Document Type
Article
Abstract
Random forests are powerful and popular machine learning methods. While general principles of tree induction are straightforward and well-understood, the numerous algorithmic treatments implemented in software tools, as well as their impacts on performance, are less familiar to most users. This paper introduces a new random forest toolkit (the ‘brif’ package in R and Python) along with its key algorithmic design features, and demonstrates the effects of the forest’s hyper-parameters such as the split search method, tree depth and the voting mechanism, on the classification performance. Summaries of benchmarking experiments are also presented. Results show that ‘brif’ stands out among several other random forest packages in R in both speed and predictive accuracy- it achieves the best overall training speed, AUC and Accuracy on a comprehensive collection of 57 open datasets.
Disciplines
Data Science | Numerical Analysis and Scientific Computing | Software Engineering
Recommended Citation
Liu, Yanchao, "brif: A novel and efficient implementation of random forests based on bit packing and parallel computing" (2023). Industrial and Systems Engineering Faculty Research Publications. 5.
https://digitalcommons.wayne.edu/im_eng_frp/5
Included in
Data Science Commons, Numerical Analysis and Scientific Computing Commons, Software Engineering Commons