Exploring the Decision Forest: An Empiri...

Further Experimental Evidence against the Utility of Occam's Razor

May 1, 1996

92% Match

G. I. Webb

Artificial Intelligence

This paper presents new experimental evidence against the utility of Occam's razor. A~systematic procedure is presented for post-processing decision trees produced by C4.5. This procedure was derived by rejecting Occam's razor and instead attending to the assumption that similar objects are likely to belong to the same class. It increases a decision tree's complexity without altering the performance of that tree on the training data from which it is inferred. The resulting mo...

Find SimilarView on arXiv

Multiple decision trees

March 27, 2013

90% Match

Suk Wah Kwok, Chris Carter

Machine Learning

Artificial Intelligence

Machine Learning

This paper describes experiments, on two domains, to investigate the effect of averaging over predictions of multiple decision trees, instead of using a single tree. Other authors have pointed out theoretical and commonsense reasons for preferring the multiple tree approach. Ideally, we would like to consider predictions from all trees, weighted by their probability. However, there is a vast number of different trees, and it is difficult to estimate the probability of each tr...

Find SimilarView on arXiv

Large Scale Prediction with Decision Trees

April 28, 2021

90% Match

Jason M. Klusowski, Peter M. Tian

Machine Learning

Statistics Theory

This paper shows that decision trees constructed with Classification and Regression Trees (CART) and C4.5 methodology are consistent for regression and classification tasks, even when the number of predictor variables grows sub-exponentially with the sample size, under natural 0-norm and 1-norm sparsity constraints. The theory applies to a wide range of models, including (ordinary or logistic) additive regression models with component functions that are continuous, of bounded...

Find SimilarView on arXiv

Large Random Forests: Optimisation for Rapid Evaluation

December 23, 2019

89% Match

Frederik Gossen, Bernhard Steffen

Machine Learning

Random Forests are one of the most popular classifiers in machine learning. The larger they are, the more precise is the outcome of their predictions. However, this comes at a cost: their running time for classification grows linearly with the number of trees, i.e. the size of the forest. In this paper, we propose a method to aggregate large Random Forests into a single, semantically equivalent decision diagram. Our experiments on various popular datasets show speed-ups of se...

Find SimilarView on arXiv

Learning Optimal Decision Trees from Large Datasets

April 12, 2019

88% Match

Florent Avellaneda

Machine Learning

Inferring a decision tree from a given dataset is one of the classic problems in machine learning. This problem consists of buildings, from a labelled dataset, a tree such that each node corresponds to a class and a path between the tree root and a leaf corresponds to a conjunction of features to be satisfied in this class. Following the principle of parsimony, we want to infer a minimal tree consistent with the dataset. Unfortunately, inferring an optimal decision tree is kn...

Find SimilarView on arXiv

A Mathematical Programming Approach to Optimal Classification Forests

November 18, 2022

88% Match

Víctor Blanco, Alberto Japón, ... , Zhang Peter

Optimization and Control

Machine Learning

In this paper, we introduce Optimal Classification Forests, a new family of classifiers that takes advantage of an optimal ensemble of decision trees to derive accurate and interpretable classifiers. We propose a novel mathematical optimization-based methodology in which a given number of trees are simultaneously constructed, each of them providing a predicted class for the observations in the feature space. The classification rule is derived by assigning to each observation ...

Find SimilarView on arXiv

Learning accurate and interpretable decision trees

May 24, 2024

88% Match

Maria-Florina Balcan, Dravyansh Sharma

Machine Learning

Decision trees are a popular tool in machine learning and yield easy-to-understand models. Several techniques have been proposed in the literature for learning a decision tree classifier, with different techniques working well for data from different domains. In this work, we develop approaches to design decision tree learning algorithms given repeated access to data from the same domain. We propose novel parameterized classes of node splitting criteria in top-down algorithms...

Find SimilarView on arXiv

A System for Induction of Oblique Decision Trees

August 1, 1994

88% Match

S. K. Murthy, S. Kasif, S. Salzberg

Artificial Intelligence

This article describes a new system for induction of oblique decision trees. This system, OC1, combines deterministic hill-climbing with two forms of randomization to find a good oblique split (in the form of a hyperplane) at each node of a decision tree. Oblique decision tree methods are tuned especially for domains in which the attributes are numeric, although they can be adapted to symbolic or mixed symbolic/numeric attributes. We present extensive empirical studies, using...

Find SimilarView on arXiv

An Algorithmic Framework for Constructing Multiple Decision Trees by Evaluating Their Combination Performance Throughout the Construction Process

February 9, 2024

88% Match

Keito Tajima, Naoki Ichijo, ... , Matsushima Toshiyasu

Machine Learning

Predictions using a combination of decision trees are known to be effective in machine learning. Typical ideas for constructing a combination of decision trees for prediction are bagging and boosting. Bagging independently constructs decision trees without evaluating their combination performance and averages them afterward. Boosting constructs decision trees sequentially, only evaluating a combination performance of a new decision tree and the fixed past decision trees at ea...

Find SimilarView on arXiv

Narrowing the Gap: Random Forests In Theory and In Practice

October 4, 2013

88% Match

Misha Denil, David Matheson, Freitas Nando de

Machine Learning

Despite widespread interest and practical use, the theoretical properties of random forests are still not well understood. In this paper we contribute to this understanding in two ways. We present a new theoretically tractable variant of random regression forests and prove that our algorithm is consistent. We also provide an empirical evaluation, comparing our algorithm and other theoretically tractable random forest models to the random forest algorithm used in practice. Our...

Find SimilarView on arXiv

Exploring the Decision Forest: An Empirical Investigation of Occam's Razor in Decision Tree Induction

Further Experimental Evidence against the Utility of Occam's Razor

Multiple decision trees

Large Scale Prediction with Decision Trees

Large Random Forests: Optimisation for Rapid Evaluation

Learning Optimal Decision Trees from Large Datasets

A Mathematical Programming Approach to Optimal Classification Forests

Learning accurate and interpretable decision trees

A System for Induction of Oblique Decision Trees

An Algorithmic Framework for Constructing Multiple Decision Trees by Evaluating Their Combination Performance Throughout the Construction Process

Narrowing the Gap: Random Forests In Theory and In Practice