Complexities of convex combinations and ...

A Margin-based Multiclass Generalization Bound via Geometric Complexity

May 28, 2024

84% Match

Michael Munn, Benoit Dherin, Javier Gonzalvo

Machine Learning

There has been considerable effort to better understand the generalization capabilities of deep neural networks both as a means to unlock a theoretical understanding of their success as well as providing directions for further improvements. In this paper, we investigate margin-based multiclass generalization bounds for neural networks which rely on a recent complexity measure, the geometric complexity, developed for neural networks. We derive a new upper bound on the generali...

Find SimilarView on arXiv

Sparsity-driven weighted ensemble classifier

October 2, 2016

84% Match

Atilla Ozgur, Hamit Erdem, Fatih Nar

Machine Learning

In this study, a novel sparsity-driven weighted ensemble classifier (SDWEC) that improves classification accuracy and minimizes the number of classifiers is proposed. Using pre-trained classifiers, an ensemble in which base classifiers votes according to assigned weights is formed. These assigned weights directly affect classifier accuracy. In the proposed method, ensemble weights finding problem is modeled as a cost function with the following terms: (a) a data fidelity term...

Find SimilarView on arXiv

Risk Bounds for the Majority Vote: From a PAC-Bayesian Analysis to a Learning Algorithm

March 28, 2015

84% Match

Pascal Germain, Alexandre Lacasse, François Laviolette, ... , Roy Jean-Francis

Machine Learning

We propose an extensive analysis of the behavior of majority votes in binary classification. In particular, we introduce a risk bound for majority votes, called the C-bound, that takes into account the average quality of the voters and their average disagreement. We also propose an extensive PAC-Bayesian analysis that shows how the C-bound can be estimated from various observations contained in the training data. The analysis intends to be self-contained and can be used as in...

Find SimilarView on arXiv

Generalized Boosting Algorithms for Convex Optimization

May 10, 2011

84% Match

Alexander Grubb, J. Andrew Bagnell

Machine Learning

Boosting is a popular way to derive powerful learners from simpler hypothesis classes. Following previous work (Mason et al., 1999; Friedman, 2000) on general boosting frameworks, we analyze gradient-based descent algorithms for boosting with respect to any convex objective and introduce a new measure of weak learner performance into this setting which generalizes existing work. We present the weak to strong learning guarantees for the existing gradient boosting work for stro...

Find SimilarView on arXiv

Bias-Variance Decompositions for Margin Losses

April 26, 2022

84% Match

Danny Wood, Tingting Mu, Gavin Brown

Machine Learning

We introduce a novel bias-variance decomposition for a range of strictly convex margin losses, including the logistic loss (minimized by the classic LogitBoost algorithm), as well as the squared margin loss and canonical boosting loss. Furthermore, we show that, for all strictly convex margin losses, the expected risk decomposes into the risk of a "central" model and a term quantifying variation in the functional margin with respect to variations in the training data. These d...

Find SimilarView on arXiv

Generalized Ambiguity Decomposition for Understanding Ensemble Diversity

December 28, 2013

84% Match

Kartik Audhkhasi, Abhinav Sethy, ... , Narayanan Shrikanth S.

Machine Learning

Computer Vision and Pattern ...

Machine Learning

Diversity or complementarity of experts in ensemble pattern recognition and information processing systems is widely-observed by researchers to be crucial for achieving performance improvement upon fusion. Understanding this link between ensemble diversity and fusion performance is thus an important research question. However, prior works have theoretically characterized ensemble diversity and have linked it with ensemble performance in very restricted settings. We present a ...

Find SimilarView on arXiv

Minimax Optimal Bayesian Aggregation

March 6, 2014

84% Match

Yun Yang, David B. Dunson

Statistics Theory

Methodology

Machine Learning

Statistics Theory

It is generally believed that ensemble approaches, which combine multiple algorithms or models, can outperform any single algorithm at machine learning tasks, such as prediction. In this paper, we propose Bayesian convex and linear aggregation approaches motivated by regression applications. We show that the proposed approach is minimax optimal when the true data-generating model is a convex or linear combination of models in the list. Moreover, the method can adapt to sparsi...

Find SimilarView on arXiv

Multi-class SVMs: From Tighter Data-Dependent Generalization Bounds to Novel Algorithms

June 14, 2015

83% Match

Yunwen Lei, Ürün Dogan, ... , Kloft Marius

Machine Learning

This paper studies the generalization performance of multi-class classification algorithms, for which we obtain, for the first time, a data-dependent generalization error bound with a logarithmic dependence on the class size, substantially improving the state-of-the-art linear dependence in the existing data-dependent generalization analysis. The theoretical analysis motivates us to introduce a new multi-class classification machine based on $\ell_p$-norm regularization, wher...

Find SimilarView on arXiv

Generalization Error Analysis for Sparse Mixture-of-Experts: A Preliminary Study

March 26, 2024

83% Match

Jinze Zhao, Peihao Wang, Zhangyang Wang

Machine Learning

Mixture-of-Experts (MoE) represents an ensemble methodology that amalgamates predictions from several specialized sub-models (referred to as experts). This fusion is accomplished through a router mechanism, dynamically assigning weights to each expert's contribution based on the input data. Conventional MoE mechanisms select all available experts, incurring substantial computational costs. In contrast, Sparse Mixture-of-Experts (Sparse MoE) selectively engages only a limited ...

Find SimilarView on arXiv

Sparse recovery in convex hulls via entropy penalization

May 13, 2009

83% Match

Vladimir Koltchinskii

Statistics Theory

Let $(X,Y)$ be a random couple in $S\times T$ with unknown distribution $P$ and $(X_1,Y_1),...,(X_n,Y_n)$ be i.i.d. copies of $(X,Y).$ Denote $P_n$ the empirical distribution of $(X_1,Y_1),...,(X_n,Y_n).$ Let $h_1,...,h_N:S\mapsto [-1,1]$ be a dictionary that consists of $N$ functions. For $\lambda \in {\mathbb{R}}^N,$ denote $f_{\lambda}:=\sum_{j=1}^N\lambda_jh_j.$ Let $\ell:T\times {\mathbb{R}}\mapsto {\mathbb{R}}$ be a given loss function and suppose it is convex with resp...

Find SimilarView on arXiv

Complexities of convex combinations and bounding the generalization error in classification

A Margin-based Multiclass Generalization Bound via Geometric Complexity

Sparsity-driven weighted ensemble classifier

Risk Bounds for the Majority Vote: From a PAC-Bayesian Analysis to a Learning Algorithm

Generalized Boosting Algorithms for Convex Optimization

Bias-Variance Decompositions for Margin Losses

Generalized Ambiguity Decomposition for Understanding Ensemble Diversity

Minimax Optimal Bayesian Aggregation

Multi-class SVMs: From Tighter Data-Dependent Generalization Bounds to Novel Algorithms

Generalization Error Analysis for Sparse Mixture-of-Experts: A Preliminary Study

Sparse recovery in convex hulls via entropy penalization