September 9, 2006
Similar papers 5
December 8, 2015
In many problems in machine learning and operations research, we need to optimize a function whose input is a random variable or a probability density function, i.e. to solve optimization problems in an infinite dimensional space. On the other hand, online learning has the advantage of dealing with streaming examples, and better model a changing environ- ment. In this paper, we extend the celebrated online gradient descent algorithm to Hilbert spaces (function spaces), and an...
April 25, 2015
Making use of predictions is a crucial, but under-explored, area of online algorithms. This paper studies a class of online optimization problems where we have external noisy predictions available. We propose a stochastic prediction error model that generalizes prior models in the learning and stochastic control communities, incorporates correlation among prediction errors, and captures the fact that predictions improve as time passes. We prove that achieving sublinear regret...
February 16, 2021
Winner-take-all competitions in forecasting and machine-learning suffer from distorted incentives. Witkowski et al. 2018 identified this problem and proposed ELF, a truthful mechanism to select a winner. We show that, from a pool of $n$ forecasters, ELF requires $\Theta(n\log n)$ events or test data points to select a near-optimal forecaster with high probability. We then show that standard online learning algorithms select an $\epsilon$-optimal forecaster using only $O(\log(...
April 27, 2014
This paper provides insight into when, why, and how forecast strategies fail when they are applied to complicated time series. We conjecture that the inherent complexity of real-world time-series data---which results from the dimension, nonlinearity, and non-stationarity of the generating process, as well as from measurement issues like noise, aggregation, and finite data length---is both empirically quantifiable and directly correlated with predictability. In particular, we ...
October 1, 2020
This book is devoted to the problem of sequential probability forecasting, that is, predicting the probabilities of the next outcome of a growing sequence of observations given the past. This problem is considered in a very general setting that unifies commonly used probabilistic and non-probabilistic settings, trying to make as few as possible assumptions on the mechanism generating the observations. A common form that arises in various formulations of this problem is that o...
January 22, 2013
Efficient online learning with pairwise loss functions is a crucial component in building large-scale learning system that maximizes the area under the Receiver Operator Characteristic (ROC) curve. In this paper we investigate the generalization performance of online learning algorithms with pairwise loss functions. We show that the existing proof techniques for generalization bounds of online algorithms with a univariate loss can not be directly applied to pairwise losses. I...
September 9, 2016
In the context of Structural Risk Minimization, one is presented a sequence of classes $\{\mathcal{G}_j\}$ from which, given a random sample $(X_i,Y_i)$ one wants to choose a strongly consistent estimator. For certain types of classes of functions, we present a criterion to choose an estimator, based on the minimization of the sum of empirical error and a complexity penalty $r(n,j)$ over each class $\G_j$. We present also several other results together with important result...
February 8, 2018
Online learning represents an important family of machine learning algorithms, in which a learner attempts to resolve an online prediction (or any type of decision-making) task by learning a model/hypothesis from a sequence of data instances one at a time. The goal of online learning is to ensure that the online learner would make a sequence of accurate predictions (or correct decisions) given the knowledge of correct answers to previous prediction or learning tasks and possi...
July 1, 2024
This paper is concerned with the fundamental limits of nonlinear dynamical system learning from input-output traces. Specifically, we show that recurrent neural networks (RNNs) are capable of learning nonlinear systems that satisfy a Lipschitz property and forget past inputs fast enough in a metric-entropy optimal manner. As the sets of sequence-to-sequence maps realized by the dynamical systems we consider are significantly more massive than function classes generally consid...
August 3, 2012
We consider the minimum error entropy (MEE) criterion and an empirical risk minimization learning algorithm in a regression setting. A learning theory approach is presented for this MEE algorithm and explicit error bounds are provided in terms of the approximation ability and capacity of the involved hypothesis space when the MEE scaling parameter is large. Novel asymptotic analysis is conducted for the generalization error associated with Renyi's entropy and a Parzen window ...