Learning a Machine for the Decision in a...

Information theoretic approach to interactive learning

September 12, 2007

86% Match

Susanne Still

Data Analysis, Statistics an...

Biological Physics

The principles of statistical mechanics and information theory play an important role in learning and have inspired both theory and the design of numerous machine learning algorithms. The new aspect in this paper is a focus on integrating feedback from the learner. A quantitative approach to interactive learning and adaptive behavior is proposed, integrating model- and decision-making into one theoretical framework. This paper follows simple principles by requiring that the o...

Find SimilarView on arXiv

Reward Maximisation through Discrete Active Inference

September 17, 2020

86% Match

Costa Lancelot Da, Noor Sajid, Thomas Parr, ... , Smith Ryan

Artificial Intelligence

Optimization and Control

Neurons and Cognition

Active inference is a probabilistic framework for modelling the behaviour of biological and artificial agents, which derives from the principle of minimising free energy. In recent years, this framework has successfully been applied to a variety of situations where the goal was to maximise reward, offering comparable and sometimes superior performance to alternative approaches. In this paper, we clarify the connection between reward maximisation and active inference by demons...

Find SimilarView on arXiv

A Novel Training Algorithm for HMMs with Partial and Noisy Access to the States

March 20, 2012

85% Match

Huseyin Ozkan, Arda Akman, Suleyman S. Kozat

Machine Learning

This paper proposes a new estimation algorithm for the parameters of an HMM as to best account for the observed data. In this model, in addition to the observation sequence, we have \emph{partial} and \emph{noisy} access to the hidden state sequence as side information. This access can be seen as "partial labeling" of the hidden states. Furthermore, we model possible mislabeling in the side information in a joint framework and derive the corresponding EM updates accordingly. ...

Find SimilarView on arXiv

Bayesian learning of noisy Markov decision processes

November 26, 2012

85% Match

Sumeetpal S. Singh, Nicolas Chopin, Nick Whiteley

Machine Learning

Computation

We consider the inverse reinforcement learning problem, that is, the problem of learning from, and then predicting or mimicking a controller based on state/action data. We propose a statistical model for such data, derived from the structure of a Markov decision process. Adopting a Bayesian approach to inference, we show how latent variables of the model can be estimated, and how predictions about actions can be made, in a unified framework. A new Markov chain Monte Carlo (MC...

Find SimilarView on arXiv

Partially Observed Optimal Stochastic Control: Regularity, Optimality, Approximations, and Learning

December 9, 2024

85% Match

Ali Devran Kara, Serdar Yuksel

Optimization and Control

Systems and Control

In this review/tutorial article, we present recent progress on optimal control of partially observed Markov Decision Processes (POMDPs). We first present regularity and continuity conditions for POMDPs and their belief-MDP reductions, where these constitute weak Feller and Wasserstein regularity and controlled filter stability. These are then utilized to arrive at existence results on optimal policies for both discounted and average cost problems, and regularity of value func...

Find SimilarView on arXiv

Active Inference through Incentive Design in Markov Decision Processes

February 10, 2025

85% Match

Xinyi Wei, Chongyang Shi, Shuo Han, Ahmed H. Hemida, ... , Fu Jie

Systems and Control

We present a method for active inference with partial observations in stochastic systems through incentive design, also known as the leader-follower game. Consider a leader agent who aims to infer a follower agent's type given a finite set of possible types. Different types of followers differ in either the dynamical model, the reward function, or both. We assume the leader can partially observe a follower's behavior in the stochastic system modeled as a Markov decision proce...

Find SimilarView on arXiv

Recurrent networks, hidden states and beliefs in partially observable environments

August 6, 2022

85% Match

Gaspard Lambrechts, Adrien Bolland, Damien Ernst

Machine Learning

Reinforcement learning aims to learn optimal policies from interaction with environments whose dynamics are unknown. Many methods rely on the approximation of a value function to derive near-optimal policies. In partially observable environments, these functions depend on the complete sequence of observations and past actions, called the history. In this work, we show empirically that recurrent neural networks trained to approximate such value functions internally filter the ...

Find SimilarView on arXiv

A PAC RL Algorithm for Episodic POMDPs

May 25, 2016

85% Match

Zhaohan Daniel Guo, Shayan Doroudi, Emma Brunskill

Machine Learning

Artificial Intelligence

Machine Learning

Many interesting real world domains involve reinforcement learning (RL) in partially observable environments. Efficient learning in such domains is important, but existing sample complexity bounds for partially observable RL are at least exponential in the episode length. We give, to our knowledge, the first partially observable RL algorithm with a polynomial bound on the number of episodes on which the algorithm may not achieve near-optimal performance. Our algorithm is suit...

Find SimilarView on arXiv

Machine Teaching of Active Sequential Learners

September 8, 2018

85% Match

Tomi Peltola, Mustafa Mert Çelikok, ... , Kaski Samuel

Machine Learning

Artificial Intelligence

Human-Computer Interaction

Machine Learning

Machine teaching addresses the problem of finding the best training data that can guide a learning algorithm to a target model with minimal effort. In conventional settings, a teacher provides data that are consistent with the true data distribution. However, for sequential learners which actively choose their queries, such as multi-armed bandits and active learners, the teacher can only provide responses to the learner's queries, not design the full data. In this setting, co...

Find SimilarView on arXiv

A Monte Carlo Algorithm for Universally Optimal Bayesian Sequence Prediction and Planning

January 18, 2010

85% Match

Franco Anthony Di

Adaptation and Self-Organizi...

Disordered Systems and Neura...

Artificial Intelligence

Machine Learning

The aim of this work is to address the question of whether we can in principle design rational decision-making agents or artificial intelligences embedded in computable physics such that their decisions are optimal in reasonable mathematical senses. Recent developments in rare event probability estimation, recursive bayesian inference, neural networks, and probabilistic planning are sufficient to explicitly approximate reinforcement learners of the AIXI style with non-trivial...

Find SimilarView on arXiv

Learning a Machine for the Decision in a Partially Observable Markov Universe

Information theoretic approach to interactive learning

Reward Maximisation through Discrete Active Inference

A Novel Training Algorithm for HMMs with Partial and Noisy Access to the States

Bayesian learning of noisy Markov decision processes

Partially Observed Optimal Stochastic Control: Regularity, Optimality, Approximations, and Learning

Active Inference through Incentive Design in Markov Decision Processes

Recurrent networks, hidden states and beliefs in partially observable environments

A PAC RL Algorithm for Episodic POMDPs

Machine Teaching of Active Sequential Learners

A Monte Carlo Algorithm for Universally Optimal Bayesian Sequence Prediction and Planning