EQUATE & Machine Learning Blog

Paper: Posterior Value Functions: Hindsight Baselines for Policy Gradient Methods

Posted: April 05, 2022 Under: Artificial Intelligence and Machine Learning By Chris Nota, Bruno C. da Silva, Philip S. Thomas

Hindsight allows reinforcement learning agents to leverage new observations to make inferences about earlier states and transitions. In this paper, we exploit the idea of hindsight and introduce posterior value functions. Posterior value functions are computed by inferring the posterior distribution over hidden components of the state in previous timesteps and can be used to … Continue reading "Paper: Posterior Value Functions: Hindsight Baselines for Policy Gradient Methods"

Paper: On the Difficulty of Unbiased Alpha Divergence Minimization

Posted: April 05, 2022 Under: Artificial Intelligence and Machine Learning By Tomas Geffner and Justin Domke

Short description: Variational inference approximates a target distribution with a simpler one. While traditional inference minimizes the “inclusive” KL-divergence, several algorithms have recently been proposed to minimize other divergences. Experimentally, however, these algorithms often seem to fail to converge. In this paper we analyze the variance of the underlying estimators for these papers. Our results … Continue reading "Paper: On the Difficulty of Unbiased Alpha Divergence Minimization"

Paper: High Confidence Generalization for Reinforcement Learning

Posted: April 05, 2022 Under: Artificial Intelligence and Machine Learning By James Kostas, Yash Chandak, Scott Jordan, Georgios Theocharous, Philip Thomas

We present several classes of reinforcement learning algorithms that safely generalize to Markov decision processes (MDPs) not seen during training. Specifically, we study the setting in which some set of MDPs is accessible for training. For various definitions of safety, our algorithms give probabilistic guarantees that agents can safely generalize to MDPs that are sampled … Continue reading "Paper: High Confidence Generalization for Reinforcement Learning"

Paper: RealMVP: A Change of Variables Method For Rectangular Matrix-Vector Products

Posted: April 05, 2022 Under: Artificial Intelligence and Machine Learning By Edmond Cunningh and Madalina Fiterau

Rectangular matrix-vector products are used extensively throughout machine learning and are fundamental to neural networks such as multi-layer perceptrons, but are notably absent as normalizing flow layers. This paper identifies this methodological gap and plugs it with a tall and wide MVP change of variables formula. Our theory builds up to a practical algorithm that … Continue reading "Paper: RealMVP: A Change of Variables Method For Rectangular Matrix-Vector Products"

Search Google Appliance