{"id":220,"date":"2022-04-05T18:37:00","date_gmt":"2022-04-05T18:37:00","guid":{"rendered":"https:\/\/groups.cs.umass.edu\/equate-ml\/?p=220"},"modified":"2022-04-07T18:16:12","modified_gmt":"2022-04-07T18:16:12","slug":"paper-on-the-difficulty-of-unbiased-alpha-divergence-minimization","status":"publish","type":"post","link":"https:\/\/groups.cs.umass.edu\/equate-ml\/2022\/04\/05\/paper-on-the-difficulty-of-unbiased-alpha-divergence-minimization\/","title":{"rendered":"Paper: On the Difficulty of Unbiased Alpha Divergence Minimization"},"content":{"rendered":"\n<p>Short description: Variational inference approximates a target distribution with a simpler one. While traditional inference minimizes the \u201cinclusive\u201d KL-divergence, several algorithms have recently been proposed to minimize other divergences. Experimentally, however, these algorithms often seem to fail to converge. In this paper we analyze the variance of the underlying estimators for these papers. Our results are very pessimistic: For any divergence except the traditional one, the signal-to-noise ratio of the gradient estimator decays exponentially in the dimensionality.<\/p>\n\n\n\n<div class=\"wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button\"><a class=\"wp-block-button__link\" href=\"https:\/\/proceedings.mlr.press\/v130\/cunningham21a.html\">Paper<\/a><\/div>\n<\/div>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Short description: Variational inference approximates a target distribution with a simpler one. While traditional inference minimizes the \u201cinclusive\u201d KL-divergence, several algorithms have recently been proposed to minimize other divergences. Experimentally, however, these algorithms often seem to fail to converge. In this paper we analyze the variance of the underlying estimators for these papers. Our results &hellip; <a href=\"https:\/\/groups.cs.umass.edu\/equate-ml\/2022\/04\/05\/paper-on-the-difficulty-of-unbiased-alpha-divergence-minimization\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Paper: On the Difficulty of Unbiased Alpha Divergence Minimization&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":180,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4],"tags":[32,33,26],"class_list":["post-220","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-ml","tag-32","tag-icml","tag-paper","group-blog","hfeed"],"_links":{"self":[{"href":"https:\/\/groups.cs.umass.edu\/equate-ml\/wp-json\/wp\/v2\/posts\/220","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/groups.cs.umass.edu\/equate-ml\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/groups.cs.umass.edu\/equate-ml\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/groups.cs.umass.edu\/equate-ml\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/groups.cs.umass.edu\/equate-ml\/wp-json\/wp\/v2\/comments?post=220"}],"version-history":[{"count":1,"href":"https:\/\/groups.cs.umass.edu\/equate-ml\/wp-json\/wp\/v2\/posts\/220\/revisions"}],"predecessor-version":[{"id":221,"href":"https:\/\/groups.cs.umass.edu\/equate-ml\/wp-json\/wp\/v2\/posts\/220\/revisions\/221"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/groups.cs.umass.edu\/equate-ml\/wp-json\/wp\/v2\/media\/180"}],"wp:attachment":[{"href":"https:\/\/groups.cs.umass.edu\/equate-ml\/wp-json\/wp\/v2\/media?parent=220"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/groups.cs.umass.edu\/equate-ml\/wp-json\/wp\/v2\/categories?post=220"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/groups.cs.umass.edu\/equate-ml\/wp-json\/wp\/v2\/tags?post=220"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}