{"id":12,"date":"2019-04-30T20:08:06","date_gmt":"2019-04-30T20:08:06","guid":{"rendered":"http:\/\/groups.cs.umass.edu\/shlomo\/?page_id=12"},"modified":"2022-01-02T16:12:06","modified_gmt":"2022-01-02T16:12:06","slug":"research","status":"publish","type":"page","link":"https:\/\/groups.cs.umass.edu\/shlomo\/research\/","title":{"rendered":"Research"},"content":{"rendered":"<p>We study a wide range of problems in artificial intelligence, automated planning and learning, autonomous systems, reasoning under uncertainty, multi-agent systems, and resource-bounded reasoning. We are particularly interested in the implications of uncertainty and limited computational resources on the design of autonomous agents. In most practical settings, it is not feasible or desirable to find the optimal action, making it necessary to resort to some form of approximate reasoning. This raises a fundamental question: what does it mean for an agent to be \u201crational\u201d when it does not have enough knowledge or computational power to derive the best course of action? Our overall approach to this problem involves meta-level control mechanisms that reason explicitly about the cost of decision-making and can optimize the amount of deliberation (or &#8220;thinking&#8221;) an agent does before taking action. We have also developed new planning techniques for situations involving multiple decision makers operating in either collaborative or adversarial domains.<\/p>\n<p><img fetchpriority=\"high\" decoding=\"async\" src=\"https:\/\/groups.cs.umass.edu\/shlomo\/wp-content\/uploads\/sites\/19\/2019\/05\/Word-Art-2-1.png\" alt=\"\" width=\"997\" height=\"501\" \/><\/p>\n<h3><span style=\"color: #264278\"><b>Human Compatible AI<\/b><\/span><\/h3>\n<div>How can we design AI systems that are compatible with human needs: accountable, explainable, equitable, ethical, and mindful of human cognitive biases and shortcomings?<\/div>\n<div><div class=\"bg-margin-for-link\"><input type='hidden' bg_collapse_expand='6a2d3028dc3cf4058363878' value='6a2d3028dc3cf4058363878'><input type='hidden' id='bg-show-more-text-6a2d3028dc3cf4058363878' value='Show Related Publications'><input type='hidden' id='bg-show-less-text-6a2d3028dc3cf4058363878' value='Hide Related Publications'><a id='bg-showmore-action-6a2d3028dc3cf4058363878' class='bg-showmore-plg-link bg-arrow '  style=\" color:#7C2622;;\" href='#'>Show Related Publications<\/a><div id='bg-showmore-hidden-6a2d3028dc3cf4058363878' ><div class=\"teachpress_pub_list\"><form name=\"tppublistform\" method=\"get\"><a name=\"tppubs\" id=\"tppubs\"><\/a><\/form><table class=\"teachpress_publication_list\"><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Miura, Shuwa;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1188','tp_links')\" style=\"cursor:pointer;\">Observer-Aware Planning with Implicit and Explicit Communication<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the The 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS), <\/span><span class=\"tp_pub_additional_address\">Auckland, New Zealand, <\/span><span class=\"tp_pub_additional_year\">2024<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1188\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1188','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1188\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1188','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1188\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1188','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1188\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:MZaamas24,<br \/>\r\ntitle = {Observer-Aware Planning with Implicit and Explicit Communication},<br \/>\r\nauthor = {Shuwa Miura and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZaamas24.pdf},<br \/>\r\nyear  = {2024},<br \/>\r\ndate = {2024-01-01},<br \/>\r\nbooktitle = {Proceedings of the The 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS)},<br \/>\r\naddress = {Auckland, New Zealand},<br \/>\r\nabstract = {This paper presents a computational model designed for planning both implicit and explicit communication of intentions, goals, and desires. Building upon previous research focused on implicit communication of intention via actions, our model seeks to strategically influence an observer\u2019s belief using both the agent\u2019s actions and explicit messages. We show that our proposed model can be considered to be a special case of general multi-agent problems with explicit communication under certain assumptions. Since the mental state of the observer depends on histories, computing a policy for the proposed model amounts to optimizing a non-Markovian objective, which we show to be intractable in the worst case. To mitigate this challenge, we propose a technique based on splitting domain and communication actions during planning. We conclude with experimental evaluations of the proposed approach that illustrate its effectiveness.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1188','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1188\" style=\"display:none;\"><div class=\"tp_abstract_entry\">This paper presents a computational model designed for planning both implicit and explicit communication of intentions, goals, and desires. Building upon previous research focused on implicit communication of intention via actions, our model seeks to strategically influence an observer\u2019s belief using both the agent\u2019s actions and explicit messages. We show that our proposed model can be considered to be a special case of general multi-agent problems with explicit communication under certain assumptions. Since the mental state of the observer depends on histories, computing a policy for the proposed model amounts to optimizing a non-Markovian objective, which we show to be intractable in the worst case. To mitigate this challenge, we propose a technique based on splitting domain and communication actions during planning. We conclude with experimental evaluations of the proposed approach that illustrate its effectiveness.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1188','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1188\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZaamas24.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZaamas24.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZaamas24.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1188','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Mahmud, Saaduddin;  Vazquez-Chanlatte, Marcell;  Witwicki, Stefan;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1189','tp_links')\" style=\"cursor:pointer;\">Explaining the Behavior of POMDP-based Agents Through the Impact of Counterfactual Information<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the The 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS), <\/span><span class=\"tp_pub_additional_address\">Auckland, New Zealand, <\/span><span class=\"tp_pub_additional_year\">2024<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1189\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1189','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1189\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1189','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1189\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1189','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1189\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:MVWZaamas24,<br \/>\r\ntitle = {Explaining the Behavior of POMDP-based Agents Through the Impact of Counterfactual Information},<br \/>\r\nauthor = {Saaduddin Mahmud and Marcell Vazquez-Chanlatte and Stefan Witwicki and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MVWZaamas24.pdf},<br \/>\r\nyear  = {2024},<br \/>\r\ndate = {2024-01-01},<br \/>\r\nbooktitle = {Proceedings of the The 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS)},<br \/>\r\naddress = {Auckland, New Zealand},<br \/>\r\nabstract = {In this work, we consider AI agents operating in Partially Observable Markov Decision Processes (POMDPs)\u2013a widely-used framework for sequential decision making with incomplete state information. Agents operating with partial information take actions not only to advance their underlying goals but also to seek information and reduce uncertainty. Despite rapid progress in explainable AI, research on separating information-driven vs. goal-driven behaviors remains sparse. To address this gap, we introduce a novel explanation generation framework called Sequential Information Probing (SIP), to investigate the direct impact of state information, or its absence, on agent behavior. To quantify the impact we also propose two metrics under this SIP framework called Value of Information (VoI) and Influence of Information (IoI). We then theoretically derive several properties of these metrics. Finally, we present several experiments, including a case study on an autonomous vehicle, that illustrate the efficacy of our method.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1189','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1189\" style=\"display:none;\"><div class=\"tp_abstract_entry\">In this work, we consider AI agents operating in Partially Observable Markov Decision Processes (POMDPs)\u2013a widely-used framework for sequential decision making with incomplete state information. Agents operating with partial information take actions not only to advance their underlying goals but also to seek information and reduce uncertainty. Despite rapid progress in explainable AI, research on separating information-driven vs. goal-driven behaviors remains sparse. To address this gap, we introduce a novel explanation generation framework called Sequential Information Probing (SIP), to investigate the direct impact of state information, or its absence, on agent behavior. To quantify the impact we also propose two metrics under this SIP framework called Value of Information (VoI) and Influence of Information (IoI). We then theoretically derive several properties of these metrics. Finally, we present several experiments, including a case study on an autonomous vehicle, that illustrate the efficacy of our method.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1189','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1189\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MVWZaamas24.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MVWZaamas24.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MVWZaamas24.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1189','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Choudhury, Moumita;  Saisubramanian, Sandhya;  Zhang, Hao;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1190','tp_links')\" style=\"cursor:pointer;\">Minimizing Negative Side Effects in Cooperative Multi-Agent Systems Using Distributed Coordination<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the The 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS), <\/span><span class=\"tp_pub_additional_address\">Auckland, New Zealand, <\/span><span class=\"tp_pub_additional_year\">2024<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1190\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1190','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1190\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1190','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1190\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1190','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1190\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:CSZZaamas24,<br \/>\r\ntitle = {Minimizing Negative Side Effects in Cooperative Multi-Agent Systems Using Distributed Coordination},<br \/>\r\nauthor = {Moumita Choudhury and Sandhya Saisubramanian and Hao Zhang and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CSZZaamas24.pdf},<br \/>\r\nyear  = {2024},<br \/>\r\ndate = {2024-01-01},<br \/>\r\nbooktitle = {Proceedings of the The 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS)},<br \/>\r\naddress = {Auckland, New Zealand},<br \/>\r\nabstract = {Autonomous agents in real-world environments may encounter undesirable outcomes or negative side effects (NSEs) when working collaboratively alongside other agents. We frame the challenge of minimizing NSEs in a multi-agent setting as a lexicographic decentralized Markov decision process in which we assume independence of rewards and transitions with respect to the primary assigned tasks, but allowing negative side effects to create a form of dependence among the agents. We present a lexicographic Q-learning approach to mitigate the NSEs using human feedback models while maintaining near-optimality with respect to the assigned tasks\u2013up to some given slack. Our empirical evaluation across two domains demonstrates that our collaborative approach effectively mitigates NSEs, outperforming non-collaborative methods.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1190','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1190\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Autonomous agents in real-world environments may encounter undesirable outcomes or negative side effects (NSEs) when working collaboratively alongside other agents. We frame the challenge of minimizing NSEs in a multi-agent setting as a lexicographic decentralized Markov decision process in which we assume independence of rewards and transitions with respect to the primary assigned tasks, but allowing negative side effects to create a form of dependence among the agents. We present a lexicographic Q-learning approach to mitigate the NSEs using human feedback models while maintaining near-optimality with respect to the assigned tasks\u2013up to some given slack. Our empirical evaluation across two domains demonstrates that our collaborative approach effectively mitigates NSEs, outperforming non-collaborative methods.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1190','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1190\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CSZZaamas24.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CSZZaamas24.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CSZZaamas24.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1190','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Choudhury, Moumita;  Saisubramanian, Sandhya;  Zhang, Hao;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1191','tp_links')\" style=\"cursor:pointer;\">Minimizing Negative Side Effects in Cooperative Multi-Agent Systems Using Distributed Coordination<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the The 37th International FLAIRS Conference, <\/span><span class=\"tp_pub_additional_address\">Miramar Beach, Florida, <\/span><span class=\"tp_pub_additional_year\">2024<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1191\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1191','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1191\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1191','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1191\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1191','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1191\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:CSZZflairs24,<br \/>\r\ntitle = {Minimizing Negative Side Effects in Cooperative Multi-Agent Systems Using Distributed Coordination},<br \/>\r\nauthor = {Moumita Choudhury and Sandhya Saisubramanian and Hao Zhang and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CSZZflairs24.pdf},<br \/>\r\nyear  = {2024},<br \/>\r\ndate = {2024-01-01},<br \/>\r\nbooktitle = {Proceedings of the The 37th International FLAIRS Conference},<br \/>\r\naddress = {Miramar Beach, Florida},<br \/>\r\nabstract = {Autonomous agents operating in real-world environments frequently encounter undesirable outcomes or negative side effects (NSEs) when working collaboratively alongside other agents. Even when agents can execute their primary task optimally when operating in isolation, their training may not account for potential negative interactions that arise in the presence of other agents. We frame the challenge of minimizing NSEs as a lexicographic decentralized Markov decision process in which we assume independence of rewards and transitions with respect to the primary assigned tasks, but recognize that addressing negative side effects creates a form of dependence among the agents. We present a lexicographic Q-learning approach to mitigate the NSEs using human feedback models while maintaining near-optimality with respect to the assigned tasks\u2013up to some given slack. Our empirical evaluation across two domains demonstrates that our collaborative approach effectively mitigates NSEs, outperforming non-collaborative methods.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1191','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1191\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Autonomous agents operating in real-world environments frequently encounter undesirable outcomes or negative side effects (NSEs) when working collaboratively alongside other agents. Even when agents can execute their primary task optimally when operating in isolation, their training may not account for potential negative interactions that arise in the presence of other agents. We frame the challenge of minimizing NSEs as a lexicographic decentralized Markov decision process in which we assume independence of rewards and transitions with respect to the primary assigned tasks, but recognize that addressing negative side effects creates a form of dependence among the agents. We present a lexicographic Q-learning approach to mitigate the NSEs using human feedback models while maintaining near-optimality with respect to the assigned tasks\u2013up to some given slack. Our empirical evaluation across two domains demonstrates that our collaborative approach effectively mitigates NSEs, outperforming non-collaborative methods.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1191','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1191\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CSZZflairs24.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CSZZflairs24.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CSZZflairs24.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1191','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_incollection\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Mahmud, Saaduddin;  Nashed, Samer B.;  Goldman, Claudia V.;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1173','tp_links')\" style=\"cursor:pointer;\">Estimating Causal Responsibility for Explaining Autonomous Behavior<\/a> <span class=\"tp_pub_type tp_  incollection\">Book Section<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span> Calvaresi, Davide (Ed.): <span class=\"tp_pub_additional_booktitle\">International Workshop on Explainable and Transparent AI and Multi-Agent Systems (EXTRAAMAS), <\/span><span class=\"tp_pub_additional_pages\">pp. 78\u201394, <\/span><span class=\"tp_pub_additional_publisher\">Springer, <\/span><span class=\"tp_pub_additional_year\">2023<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1173\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1173','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1173\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1173','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1173\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1173','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1173\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@incollection{SZ:MNGZextraamas23,<br \/>\r\ntitle = {Estimating Causal Responsibility for Explaining Autonomous Behavior},<br \/>\r\nauthor = {Saaduddin Mahmud and Samer B. Nashed and Claudia V. Goldman and Shlomo Zilberstein},<br \/>\r\neditor = {Davide Calvaresi},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MNGZextraamas23.pdf},<br \/>\r\ndoi = {10.1007\/978-3-031-40878-6},<br \/>\r\nyear  = {2023},<br \/>\r\ndate = {2023-01-01},<br \/>\r\nbooktitle = {International Workshop on Explainable and Transparent AI and Multi-Agent Systems (EXTRAAMAS)},<br \/>\r\npages = {78\u201394},<br \/>\r\npublisher = {Springer},<br \/>\r\nabstract = {There has been growing interest in causal explanations of stochastic, sequential decision-making systems. Structural causal models and causal reasoning offer several theoretical benefits when exact inference can be applied. Furthermore, users overwhelmingly prefer the resulting causal explanations over other state-of-the-art systems. In this work, we focus on one such method, MeanRESP, and its approximate versions that drastically reduce compute load and assign a responsibility score to each variable, which helps identify smaller sets of causes to be used as explanations. However, this method, and its approximate versions in particular, lack deeper theoretical analysis and broader empirical tests. To address these shortcomings, we provide three primary contributions. First, we offer several theoretical insights on the sample complexity and error rate of approximate MeanRESP. Second, we discuss several automated metrics for comparing explanations generated from approximate methods to those generated via exact methods. While we recognize the significance of user studies as the gold standard for evaluating explanations, our aim is to leverage the proposed metrics to systematically compare explanation-generation methods along important quantitative dimensions. Finally, we provide a more detailed discussion of MeanRESP and how its output under different definitions of responsibility compares to existing widely adopted methods that use Shapley values.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {incollection}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1173','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1173\" style=\"display:none;\"><div class=\"tp_abstract_entry\">There has been growing interest in causal explanations of stochastic, sequential decision-making systems. Structural causal models and causal reasoning offer several theoretical benefits when exact inference can be applied. Furthermore, users overwhelmingly prefer the resulting causal explanations over other state-of-the-art systems. In this work, we focus on one such method, MeanRESP, and its approximate versions that drastically reduce compute load and assign a responsibility score to each variable, which helps identify smaller sets of causes to be used as explanations. However, this method, and its approximate versions in particular, lack deeper theoretical analysis and broader empirical tests. To address these shortcomings, we provide three primary contributions. First, we offer several theoretical insights on the sample complexity and error rate of approximate MeanRESP. Second, we discuss several automated metrics for comparing explanations generated from approximate methods to those generated via exact methods. While we recognize the significance of user studies as the gold standard for evaluating explanations, our aim is to leverage the proposed metrics to systematically compare explanation-generation methods along important quantitative dimensions. Finally, we provide a more detailed discussion of MeanRESP and how its output under different definitions of responsibility compares to existing widely adopted methods that use Shapley values.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1173','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1173\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MNGZextraamas23.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MNGZextraamas23.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MNGZextraamas23.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1007\/978-3-031-40878-6\" title=\"Follow DOI:10.1007\/978-3-031-40878-6\" target=\"_blank\">doi:10.1007\/978-3-031-40878-6<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1173','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Saisubramanian, Sandhya;  Zilberstein, Shlomo;  Kamar, Ece<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1156','tp_links')\" style=\"cursor:pointer;\">Avoiding Negative Side Effects due to Incomplete Knowledge of AI Systems<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">AI Magazine, <\/span><span class=\"tp_pub_additional_volume\">vol. 42, <\/span><span class=\"tp_pub_additional_number\">no. 4, <\/span><span class=\"tp_pub_additional_pages\">pp. 62\u201371, <\/span><span class=\"tp_pub_additional_year\">2022<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1156\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1156','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1156\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1156','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1156\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1156','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1156\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:SZKaimag22,<br \/>\r\ntitle = {Avoiding Negative Side Effects due to Incomplete Knowledge of AI Systems},<br \/>\r\nauthor = {Sandhya Saisubramanian and Shlomo Zilberstein and Ece Kamar},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZKaimag22.pdf},<br \/>\r\ndoi = {\u000e10.1609\/aaai.12028},<br \/>\r\nyear  = {2022},<br \/>\r\ndate = {2022-01-01},<br \/>\r\nurldate = {2022-01-01},<br \/>\r\njournal = {AI Magazine},<br \/>\r\nvolume = {42},<br \/>\r\nnumber = {4},<br \/>\r\npages = {62--71},<br \/>\r\nabstract = {Autonomous agents acting in the real-world often operate based on models that ignore certain aspects of the environment. The incompleteness of any given model \u2013 handcrafted or machine acquired \u2013 is inevitable due to practical limitations of any modeling technique for complex real-world settings. Due to the limited fidelity of its model, an agent\u2019s actions may have unexpected, undesirable consequences during execution. Learning to recognize and avoid such negative side effects (NSEs) of an agent\u2019s actions is critical to improve the safety and reliability of autonomous systems. Mitigating NSEs is an emerging research topic that is attracting increased attention due to the rapid growth in the deployment of AI systems and their broad societal impacts. This article provides a comprehensive overview of different forms of NSEs and the recent research efforts to address them. We identify key characteristics of NSEs, highlight the challenges in avoiding NSEs, and discuss recently developed approaches, contrasting their benefits and limitations. The article concludes with a discussion of open questions and suggestions for future research directions.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1156','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1156\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Autonomous agents acting in the real-world often operate based on models that ignore certain aspects of the environment. The incompleteness of any given model \u2013 handcrafted or machine acquired \u2013 is inevitable due to practical limitations of any modeling technique for complex real-world settings. Due to the limited fidelity of its model, an agent\u2019s actions may have unexpected, undesirable consequences during execution. Learning to recognize and avoid such negative side effects (NSEs) of an agent\u2019s actions is critical to improve the safety and reliability of autonomous systems. Mitigating NSEs is an emerging research topic that is attracting increased attention due to the rapid growth in the deployment of AI systems and their broad societal impacts. This article provides a comprehensive overview of different forms of NSEs and the recent research efforts to address them. We identify key characteristics of NSEs, highlight the challenges in avoiding NSEs, and discuss recently developed approaches, contrasting their benefits and limitations. The article concludes with a discussion of open questions and suggestions for future research directions.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1156','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1156\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZKaimag22.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZKaimag22.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZKaimag22.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/\u000e10.1609\/aaai.12028\" title=\"Follow DOI:\u000e10.1609\/aaai.12028\" target=\"_blank\">doi:\u000e10.1609\/aaai.12028<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1156','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Saisubramanian, Sandhya;  Zilberstein, Shlomo;  Kamar, Ece<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1157','tp_links')\" style=\"cursor:pointer;\">Avoiding Negative Side Effects of Autonomous Systems in the Open World<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Journal of Artificial Intelligence Research (JAIR), <\/span><span class=\"tp_pub_additional_volume\">vol. 74, <\/span><span class=\"tp_pub_additional_pages\">pp. 143\u2013177, <\/span><span class=\"tp_pub_additional_year\">2022<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1157\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1157','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1157\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1157','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1157\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1157','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1157\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:SZKjair22,<br \/>\r\ntitle = {Avoiding Negative Side Effects of Autonomous Systems in the Open World},<br \/>\r\nauthor = {Sandhya Saisubramanian and Shlomo Zilberstein and Ece Kamar},<br \/>\r\nurl = {https:\/\/www.jair.org\/index.php\/jair\/article\/view\/13581\/26799},<br \/>\r\ndoi = {10.1613\/jair.1.13581},<br \/>\r\nyear  = {2022},<br \/>\r\ndate = {2022-01-01},<br \/>\r\nurldate = {2022-01-01},<br \/>\r\njournal = {Journal of Artificial Intelligence Research (JAIR)},<br \/>\r\nvolume = {74},<br \/>\r\npages = {143--177},<br \/>\r\nabstract = {Autonomous systems that operate in the open world often use incomplete models of their environment. Model incompleteness is inevitable due to the practical limitations in precise model specification and data collection about open-world environments. Due to the limited fidelity of the model, agent actions may produce negative side effects (NSEs) when deployed. Negative side effects are undesirable, unmodeled effects of agent actions on the environment. NSEs are inherently challenging to identify at design time and may affect the reliability, usability and safety of the system. We present two complementary approaches to mitigate the NSE via: (1) learning from feedback, and (2) environment shaping. The solution approaches target settings with different assumptions and agent responsibilities. In learning from feedback, the agent learns a penalty function associated with a NSE. We investigate the efficiency of different feedback mechanisms, including human feedback and autonomous exploration. The problem is formulated as a multi-objective Markov decision process such that optimizing the agent\u2019s assigned task is prioritized over mitigating NSE. A slack parameter denotes the maximum allowed deviation from the optimal expected reward for the agent\u2019s task in order to mitigate NSE. In environment shaping, we examine how a human can assist an agent, beyond providing feedback, and utilize their broader scope of knowledge to mitigate the impacts of NSE. We formulate the problem as a human-agent collaboration with decoupled objectives. The agent optimizes its assigned task and may produce NSE during its operation. The human assists the agent by performing modest reconfigurations of the environment so as to mitigate the impacts of NSE, without affecting the agent\u2019s ability to complete its assigned task. We present an algorithm for shaping and analyze its properties. Empirical evaluations demonstrate the trade-offs in the performance of different approaches in mitigating NSE in different settings.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1157','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1157\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Autonomous systems that operate in the open world often use incomplete models of their environment. Model incompleteness is inevitable due to the practical limitations in precise model specification and data collection about open-world environments. Due to the limited fidelity of the model, agent actions may produce negative side effects (NSEs) when deployed. Negative side effects are undesirable, unmodeled effects of agent actions on the environment. NSEs are inherently challenging to identify at design time and may affect the reliability, usability and safety of the system. We present two complementary approaches to mitigate the NSE via: (1) learning from feedback, and (2) environment shaping. The solution approaches target settings with different assumptions and agent responsibilities. In learning from feedback, the agent learns a penalty function associated with a NSE. We investigate the efficiency of different feedback mechanisms, including human feedback and autonomous exploration. The problem is formulated as a multi-objective Markov decision process such that optimizing the agent\u2019s assigned task is prioritized over mitigating NSE. A slack parameter denotes the maximum allowed deviation from the optimal expected reward for the agent\u2019s task in order to mitigate NSE. In environment shaping, we examine how a human can assist an agent, beyond providing feedback, and utilize their broader scope of knowledge to mitigate the impacts of NSE. We formulate the problem as a human-agent collaboration with decoupled objectives. The agent optimizes its assigned task and may produce NSE during its operation. The human assists the agent by performing modest reconfigurations of the environment so as to mitigate the impacts of NSE, without affecting the agent\u2019s ability to complete its assigned task. We present an algorithm for shaping and analyze its properties. Empirical evaluations demonstrate the trade-offs in the performance of different approaches in mitigating NSE in different settings.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1157','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1157\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/www.jair.org\/index.php\/jair\/article\/view\/13581\/26799\" title=\"https:\/\/www.jair.org\/index.php\/jair\/article\/view\/13581\/26799\" target=\"_blank\">https:\/\/www.jair.org\/index.php\/jair\/article\/view\/13581\/26799<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1613\/jair.1.13581\" title=\"Follow DOI:10.1613\/jair.1.13581\" target=\"_blank\">doi:10.1613\/jair.1.13581<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1157','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Miura, Shuwa;  Wray, Kyle Hollins;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1161','tp_links')\" style=\"cursor:pointer;\">Heuristic Search for SSPs with Lexicographic Preferences over Multiple Costs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 15th Annual Symposium on Combinatorial Search (SOCS), <\/span><span class=\"tp_pub_additional_address\">Vienna, Austria, <\/span><span class=\"tp_pub_additional_year\">2022<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1161\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1161','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1161\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1161','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1161\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1161','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1161\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:MWZsocs22,<br \/>\r\ntitle = {Heuristic Search for SSPs with Lexicographic Preferences over Multiple Costs},<br \/>\r\nauthor = {Shuwa Miura and Kyle Hollins Wray and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MWZsocs22.pdf},<br \/>\r\nyear  = {2022},<br \/>\r\ndate = {2022-01-01},<br \/>\r\nbooktitle = {Proceedings of the 15th Annual Symposium on Combinatorial Search (SOCS)},<br \/>\r\naddress = {Vienna, Austria},<br \/>\r\nabstract = {Real-world decision problems often involve multiple competing objectives. The Stochastic Shortest Path (SSP) with lexicographic preferences over multiple costs offers an expressive formulation for many practical problems. However, the existing solution methods either lack optimality guarantees or require costly computations over the entire state space. We propose the first heuristic search algorithm for this problem, based on the heuristic algorithm for Constrained SSPs. Our experiments show that our heuristic search algorithm can compute optimal policies while avoiding a large portion of the state space. We also analyze the theoretical properties of the problem, establishing the conditions under which SSPs with lexicographic preferences have a proper optimal policy.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1161','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1161\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Real-world decision problems often involve multiple competing objectives. The Stochastic Shortest Path (SSP) with lexicographic preferences over multiple costs offers an expressive formulation for many practical problems. However, the existing solution methods either lack optimality guarantees or require costly computations over the entire state space. We propose the first heuristic search algorithm for this problem, based on the heuristic algorithm for Constrained SSPs. Our experiments show that our heuristic search algorithm can compute optimal policies while avoiding a large portion of the state space. We also analyze the theoretical properties of the problem, establishing the conditions under which SSPs with lexicographic preferences have a proper optimal policy.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1161','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1161\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MWZsocs22.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MWZsocs22.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MWZsocs22.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1161','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Svegliato, Justin;  Nashed, Samer B;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1136','tp_links')\" style=\"cursor:pointer;\">Ethically Compliant Sequential Decision Making<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 35th Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_year\">2021<\/span><span class=\"tp_pub_additional_note\">, (Distinguished Paper Award)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1136\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1136','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1136\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1136','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1136\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1136','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1136\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SNZaaai21,<br \/>\r\ntitle = {Ethically Compliant Sequential Decision Making},<br \/>\r\nauthor = {Justin Svegliato and Samer B Nashed and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SNZaaai21.pdf},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\nbooktitle = {Proceedings of the 35th Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {11657--11665},<br \/>\r\nabstract = {Enabling autonomous systems to comply with an ethical theory is critical given their accelerating deployment in domains that impact society. While many ethical theories have been studied extensively in moral philosophy, they are still challenging to implement by developers who build autonomous systems. This paper proposes a novel approach for building ethically compliant autonomous systems that optimize completing a task while following an ethical framework. First, we introduce a definition of an ethically compliant autonomous system and its properties. Next, we offer a range of ethical frameworks for divine command theory, prima facie duties, and virtue ethics. Finally, we demonstrate the accuracy and usability of our approach in a set of autonomous driving simulations and a user study of planning and robotics experts.},<br \/>\r\nnote = {Distinguished Paper Award},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1136','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1136\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Enabling autonomous systems to comply with an ethical theory is critical given their accelerating deployment in domains that impact society. While many ethical theories have been studied extensively in moral philosophy, they are still challenging to implement by developers who build autonomous systems. This paper proposes a novel approach for building ethically compliant autonomous systems that optimize completing a task while following an ethical framework. First, we introduce a definition of an ethically compliant autonomous system and its properties. Next, we offer a range of ethical frameworks for divine command theory, prima facie duties, and virtue ethics. Finally, we demonstrate the accuracy and usability of our approach in a set of autonomous driving simulations and a user study of planning and robotics experts.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1136','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1136\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SNZaaai21.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SNZaaai21.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SNZaaai21.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1136','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Miura, Shuwa;  Cohen, Andrew L;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1145','tp_links')\" style=\"cursor:pointer;\">Maximizing Legibility in Stochastic Environments<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 30th IEEE International Conference on Robot &amp; Human Interactive Communication, (RO-MAN), <\/span><span class=\"tp_pub_additional_address\">Vancouver, BC, Canada, <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1145\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1145','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1145\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1145','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1145\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1145','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1145\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:MCZroman21,<br \/>\r\ntitle = {Maximizing Legibility in Stochastic Environments},<br \/>\r\nauthor = {Shuwa Miura and Andrew L Cohen and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MCZroman21.pdf},<br \/>\r\ndoi = {10.1109\/RO-MAN50785.2021.9515318},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\nbooktitle = {Proceedings of the 30th IEEE International Conference on Robot & Human Interactive Communication, (RO-MAN)},<br \/>\r\npages = {1053--1059},<br \/>\r\naddress = {Vancouver, BC, Canada},<br \/>\r\nabstract = {Making an agent's intentions clear from its observed behavior is crucial for seamless human-agent interaction and for increased transparency and trust in AI systems. Existing methods that address this challenge and maximize legibility of behaviors are limited to deterministic domains. We develop a technique for maximizing legibility in stochastic environments and illustrate that using legibility as an objective improves interpretability of agent behavior in several scenarios. We provide initial empirical evidence that human subjects can better interpret legible behavior.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1145','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1145\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Making an agent's intentions clear from its observed behavior is crucial for seamless human-agent interaction and for increased transparency and trust in AI systems. Existing methods that address this challenge and maximize legibility of behaviors are limited to deterministic domains. We develop a technique for maximizing legibility in stochastic environments and illustrate that using legibility as an objective improves interpretability of agent behavior in several scenarios. We provide initial empirical evidence that human subjects can better interpret legible behavior.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1145','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1145\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MCZroman21.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MCZroman21.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MCZroman21.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1109\/RO-MAN50785.2021.9515318\" title=\"Follow DOI:10.1109\/RO-MAN50785.2021.9515318\" target=\"_blank\">doi:10.1109\/RO-MAN50785.2021.9515318<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1145','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Miura, Shuwa;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1146','tp_links')\" style=\"cursor:pointer;\">A Unifying Framework for Observer-Aware Planning and its Complexity<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 37th Conference on Uncertainty in Artificial Intelligence (UAI), <\/span><span class=\"tp_pub_additional_address\">Virtual Event, <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1146\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1146','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1146\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1146','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1146\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1146','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1146\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:MZuai21,<br \/>\r\ntitle = {A Unifying Framework for Observer-Aware Planning and its Complexity},<br \/>\r\nauthor = {Shuwa Miura and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZuai21.pdf},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\nbooktitle = {Proceedings of the 37th Conference on Uncertainty in Artificial Intelligence (UAI)},<br \/>\r\npages = {610--620},<br \/>\r\naddress = {Virtual Event},<br \/>\r\nabstract = {Being aware of observers and the inferences they make about an agent's behavior is crucial for successful multi-agent interaction. Existing works on observer-aware planning use different assumptions and techniques to produce observer-aware behaviors. We argue that observer-aware planning, in its most general form, can be modeled as an Interactive POMDP (I-POMDP), which requires complex modeling and is hard to solve. Hence, we introduce a less complex framework for producing observer-aware behaviors called Observer-Aware MDP (OAMDP) and analyze its relationship to I-POMDP. We establish the complexity of OAMDPs and show that they can improve interpretability of agent behaviors in several scenarios.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1146','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1146\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Being aware of observers and the inferences they make about an agent's behavior is crucial for successful multi-agent interaction. Existing works on observer-aware planning use different assumptions and techniques to produce observer-aware behaviors. We argue that observer-aware planning, in its most general form, can be modeled as an Interactive POMDP (I-POMDP), which requires complex modeling and is hard to solve. Hence, we introduce a less complex framework for producing observer-aware behaviors called Observer-Aware MDP (OAMDP) and analyze its relationship to I-POMDP. We establish the complexity of OAMDPs and show that they can improve interpretability of agent behaviors in several scenarios.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1146','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1146\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZuai21.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZuai21.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZuai21.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1146','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Rabiee, Sadegh;  Basich, Connor;  Wray, Kyle Hollins;  Zilberstein, Shlomo;  Biswas, Joydeep<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1147','tp_links')\" style=\"cursor:pointer;\">Competence-Aware Path Planning via Introspective Perception<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">CoRR, <\/span><span class=\"tp_pub_additional_volume\">vol. abs\/2109.13974, <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1147\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1147','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1147\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1147','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1147\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1147','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1147\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:SZarXiv21c,<br \/>\r\ntitle = {Competence-Aware Path Planning via Introspective Perception},<br \/>\r\nauthor = {Sadegh Rabiee and Connor Basich and Kyle Hollins Wray and Shlomo Zilberstein and Joydeep Biswas},<br \/>\r\nurl = {https:\/\/arxiv.org\/abs\/2109.13974},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\njournal = {CoRR},<br \/>\r\nvolume = {abs\/2109.13974},<br \/>\r\nabstract = {Robots deployed in the real world over extended periods of time need to reason about unexpected failures, learn to predict them, and to proactively take actions to avoid future failures. Existing approaches for competence-aware planning are either model-based, requiring explicit enumeration of known failure modes, or purely statistical, using state- and location-specific failure statistics to infer competence. We instead propose a structured model-free approach to competence-aware planning by reasoning about plan execution failures due to errors in perception, without requiring a-priori enumeration of failure modes or requiring location-specific failure statistics. We introduce competence-aware path planning via introspective perception (CPIP), a Bayesian framework to iteratively learn and exploit task-level competence in novel deployment environments. CPIP factorizes the competence-aware planning problem into two components. First, perception errors are learned in a model-free and location-agnostic setting via introspective perception prior to deployment in novel environments. Second, during actual deployments, the prediction of task-level failures is learned in a context-aware setting. Experiments in a simulation show that the proposed CPIP approach outperforms the frequentist baseline in multiple mobile robot tasks, and is further validated via real robot experiments in an environment with perceptually challenging obstacles and terrain.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1147','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1147\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Robots deployed in the real world over extended periods of time need to reason about unexpected failures, learn to predict them, and to proactively take actions to avoid future failures. Existing approaches for competence-aware planning are either model-based, requiring explicit enumeration of known failure modes, or purely statistical, using state- and location-specific failure statistics to infer competence. We instead propose a structured model-free approach to competence-aware planning by reasoning about plan execution failures due to errors in perception, without requiring a-priori enumeration of failure modes or requiring location-specific failure statistics. We introduce competence-aware path planning via introspective perception (CPIP), a Bayesian framework to iteratively learn and exploit task-level competence in novel deployment environments. CPIP factorizes the competence-aware planning problem into two components. First, perception errors are learned in a model-free and location-agnostic setting via introspective perception prior to deployment in novel environments. Second, during actual deployments, the prediction of task-level failures is learned in a context-aware setting. Experiments in a simulation show that the proposed CPIP approach outperforms the frequentist baseline in multiple mobile robot tasks, and is further validated via real robot experiments in an environment with perceptually challenging obstacles and terrain.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1147','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1147\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"ai ai-arxiv\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/arxiv.org\/abs\/2109.13974\" title=\"https:\/\/arxiv.org\/abs\/2109.13974\" target=\"_blank\">https:\/\/arxiv.org\/abs\/2109.13974<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1147','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Nashed, Samer B;  Svegliato, Justin;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1133','tp_links')\" style=\"cursor:pointer;\">Ethically Compliant Planning within Moral Communities<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the AAAI\/ACM Conference on AI, Ethics, and Society (AIES), <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1133\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1133','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1133\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1133','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1133\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1133','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1133\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:NSZaies21,<br \/>\r\ntitle = {Ethically Compliant Planning within Moral Communities},<br \/>\r\nauthor = {Samer B Nashed and Justin Svegliato and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/NSZaies21.pdf},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\nbooktitle = {Proceedings of the AAAI\/ACM Conference on AI, Ethics, and Society (AIES)},<br \/>\r\nabstract = {Ethically compliant autonomous systems (ECAS) are the state-of- the-art for solving sequential decision-making problems under un- certainty while respecting constraints that encode ethical considerations. This paper defines a novel concept in the context of ECAS that is from moral philosophy, the moral community, which leads to a nuanced taxonomy of explicit ethical agents. We then propose new ethical frameworks that extend the applicability of ECAS to domains where a moral community is required. Next, we provide a formal analysis of the proposed ethical frameworks and conduct experiments that illustrate their differences. Finally, we discuss the implications of explicit moral communities that could shape research on standards and guidelines for ethical agents in order to better understand and predict common errors in their design and communicate their capabilities.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1133','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1133\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Ethically compliant autonomous systems (ECAS) are the state-of- the-art for solving sequential decision-making problems under un- certainty while respecting constraints that encode ethical considerations. This paper defines a novel concept in the context of ECAS that is from moral philosophy, the moral community, which leads to a nuanced taxonomy of explicit ethical agents. We then propose new ethical frameworks that extend the applicability of ECAS to domains where a moral community is required. Next, we provide a formal analysis of the proposed ethical frameworks and conduct experiments that illustrate their differences. Finally, we discuss the implications of explicit moral communities that could shape research on standards and guidelines for ethical agents in order to better understand and predict common errors in their design and communicate their capabilities.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1133','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1133\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/NSZaies21.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/NSZaies21.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/NSZaies21.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1133','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Galhotra, Sainyam;  Saisubramanian, Sandhya;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1132','tp_links')\" style=\"cursor:pointer;\">Learning to Generate Fair Clusters from Demonstrations<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the AAAI\/ACM Conference on AI, Ethics, and Society (AIES), <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1132\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1132','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1132\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1132','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1132\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1132','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1132\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:GSZaies21,<br \/>\r\ntitle = {Learning to Generate Fair Clusters from Demonstrations},<br \/>\r\nauthor = {Sainyam Galhotra and Sandhya Saisubramanian and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/GSZaies21.pdf},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\nbooktitle = {Proceedings of the AAAI\/ACM Conference on AI, Ethics, and Society (AIES)},<br \/>\r\nabstract = {Fair clustering is the process of grouping similar entities together, while satisfying a mathematically well-defined fairness metric as a constraint. Due to the practical challenges in precise model specification, the prescribed fairness constraints are often incomplete and act as proxies to the intended fairness requirement. Clustering with proxies may lead to biased outcomes when the system is deployed. We examine how to identify the intended fairness constraint for a problem based on limited demonstrations from an expert. Each demonstration is a clustering over a subset of the data. We present an algorithm to identify the fairness metric from demonstrations and generate clusters using existing off-the-shelf clustering techniques, and analyze its theoretical properties. To extend our approach to novel fairness metrics for which clustering algorithms do not currently exist, we present a greedy method for clustering. Additionally, we investigate how to generate interpretable solutions using our approach. Empirical evaluation on three real-world datasets demonstrates the effectiveness of our approach in quickly identifying the underlying fairness and interpretability constraints, which are then used to generate fair and interpretable clusters.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1132','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1132\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Fair clustering is the process of grouping similar entities together, while satisfying a mathematically well-defined fairness metric as a constraint. Due to the practical challenges in precise model specification, the prescribed fairness constraints are often incomplete and act as proxies to the intended fairness requirement. Clustering with proxies may lead to biased outcomes when the system is deployed. We examine how to identify the intended fairness constraint for a problem based on limited demonstrations from an expert. Each demonstration is a clustering over a subset of the data. We present an algorithm to identify the fairness metric from demonstrations and generate clusters using existing off-the-shelf clustering techniques, and analyze its theoretical properties. To extend our approach to novel fairness metrics for which clustering algorithms do not currently exist, we present a greedy method for clustering. Additionally, we investigate how to generate interpretable solutions using our approach. Empirical evaluation on three real-world datasets demonstrates the effectiveness of our approach in quickly identifying the underlying fairness and interpretability constraints, which are then used to generate fair and interpretable clusters.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1132','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1132\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/GSZaies21.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/GSZaies21.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/GSZaies21.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1132','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Galhotra, Sainyam;  Saisubramanian, Sandhya;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1138','tp_links')\" style=\"cursor:pointer;\">Learning to Generate Fair Clusters from Demonstrations<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">CoRR, <\/span><span class=\"tp_pub_additional_volume\">vol. abs\/2102.03977, <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1138\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1138','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1138\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1138','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1138\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1138','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1138\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:GSZarXiv21a,<br \/>\r\ntitle = {Learning to Generate Fair Clusters from Demonstrations},<br \/>\r\nauthor = {Sainyam Galhotra and Sandhya Saisubramanian and Shlomo Zilberstein},<br \/>\r\nurl = {https:\/\/arxiv.org\/abs\/2102.03977},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\njournal = {CoRR},<br \/>\r\nvolume = {abs\/2102.03977},<br \/>\r\nabstract = {Fair clustering is the process of grouping similar entities together, while satisfying a mathematically well-defined fairness metric as a constraint. Due to the practical challenges in precise model specification, the prescribed fairness constraints are often incomplete and act as proxies to the intended fairness requirement, leading to biased outcomes when the system is deployed. We examine how to identify the intended fairness constraint for a problem based on limited demonstrations from an expert. Each demonstration is a clustering over a subset of the data. We present an algorithm to identify the fairness metric from demonstrations and generate clusters using existing off-the-shelf clustering techniques, and analyze its theoretical properties. To extend our approach to novel fairness metrics for which clustering algorithms do not currently exist, we present a greedy method for clustering. Additionally, we investigate how to generate interpretable solutions using our approach. Empirical evaluation on three real-world datasets demonstrates the effectiveness of our approach in quickly identifying the underlying fairness and interpretability constraints, which are then used to generate fair and interpretable clusters.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1138','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1138\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Fair clustering is the process of grouping similar entities together, while satisfying a mathematically well-defined fairness metric as a constraint. Due to the practical challenges in precise model specification, the prescribed fairness constraints are often incomplete and act as proxies to the intended fairness requirement, leading to biased outcomes when the system is deployed. We examine how to identify the intended fairness constraint for a problem based on limited demonstrations from an expert. Each demonstration is a clustering over a subset of the data. We present an algorithm to identify the fairness metric from demonstrations and generate clusters using existing off-the-shelf clustering techniques, and analyze its theoretical properties. To extend our approach to novel fairness metrics for which clustering algorithms do not currently exist, we present a greedy method for clustering. Additionally, we investigate how to generate interpretable solutions using our approach. Empirical evaluation on three real-world datasets demonstrates the effectiveness of our approach in quickly identifying the underlying fairness and interpretability constraints, which are then used to generate fair and interpretable clusters.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1138','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1138\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"ai ai-arxiv\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/arxiv.org\/abs\/2102.03977\" title=\"https:\/\/arxiv.org\/abs\/2102.03977\" target=\"_blank\">https:\/\/arxiv.org\/abs\/2102.03977<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1138','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Saisubramanian, Sandhya;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1137','tp_links')\" style=\"cursor:pointer;\">Mitigating Negative Side Effects via Environment Shaping<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">CoRR, <\/span><span class=\"tp_pub_additional_volume\">vol. abs\/2102.07017, <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1137\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1137','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1137\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1137','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1137\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1137','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1137\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:SZarXiv21b,<br \/>\r\ntitle = {Mitigating Negative Side Effects via Environment Shaping},<br \/>\r\nauthor = {Sandhya Saisubramanian and Shlomo Zilberstein},<br \/>\r\nurl = {https:\/\/arxiv.org\/abs\/2102.07017},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\njournal = {CoRR},<br \/>\r\nvolume = {abs\/2102.07017},<br \/>\r\nabstract = {Agents operating in unstructured environments often produce negative side effects (NSE), which are difficult to identify at design time. While the agent can learn to mitigate the side effects from human feedback, such feedback is often expensive and the rate of learning is sensitive to the agent's state representation. We examine how humans can assist an agent, beyond providing feedback, and exploit their broader scope of knowledge to mitigate the impacts of NSE. We formulate this problem as a human-agent team with decoupled objectives. The agent optimizes its assigned task, during which its actions may produce NSE. The human shapes the environment through minor reconfiguration actions so as to mitigate the impacts of the agent's side effects, without affecting the agent's ability to complete its assigned task. We present an algorithm to solve this problem and analyze its theoretical properties. Through experiments with human subjects, we assess the willingness of users to perform minor environment modifications to mitigate the impacts of NSE. Empirical evaluation of our approach shows that the proposed framework can successfully mitigate NSE, without affecting the agent's ability to complete its assigned task.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1137','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1137\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Agents operating in unstructured environments often produce negative side effects (NSE), which are difficult to identify at design time. While the agent can learn to mitigate the side effects from human feedback, such feedback is often expensive and the rate of learning is sensitive to the agent's state representation. We examine how humans can assist an agent, beyond providing feedback, and exploit their broader scope of knowledge to mitigate the impacts of NSE. We formulate this problem as a human-agent team with decoupled objectives. The agent optimizes its assigned task, during which its actions may produce NSE. The human shapes the environment through minor reconfiguration actions so as to mitigate the impacts of the agent's side effects, without affecting the agent's ability to complete its assigned task. We present an algorithm to solve this problem and analyze its theoretical properties. Through experiments with human subjects, we assess the willingness of users to perform minor environment modifications to mitigate the impacts of NSE. Empirical evaluation of our approach shows that the proposed framework can successfully mitigate NSE, without affecting the agent's ability to complete its assigned task.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1137','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1137\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"ai ai-arxiv\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/arxiv.org\/abs\/2102.07017\" title=\"https:\/\/arxiv.org\/abs\/2102.07017\" target=\"_blank\">https:\/\/arxiv.org\/abs\/2102.07017<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1137','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Saisubramanian, Sandhya;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1135','tp_links')\" style=\"cursor:pointer;\">Mitigating Negative Side Effects via Environment Shaping (Extended Abstract)<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS), <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1135\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1135','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1135\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1135','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1135\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1135','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1135\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SZaamas21,<br \/>\r\ntitle = {Mitigating Negative Side Effects via Environment Shaping (Extended Abstract)},<br \/>\r\nauthor = {Sandhya Saisubramanian and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZaamas21.pdf},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\nbooktitle = {Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS)},<br \/>\r\nabstract = {Agents operating in the open world often produce negative side effects (NSE), which are difficult to identify at design time. We examine how a human can assist an agent, beyond providing feedback, and exploit their broader scope of knowledge to mitigate the impacts of NSE. We formulate this problem as a human-agent team with decoupled objectives. The agent optimizes its assigned task, during which its actions may produce NSE. The human shapes the environment through minor reconfiguration actions so as to mitigate the impacts of agent's side effects, without significantly degrading agent performance. We present an algorithm to solve this problem. Empirical evaluation shows that the proposed framework can successfully mitigate NSE, without affecting the agent\u2019s ability to complete its assigned task.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1135','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1135\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Agents operating in the open world often produce negative side effects (NSE), which are difficult to identify at design time. We examine how a human can assist an agent, beyond providing feedback, and exploit their broader scope of knowledge to mitigate the impacts of NSE. We formulate this problem as a human-agent team with decoupled objectives. The agent optimizes its assigned task, during which its actions may produce NSE. The human shapes the environment through minor reconfiguration actions so as to mitigate the impacts of agent's side effects, without significantly degrading agent performance. We present an algorithm to solve this problem. Empirical evaluation shows that the proposed framework can successfully mitigate NSE, without affecting the agent\u2019s ability to complete its assigned task.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1135','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1135\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZaamas21.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZaamas21.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZaamas21.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1135','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Saisubramanian, Sandhya;  Roberts, Shannon C;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1134','tp_links')\" style=\"cursor:pointer;\">Understanding User Attitudes Towards Negative Side Effects of AI Systems<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">CHI Conference on Human Factors in Computing Systems, Late-Breaking Work, <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1134\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1134','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1134\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1134','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1134\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1134','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1134\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SRZchi21,<br \/>\r\ntitle = {Understanding User Attitudes Towards Negative Side Effects of AI Systems},<br \/>\r\nauthor = {Sandhya Saisubramanian and Shannon C Roberts and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SRZchi21.pdf},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\nbooktitle = {CHI Conference on Human Factors in Computing Systems, Late-Breaking Work},<br \/>\r\npages = {368:1--368:6},<br \/>\r\nabstract = {Artificial Intelligence (AI) systems deployed in the open world may produce negative side effects\u2014which are unanticipated, undesirable outcomes that occur in addition to the intended outcomes of the system\u2019s actions. These negative side effects affect users directly or indirectly, by violating their preferences or altering their environment in an undesirable, potentially harmful, manner. While the existing literature has started to explore techniques to overcome the impacts of negative side effects in deployed systems, there has been no prior efforts to determine how users perceive and respond to negative side effects. We surveyed 183 participants to develop an understanding of user attitudes towards side effects and how side effects impact user trust in the system. The surveys targeted two domains: an autonomous vacuum cleaner and an autonomous vehicle, each with 183 respondents. The results indicate that users are willing to tolerate side effects that are not safety-critical but prefer to minimize them as much as possible. Furthermore, users are willing to assist the system in mitigating negative side effects by providing feedback and reconfiguring the environment. Trust in the system diminishes if it fails to minimize the impacts of negative side effects over time. These results support key fundamental assumptions in existing techniques and facilitate the development of new methods to overcome negative side effects of AI systems.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1134','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1134\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Artificial Intelligence (AI) systems deployed in the open world may produce negative side effects\u2014which are unanticipated, undesirable outcomes that occur in addition to the intended outcomes of the system\u2019s actions. These negative side effects affect users directly or indirectly, by violating their preferences or altering their environment in an undesirable, potentially harmful, manner. While the existing literature has started to explore techniques to overcome the impacts of negative side effects in deployed systems, there has been no prior efforts to determine how users perceive and respond to negative side effects. We surveyed 183 participants to develop an understanding of user attitudes towards side effects and how side effects impact user trust in the system. The surveys targeted two domains: an autonomous vacuum cleaner and an autonomous vehicle, each with 183 respondents. The results indicate that users are willing to tolerate side effects that are not safety-critical but prefer to minimize them as much as possible. Furthermore, users are willing to assist the system in mitigating negative side effects by providing feedback and reconfiguring the environment. Trust in the system diminishes if it fails to minimize the impacts of negative side effects over time. These results support key fundamental assumptions in existing techniques and facilitate the development of new methods to overcome negative side effects of AI systems.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1134','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1134\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SRZchi21.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SRZchi21.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SRZchi21.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1134','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Woolf, Beverly;  Ghosh, Aritra;  Lan, Andrew;  Zilberstein, Shlomo;  Juravich, Tom;  Cohen, Andrew;  Geho, Olivia<\/p><p class=\"tp_pub_title\">AI-Enabled Training in Manufacturing Workforce Development <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">AAAI Spring Symposium on Artificial Intelligence in Manufacturing, <\/span><span class=\"tp_pub_additional_year\">2020<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1116\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1116','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1116\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1116','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1116\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WGLZJCGspring20,<br \/>\r\ntitle = {AI-Enabled Training in Manufacturing Workforce Development},<br \/>\r\nauthor = {Beverly Woolf and Aritra Ghosh and Andrew Lan and Shlomo Zilberstein and Tom Juravich and Andrew Cohen and Olivia Geho},<br \/>\r\nyear  = {2020},<br \/>\r\ndate = {2020-01-01},<br \/>\r\nbooktitle = {AAAI Spring Symposium on Artificial Intelligence in Manufacturing},<br \/>\r\nabstract = {A highly productive workforce can evolve with the integration of digital devices, such as computer interfaces to operating machines, interconnected smart devices, and robots, in the workplace. However, this potential cannot be realized with the current state-of-the-art systems used to train workers. This problem is acute in manufacturing, where huge skills gaps are evident; most workers lack the necessary skills to operate or collaborate with autonomous systems. We propose to address this problem by using intelligent tutoring systems and worker data analysis. The worker data includes: i) fine-grained on-job performance data, ii) career path data containing the entire career paths of workers, and iii) job posting data over a long period of time indicating the required skills for each job. We will collect and analyze worker data and use it to drive new methods for training and reskilling workers. We detail ideas and tools to be developed by research in intelligent tutoring systems, data science, manufacturing, sociology, labor analysis, education, psychology, and economics. We also describe a convergent approach to developing effective, fair, and scalable software solutions and dynamic intelligent training. address = Stanford, California},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1116','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1116\" style=\"display:none;\"><div class=\"tp_abstract_entry\">A highly productive workforce can evolve with the integration of digital devices, such as computer interfaces to operating machines, interconnected smart devices, and robots, in the workplace. However, this potential cannot be realized with the current state-of-the-art systems used to train workers. This problem is acute in manufacturing, where huge skills gaps are evident; most workers lack the necessary skills to operate or collaborate with autonomous systems. We propose to address this problem by using intelligent tutoring systems and worker data analysis. The worker data includes: i) fine-grained on-job performance data, ii) career path data containing the entire career paths of workers, and iii) job posting data over a long period of time indicating the required skills for each job. We will collect and analyze worker data and use it to drive new methods for training and reskilling workers. We detail ideas and tools to be developed by research in intelligent tutoring systems, data science, manufacturing, sociology, labor analysis, education, psychology, and economics. We also describe a convergent approach to developing effective, fair, and scalable software solutions and dynamic intelligent training. address = Stanford, California<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1116','tp_abstract')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Renski, Henry;  Smith-Doerr, Laurel;  Wilkerson, Tiamba;  Roberts, Shannon C;  Zilberstein, Shlomo;  Branch, Enobong H<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1123','tp_links')\" style=\"cursor:pointer;\">Racial Equity and the Future of Work<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Technology| Architecture+ Design, <\/span><span class=\"tp_pub_additional_volume\">vol. 4, <\/span><span class=\"tp_pub_additional_number\">no. 1, <\/span><span class=\"tp_pub_additional_pages\">pp. 17\u201322, <\/span><span class=\"tp_pub_additional_year\">2020<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_resource_link\"><a id=\"tp_links_sh_1123\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1123','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1123\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1123','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1123\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:RSWRZBtad20,<br \/>\r\ntitle = {Racial Equity and the Future of Work},<br \/>\r\nauthor = {Henry Renski and Laurel Smith-Doerr and Tiamba Wilkerson and Shannon C Roberts and Shlomo Zilberstein and Enobong H Branch},<br \/>\r\ndoi = {10.1080\/24751448.2020.1705711},<br \/>\r\nyear  = {2020},<br \/>\r\ndate = {2020-01-01},<br \/>\r\njournal = {Technology| Architecture+ Design},<br \/>\r\nvolume = {4},<br \/>\r\nnumber = {1},<br \/>\r\npages = {17--22},<br \/>\r\npublisher = {Taylor & Francis doi = 10.1080\/24751448.2020.1705711},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1123','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1123\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1080\/24751448.2020.1705711\" title=\"Follow DOI:10.1080\/24751448.2020.1705711\" target=\"_blank\">doi:10.1080\/24751448.2020.1705711<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1123','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><\/table><\/div><\/div>\n<div><\/div><\/div><\/div>\n<h3><span style=\"color: #264278\"><b>Anytime Algorithms<\/b><\/span><\/h3>\n<div>How can we design \u201cwell behaved\u201d algorithms that can be interrupted at any time and still return useful results, and how can we use such algorithms as components of a complex AI system?<\/div>\n<div><div class=\"bg-margin-for-link\"><input type='hidden' bg_collapse_expand='6a2d3028e027b1015921763' value='6a2d3028e027b1015921763'><input type='hidden' id='bg-show-more-text-6a2d3028e027b1015921763' value='Show Related Publications'><input type='hidden' id='bg-show-less-text-6a2d3028e027b1015921763' value='Hide Related Publications'><a id='bg-showmore-action-6a2d3028e027b1015921763' class='bg-showmore-plg-link bg-arrow '  style=\" color:#7C2622;;\" href='#'>Show Related Publications<\/a><div id='bg-showmore-hidden-6a2d3028e027b1015921763' ><div class=\"teachpress_pub_list\"><form name=\"tppublistform\" method=\"get\"><a name=\"tppubs\" id=\"tppubs\"><\/a><\/form><table class=\"teachpress_publication_list\"><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Bhatia, Abhinav;  Svegliato, Justin;  Nashed, Samer B.;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1160','tp_links')\" style=\"cursor:pointer;\">Tuning the Hyperparameters of Anytime Planning: A Metareasoning Approach with Deep Reinforcement Learning<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 32nd International Conference on Automated Planning and Scheduling (ICAPS), <\/span><span class=\"tp_pub_additional_address\">Virtual Conference, <\/span><span class=\"tp_pub_additional_year\">2022<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1160\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1160','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1160\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1160','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1160\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1160','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1160\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:BSNZicaps22,<br \/>\r\ntitle = {Tuning the Hyperparameters of Anytime Planning: A Metareasoning Approach with Deep Reinforcement Learning},<br \/>\r\nauthor = {Abhinav Bhatia and Justin Svegliato and Samer B. Nashed and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSNZicaps22.pdf},<br \/>\r\nyear  = {2022},<br \/>\r\ndate = {2022-01-01},<br \/>\r\nbooktitle = {Proceedings of the 32nd International Conference on Automated Planning and Scheduling (ICAPS)},<br \/>\r\naddress = {Virtual Conference},<br \/>\r\nabstract = {Anytime planning algorithms often have hyperparameters that can be tuned at runtime to optimize their performance. While work on metareasoning has focused on when to interrupt an anytime planner and act on the current plan, the scope of metareasoning can be expanded to tuning the hyperparameters of the anytime planner at runtime. This paper introduces a general, decision-theoretic metareasoning approach that optimizes both the stopping point and hyperparameters of any- time planning. We begin by proposing a generalization of the standard meta-level control problem for anytime algorithms. We then offer a meta-level control technique that monitors and controls an anytime algorithm using deep reinforcement learning. Finally, we show that our approach boosts performance on a common benchmark domain that uses anytime weighted A* to solve a range of heuristic search problems and a mobile robot application that uses RRT* to solve motion planning problems.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1160','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1160\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Anytime planning algorithms often have hyperparameters that can be tuned at runtime to optimize their performance. While work on metareasoning has focused on when to interrupt an anytime planner and act on the current plan, the scope of metareasoning can be expanded to tuning the hyperparameters of the anytime planner at runtime. This paper introduces a general, decision-theoretic metareasoning approach that optimizes both the stopping point and hyperparameters of any- time planning. We begin by proposing a generalization of the standard meta-level control problem for anytime algorithms. We then offer a meta-level control technique that monitors and controls an anytime algorithm using deep reinforcement learning. Finally, we show that our approach boosts performance on a common benchmark domain that uses anytime weighted A* to solve a range of heuristic search problems and a mobile robot application that uses RRT* to solve motion planning problems.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1160','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1160\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSNZicaps22.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSNZicaps22.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSNZicaps22.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1160','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Bhatia, Abhinav;  Svegliato, Justin;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1130','tp_links')\" style=\"cursor:pointer;\">On the Benefits of Randomly Adjusting Anytime Weighted A*<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 14th International Symposium on Combinatorial Search (SOCS), <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1130\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1130','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1130\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1130','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1130\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1130','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1130\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:BSZsocs21,<br \/>\r\ntitle = {On the Benefits of Randomly Adjusting Anytime Weighted A*},<br \/>\r\nauthor = {Abhinav Bhatia and Justin Svegliato and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSZsocs21.pdf},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\nbooktitle = {Proceedings of the 14th International Symposium on Combinatorial Search (SOCS)},<br \/>\r\nabstract = {Anytime Weighted A*--an anytime heuristic search algorithm that uses a weight to scale the heuristic value of each node in the open list--has proven to be an effective way to manage the trade-off between solution quality and computation time in heuristic search. Finding the best weight, however, is challenging because it depends on not only the characteristics of the domain and the details of the instance at hand, but also the available computation time. We propose a randomized version of this algorithm, called Randomized Weighted A*, that randomly adjusts its weight at runtime and show a counterintuitive phenomenon: RWA* generally per- forms as well or better than AWA* with the best static weight on a range of benchmark problems. The result is a simple algorithm that is easy to implement and performs consistently well without any offline experimentation or parameter tuning.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1130','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1130\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Anytime Weighted A*--an anytime heuristic search algorithm that uses a weight to scale the heuristic value of each node in the open list--has proven to be an effective way to manage the trade-off between solution quality and computation time in heuristic search. Finding the best weight, however, is challenging because it depends on not only the characteristics of the domain and the details of the instance at hand, but also the available computation time. We propose a randomized version of this algorithm, called Randomized Weighted A*, that randomly adjusts its weight at runtime and show a counterintuitive phenomenon: RWA* generally per- forms as well or better than AWA* with the best static weight on a range of benchmark problems. The result is a simple algorithm that is easy to implement and performs consistently well without any offline experimentation or parameter tuning.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1130','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1130\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSZsocs21.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSZsocs21.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSZsocs21.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1130','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Svegliato, Justin;  Wray, Kyle Hollins;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('870','tp_links')\" style=\"cursor:pointer;\">Meta-Level Control of Anytime Algorithms with Online Performance Prediction<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 27th International Joint Conference on Artificial Intelligence, <\/span><span class=\"tp_pub_additional_address\">Stockholm, Sweden, <\/span><span class=\"tp_pub_additional_year\">2018<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_870\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('870','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_870\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('870','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_870\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('870','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_870\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SWZijcai18,<br \/>\r\ntitle = {Meta-Level Control of Anytime Algorithms with Online Performance Prediction},<br \/>\r\nauthor = {Justin Svegliato and Kyle Hollins Wray and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SWZijcai18.pdf},<br \/>\r\ndoi = {10.24963\/ijcai.2018\/208},<br \/>\r\nyear  = {2018},<br \/>\r\ndate = {2018-01-01},<br \/>\r\nbooktitle = {Proceedings of the 27th International Joint Conference on Artificial Intelligence},<br \/>\r\npages = {1499--1505},<br \/>\r\naddress = {Stockholm, Sweden},<br \/>\r\nabstract = {Anytime algorithms enable intelligent systems to trade computation time with solution quality. To exploit this crucial ability in real-time decision-making, the system must decide when to interrupt the anytime algorithm and act on the current solution. Existing meta-level control techniques, how- ever, address this problem by relying on significant offline work that diminishes their practical utility and accuracy. We formally introduce an online performance prediction framework that enables meta- level control to adapt to each instance of a problem without any preprocessing. Using this framework, we then present a meta-level control technique and two stopping conditions. Finally, we show that our approach outperforms existing techniques that re- quire substantial offline work. The result is efficient nonmyopic meta-level control that reduces the overhead and increases the benefits of using any- time algorithms in intelligent systems.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('870','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_870\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Anytime algorithms enable intelligent systems to trade computation time with solution quality. To exploit this crucial ability in real-time decision-making, the system must decide when to interrupt the anytime algorithm and act on the current solution. Existing meta-level control techniques, how- ever, address this problem by relying on significant offline work that diminishes their practical utility and accuracy. We formally introduce an online performance prediction framework that enables meta- level control to adapt to each instance of a problem without any preprocessing. Using this framework, we then present a meta-level control technique and two stopping conditions. Finally, we show that our approach outperforms existing techniques that re- quire substantial offline work. The result is efficient nonmyopic meta-level control that reduces the overhead and increases the benefits of using any- time algorithms in intelligent systems.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('870','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_870\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SWZijcai18.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SWZijcai18.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SWZijcai18.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.24963\/ijcai.2018\/208\" title=\"Follow DOI:10.24963\/ijcai.2018\/208\" target=\"_blank\">doi:10.24963\/ijcai.2018\/208<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('870','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Svegliato, Justin;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('872','tp_links')\" style=\"cursor:pointer;\">Adaptive Metareasoning for Bounded Rational Agents<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">IJCAI\/ECAI Workshop on Architectures and Evaluation for Generality, Autonomy and Progress in AI (AEGAP), <\/span><span class=\"tp_pub_additional_address\">Stockholm, Sweden, <\/span><span class=\"tp_pub_additional_year\">2018<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_872\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('872','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_872\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('872','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_872\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('872','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_872\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SZijcaiAEGAP18,<br \/>\r\ntitle = {Adaptive Metareasoning for Bounded Rational Agents},<br \/>\r\nauthor = {Justin Svegliato and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZijcaiAEGAP18.pdf},<br \/>\r\nyear  = {2018},<br \/>\r\ndate = {2018-01-01},<br \/>\r\nbooktitle = {IJCAI\/ECAI Workshop on Architectures and Evaluation for Generality, Autonomy and Progress in AI (AEGAP)},<br \/>\r\naddress = {Stockholm, Sweden},<br \/>\r\nabstract = {In computational approaches to bounded rationality, metareasoning enables intelligent agents to optimize their own decision-making process in order to produce effective action in a timely manner. While there have been substantial efforts to develop effective meta-level control for anytime algorithms, existing techniques rely on extensive offline work, imposing several critical assumptions that diminish their effectiveness and limit their practical utility in the real world. In order to eliminate these assumptions, adaptive metareasoning enables intelligent agents to adapt to each individual instance of the problem at hand without the need for significant offline preprocessing. Building on our re- cent work, we first introduce a model-free approach to meta-level control based on reinforcement learn- ing. We then present a meta-level control technique that uses temporal difference learning. Finally, we show empirically that our approach is effective on a common benchmark in meta-level control.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('872','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_872\" style=\"display:none;\"><div class=\"tp_abstract_entry\">In computational approaches to bounded rationality, metareasoning enables intelligent agents to optimize their own decision-making process in order to produce effective action in a timely manner. While there have been substantial efforts to develop effective meta-level control for anytime algorithms, existing techniques rely on extensive offline work, imposing several critical assumptions that diminish their effectiveness and limit their practical utility in the real world. In order to eliminate these assumptions, adaptive metareasoning enables intelligent agents to adapt to each individual instance of the problem at hand without the need for significant offline preprocessing. Building on our re- cent work, we first introduce a model-free approach to meta-level control based on reinforcement learn- ing. We then present a meta-level control technique that uses temporal difference learning. Finally, we show empirically that our approach is effective on a common benchmark in meta-level control.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('872','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_872\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZijcaiAEGAP18.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZijcaiAEGAP18.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZijcaiAEGAP18.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('872','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Arnt, Andrew;  Zilberstein, Shlomo;  Allan, James;  Mouaddib, Abdel-Illah<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1021','tp_links')\" style=\"cursor:pointer;\">Dynamic Composition of Information Retrieval Techniques<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Journal of Intelligent Information Systems (JIIS), <\/span><span class=\"tp_pub_additional_volume\">vol. 23, <\/span><span class=\"tp_pub_additional_number\">no. 1, <\/span><span class=\"tp_pub_additional_pages\">pp. 67\u201397, <\/span><span class=\"tp_pub_additional_year\">2004<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1021\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1021','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1021\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1021','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1021\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1021','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1021\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:AZAMjiis04,<br \/>\r\ntitle = {Dynamic Composition of Information Retrieval Techniques},<br \/>\r\nauthor = {Andrew Arnt and Shlomo Zilberstein and James Allan and Abdel-Illah Mouaddib},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/AZAMjiis04.pdf},<br \/>\r\ndoi = {10.1023\/B:JIIS.0000029671.27333.7d},<br \/>\r\nyear  = {2004},<br \/>\r\ndate = {2004-01-01},<br \/>\r\njournal = {Journal of Intelligent Information Systems (JIIS)},<br \/>\r\nvolume = {23},<br \/>\r\nnumber = {1},<br \/>\r\npages = {67--97},<br \/>\r\nabstract = {This paper presents a new approach to information retrieval (IR) based on run-time selection of the best set of techniques to respond to a given query. A technique is selected based on its projected effectiveness with respect to the specific query, the load on the system, and a time-dependent utility function. The paper examines two fundamental questions: (1) can the selection of the best IR techniques be performed at run-time with minimal computational overhead? and (2) is it possible to construct a reliable probabilistic model of the performance of an IR technique that is conditioned on the characteristics of the query? We show that both of these questions can be answered positively. These results suggest a new system design that carries a great potential to improve the quality of service of future IR systems.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1021','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1021\" style=\"display:none;\"><div class=\"tp_abstract_entry\">This paper presents a new approach to information retrieval (IR) based on run-time selection of the best set of techniques to respond to a given query. A technique is selected based on its projected effectiveness with respect to the specific query, the load on the system, and a time-dependent utility function. The paper examines two fundamental questions: (1) can the selection of the best IR techniques be performed at run-time with minimal computational overhead? and (2) is it possible to construct a reliable probabilistic model of the performance of an IR technique that is conditioned on the characteristics of the query? We show that both of these questions can be answered positively. These results suggest a new system design that carries a great potential to improve the quality of service of future IR systems.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1021','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1021\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/AZAMjiis04.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/AZAMjiis04.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/AZAMjiis04.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1023\/B:JIIS.0000029671.27333.7d\" title=\"Follow DOI:10.1023\/B:JIIS.0000029671.27333.7d\" target=\"_blank\">doi:10.1023\/B:JIIS.0000029671.27333.7d<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1021','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Zilberstein, Shlomo;  Charpillet, Francois;  Chassaing, Philippe<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1033','tp_links')\" style=\"cursor:pointer;\">Optimal Sequencing of Contract Algorithms<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Annals of Mathematics and Artificial Intelligence (AMAI), <\/span><span class=\"tp_pub_additional_volume\">vol. 39, <\/span><span class=\"tp_pub_additional_number\">no. 1-2, <\/span><span class=\"tp_pub_additional_pages\">pp. 1-18, <\/span><span class=\"tp_pub_additional_year\">2003<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1033\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1033','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1033\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1033','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1033\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1033','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1033\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:ZCCamai03,<br \/>\r\ntitle = {Optimal Sequencing of Contract Algorithms},<br \/>\r\nauthor = {Shlomo Zilberstein and Francois Charpillet and Philippe Chassaing},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZCCamai03.pdf},<br \/>\r\ndoi = {10.1023\/A:1024412831598},<br \/>\r\nyear  = {2003},<br \/>\r\ndate = {2003-01-01},<br \/>\r\njournal = {Annals of Mathematics and Artificial Intelligence (AMAI)},<br \/>\r\nvolume = {39},<br \/>\r\nnumber = {1-2},<br \/>\r\npages = {1-18},<br \/>\r\nabstract = {We address the problem of building an interruptible real-time system using non-interruptible components. Some artificial intelligence techniques offer a tradeoff between computation time and quality of results, but their run-time must be determined when they are activated. These techniques, called contract algorithms, introduce a complex scheduling problem when there is uncertainty about the amount of time available for problem-solving. We show how to optimally sequence contract algorithms to create the best possible interruptible system with or without stochastic information about the deadline. These results extend the foundation of real-time problem-solving and provide useful guidance for embedding contract algorithms in applications.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1033','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1033\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We address the problem of building an interruptible real-time system using non-interruptible components. Some artificial intelligence techniques offer a tradeoff between computation time and quality of results, but their run-time must be determined when they are activated. These techniques, called contract algorithms, introduce a complex scheduling problem when there is uncertainty about the amount of time available for problem-solving. We show how to optimally sequence contract algorithms to create the best possible interruptible system with or without stochastic information about the deadline. These results extend the foundation of real-time problem-solving and provide useful guidance for embedding contract algorithms in applications.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1033','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1033\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZCCamai03.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZCCamai03.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZCCamai03.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1023\/A:1024412831598\" title=\"Follow DOI:10.1023\/A:1024412831598\" target=\"_blank\">doi:10.1023\/A:1024412831598<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1033','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Bernstein, Daniel S;  Finkelstein, Lev;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1041','tp_links')\" style=\"cursor:pointer;\">Contract Algorithms and Robots on Rays: Unifying Two Scheduling Problems<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_address\">Acapulco, Mexico, <\/span><span class=\"tp_pub_additional_year\">2003<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1041\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1041','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1041\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1041','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1041\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1041','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1041\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:BFZijcai03,<br \/>\r\ntitle = {Contract Algorithms and Robots on Rays: Unifying Two Scheduling Problems},<br \/>\r\nauthor = {Daniel S Bernstein and Lev Finkelstein and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BFZijcai03.pdf},<br \/>\r\nyear  = {2003},<br \/>\r\ndate = {2003-01-01},<br \/>\r\nbooktitle = {Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\npages = {1211--1217},<br \/>\r\naddress = {Acapulco, Mexico},<br \/>\r\nabstract = {We study two apparently different, but formally similar, scheduling problems. The first problem involves contract algorithms, which can trade off run time for solution quality, as long as the amount of available run time is known in advance. The problem is to schedule contract algorithms to run on parallel processors, under the condition that an interruption can occur at any time, and upon interruption a solution to any one of a number of problems can be requested. Schedules are compared in terms of acceleration ratio, which is a worst-case measure of efficiency. We provide a schedule and prove its optimality among a particular class of schedules. Our second problem involves multiple robots searching for a goal on one of multiple rays. Search strategies are compared in terms of time-competitive ratio, the ratio of the total search time to the time it would take for one robot to traverse directly to the goal. We demonstrate that search strategies and contract schedules are formally equivalent. In addition, for our class of schedules, we derive a formula relating the acceleration ratio of a schedule to the time-competitive ratio of the corresponding search strategy.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1041','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1041\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We study two apparently different, but formally similar, scheduling problems. The first problem involves contract algorithms, which can trade off run time for solution quality, as long as the amount of available run time is known in advance. The problem is to schedule contract algorithms to run on parallel processors, under the condition that an interruption can occur at any time, and upon interruption a solution to any one of a number of problems can be requested. Schedules are compared in terms of acceleration ratio, which is a worst-case measure of efficiency. We provide a schedule and prove its optimality among a particular class of schedules. Our second problem involves multiple robots searching for a goal on one of multiple rays. Search strategies are compared in terms of time-competitive ratio, the ratio of the total search time to the time it would take for one robot to traverse directly to the goal. We demonstrate that search strategies and contract schedules are formally equivalent. In addition, for our class of schedules, we derive a formula relating the acceleration ratio of a schedule to the time-competitive ratio of the corresponding search strategy.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1041','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1041\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BFZijcai03.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BFZijcai03.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BFZijcai03.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1041','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Bernstein, Daniel S;  Perkins, Theodore J;  Zilberstein, Shlomo;  Finkelstein, Lev<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1046','tp_links')\" style=\"cursor:pointer;\">Scheduling Contract Algorithms on Multiple Processors<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 18th National Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_address\">Edmonton, Alberta, <\/span><span class=\"tp_pub_additional_year\">2002<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1046\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1046','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1046\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1046','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1046\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1046','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1046\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:BPZFaaai02,<br \/>\r\ntitle = {Scheduling Contract Algorithms on Multiple Processors},<br \/>\r\nauthor = {Daniel S Bernstein and Theodore J Perkins and Shlomo Zilberstein and Lev Finkelstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BPZFaaai02.pdf},<br \/>\r\nyear  = {2002},<br \/>\r\ndate = {2002-01-01},<br \/>\r\nbooktitle = {Proceedings of the 18th National Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {702--706},<br \/>\r\naddress = {Edmonton, Alberta},<br \/>\r\nabstract = {Anytime algorithms offer a tradeoff between computation time and the quality of the result returned. They can be divided into two classes: contract algorithms, for which the total run time must be specified in advance, and interruptible algorithms, which can be queried at any time for a solution. An interruptible algorithm can be constructed from a contract algorithm by repeatedly activating the contract algorithm with increasing run times. The acceleration ratio of a run-time schedule is a worst-case measure of how inefficient the constructed interruptible algorithm is compared to the contract algorithm. The smallest acceleration ratio achievable on a single processor is known. Using multiple processors, smaller acceleration ratios are possible. In this paper, we provide a schedule for m processors and prove that it is optimal for all m. Our results provide general guidelines for the use of parallel processors in the design of real-time systems.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1046','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1046\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Anytime algorithms offer a tradeoff between computation time and the quality of the result returned. They can be divided into two classes: contract algorithms, for which the total run time must be specified in advance, and interruptible algorithms, which can be queried at any time for a solution. An interruptible algorithm can be constructed from a contract algorithm by repeatedly activating the contract algorithm with increasing run times. The acceleration ratio of a run-time schedule is a worst-case measure of how inefficient the constructed interruptible algorithm is compared to the contract algorithm. The smallest acceleration ratio achievable on a single processor is known. Using multiple processors, smaller acceleration ratios are possible. In this paper, we provide a schedule for m processors and prove that it is optimal for all m. Our results provide general guidelines for the use of parallel processors in the design of real-time systems.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1046','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1046\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BPZFaaai02.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BPZFaaai02.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BPZFaaai02.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1046','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Hansen, Eric A;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1050','tp_links')\" style=\"cursor:pointer;\">Monitoring and Control of Anytime Algorithms: A Dynamic Programming Approach<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Artificial Intelligence (AIJ), <\/span><span class=\"tp_pub_additional_volume\">vol. 126, <\/span><span class=\"tp_pub_additional_number\">no. 1-2, <\/span><span class=\"tp_pub_additional_pages\">pp. 139\u2013157, <\/span><span class=\"tp_pub_additional_year\">2001<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1050\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1050','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1050\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1050','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1050\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1050','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1050\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:HZaij01a,<br \/>\r\ntitle = {Monitoring and Control of Anytime Algorithms: A Dynamic Programming Approach},<br \/>\r\nauthor = {Eric A Hansen and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HZaij01a.pdf},<br \/>\r\ndoi = {10.1016\/S0004-3702(00)00068-0},<br \/>\r\nyear  = {2001},<br \/>\r\ndate = {2001-01-01},<br \/>\r\njournal = {Artificial Intelligence (AIJ)},<br \/>\r\nvolume = {126},<br \/>\r\nnumber = {1-2},<br \/>\r\npages = {139--157},<br \/>\r\nabstract = {Anytime algorithms offer a tradeoff between solution quality and computation time that has proved useful in solving time-critical problems such as planning and scheduling, belief network evaluation, and information gathering. To exploit this tradeoff, a system must be able to decide when to stop deliberation and act on the currently available solution. This paper analyzes the characteristics of existing techniques for meta-level control of anytime algorithms and develops a new framework for monitoring and control. The new framework handles effectively the uncertainty associated with the algorithm's performance profile, the uncertainty associated with the domain of operation, and the cost of monitoring progress. The result is an efficient non-myopic solution to the meta-level control problem for anytime algorithms.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1050','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1050\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Anytime algorithms offer a tradeoff between solution quality and computation time that has proved useful in solving time-critical problems such as planning and scheduling, belief network evaluation, and information gathering. To exploit this tradeoff, a system must be able to decide when to stop deliberation and act on the currently available solution. This paper analyzes the characteristics of existing techniques for meta-level control of anytime algorithms and develops a new framework for monitoring and control. The new framework handles effectively the uncertainty associated with the algorithm's performance profile, the uncertainty associated with the domain of operation, and the cost of monitoring progress. The result is an efficient non-myopic solution to the meta-level control problem for anytime algorithms.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1050','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1050\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HZaij01a.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HZaij01a.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HZaij01a.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1016\/S0004-3702(00)00068-0\" title=\"Follow DOI:10.1016\/S0004-3702(00)00068-0\" target=\"_blank\">doi:10.1016\/S0004-3702(00)00068-0<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1050','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Grass, Joshua;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1058','tp_links')\" style=\"cursor:pointer;\">A Value-Driven System for Autonomous Information Gathering<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Journal of Intelligent Information Systems (JIIS), <\/span><span class=\"tp_pub_additional_volume\">vol. 14, <\/span><span class=\"tp_pub_additional_number\">no. 1, <\/span><span class=\"tp_pub_additional_pages\">pp. 5\u201327, <\/span><span class=\"tp_pub_additional_year\">2000<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1058\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1058','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1058\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1058','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1058\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1058','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1058\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:GZjiis00,<br \/>\r\ntitle = {A Value-Driven System for Autonomous Information Gathering},<br \/>\r\nauthor = {Joshua Grass and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/GZjiis00.pdf},<br \/>\r\ndoi = {10.1023\/A:1008718418982},<br \/>\r\nyear  = {2000},<br \/>\r\ndate = {2000-01-01},<br \/>\r\njournal = {Journal of Intelligent Information Systems (JIIS)},<br \/>\r\nvolume = {14},<br \/>\r\nnumber = {1},<br \/>\r\npages = {5--27},<br \/>\r\nabstract = {This paper presents a system for autonomous information gathering in an information rich domain under time and monetary resource restrictions. The system gathers information using an explicit representation of the user's decision model and a database of information sources. Information gathering actions (queries) are scheduled myopically by selecting the query with the highest marginal value. This value is determined by the value of the information with respect to the decision being made, the responsiveness of the information source, and a given resource cost function. Finally, we compare the value-driven approach to several base-line techniques and show that the overhead of the meta-level control is made up for by the increased decision quality.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1058','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1058\" style=\"display:none;\"><div class=\"tp_abstract_entry\">This paper presents a system for autonomous information gathering in an information rich domain under time and monetary resource restrictions. The system gathers information using an explicit representation of the user's decision model and a database of information sources. Information gathering actions (queries) are scheduled myopically by selecting the query with the highest marginal value. This value is determined by the value of the information with respect to the decision being made, the responsiveness of the information source, and a given resource cost function. Finally, we compare the value-driven approach to several base-line techniques and show that the overhead of the meta-level control is made up for by the increased decision quality.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1058','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1058\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/GZjiis00.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/GZjiis00.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/GZjiis00.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1023\/A:1008718418982\" title=\"Follow DOI:10.1023\/A:1008718418982\" target=\"_blank\">doi:10.1023\/A:1008718418982<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1058','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Zilberstein, Shlomo;  Charpillet, Francois;  Chassaing, Philippe<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1064','tp_links')\" style=\"cursor:pointer;\">Real-Time Problem-Solving with Contract Algorithms<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 16th International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_address\">Stockholm, Sweden, <\/span><span class=\"tp_pub_additional_year\">1999<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1064\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1064','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1064\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1064','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1064\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1064','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1064\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:ZCCijcai99,<br \/>\r\ntitle = {Real-Time Problem-Solving with Contract Algorithms},<br \/>\r\nauthor = {Shlomo Zilberstein and Francois Charpillet and Philippe Chassaing},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZCCijcai99.pdf},<br \/>\r\nyear  = {1999},<br \/>\r\ndate = {1999-01-01},<br \/>\r\nbooktitle = {Proceedings of the 16th International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\npages = {1008--1015},<br \/>\r\naddress = {Stockholm, Sweden},<br \/>\r\nabstract = {This paper addresses the problem of building an interruptible real-time system using contract algorithms. Contract algorithms offer a tradeoff between computation time and quality of results, but their run-time must be determined when they are activated. Many AI techniques provide useful contract algorithms that are not interruptible. We show how to optimally sequence contract algorithms to create the best interruptible system with or without stochastic information about the deadline. These results extend the foundation of real-time problem-solving and provide useful guidance for embedding contract algorithms in applications.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1064','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1064\" style=\"display:none;\"><div class=\"tp_abstract_entry\">This paper addresses the problem of building an interruptible real-time system using contract algorithms. Contract algorithms offer a tradeoff between computation time and quality of results, but their run-time must be determined when they are activated. Many AI techniques provide useful contract algorithms that are not interruptible. We show how to optimally sequence contract algorithms to create the best interruptible system with or without stochastic information about the deadline. These results extend the foundation of real-time problem-solving and provide useful guidance for embedding contract algorithms in applications.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1064','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1064\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZCCijcai99.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZCCijcai99.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZCCijcai99.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1064','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Marengoni, Mauricio;  Hanson, Allen;  Zilberstein, Shlomo;  Riseman, Edward<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1066','tp_links')\" style=\"cursor:pointer;\">Control in a 3D Reconstruction System Using Selective Perception<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 7th IEEE International Conference on Computer Vision (ICCV), <\/span><span class=\"tp_pub_additional_address\">Kerkyra, Greece, <\/span><span class=\"tp_pub_additional_year\">1999<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1066\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1066','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1066\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1066','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1066\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1066','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1066\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:MHZRiccv99,<br \/>\r\ntitle = {Control in a 3D Reconstruction System Using Selective Perception},<br \/>\r\nauthor = {Mauricio Marengoni and Allen Hanson and Shlomo Zilberstein and Edward Riseman},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MHZRiccv99.pdf},<br \/>\r\ndoi = {10.1109\/ICCV.1999.790421},<br \/>\r\nyear  = {1999},<br \/>\r\ndate = {1999-01-01},<br \/>\r\nbooktitle = {Proceedings of the 7th IEEE International Conference on Computer Vision (ICCV)},<br \/>\r\npages = {1229--1236},<br \/>\r\naddress = {Kerkyra, Greece},<br \/>\r\nabstract = {This paper presents a control structure for general purpose image understanding that addresses both the high level of uncertainty in local hypotheses and the computational complexity of image interpretation. The control of vision algorithms is performed by an independent subsystem that uses Bayesian networks and utility theory to compute the marginal value of information provided by alternative operators and selects the ones with the highest value. We have implemented and tested this control structure with several aerial image datasets. The results show that the knowledge base used by the system can be acquired using standard learning techniques and that the value-driven approach to the selection of vision algorithms leads to performance gains. Moreover, the modular system architecture simplifies the addition of both control knowledge and new vision algorithms.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1066','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1066\" style=\"display:none;\"><div class=\"tp_abstract_entry\">This paper presents a control structure for general purpose image understanding that addresses both the high level of uncertainty in local hypotheses and the computational complexity of image interpretation. The control of vision algorithms is performed by an independent subsystem that uses Bayesian networks and utility theory to compute the marginal value of information provided by alternative operators and selects the ones with the highest value. We have implemented and tested this control structure with several aerial image datasets. The results show that the knowledge base used by the system can be acquired using standard learning techniques and that the value-driven approach to the selection of vision algorithms leads to performance gains. Moreover, the modular system architecture simplifies the addition of both control knowledge and new vision algorithms.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1066','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1066\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MHZRiccv99.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MHZRiccv99.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MHZRiccv99.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1109\/ICCV.1999.790421\" title=\"Follow DOI:10.1109\/ICCV.1999.790421\" target=\"_blank\">doi:10.1109\/ICCV.1999.790421<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1066','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_techreport\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Hansen, Eric A;  Zilberstein, Shlomo;  Danilchenko, Victor A<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1076','tp_links')\" style=\"cursor:pointer;\">Anytime Heuristic Search: First Results<\/a> <span class=\"tp_pub_type tp_  techreport\">Technical Report<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_institution\">Computer Science Department, University of Massachussetts Amherst <\/span><span class=\"tp_pub_additional_number\">no. 97-50, <\/span><span class=\"tp_pub_additional_year\">1997<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1076\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1076','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1076\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1076','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1076\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1076','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1076\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@techreport{SZ:HZDtr9750,<br \/>\r\ntitle = {Anytime Heuristic Search: First Results},<br \/>\r\nauthor = {Eric A Hansen and Shlomo Zilberstein and Victor A Danilchenko},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HZDtr9750.pdf},<br \/>\r\nyear  = {1997},<br \/>\r\ndate = {1997-01-01},<br \/>\r\nnumber = {97-50},<br \/>\r\ninstitution = {Computer Science Department, University of Massachussetts Amherst},<br \/>\r\nabstract = {We describe a simple technique for converting heuristic search algorithms into anytime algorithms that offer a tradeoff between search time and solution quality. The technique is related to work on use of non-admissible evaluation functions that make it possible to find good, but possibly sub-optimal, solutions more quickly than it takes to find an optimal solution. Instead of stopping the search after the first solution is found, however, we continue the search in order to find a sequence of improved solutions that eventually converges to an optimal solution. The performance of anytime heuristic search depends on the non-admissible evaluation function that guides the search. We discuss how to design a search heuristic that \"optimizes\" the rate at which the currently available solution improves.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {techreport}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1076','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1076\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We describe a simple technique for converting heuristic search algorithms into anytime algorithms that offer a tradeoff between search time and solution quality. The technique is related to work on use of non-admissible evaluation functions that make it possible to find good, but possibly sub-optimal, solutions more quickly than it takes to find an optimal solution. Instead of stopping the search after the first solution is found, however, we continue the search in order to find a sequence of improved solutions that eventually converges to an optimal solution. The performance of anytime heuristic search depends on the non-admissible evaluation function that guides the search. We discuss how to design a search heuristic that \"optimizes\" the rate at which the currently available solution improves.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1076','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1076\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HZDtr9750.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HZDtr9750.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HZDtr9750.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1076','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Zilberstein, Shlomo;  Russell, Stuart J<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1077','tp_links')\" style=\"cursor:pointer;\">Optimal Composition of Real-Time Systems<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Artificial Intelligence (AIJ), <\/span><span class=\"tp_pub_additional_volume\">vol. 82, <\/span><span class=\"tp_pub_additional_number\">no. 1-2, <\/span><span class=\"tp_pub_additional_pages\">pp. 181\u2013213, <\/span><span class=\"tp_pub_additional_year\">1996<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1077\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1077','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1077\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1077','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1077\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1077','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1077\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:ZRaij96,<br \/>\r\ntitle = {Optimal Composition of Real-Time Systems},<br \/>\r\nauthor = {Shlomo Zilberstein and Stuart J Russell},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZRaij96.pdf},<br \/>\r\nyear  = {1996},<br \/>\r\ndate = {1996-01-01},<br \/>\r\njournal = {Artificial Intelligence (AIJ)},<br \/>\r\nvolume = {82},<br \/>\r\nnumber = {1-2},<br \/>\r\npages = {181--213},<br \/>\r\nabstract = {Real-time systems are designed for environments in which the utility of actions is strongly time-dependent. Recent work by Dean, Horvitz and others has shown that anytime algorithms are a useful tool for real-time system design, since they allow computation time to be traded for decision quality. In order to construct complex systems, however, we need to be able to compose larger systems from smaller, reusable anytime modules. This paper addresses two basic problems associated with composition: how to ensure the interruptibility of the composed system; and how to allocate computation time optimally among the components. The first problem is solved by a simple and general construction that incurs only a small, constant penalty. The second is solved by an off-line compilation process. We show that the general compilation problem is NP-complete. However, efficient local compilation techniques, working on a single program structure at a time, yield globally optimal allocations for a large class of programs. We illustrate these results with two simple applications.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1077','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1077\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Real-time systems are designed for environments in which the utility of actions is strongly time-dependent. Recent work by Dean, Horvitz and others has shown that anytime algorithms are a useful tool for real-time system design, since they allow computation time to be traded for decision quality. In order to construct complex systems, however, we need to be able to compose larger systems from smaller, reusable anytime modules. This paper addresses two basic problems associated with composition: how to ensure the interruptibility of the composed system; and how to allocate computation time optimally among the components. The first problem is solved by a simple and general construction that incurs only a small, constant penalty. The second is solved by an off-line compilation process. We show that the general compilation problem is NP-complete. However, efficient local compilation techniques, working on a single program structure at a time, yield globally optimal allocations for a large class of programs. We illustrate these results with two simple applications.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1077','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1077\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZRaij96.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZRaij96.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZRaij96.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1077','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1079','tp_links')\" style=\"cursor:pointer;\">Using Anytime Algorithms in Intelligent Systems<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">AI Magazine, <\/span><span class=\"tp_pub_additional_volume\">vol. 17, <\/span><span class=\"tp_pub_additional_number\">no. 3, <\/span><span class=\"tp_pub_additional_pages\">pp. 73\u201383, <\/span><span class=\"tp_pub_additional_year\">1996<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1079\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1079','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1079\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1079','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1079\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1079','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1079\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:Zaimag96,<br \/>\r\ntitle = {Using Anytime Algorithms in Intelligent Systems},<br \/>\r\nauthor = {Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/Zaimag96.pdf},<br \/>\r\nyear  = {1996},<br \/>\r\ndate = {1996-01-01},<br \/>\r\njournal = {AI Magazine},<br \/>\r\nvolume = {17},<br \/>\r\nnumber = {3},<br \/>\r\npages = {73--83},<br \/>\r\nabstract = {Anytime algorithms give intelligent systems the capability to trade off deliberation time for quality of results. This capability is essential for successful operation in domains such as signal interpretation, real-time diagnosis and repair, and mobile robot control. What characterizes these domains is that it is not feasible (computationally) or desirable (economically) to compute the optimal answer. This paper surveys the main control problems that arise when a system is composed of several anytime algorithms. These problems relate to optimal management of uncertainty and precision. After a brief introduction to anytime computation, the paper outlines a wide range of existing solutions to the meta-level control problem and describes current work that is aimed at increasing the applicability of anytime computation.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1079','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1079\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Anytime algorithms give intelligent systems the capability to trade off deliberation time for quality of results. This capability is essential for successful operation in domains such as signal interpretation, real-time diagnosis and repair, and mobile robot control. What characterizes these domains is that it is not feasible (computationally) or desirable (economically) to compute the optimal answer. This paper surveys the main control problems that arise when a system is composed of several anytime algorithms. These problems relate to optimal management of uncertainty and precision. After a brief introduction to anytime computation, the paper outlines a wide range of existing solutions to the meta-level control problem and describes current work that is aimed at increasing the applicability of anytime computation.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1079','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1079\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/Zaimag96.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/Zaimag96.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/Zaimag96.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1079','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Hansen, Eric A;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1083','tp_links')\" style=\"cursor:pointer;\">Monitoring the Progress of Anytime Problem-Solving<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 13th National Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_address\">Portland, Oregon, <\/span><span class=\"tp_pub_additional_year\">1996<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1083\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1083','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1083\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1083','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1083\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1083','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1083\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:HZaaai96,<br \/>\r\ntitle = {Monitoring the Progress of Anytime Problem-Solving},<br \/>\r\nauthor = {Eric A Hansen and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HZaaai96.pdf},<br \/>\r\nyear  = {1996},<br \/>\r\ndate = {1996-01-01},<br \/>\r\nbooktitle = {Proceedings of the 13th National Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {1229--1234},<br \/>\r\naddress = {Portland, Oregon},<br \/>\r\nabstract = {Anytime algorithms offer a tradeoff between solution quality and computation time that has proved useful in applying artificial intelligence techniques to time-critical problems. To exploit this tradeoff, a system must be able to determine the best time to stop deliberation and act on the currently available solution. When the rate of improvement of solution quality is uncertain, monitoring the progress of the algorithm can improve the utility of the system. This paper introduces a technique for run-time monitoring of anytime algorithms that is sensitive to the variance of the algorithm's performance, the time-dependent utility of a solution, the ability of the run-time monitor to estimate the quality of the currently available solution, and the cost of monitoring. The paper examines the conditions under which the technique is optimal and demonstrates its applicability.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1083','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1083\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Anytime algorithms offer a tradeoff between solution quality and computation time that has proved useful in applying artificial intelligence techniques to time-critical problems. To exploit this tradeoff, a system must be able to determine the best time to stop deliberation and act on the currently available solution. When the rate of improvement of solution quality is uncertain, monitoring the progress of the algorithm can improve the utility of the system. This paper introduces a technique for run-time monitoring of anytime algorithms that is sensitive to the variance of the algorithm's performance, the time-dependent utility of a solution, the ability of the run-time monitor to estimate the quality of the currently available solution, and the cost of monitoring. The paper examines the conditions under which the technique is optimal and demonstrates its applicability.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1083','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1083\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HZaaai96.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HZaaai96.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HZaaai96.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1083','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1089','tp_links')\" style=\"cursor:pointer;\">Optimizing Decision Quality with Contract Algorithms<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_address\">Montreal, Canada, <\/span><span class=\"tp_pub_additional_year\">1995<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1089\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1089','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1089\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1089','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1089\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1089','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1089\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:Zijcai95,<br \/>\r\ntitle = {Optimizing Decision Quality with Contract Algorithms},<br \/>\r\nauthor = {Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/Zijcai95.pdf},<br \/>\r\nyear  = {1995},<br \/>\r\ndate = {1995-01-01},<br \/>\r\nbooktitle = {Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\npages = {1576--1582},<br \/>\r\naddress = {Montreal, Canada},<br \/>\r\nabstract = {Contract algorithms offer a tradeoff between output quality and computation time, provided that the amount of computation time is determined prior to their activation. Originally, they were introduced as an intermediate step in the composition of interruptible anytime algorithms. However, for many real-time tasks such as information gathering, game playing, and a large class of planning problems, contract algorithms offer an ideal mechanism to optimize decision quality. This paper extends previous results regarding the meta-level control of contract algorithms by handling a more general type of performance description. The output quality of each contract algorithm is described by a probabilistic (rather than deterministic) conditional performance profile. Such profiles map input quality and computation time to a probability distribution of output quality. The composition problem is solved by an efficient off-line compilation technique that simplifies the run-time monitoring task.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1089','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1089\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Contract algorithms offer a tradeoff between output quality and computation time, provided that the amount of computation time is determined prior to their activation. Originally, they were introduced as an intermediate step in the composition of interruptible anytime algorithms. However, for many real-time tasks such as information gathering, game playing, and a large class of planning problems, contract algorithms offer an ideal mechanism to optimize decision quality. This paper extends previous results regarding the meta-level control of contract algorithms by handling a more general type of performance description. The output quality of each contract algorithm is described by a probabilistic (rather than deterministic) conditional performance profile. Such profiles map input quality and computation time to a probability distribution of output quality. The composition problem is solved by an efficient off-line compilation technique that simplifies the run-time monitoring task.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1089','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1089\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/Zijcai95.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/Zijcai95.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/Zijcai95.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1089','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Mouaddib, Abdel-illah;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1090','tp_links')\" style=\"cursor:pointer;\">Knowledge-Based Anytime Computation<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_address\">Montreal, Canada, <\/span><span class=\"tp_pub_additional_year\">1995<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1090\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1090','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1090\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1090','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1090\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1090','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1090\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:MZijcai95,<br \/>\r\ntitle = {Knowledge-Based Anytime Computation},<br \/>\r\nauthor = {Abdel-illah Mouaddib and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZijcai95.pdf},<br \/>\r\nyear  = {1995},<br \/>\r\ndate = {1995-01-01},<br \/>\r\nbooktitle = {Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\npages = {775--781},<br \/>\r\naddress = {Montreal, Canada},<br \/>\r\nabstract = {This paper describes a real-time decision-making model that combines the expressiveness and flexibility of knowledge-based systems with the real-time advantages of anytime algorithms. Anytime algorithms offer a simple means by which an intelligent system can trade off computation time for quality of results. Previous attempts to develop knowledge-based anytime algorithms failed to produce consistent, predictable improvement of quality over time. Without performance profiles, that describe the output quality as a function of time, it is hard to exploit the flexibility of anytime algorithms. The model of progressive reasoning that is presented here is based on a hierarchy of reasoning units that allow for gradual improvement of decision quality in a predictable manner. The result is an important step towards the application of knowledge-based systems in time-critical domains.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1090','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1090\" style=\"display:none;\"><div class=\"tp_abstract_entry\">This paper describes a real-time decision-making model that combines the expressiveness and flexibility of knowledge-based systems with the real-time advantages of anytime algorithms. Anytime algorithms offer a simple means by which an intelligent system can trade off computation time for quality of results. Previous attempts to develop knowledge-based anytime algorithms failed to produce consistent, predictable improvement of quality over time. Without performance profiles, that describe the output quality as a function of time, it is hard to exploit the flexibility of anytime algorithms. The model of progressive reasoning that is presented here is based on a hierarchy of reasoning units that allow for gradual improvement of decision quality in a predictable manner. The result is an important step towards the application of knowledge-based systems in time-critical domains.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1090','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1090\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZijcai95.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZijcai95.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZijcai95.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1090','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_phdthesis\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1094','tp_links')\" style=\"cursor:pointer;\">Operational Rationality through Compilation of Anytime Algorithms<\/a> <span class=\"tp_pub_type tp_  phdthesis\">PhD Thesis<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_school\">Computer Science Division, University of California Berkeley, <\/span><span class=\"tp_pub_additional_year\">1993<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1094\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1094','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1094\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1094','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1094\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1094','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1094\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@phdthesis{SZ:Zshort93,<br \/>\r\ntitle = {Operational Rationality through Compilation of Anytime Algorithms},<br \/>\r\nauthor = {Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/Zshort93.pdf},<br \/>\r\nyear  = {1993},<br \/>\r\ndate = {1993-01-01},<br \/>\r\nschool = {Computer Science Division, University of California Berkeley},<br \/>\r\nabstract = {An important and largely ignored aspect of real-time decision making is the capability of agents to factor the cost of deliberation into the decision making process. I have developed an efficient model that creates this capability. The model uses as basic components anytime algorithms whose quality of results improves gradually as computation time increases. The main contribution of this work is a compilation process that extends the property of gradual improvement from the level of single algorithms to the level of complex systems. <br \/>\r\nIn standard algorithms, the fixed quality of the output allows for composition to be implemented by a simple call-return mechanism. However, when algorithms have resource allocation as a degree of freedom, there arises the question of how to construct, for example, the optimal composition of two anytime algorithms, one of which feeds its output to the other. This scheduling problem is solved by an off-line compilation process and a run-time monitoring component that together generate a utility maximizing behavior. The crucial meta-level knowledge is kept in the anytime library in the form of conditional performance profiles. These profiles characterize the performance of each elementary anytime algorithm as a function of run-time and input quality. The compilation process therefore extends the principles of procedural abstraction and modularity to anytime computation. Its efficiency is significantly improved by using local compilation that works on a single program structure at a time. Local compilation is proved to yield global optimality for a large set of program structures. <br \/>\r\nCompilation produces contract algorithms which require the determination of the total run-time when activated. Some real-time domains require interruptible algorithms whose total run-time is unknown in advance. An important result of this work is a general method by which an interruptible algorithm can be constructed once a contract algorithm is compiled. Finally, the notion of gradual improvement of quality is extended to sensing and plan execution and the application of the model is demonstrated through a simulated robot navigation system. The result is a modular approach for developing real-time agents that act by performing anytime actions and make decisions using anytime computation.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {phdthesis}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1094','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1094\" style=\"display:none;\"><div class=\"tp_abstract_entry\">An important and largely ignored aspect of real-time decision making is the capability of agents to factor the cost of deliberation into the decision making process. I have developed an efficient model that creates this capability. The model uses as basic components anytime algorithms whose quality of results improves gradually as computation time increases. The main contribution of this work is a compilation process that extends the property of gradual improvement from the level of single algorithms to the level of complex systems. <br \/>\r\nIn standard algorithms, the fixed quality of the output allows for composition to be implemented by a simple call-return mechanism. However, when algorithms have resource allocation as a degree of freedom, there arises the question of how to construct, for example, the optimal composition of two anytime algorithms, one of which feeds its output to the other. This scheduling problem is solved by an off-line compilation process and a run-time monitoring component that together generate a utility maximizing behavior. The crucial meta-level knowledge is kept in the anytime library in the form of conditional performance profiles. These profiles characterize the performance of each elementary anytime algorithm as a function of run-time and input quality. The compilation process therefore extends the principles of procedural abstraction and modularity to anytime computation. Its efficiency is significantly improved by using local compilation that works on a single program structure at a time. Local compilation is proved to yield global optimality for a large set of program structures. <br \/>\r\nCompilation produces contract algorithms which require the determination of the total run-time when activated. Some real-time domains require interruptible algorithms whose total run-time is unknown in advance. An important result of this work is a general method by which an interruptible algorithm can be constructed once a contract algorithm is compiled. Finally, the notion of gradual improvement of quality is extended to sensing and plan execution and the application of the model is demonstrated through a simulated robot navigation system. The result is a modular approach for developing real-time agents that act by performing anytime actions and make decisions using anytime computation.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1094','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1094\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/Zshort93.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/Zshort93.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/Zshort93.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1094','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Zilberstein, Shlomo;  Russell, Stuart J<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1095','tp_links')\" style=\"cursor:pointer;\">Anytime Sensing, Planning and Action: A Practical Model for Robot Control<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 13th International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_address\">Chambery, France, <\/span><span class=\"tp_pub_additional_year\">1993<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1095\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1095','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1095\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1095','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1095\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1095','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1095\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:ZRijcai93,<br \/>\r\ntitle = {Anytime Sensing, Planning and Action: A Practical Model for Robot Control},<br \/>\r\nauthor = {Shlomo Zilberstein and Stuart J Russell},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZRijcai93.pdf},<br \/>\r\nyear  = {1993},<br \/>\r\ndate = {1993-01-01},<br \/>\r\nbooktitle = {Proceedings of the 13th International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\npages = {1402--1407},<br \/>\r\naddress = {Chambery, France},<br \/>\r\nabstract = {Anytime algorithms, whose quality of results improves gradually as computation time increases, provide useful performance components for time-critical planning and control of robotic systems. In earlier work, we introduced a compilation scheme for optimal composition of anytime algorithms. In this paper we present an implementation of a navigation system in which an off-line compilation process and a run-time monitoring component guarantee the optimal allocation of time to the anytime modules. The crucial meta-level knowledge is kept in the anytime library in the form of conditional performance profiles. We also extend the notion of gradual improvement to sensing and plan execution. The result is an efficient, flexible control for robotic systems that exploits the tradeoff between time and quality in planning, sensing and plan execution.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1095','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1095\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Anytime algorithms, whose quality of results improves gradually as computation time increases, provide useful performance components for time-critical planning and control of robotic systems. In earlier work, we introduced a compilation scheme for optimal composition of anytime algorithms. In this paper we present an implementation of a navigation system in which an off-line compilation process and a run-time monitoring component guarantee the optimal allocation of time to the anytime modules. The crucial meta-level knowledge is kept in the anytime library in the form of conditional performance profiles. We also extend the notion of gradual improvement to sensing and plan execution. The result is an efficient, flexible control for robotic systems that exploits the tradeoff between time and quality in planning, sensing and plan execution.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1095','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1095\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZRijcai93.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZRijcai93.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZRijcai93.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1095','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Russell, Stuart J;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1097','tp_links')\" style=\"cursor:pointer;\">Composing Real-Time Systems<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 12th International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_address\">Sydney, Australia, <\/span><span class=\"tp_pub_additional_year\">1991<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1097\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1097','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1097\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1097','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1097\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1097','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1097\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:RZijcai91,<br \/>\r\ntitle = {Composing Real-Time Systems},<br \/>\r\nauthor = {Stuart J Russell and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/RZijcai91.pdf},<br \/>\r\nyear  = {1991},<br \/>\r\ndate = {1991-01-01},<br \/>\r\nbooktitle = {Proceedings of the 12th International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\npages = {212--217},<br \/>\r\naddress = {Sydney, Australia},<br \/>\r\nabstract = {We present a method to construct real-time systems using as components anytime algorithms whose quality of results degrades gracefully as computation time decreases. Introducing computation time as a degree of freedom defines a scheduling problem involving the activation and interruption of the anytime components. This scheduling problem is especially complicated when trying to construct interruptible algorithms, whose total run-time is unknown in advance. We introduce a framework to measure the performance of anytime algorithms and solve the problem of constructing interruptible algorithms by a mathematical reduction to the problem of constructing contract algorithms, which require the determination of the total run-time when activated. We show how the composition of anytime algorithms can be mechanized as part of a compiler for a LISP-like programming language for real-time systems. The result is a new approach to the construction of complex real-time systems that separates the arrangement of the performance components from the optimization of their scheduling, and automates the latter task.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1097','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1097\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We present a method to construct real-time systems using as components anytime algorithms whose quality of results degrades gracefully as computation time decreases. Introducing computation time as a degree of freedom defines a scheduling problem involving the activation and interruption of the anytime components. This scheduling problem is especially complicated when trying to construct interruptible algorithms, whose total run-time is unknown in advance. We introduce a framework to measure the performance of anytime algorithms and solve the problem of constructing interruptible algorithms by a mathematical reduction to the problem of constructing contract algorithms, which require the determination of the total run-time when activated. We show how the composition of anytime algorithms can be mechanized as part of a compiler for a LISP-like programming language for real-time systems. The result is a new approach to the construction of complex real-time systems that separates the arrangement of the performance components from the optimization of their scheduling, and automates the latter task.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1097','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1097\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/RZijcai91.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/RZijcai91.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/RZijcai91.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1097','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><\/table><\/div><\/div>\n<div><\/div><\/div><\/div>\n<h3><span style=\"color: #264278\"><b>Models of Bounded Rationality<\/b><\/span><\/h3>\n<div>\n<div>What does it mean for an agent to be \u201crational\u201d when it does not have enough knowledge or computational power to derive the best course of action?<\/div>\n<div><div class=\"bg-margin-for-link\"><input type='hidden' bg_collapse_expand='6a2d3028e30f65007594290' value='6a2d3028e30f65007594290'><input type='hidden' id='bg-show-more-text-6a2d3028e30f65007594290' value='Show Related Publications'><input type='hidden' id='bg-show-less-text-6a2d3028e30f65007594290' value='Hide Related Publications'><a id='bg-showmore-action-6a2d3028e30f65007594290' class='bg-showmore-plg-link bg-arrow '  style=\" color:#7C2622;;\" href='#'>Show Related Publications<\/a><div id='bg-showmore-hidden-6a2d3028e30f65007594290' ><div class=\"teachpress_pub_list\"><form name=\"tppublistform\" method=\"get\"><a name=\"tppubs\" id=\"tppubs\"><\/a><\/form><table class=\"teachpress_publication_list\"><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Svegliato, Justin;  Basich, Connor;  Saisubramanian, Sandhya;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1159','tp_links')\" style=\"cursor:pointer;\">Metareasoning for Safe Decision Making in Autonomous Systems<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), <\/span><span class=\"tp_pub_additional_address\">Philadelphia, Pennsylvania, <\/span><span class=\"tp_pub_additional_year\">2022<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1159\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1159','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1159\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1159','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1159\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1159','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1159\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SBSZicra22,<br \/>\r\ntitle = {Metareasoning for Safe Decision Making in Autonomous Systems},<br \/>\r\nauthor = {Justin Svegliato and Connor Basich and Sandhya Saisubramanian and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SBSZicra22.pdf},<br \/>\r\nyear  = {2022},<br \/>\r\ndate = {2022-01-01},<br \/>\r\nbooktitle = {Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},<br \/>\r\naddress = {Philadelphia, Pennsylvania},<br \/>\r\nabstract = {Although experts carefully specify the high-level decision-making models in autonomous systems, it is infeasible to guarantee safety across every scenario during operation. We therefore propose a safety metareasoning system that optimizes the severity of the system's safety concerns and the interference to the system's task: the system executes in parallel a task process that completes a specified task and safety processes that each address a specified safety concern with a conflict resolver for arbitration. This paper offers a formal definition of a safety metareasoning system, a recommendation algorithm for a safety process, an arbitration algorithm for a conflict resolver, an application of our approach to planetary rover exploration, and a demonstration that our approach is effective in simulation.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1159','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1159\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Although experts carefully specify the high-level decision-making models in autonomous systems, it is infeasible to guarantee safety across every scenario during operation. We therefore propose a safety metareasoning system that optimizes the severity of the system's safety concerns and the interference to the system's task: the system executes in parallel a task process that completes a specified task and safety processes that each address a specified safety concern with a conflict resolver for arbitration. This paper offers a formal definition of a safety metareasoning system, a recommendation algorithm for a safety process, an arbitration algorithm for a conflict resolver, an application of our approach to planetary rover exploration, and a demonstration that our approach is effective in simulation.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1159','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1159\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SBSZicra22.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SBSZicra22.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SBSZicra22.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1159','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Bhatia, Abhinav;  Svegliato, Justin;  Nashed, Samer B.;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1160','tp_links')\" style=\"cursor:pointer;\">Tuning the Hyperparameters of Anytime Planning: A Metareasoning Approach with Deep Reinforcement Learning<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 32nd International Conference on Automated Planning and Scheduling (ICAPS), <\/span><span class=\"tp_pub_additional_address\">Virtual Conference, <\/span><span class=\"tp_pub_additional_year\">2022<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1160\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1160','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1160\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1160','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1160\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1160','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1160\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:BSNZicaps22,<br \/>\r\ntitle = {Tuning the Hyperparameters of Anytime Planning: A Metareasoning Approach with Deep Reinforcement Learning},<br \/>\r\nauthor = {Abhinav Bhatia and Justin Svegliato and Samer B. Nashed and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSNZicaps22.pdf},<br \/>\r\nyear  = {2022},<br \/>\r\ndate = {2022-01-01},<br \/>\r\nbooktitle = {Proceedings of the 32nd International Conference on Automated Planning and Scheduling (ICAPS)},<br \/>\r\naddress = {Virtual Conference},<br \/>\r\nabstract = {Anytime planning algorithms often have hyperparameters that can be tuned at runtime to optimize their performance. While work on metareasoning has focused on when to interrupt an anytime planner and act on the current plan, the scope of metareasoning can be expanded to tuning the hyperparameters of the anytime planner at runtime. This paper introduces a general, decision-theoretic metareasoning approach that optimizes both the stopping point and hyperparameters of any- time planning. We begin by proposing a generalization of the standard meta-level control problem for anytime algorithms. We then offer a meta-level control technique that monitors and controls an anytime algorithm using deep reinforcement learning. Finally, we show that our approach boosts performance on a common benchmark domain that uses anytime weighted A* to solve a range of heuristic search problems and a mobile robot application that uses RRT* to solve motion planning problems.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1160','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1160\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Anytime planning algorithms often have hyperparameters that can be tuned at runtime to optimize their performance. While work on metareasoning has focused on when to interrupt an anytime planner and act on the current plan, the scope of metareasoning can be expanded to tuning the hyperparameters of the anytime planner at runtime. This paper introduces a general, decision-theoretic metareasoning approach that optimizes both the stopping point and hyperparameters of any- time planning. We begin by proposing a generalization of the standard meta-level control problem for anytime algorithms. We then offer a meta-level control technique that monitors and controls an anytime algorithm using deep reinforcement learning. Finally, we show that our approach boosts performance on a common benchmark domain that uses anytime weighted A* to solve a range of heuristic search problems and a mobile robot application that uses RRT* to solve motion planning problems.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1160','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1160\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSNZicaps22.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSNZicaps22.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSNZicaps22.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1160','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Svegliato, Justin;  Sharma, Prakhar;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1120','tp_links')\" style=\"cursor:pointer;\">A Model-Free Approach to Meta-Level Control of Anytime Algorithms<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), <\/span><span class=\"tp_pub_additional_address\">Paris, France, <\/span><span class=\"tp_pub_additional_year\">2020<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1120\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1120','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1120\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1120','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1120\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1120','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1120\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SSZicra20,<br \/>\r\ntitle = {A Model-Free Approach to Meta-Level Control of Anytime Algorithms},<br \/>\r\nauthor = {Justin Svegliato and Prakhar Sharma and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SSZicra20.pdf},<br \/>\r\nyear  = {2020},<br \/>\r\ndate = {2020-01-01},<br \/>\r\nbooktitle = {Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},<br \/>\r\naddress = {Paris, France},<br \/>\r\nabstract = {Anytime algorithms offer a trade-off between solution quality and computation time that has proven to be useful in autonomous systems for a wide range of real-time planning problems. In order to optimize this trade-off, an autonomous system has to solve a challenging meta-level control problem: it must decide when to interrupt the anytime algorithm and act on the current solution. Prevailing meta-level control techniques, however, make a number of unrealistic assumptions that reduce their effectiveness and usefulness in the real world. Eliminating these assumptions, we first introduce a model-free approach to meta-level control based on reinforcement learning and prove its optimality. We then offer a general meta-level control technique that can use different reinforcement learning methods. Finally, we show that our approach is effective across several common benchmark domains and a mobile robot domain.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1120','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1120\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Anytime algorithms offer a trade-off between solution quality and computation time that has proven to be useful in autonomous systems for a wide range of real-time planning problems. In order to optimize this trade-off, an autonomous system has to solve a challenging meta-level control problem: it must decide when to interrupt the anytime algorithm and act on the current solution. Prevailing meta-level control techniques, however, make a number of unrealistic assumptions that reduce their effectiveness and usefulness in the real world. Eliminating these assumptions, we first introduce a model-free approach to meta-level control based on reinforcement learning and prove its optimality. We then offer a general meta-level control technique that can use different reinforcement learning methods. Finally, we show that our approach is effective across several common benchmark domains and a mobile robot domain.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1120','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1120\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SSZicra20.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SSZicra20.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SSZicra20.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1120','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Carlin, Alan;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('935','tp_links')\" style=\"cursor:pointer;\">Decentralized Monitoring of Distributed Anytime Algorithms<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), <\/span><span class=\"tp_pub_additional_address\">Taipei, Taiwan, <\/span><span class=\"tp_pub_additional_year\">2011<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_935\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('935','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_935\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('935','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_935\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('935','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_935\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:CZaamas11,<br \/>\r\ntitle = {Decentralized Monitoring of Distributed Anytime Algorithms},<br \/>\r\nauthor = {Alan Carlin and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CZaamas11.pdf},<br \/>\r\nyear  = {2011},<br \/>\r\ndate = {2011-01-01},<br \/>\r\nbooktitle = {Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems (AAMAS)},<br \/>\r\npages = {157--164},<br \/>\r\naddress = {Taipei, Taiwan},<br \/>\r\nabstract = {Anytime algorithms allow a system to trade solution quality for computation time. In previous work, monitoring techniques have been developed to allow agents to stop the computation at the \"right\" time so as to optimize a given time-dependent utility function. However, these results apply only to the single-agent case. In this paper we analyze the problems that arise when several agents solve components of a larger problem, each using an anytime algorithm. Monitoring in this case is more challenging as each agent is uncertain about the progress made so far by the others. We develop a formal framework for decentralized monitoring, establish the complexity of several interesting variants of the problem, and propose solution techniques for each one. Finally, we show that the framework can be applied to decentralized flow and planning problems.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('935','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_935\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Anytime algorithms allow a system to trade solution quality for computation time. In previous work, monitoring techniques have been developed to allow agents to stop the computation at the \"right\" time so as to optimize a given time-dependent utility function. However, these results apply only to the single-agent case. In this paper we analyze the problems that arise when several agents solve components of a larger problem, each using an anytime algorithm. Monitoring in this case is more challenging as each agent is uncertain about the progress made so far by the others. We develop a formal framework for decentralized monitoring, establish the complexity of several interesting variants of the problem, and propose solution techniques for each one. Finally, we show that the framework can be applied to decentralized flow and planning problems.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('935','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_935\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CZaamas11.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CZaamas11.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CZaamas11.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('935','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_incollection\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('946','tp_links')\" style=\"cursor:pointer;\">Metareasoning and Bounded Rationality<\/a> <span class=\"tp_pub_type tp_  incollection\">Book Section<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span> Cox, M;  Raja, A (Ed.): <span class=\"tp_pub_additional_booktitle\">Metareasoning: Thinking about Thinking, <\/span><span class=\"tp_pub_additional_pages\">pp. 27\u201340, <\/span><span class=\"tp_pub_additional_publisher\">MIT Press, <\/span><span class=\"tp_pub_additional_address\">Cambridge, MA, USA, <\/span><span class=\"tp_pub_additional_year\">2011<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_resource_link\"><a id=\"tp_links_sh_946\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('946','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_946\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('946','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_946\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@incollection{SZ:Zmetareasoning11,<br \/>\r\ntitle = {Metareasoning and Bounded Rationality},<br \/>\r\nauthor = {Shlomo Zilberstein},<br \/>\r\neditor = {M Cox and A Raja},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZCh3-2011.pdf},<br \/>\r\nyear  = {2011},<br \/>\r\ndate = {2011-01-01},<br \/>\r\nbooktitle = {Metareasoning: Thinking about Thinking},<br \/>\r\npages = {27--40},<br \/>\r\npublisher = {MIT Press},<br \/>\r\naddress = {Cambridge, MA, USA},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {incollection}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('946','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_946\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZCh3-2011.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZCh3-2011.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZCh3-2011.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('946','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_incollection\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Carling, Alan;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('947','tp_links')\" style=\"cursor:pointer;\">Bounded Rationality in Multiagent Systems Using Decentralized Metareasoning<\/a> <span class=\"tp_pub_type tp_  incollection\">Book Section<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span> Guy, T;  Karny, M;  Wolpert, D (Ed.): <span class=\"tp_pub_additional_booktitle\">Decision Making with Imperfect Decision Makers, <\/span><span class=\"tp_pub_additional_pages\">pp. 1\u201328, <\/span><span class=\"tp_pub_additional_publisher\">Springer, <\/span><span class=\"tp_pub_additional_address\">Berlin, Heidelberg, <\/span><span class=\"tp_pub_additional_year\">2011<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_resource_link\"><a id=\"tp_links_sh_947\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('947','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_947\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('947','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_947\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@incollection{SZ:CZdecisionmaking11,<br \/>\r\ntitle = {Bounded Rationality in Multiagent Systems Using Decentralized Metareasoning},<br \/>\r\nauthor = {Alan Carling and Shlomo Zilberstein},<br \/>\r\neditor = {T Guy and M Karny and D Wolpert},<br \/>\r\nurl = {http:\/\/www.springerlink.com\/content\/g136745180478228\/},<br \/>\r\ndoi = {https:\/\/doi.org\/10.1007\/978-3-642-24647-0},<br \/>\r\nyear  = {2011},<br \/>\r\ndate = {2011-01-01},<br \/>\r\nbooktitle = {Decision Making with Imperfect Decision Makers},<br \/>\r\npages = {1--28},<br \/>\r\npublisher = {Springer},<br \/>\r\naddress = {Berlin, Heidelberg},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {incollection}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('947','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_947\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/www.springerlink.com\/content\/g136745180478228\/\" title=\"http:\/\/www.springerlink.com\/content\/g136745180478228\/\" target=\"_blank\">http:\/\/www.springerlink.com\/content\/g136745180478228\/<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/https:\/\/doi.org\/10.1007\/978-3-642-24647-0\" title=\"Follow DOI:https:\/\/doi.org\/10.1007\/978-3-642-24647-0\" target=\"_blank\">doi:https:\/\/doi.org\/10.1007\/978-3-642-24647-0<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('947','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Petrik, Marek;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1005','tp_links')\" style=\"cursor:pointer;\">Learning Parallel Portfolios of Algorithms<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Annals of Mathematics and Artificial Intelligence (AMAI), <\/span><span class=\"tp_pub_additional_volume\">vol. 48, <\/span><span class=\"tp_pub_additional_number\">no. 1-2, <\/span><span class=\"tp_pub_additional_pages\">pp. 85\u2013106, <\/span><span class=\"tp_pub_additional_year\">2006<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1005\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1005','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1005\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1005','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1005\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1005','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1005\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:PZamai06,<br \/>\r\ntitle = {Learning Parallel Portfolios of Algorithms},<br \/>\r\nauthor = {Marek Petrik and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZamai06.pdf},<br \/>\r\ndoi = {10.1007\/s10472-007-9050-9},<br \/>\r\nyear  = {2006},<br \/>\r\ndate = {2006-01-01},<br \/>\r\njournal = {Annals of Mathematics and Artificial Intelligence (AMAI)},<br \/>\r\nvolume = {48},<br \/>\r\nnumber = {1-2},<br \/>\r\npages = {85--106},<br \/>\r\nabstract = {A wide range of combinatorial optimization algorithms have been developed for complex reasoning tasks. Frequently, no single algorithm outperforms all the others. This has raised interest in leveraging the performance of a collection of algorithms to improve performance. We show how to accomplish this using a Parallel Portfolio of Algorithms (PPA). A PPA is a collection of diverse algorithms for solving a single problem, all running concurrently on a single processor until a solution is produced. The performance of the portfolio may be controlled by assigning different shares of processor time to each algorithm. We present an effective method for finding a PPA in which the share of processor time allocated to each algorithm is fixed. Finding the optimal static schedule is shown to be an NP-complete problem for a general class of utility functions. We present bounds on the performance of the PPA over random instances and evaluate the performance empirically on a collection of 23 state-of-the-art SAT algorithms. The results show significant performance gains over the fastest individual algorithm in the collection.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1005','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1005\" style=\"display:none;\"><div class=\"tp_abstract_entry\">A wide range of combinatorial optimization algorithms have been developed for complex reasoning tasks. Frequently, no single algorithm outperforms all the others. This has raised interest in leveraging the performance of a collection of algorithms to improve performance. We show how to accomplish this using a Parallel Portfolio of Algorithms (PPA). A PPA is a collection of diverse algorithms for solving a single problem, all running concurrently on a single processor until a solution is produced. The performance of the portfolio may be controlled by assigning different shares of processor time to each algorithm. We present an effective method for finding a PPA in which the share of processor time allocated to each algorithm is fixed. Finding the optimal static schedule is shown to be an NP-complete problem for a general class of utility functions. We present bounds on the performance of the PPA over random instances and evaluate the performance empirically on a collection of 23 state-of-the-art SAT algorithms. The results show significant performance gains over the fastest individual algorithm in the collection.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1005','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1005\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZamai06.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZamai06.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZamai06.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1007\/s10472-007-9050-9\" title=\"Follow DOI:10.1007\/s10472-007-9050-9\" target=\"_blank\">doi:10.1007\/s10472-007-9050-9<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1005','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Petrik, Marek;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1007','tp_links')\" style=\"cursor:pointer;\">Learning Static Parallel Portfolios of Algorithms<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 9th International Symposium on Artificial Intelligence and Mathematics (ISAIM), <\/span><span class=\"tp_pub_additional_address\">Ft. Lauderdale, Florida, <\/span><span class=\"tp_pub_additional_year\">2006<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1007\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1007','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1007\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1007','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1007\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1007','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1007\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:PZisaim06,<br \/>\r\ntitle = {Learning Static Parallel Portfolios of Algorithms},<br \/>\r\nauthor = {Marek Petrik and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZisaim06.pdf},<br \/>\r\nyear  = {2006},<br \/>\r\ndate = {2006-01-01},<br \/>\r\nbooktitle = {Proceedings of the 9th International Symposium on Artificial Intelligence and Mathematics (ISAIM)},<br \/>\r\naddress = {Ft. Lauderdale, Florida},<br \/>\r\nabstract = {We present an approach for improving the performance of combinatorial optimization algorithms by generating an optimal Parallel Portfolio of Algorithms (PPA). A PPA is a collection of diverse algorithms for solving a single problem, all running concurrently on a single processor until a solution is produced. The performance of the portfolio may be controlled by assigning different shares of processor time to each algorithm. We present a method for finding a static PPA, in which the share of processor time allocated to each algorithm is fixed. The schedule is shown to be optimal with respect to a given training set of instances. We draw bounds on the performance of the PPA over random instances and evaluate the performance empirically on a collection of 23 state-of-the-art SAT algorithms. The results show significant performance gains (up to a factor of 2) over the fastest individual algorithm in a realistic setting.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1007','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1007\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We present an approach for improving the performance of combinatorial optimization algorithms by generating an optimal Parallel Portfolio of Algorithms (PPA). A PPA is a collection of diverse algorithms for solving a single problem, all running concurrently on a single processor until a solution is produced. The performance of the portfolio may be controlled by assigning different shares of processor time to each algorithm. We present a method for finding a static PPA, in which the share of processor time allocated to each algorithm is fixed. The schedule is shown to be optimal with respect to a given training set of instances. We draw bounds on the performance of the PPA over random instances and evaluate the performance empirically on a collection of 23 state-of-the-art SAT algorithms. The results show significant performance gains (up to a factor of 2) over the fastest individual algorithm in a realistic setting.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1007','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1007\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZisaim06.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZisaim06.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZisaim06.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1007','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Hansen, Eric A;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1050','tp_links')\" style=\"cursor:pointer;\">Monitoring and Control of Anytime Algorithms: A Dynamic Programming Approach<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Artificial Intelligence (AIJ), <\/span><span class=\"tp_pub_additional_volume\">vol. 126, <\/span><span class=\"tp_pub_additional_number\">no. 1-2, <\/span><span class=\"tp_pub_additional_pages\">pp. 139\u2013157, <\/span><span class=\"tp_pub_additional_year\">2001<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1050\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1050','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1050\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1050','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1050\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1050','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1050\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:HZaij01a,<br \/>\r\ntitle = {Monitoring and Control of Anytime Algorithms: A Dynamic Programming Approach},<br \/>\r\nauthor = {Eric A Hansen and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HZaij01a.pdf},<br \/>\r\ndoi = {10.1016\/S0004-3702(00)00068-0},<br \/>\r\nyear  = {2001},<br \/>\r\ndate = {2001-01-01},<br \/>\r\njournal = {Artificial Intelligence (AIJ)},<br \/>\r\nvolume = {126},<br \/>\r\nnumber = {1-2},<br \/>\r\npages = {139--157},<br \/>\r\nabstract = {Anytime algorithms offer a tradeoff between solution quality and computation time that has proved useful in solving time-critical problems such as planning and scheduling, belief network evaluation, and information gathering. To exploit this tradeoff, a system must be able to decide when to stop deliberation and act on the currently available solution. This paper analyzes the characteristics of existing techniques for meta-level control of anytime algorithms and develops a new framework for monitoring and control. The new framework handles effectively the uncertainty associated with the algorithm's performance profile, the uncertainty associated with the domain of operation, and the cost of monitoring progress. The result is an efficient non-myopic solution to the meta-level control problem for anytime algorithms.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1050','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1050\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Anytime algorithms offer a tradeoff between solution quality and computation time that has proved useful in solving time-critical problems such as planning and scheduling, belief network evaluation, and information gathering. To exploit this tradeoff, a system must be able to decide when to stop deliberation and act on the currently available solution. This paper analyzes the characteristics of existing techniques for meta-level control of anytime algorithms and develops a new framework for monitoring and control. The new framework handles effectively the uncertainty associated with the algorithm's performance profile, the uncertainty associated with the domain of operation, and the cost of monitoring progress. The result is an efficient non-myopic solution to the meta-level control problem for anytime algorithms.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1050','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1050\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HZaij01a.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HZaij01a.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HZaij01a.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1016\/S0004-3702(00)00068-0\" title=\"Follow DOI:10.1016\/S0004-3702(00)00068-0\" target=\"_blank\">doi:10.1016\/S0004-3702(00)00068-0<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1050','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Cardon, Stephane;  Mouaddib, Abdel-Illah;  Zilberstein, Shlomo;  Washington, Richard<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1053','tp_links')\" style=\"cursor:pointer;\">Adaptive Control of Acyclic Progressive Processing Task Structures<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 17th International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_address\">Seattle, Washington, <\/span><span class=\"tp_pub_additional_year\">2001<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1053\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1053','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1053\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1053','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1053\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1053','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1053\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:CMZWijcai01,<br \/>\r\ntitle = {Adaptive Control of Acyclic Progressive Processing Task Structures},<br \/>\r\nauthor = {Stephane Cardon and Abdel-Illah Mouaddib and Shlomo Zilberstein and Richard Washington},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CMZWijcai01.pdf},<br \/>\r\nyear  = {2001},<br \/>\r\ndate = {2001-01-01},<br \/>\r\nbooktitle = {Proceedings of the 17th International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\npages = {701--706},<br \/>\r\naddress = {Seattle, Washington},<br \/>\r\nabstract = {The progressive processing model allows a system to trade off resource consumption against the quality of the outcome by mapping each activity to a graph of potential solution methods. In the past, only semi-linear graphs have been used. We examine the application of the model to control the operation of an autonomous rover which operates under tight resource constraints. The task structure is generalized to directed acyclic graphs for which the optimal schedule can be computed by solving a corresponding Markov decision problem. We evaluate the complexity of the solution analytically and experimentally and show that it provides a practical approach to building an adaptive controller for this application.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1053','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1053\" style=\"display:none;\"><div class=\"tp_abstract_entry\">The progressive processing model allows a system to trade off resource consumption against the quality of the outcome by mapping each activity to a graph of potential solution methods. In the past, only semi-linear graphs have been used. We examine the application of the model to control the operation of an autonomous rover which operates under tight resource constraints. The task structure is generalized to directed acyclic graphs for which the optimal schedule can be computed by solving a corresponding Markov decision problem. We evaluate the complexity of the solution analytically and experimentally and show that it provides a practical approach to building an adaptive controller for this application.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1053','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1053\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CMZWijcai01.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CMZWijcai01.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CMZWijcai01.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1053','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Zilberstein, Shlomo;  Mouaddib, Abdel-Illah<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1057','tp_links')\" style=\"cursor:pointer;\">Optimal Scheduling of Progressive Processing Tasks<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">International Journal of Approximate Reasoning (IJAR), <\/span><span class=\"tp_pub_additional_volume\">vol. 25, <\/span><span class=\"tp_pub_additional_number\">no. 3, <\/span><span class=\"tp_pub_additional_pages\">pp. 169\u2013186, <\/span><span class=\"tp_pub_additional_year\">2000<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1057\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1057','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1057\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1057','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1057\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1057','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1057\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:ZMijar00,<br \/>\r\ntitle = {Optimal Scheduling of Progressive Processing Tasks},<br \/>\r\nauthor = {Shlomo Zilberstein and Abdel-Illah Mouaddib},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZMijar00.pdf},<br \/>\r\ndoi = {10.1016\/S0888-613X(00)00049-9},<br \/>\r\nyear  = {2000},<br \/>\r\ndate = {2000-01-01},<br \/>\r\njournal = {International Journal of Approximate Reasoning (IJAR)},<br \/>\r\nvolume = {25},<br \/>\r\nnumber = {3},<br \/>\r\npages = {169--186},<br \/>\r\nabstract = {LAO* is a heuristic search algorithm for Markov decision problems that is derived from the classic heuristic search algorithm AO* (Progressive processing is an approximate reasoning model that allows a system to satisfy a set of requests under time pressure by limiting the amount of processing allocated to each task based on a predefined hierarchical task structure. It is a useful model for a variety of real-time tasks such as information retrieval, automated diagnosis, or real-time image tracking and speech recognition. In performing these tasks it is often necessary to trade-off computational resources for quality of results. This paper addresses progressive processing of information retrieval requests that are characterized by high duration uncertainty associated with each computational unit and dynamic operation allowing new requests to be added at run-time. We introduce a new approach to scheduling the processing units by constructing and solving a particular Markov decision problem. The resulting policy is an optimal schedule for the progressive processing problem. Evaluation of the technique shows that it offers a significant improvement over existing heuristic scheduling techniques. Moreover, the framework presented in this paper can be applied to real-time scheduling of a wide variety of task structures other than progressive processing.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1057','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1057\" style=\"display:none;\"><div class=\"tp_abstract_entry\">LAO* is a heuristic search algorithm for Markov decision problems that is derived from the classic heuristic search algorithm AO* (Progressive processing is an approximate reasoning model that allows a system to satisfy a set of requests under time pressure by limiting the amount of processing allocated to each task based on a predefined hierarchical task structure. It is a useful model for a variety of real-time tasks such as information retrieval, automated diagnosis, or real-time image tracking and speech recognition. In performing these tasks it is often necessary to trade-off computational resources for quality of results. This paper addresses progressive processing of information retrieval requests that are characterized by high duration uncertainty associated with each computational unit and dynamic operation allowing new requests to be added at run-time. We introduce a new approach to scheduling the processing units by constructing and solving a particular Markov decision problem. The resulting policy is an optimal schedule for the progressive processing problem. Evaluation of the technique shows that it offers a significant improvement over existing heuristic scheduling techniques. Moreover, the framework presented in this paper can be applied to real-time scheduling of a wide variety of task structures other than progressive processing.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1057','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1057\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZMijar00.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZMijar00.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZMijar00.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1016\/S0888-613X(00)00049-9\" title=\"Follow DOI:10.1016\/S0888-613X(00)00049-9\" target=\"_blank\">doi:10.1016\/S0888-613X(00)00049-9<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1057','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Zilberstein, Shlomo;  Mouaddib, Abdel-Illah<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1065','tp_links')\" style=\"cursor:pointer;\">Reactive Control of Dynamic Progressive Processing<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 16th International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_address\">Stockholm, Sweden, <\/span><span class=\"tp_pub_additional_year\">1999<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1065\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1065','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1065\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1065','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1065\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1065','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1065\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:ZMijcai99,<br \/>\r\ntitle = {Reactive Control of Dynamic Progressive Processing},<br \/>\r\nauthor = {Shlomo Zilberstein and Abdel-Illah Mouaddib},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZMijcai99.pdf},<br \/>\r\nyear  = {1999},<br \/>\r\ndate = {1999-01-01},<br \/>\r\nbooktitle = {Proceedings of the 16th International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\npages = {1268--1273},<br \/>\r\naddress = {Stockholm, Sweden},<br \/>\r\nabstract = {Progressive processing is a model of computation that allows a system to tradeoff computational resources against the quality of results. This paper generalizes the existing model to make it suitable for dynamic composition of information retrieval techniques. The new framework addresses effectively the uncertainty associated with the duration and output quality of each component. We show how to construct an optimal meta-level controller for a single task based on solving a corresponding Markov decision problem, and how to extend the solution to the case of multiple and dynamic tasks using the notion of an opportunity cost.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1065','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1065\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Progressive processing is a model of computation that allows a system to tradeoff computational resources against the quality of results. This paper generalizes the existing model to make it suitable for dynamic composition of information retrieval techniques. The new framework addresses effectively the uncertainty associated with the duration and output quality of each component. We show how to construct an optimal meta-level controller for a single task based on solving a corresponding Markov decision problem, and how to extend the solution to the case of multiple and dynamic tasks using the notion of an opportunity cost.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1065','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1065\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZMijcai99.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZMijcai99.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZMijcai99.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1065','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Mouaddib, Abdel-Illah;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1070','tp_links')\" style=\"cursor:pointer;\">Optimal Scheduling of Dynamic Progressive Processing<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 13th European Conference on Artificial Intelligence (ECAI), <\/span><span class=\"tp_pub_additional_address\">Brighton, UK, <\/span><span class=\"tp_pub_additional_year\">1998<\/span><span class=\"tp_pub_additional_note\">, (Best Paper Award)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1070\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1070','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1070\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1070','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1070\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1070','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1070\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:MZecai98,<br \/>\r\ntitle = {Optimal Scheduling of Dynamic Progressive Processing},<br \/>\r\nauthor = {Abdel-Illah Mouaddib and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZecai98.pdf},<br \/>\r\nyear  = {1998},<br \/>\r\ndate = {1998-01-01},<br \/>\r\nbooktitle = {Proceedings of the 13th European Conference on Artificial Intelligence (ECAI)},<br \/>\r\npages = {499--503},<br \/>\r\naddress = {Brighton, UK},<br \/>\r\nabstract = {Progressive processing allows a system to satisfy a set of requests under time pressure by limiting the amount of processing allocated to each task based on a predefined hierarchical task structure. It is a useful model for a variety of real-time AI tasks such as diagnosis and planning in which it is necessary to trade-off computational resources for quality of results. This paper addresses progressive processing of information retrieval requests that are characterized by high duration uncertainty associated with each computational unit and dynamic operation allowing new requests to be added at run-time. We introduce a new approach to scheduling the processing units by constructing and solving a particular Markov decision problem. The resulting policy is an optimal schedule for the progressive processing problem. Finally, we evaluate the technique and show that it offers a significant improvement over existing heuristic scheduling techniques.},<br \/>\r\nnote = {Best Paper Award},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1070','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1070\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Progressive processing allows a system to satisfy a set of requests under time pressure by limiting the amount of processing allocated to each task based on a predefined hierarchical task structure. It is a useful model for a variety of real-time AI tasks such as diagnosis and planning in which it is necessary to trade-off computational resources for quality of results. This paper addresses progressive processing of information retrieval requests that are characterized by high duration uncertainty associated with each computational unit and dynamic operation allowing new requests to be added at run-time. We introduce a new approach to scheduling the processing units by constructing and solving a particular Markov decision problem. The resulting policy is an optimal schedule for the progressive processing problem. Finally, we evaluate the technique and show that it offers a significant improvement over existing heuristic scheduling techniques.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1070','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1070\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZecai98.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZecai98.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZecai98.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1070','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Mouaddib, Abdel-Illah;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1075','tp_links')\" style=\"cursor:pointer;\">Handling Duration Uncertainty in Meta-Level Control of Progressive Processing<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 15th International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_address\">Nagoya, Japan, <\/span><span class=\"tp_pub_additional_year\">1997<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1075\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1075','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1075\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1075','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1075\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1075','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1075\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:MZijcai97,<br \/>\r\ntitle = {Handling Duration Uncertainty in Meta-Level Control of Progressive Processing},<br \/>\r\nauthor = {Abdel-Illah Mouaddib and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZijcai97.pdf},<br \/>\r\nyear  = {1997},<br \/>\r\ndate = {1997-01-01},<br \/>\r\nbooktitle = {Proceedings of the 15th International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\npages = {1201--1207},<br \/>\r\naddress = {Nagoya, Japan},<br \/>\r\nabstract = {Progressive processing is a resource-bounded reasoning technique that allows a system to incrementally construct a solution to a problem using a hierarchy of processing levels. This paper focuses on the problem of meta-level control of progressive processing in domains characterized by rapid change and high level of duration uncertainty. We show that progressive processing facilitates efficient run-time monitoring and meta-level control. Our solution is based on an incremental scheduler that can handle duration uncertainty by dynamically revising the schedule during execution time based on run-time information. We also show that a probabilistic representation of duration uncertainty reduces the frequency of schedule revisions and thus improves the performance of the system. Finally, an experimental evaluation shows the contributions of this approach and its suitability for a data transmission application.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1075','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1075\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Progressive processing is a resource-bounded reasoning technique that allows a system to incrementally construct a solution to a problem using a hierarchy of processing levels. This paper focuses on the problem of meta-level control of progressive processing in domains characterized by rapid change and high level of duration uncertainty. We show that progressive processing facilitates efficient run-time monitoring and meta-level control. Our solution is based on an incremental scheduler that can handle duration uncertainty by dynamically revising the schedule during execution time based on run-time information. We also show that a probabilistic representation of duration uncertainty reduces the frequency of schedule revisions and thus improves the performance of the system. Finally, an experimental evaluation shows the contributions of this approach and its suitability for a data transmission application.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1075','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1075\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZijcai97.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZijcai97.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZijcai97.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1075','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1078','tp_links')\" style=\"cursor:pointer;\">Resource-Bounded Sensing and Planning in Autonomous Systems<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Autonomous Robots, <\/span><span class=\"tp_pub_additional_volume\">vol. 3, <\/span><span class=\"tp_pub_additional_pages\">pp. 31\u201348, <\/span><span class=\"tp_pub_additional_year\">1996<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1078\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1078','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1078\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1078','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1078\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1078','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1078\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:Zar96,<br \/>\r\ntitle = {Resource-Bounded Sensing and Planning in Autonomous Systems},<br \/>\r\nauthor = {Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/Zar96.pdf},<br \/>\r\nyear  = {1996},<br \/>\r\ndate = {1996-01-01},<br \/>\r\njournal = {Autonomous Robots},<br \/>\r\nvolume = {3},<br \/>\r\npages = {31--48},<br \/>\r\nabstract = {This paper is concerned with the implications of limited computational resources and uncertainty on the design of autonomous systems. To address this problem, we redefine the principal role of sensor interpretation and planning processes. Following Agre and Chapman's plan-as-communication approach, sensing and planning are treated as computational processes that provide information to an execution architecture and thus improve the overall performance of the system. We argue that autonomous systems must be able to trade off the quality of this information with the computational resources required to produce it. Anytime algorithms, whose quality of results improves gradually as computation time increases, provide useful performance components for time-critical sensing and planning in robotic systems. In our earlier work, we introduced a compilation scheme for optimal composition of anytime algorithms. This paper demonstrates the applicability of the compilation technique to the construction of autonomous systems. The result is a flexible approach to construct systems that can operate robustly in real-time by exploiting the tradeoff between time and quality in planning, sensing and plan execution.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1078','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1078\" style=\"display:none;\"><div class=\"tp_abstract_entry\">This paper is concerned with the implications of limited computational resources and uncertainty on the design of autonomous systems. To address this problem, we redefine the principal role of sensor interpretation and planning processes. Following Agre and Chapman's plan-as-communication approach, sensing and planning are treated as computational processes that provide information to an execution architecture and thus improve the overall performance of the system. We argue that autonomous systems must be able to trade off the quality of this information with the computational resources required to produce it. Anytime algorithms, whose quality of results improves gradually as computation time increases, provide useful performance components for time-critical sensing and planning in robotic systems. In our earlier work, we introduced a compilation scheme for optimal composition of anytime algorithms. This paper demonstrates the applicability of the compilation technique to the construction of autonomous systems. The result is a flexible approach to construct systems that can operate robustly in real-time by exploiting the tradeoff between time and quality in planning, sensing and plan execution.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1078','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1078\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/Zar96.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/Zar96.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/Zar96.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1078','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Hansen, Eric A;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1083','tp_links')\" style=\"cursor:pointer;\">Monitoring the Progress of Anytime Problem-Solving<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 13th National Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_address\">Portland, Oregon, <\/span><span class=\"tp_pub_additional_year\">1996<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1083\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1083','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1083\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1083','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1083\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1083','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1083\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:HZaaai96,<br \/>\r\ntitle = {Monitoring the Progress of Anytime Problem-Solving},<br \/>\r\nauthor = {Eric A Hansen and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HZaaai96.pdf},<br \/>\r\nyear  = {1996},<br \/>\r\ndate = {1996-01-01},<br \/>\r\nbooktitle = {Proceedings of the 13th National Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {1229--1234},<br \/>\r\naddress = {Portland, Oregon},<br \/>\r\nabstract = {Anytime algorithms offer a tradeoff between solution quality and computation time that has proved useful in applying artificial intelligence techniques to time-critical problems. To exploit this tradeoff, a system must be able to determine the best time to stop deliberation and act on the currently available solution. When the rate of improvement of solution quality is uncertain, monitoring the progress of the algorithm can improve the utility of the system. This paper introduces a technique for run-time monitoring of anytime algorithms that is sensitive to the variance of the algorithm's performance, the time-dependent utility of a solution, the ability of the run-time monitor to estimate the quality of the currently available solution, and the cost of monitoring. The paper examines the conditions under which the technique is optimal and demonstrates its applicability.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1083','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1083\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Anytime algorithms offer a tradeoff between solution quality and computation time that has proved useful in applying artificial intelligence techniques to time-critical problems. To exploit this tradeoff, a system must be able to determine the best time to stop deliberation and act on the currently available solution. When the rate of improvement of solution quality is uncertain, monitoring the progress of the algorithm can improve the utility of the system. This paper introduces a technique for run-time monitoring of anytime algorithms that is sensitive to the variance of the algorithm's performance, the time-dependent utility of a solution, the ability of the run-time monitor to estimate the quality of the currently available solution, and the cost of monitoring. The paper examines the conditions under which the technique is optimal and demonstrates its applicability.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1083','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1083\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HZaaai96.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HZaaai96.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HZaaai96.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1083','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1089','tp_links')\" style=\"cursor:pointer;\">Optimizing Decision Quality with Contract Algorithms<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_address\">Montreal, Canada, <\/span><span class=\"tp_pub_additional_year\">1995<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1089\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1089','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1089\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1089','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1089\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1089','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1089\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:Zijcai95,<br \/>\r\ntitle = {Optimizing Decision Quality with Contract Algorithms},<br \/>\r\nauthor = {Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/Zijcai95.pdf},<br \/>\r\nyear  = {1995},<br \/>\r\ndate = {1995-01-01},<br \/>\r\nbooktitle = {Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\npages = {1576--1582},<br \/>\r\naddress = {Montreal, Canada},<br \/>\r\nabstract = {Contract algorithms offer a tradeoff between output quality and computation time, provided that the amount of computation time is determined prior to their activation. Originally, they were introduced as an intermediate step in the composition of interruptible anytime algorithms. However, for many real-time tasks such as information gathering, game playing, and a large class of planning problems, contract algorithms offer an ideal mechanism to optimize decision quality. This paper extends previous results regarding the meta-level control of contract algorithms by handling a more general type of performance description. The output quality of each contract algorithm is described by a probabilistic (rather than deterministic) conditional performance profile. Such profiles map input quality and computation time to a probability distribution of output quality. The composition problem is solved by an efficient off-line compilation technique that simplifies the run-time monitoring task.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1089','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1089\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Contract algorithms offer a tradeoff between output quality and computation time, provided that the amount of computation time is determined prior to their activation. Originally, they were introduced as an intermediate step in the composition of interruptible anytime algorithms. However, for many real-time tasks such as information gathering, game playing, and a large class of planning problems, contract algorithms offer an ideal mechanism to optimize decision quality. This paper extends previous results regarding the meta-level control of contract algorithms by handling a more general type of performance description. The output quality of each contract algorithm is described by a probabilistic (rather than deterministic) conditional performance profile. Such profiles map input quality and computation time to a probability distribution of output quality. The composition problem is solved by an efficient off-line compilation technique that simplifies the run-time monitoring task.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1089','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1089\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/Zijcai95.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/Zijcai95.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/Zijcai95.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1089','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_phdthesis\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1094','tp_links')\" style=\"cursor:pointer;\">Operational Rationality through Compilation of Anytime Algorithms<\/a> <span class=\"tp_pub_type tp_  phdthesis\">PhD Thesis<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_school\">Computer Science Division, University of California Berkeley, <\/span><span class=\"tp_pub_additional_year\">1993<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1094\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1094','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1094\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1094','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1094\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1094','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1094\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@phdthesis{SZ:Zshort93,<br \/>\r\ntitle = {Operational Rationality through Compilation of Anytime Algorithms},<br \/>\r\nauthor = {Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/Zshort93.pdf},<br \/>\r\nyear  = {1993},<br \/>\r\ndate = {1993-01-01},<br \/>\r\nschool = {Computer Science Division, University of California Berkeley},<br \/>\r\nabstract = {An important and largely ignored aspect of real-time decision making is the capability of agents to factor the cost of deliberation into the decision making process. I have developed an efficient model that creates this capability. The model uses as basic components anytime algorithms whose quality of results improves gradually as computation time increases. The main contribution of this work is a compilation process that extends the property of gradual improvement from the level of single algorithms to the level of complex systems. <br \/>\r\nIn standard algorithms, the fixed quality of the output allows for composition to be implemented by a simple call-return mechanism. However, when algorithms have resource allocation as a degree of freedom, there arises the question of how to construct, for example, the optimal composition of two anytime algorithms, one of which feeds its output to the other. This scheduling problem is solved by an off-line compilation process and a run-time monitoring component that together generate a utility maximizing behavior. The crucial meta-level knowledge is kept in the anytime library in the form of conditional performance profiles. These profiles characterize the performance of each elementary anytime algorithm as a function of run-time and input quality. The compilation process therefore extends the principles of procedural abstraction and modularity to anytime computation. Its efficiency is significantly improved by using local compilation that works on a single program structure at a time. Local compilation is proved to yield global optimality for a large set of program structures. <br \/>\r\nCompilation produces contract algorithms which require the determination of the total run-time when activated. Some real-time domains require interruptible algorithms whose total run-time is unknown in advance. An important result of this work is a general method by which an interruptible algorithm can be constructed once a contract algorithm is compiled. Finally, the notion of gradual improvement of quality is extended to sensing and plan execution and the application of the model is demonstrated through a simulated robot navigation system. The result is a modular approach for developing real-time agents that act by performing anytime actions and make decisions using anytime computation.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {phdthesis}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1094','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1094\" style=\"display:none;\"><div class=\"tp_abstract_entry\">An important and largely ignored aspect of real-time decision making is the capability of agents to factor the cost of deliberation into the decision making process. I have developed an efficient model that creates this capability. The model uses as basic components anytime algorithms whose quality of results improves gradually as computation time increases. The main contribution of this work is a compilation process that extends the property of gradual improvement from the level of single algorithms to the level of complex systems. <br \/>\r\nIn standard algorithms, the fixed quality of the output allows for composition to be implemented by a simple call-return mechanism. However, when algorithms have resource allocation as a degree of freedom, there arises the question of how to construct, for example, the optimal composition of two anytime algorithms, one of which feeds its output to the other. This scheduling problem is solved by an off-line compilation process and a run-time monitoring component that together generate a utility maximizing behavior. The crucial meta-level knowledge is kept in the anytime library in the form of conditional performance profiles. These profiles characterize the performance of each elementary anytime algorithm as a function of run-time and input quality. The compilation process therefore extends the principles of procedural abstraction and modularity to anytime computation. Its efficiency is significantly improved by using local compilation that works on a single program structure at a time. Local compilation is proved to yield global optimality for a large set of program structures. <br \/>\r\nCompilation produces contract algorithms which require the determination of the total run-time when activated. Some real-time domains require interruptible algorithms whose total run-time is unknown in advance. An important result of this work is a general method by which an interruptible algorithm can be constructed once a contract algorithm is compiled. Finally, the notion of gradual improvement of quality is extended to sensing and plan execution and the application of the model is demonstrated through a simulated robot navigation system. The result is a modular approach for developing real-time agents that act by performing anytime actions and make decisions using anytime computation.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1094','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1094\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/Zshort93.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/Zshort93.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/Zshort93.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1094','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Zilberstein, Shlomo;  Russell, Stuart J<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1095','tp_links')\" style=\"cursor:pointer;\">Anytime Sensing, Planning and Action: A Practical Model for Robot Control<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 13th International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_address\">Chambery, France, <\/span><span class=\"tp_pub_additional_year\">1993<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1095\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1095','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1095\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1095','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1095\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1095','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1095\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:ZRijcai93,<br \/>\r\ntitle = {Anytime Sensing, Planning and Action: A Practical Model for Robot Control},<br \/>\r\nauthor = {Shlomo Zilberstein and Stuart J Russell},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZRijcai93.pdf},<br \/>\r\nyear  = {1993},<br \/>\r\ndate = {1993-01-01},<br \/>\r\nbooktitle = {Proceedings of the 13th International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\npages = {1402--1407},<br \/>\r\naddress = {Chambery, France},<br \/>\r\nabstract = {Anytime algorithms, whose quality of results improves gradually as computation time increases, provide useful performance components for time-critical planning and control of robotic systems. In earlier work, we introduced a compilation scheme for optimal composition of anytime algorithms. In this paper we present an implementation of a navigation system in which an off-line compilation process and a run-time monitoring component guarantee the optimal allocation of time to the anytime modules. The crucial meta-level knowledge is kept in the anytime library in the form of conditional performance profiles. We also extend the notion of gradual improvement to sensing and plan execution. The result is an efficient, flexible control for robotic systems that exploits the tradeoff between time and quality in planning, sensing and plan execution.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1095','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1095\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Anytime algorithms, whose quality of results improves gradually as computation time increases, provide useful performance components for time-critical planning and control of robotic systems. In earlier work, we introduced a compilation scheme for optimal composition of anytime algorithms. In this paper we present an implementation of a navigation system in which an off-line compilation process and a run-time monitoring component guarantee the optimal allocation of time to the anytime modules. The crucial meta-level knowledge is kept in the anytime library in the form of conditional performance profiles. We also extend the notion of gradual improvement to sensing and plan execution. The result is an efficient, flexible control for robotic systems that exploits the tradeoff between time and quality in planning, sensing and plan execution.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1095','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1095\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZRijcai93.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZRijcai93.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ZRijcai93.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1095','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><\/table><\/div><\/div>\n<div><\/div><\/div><\/div>\n<\/div>\n<h3><span style=\"color: #264278\"><b>Scalable Algorithms for Probabilistic Reasoning<\/b><\/span><\/h3>\n<div>\n<div>How can AI systems cope with uncertainty in large sequential decision problems, and how to leverage heuristic search and reachability analysis to solve complex probabilistic planning problems?<\/div>\n<div><div class=\"bg-margin-for-link\"><input type='hidden' bg_collapse_expand='6a2d3028e58569087590425' value='6a2d3028e58569087590425'><input type='hidden' id='bg-show-more-text-6a2d3028e58569087590425' value='Show Related Publications'><input type='hidden' id='bg-show-less-text-6a2d3028e58569087590425' value='Hide Related Publications'><a id='bg-showmore-action-6a2d3028e58569087590425' class='bg-showmore-plg-link bg-arrow '  style=\" color:#7C2622;;\" href='#'>Show Related Publications<\/a><div id='bg-showmore-hidden-6a2d3028e58569087590425' ><div class=\"teachpress_pub_list\"><form name=\"tppublistform\" method=\"get\"><a name=\"tppubs\" id=\"tppubs\"><\/a><\/form><table class=\"teachpress_publication_list\"><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Miura, Shuwa;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1188','tp_links')\" style=\"cursor:pointer;\">Observer-Aware Planning with Implicit and Explicit Communication<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the The 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS), <\/span><span class=\"tp_pub_additional_address\">Auckland, New Zealand, <\/span><span class=\"tp_pub_additional_year\">2024<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1188\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1188','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1188\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1188','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1188\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1188','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1188\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:MZaamas24,<br \/>\r\ntitle = {Observer-Aware Planning with Implicit and Explicit Communication},<br \/>\r\nauthor = {Shuwa Miura and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZaamas24.pdf},<br \/>\r\nyear  = {2024},<br \/>\r\ndate = {2024-01-01},<br \/>\r\nbooktitle = {Proceedings of the The 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS)},<br \/>\r\naddress = {Auckland, New Zealand},<br \/>\r\nabstract = {This paper presents a computational model designed for planning both implicit and explicit communication of intentions, goals, and desires. Building upon previous research focused on implicit communication of intention via actions, our model seeks to strategically influence an observer\u2019s belief using both the agent\u2019s actions and explicit messages. We show that our proposed model can be considered to be a special case of general multi-agent problems with explicit communication under certain assumptions. Since the mental state of the observer depends on histories, computing a policy for the proposed model amounts to optimizing a non-Markovian objective, which we show to be intractable in the worst case. To mitigate this challenge, we propose a technique based on splitting domain and communication actions during planning. We conclude with experimental evaluations of the proposed approach that illustrate its effectiveness.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1188','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1188\" style=\"display:none;\"><div class=\"tp_abstract_entry\">This paper presents a computational model designed for planning both implicit and explicit communication of intentions, goals, and desires. Building upon previous research focused on implicit communication of intention via actions, our model seeks to strategically influence an observer\u2019s belief using both the agent\u2019s actions and explicit messages. We show that our proposed model can be considered to be a special case of general multi-agent problems with explicit communication under certain assumptions. Since the mental state of the observer depends on histories, computing a policy for the proposed model amounts to optimizing a non-Markovian objective, which we show to be intractable in the worst case. To mitigate this challenge, we propose a technique based on splitting domain and communication actions during planning. We conclude with experimental evaluations of the proposed approach that illustrate its effectiveness.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1188','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1188\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZaamas24.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZaamas24.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZaamas24.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1188','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Bhatia, Abhinav;  Nashed, Samer B.;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1172','tp_links')\" style=\"cursor:pointer;\">RL3: Boosting Meta Reinforcement Learning via RL inside RL2<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">NeurIPS Workshop on Generalized Planning (GenPlan), <\/span><span class=\"tp_pub_additional_address\">New Orleans, Louisiana, <\/span><span class=\"tp_pub_additional_year\">2023<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1172\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1172','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1172\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1172','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1172\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1172','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1172\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:BNZgenplan23,<br \/>\r\ntitle = {RL3: Boosting Meta Reinforcement Learning via RL inside RL2},<br \/>\r\nauthor = {Abhinav Bhatia and Samer B. Nashed and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BNZgenplan23.pdf},<br \/>\r\nyear  = {2023},<br \/>\r\ndate = {2023-01-01},<br \/>\r\nurldate = {2023-01-01},<br \/>\r\nbooktitle = {NeurIPS Workshop on Generalized Planning (GenPlan)},<br \/>\r\naddress = {New Orleans, Louisiana},<br \/>\r\nabstract = {Meta reinforcement learning (meta-RL) methods such as RL2 have emerged as promising approaches for learning data-efficient RL algorithms tailored to a given task distribution. However, these RL algorithms struggle with long-horizon tasks and out-of-distribution tasks since they rely on recurrent neural networks to pro- cess the sequence of experiences instead of summarizing them into general RL components such as value functions. Moreover, even transformers have a practical limit to the length of histories they can efficiently reason about before training and inference costs become prohibitive. In contrast, traditional RL algorithms are data-inefficient since they do not leverage domain knowledge, but they do converge to an optimal policy as more data becomes available. In this paper, we propose RL3, a principled hybrid approach that combines traditional RL and meta-RL by incorporating task-specific action-values learned through traditional RL as an input to the meta-RL neural network. We show that RL3 earns greater cumulative reward on long-horizon and out-of-distribution tasks compared to RL2, while maintaining the efficiency of the latter in the short term. Experiments are conducted on both custom and benchmark discrete domains from the meta-RL literature that exhibit a range of short-term, long-term, and complex dependencies.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1172','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1172\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Meta reinforcement learning (meta-RL) methods such as RL2 have emerged as promising approaches for learning data-efficient RL algorithms tailored to a given task distribution. However, these RL algorithms struggle with long-horizon tasks and out-of-distribution tasks since they rely on recurrent neural networks to pro- cess the sequence of experiences instead of summarizing them into general RL components such as value functions. Moreover, even transformers have a practical limit to the length of histories they can efficiently reason about before training and inference costs become prohibitive. In contrast, traditional RL algorithms are data-inefficient since they do not leverage domain knowledge, but they do converge to an optimal policy as more data becomes available. In this paper, we propose RL3, a principled hybrid approach that combines traditional RL and meta-RL by incorporating task-specific action-values learned through traditional RL as an input to the meta-RL neural network. We show that RL3 earns greater cumulative reward on long-horizon and out-of-distribution tasks compared to RL2, while maintaining the efficiency of the latter in the short term. Experiments are conducted on both custom and benchmark discrete domains from the meta-RL literature that exhibit a range of short-term, long-term, and complex dependencies.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1172','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1172\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BNZgenplan23.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BNZgenplan23.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BNZgenplan23.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1172','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_incollection\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Mahmud, Saaduddin;  Nashed, Samer B.;  Goldman, Claudia V.;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1173','tp_links')\" style=\"cursor:pointer;\">Estimating Causal Responsibility for Explaining Autonomous Behavior<\/a> <span class=\"tp_pub_type tp_  incollection\">Book Section<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span> Calvaresi, Davide (Ed.): <span class=\"tp_pub_additional_booktitle\">International Workshop on Explainable and Transparent AI and Multi-Agent Systems (EXTRAAMAS), <\/span><span class=\"tp_pub_additional_pages\">pp. 78\u201394, <\/span><span class=\"tp_pub_additional_publisher\">Springer, <\/span><span class=\"tp_pub_additional_year\">2023<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1173\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1173','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1173\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1173','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1173\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1173','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1173\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@incollection{SZ:MNGZextraamas23,<br \/>\r\ntitle = {Estimating Causal Responsibility for Explaining Autonomous Behavior},<br \/>\r\nauthor = {Saaduddin Mahmud and Samer B. Nashed and Claudia V. Goldman and Shlomo Zilberstein},<br \/>\r\neditor = {Davide Calvaresi},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MNGZextraamas23.pdf},<br \/>\r\ndoi = {10.1007\/978-3-031-40878-6},<br \/>\r\nyear  = {2023},<br \/>\r\ndate = {2023-01-01},<br \/>\r\nbooktitle = {International Workshop on Explainable and Transparent AI and Multi-Agent Systems (EXTRAAMAS)},<br \/>\r\npages = {78\u201394},<br \/>\r\npublisher = {Springer},<br \/>\r\nabstract = {There has been growing interest in causal explanations of stochastic, sequential decision-making systems. Structural causal models and causal reasoning offer several theoretical benefits when exact inference can be applied. Furthermore, users overwhelmingly prefer the resulting causal explanations over other state-of-the-art systems. In this work, we focus on one such method, MeanRESP, and its approximate versions that drastically reduce compute load and assign a responsibility score to each variable, which helps identify smaller sets of causes to be used as explanations. However, this method, and its approximate versions in particular, lack deeper theoretical analysis and broader empirical tests. To address these shortcomings, we provide three primary contributions. First, we offer several theoretical insights on the sample complexity and error rate of approximate MeanRESP. Second, we discuss several automated metrics for comparing explanations generated from approximate methods to those generated via exact methods. While we recognize the significance of user studies as the gold standard for evaluating explanations, our aim is to leverage the proposed metrics to systematically compare explanation-generation methods along important quantitative dimensions. Finally, we provide a more detailed discussion of MeanRESP and how its output under different definitions of responsibility compares to existing widely adopted methods that use Shapley values.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {incollection}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1173','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1173\" style=\"display:none;\"><div class=\"tp_abstract_entry\">There has been growing interest in causal explanations of stochastic, sequential decision-making systems. Structural causal models and causal reasoning offer several theoretical benefits when exact inference can be applied. Furthermore, users overwhelmingly prefer the resulting causal explanations over other state-of-the-art systems. In this work, we focus on one such method, MeanRESP, and its approximate versions that drastically reduce compute load and assign a responsibility score to each variable, which helps identify smaller sets of causes to be used as explanations. However, this method, and its approximate versions in particular, lack deeper theoretical analysis and broader empirical tests. To address these shortcomings, we provide three primary contributions. First, we offer several theoretical insights on the sample complexity and error rate of approximate MeanRESP. Second, we discuss several automated metrics for comparing explanations generated from approximate methods to those generated via exact methods. While we recognize the significance of user studies as the gold standard for evaluating explanations, our aim is to leverage the proposed metrics to systematically compare explanation-generation methods along important quantitative dimensions. Finally, we provide a more detailed discussion of MeanRESP and how its output under different definitions of responsibility compares to existing widely adopted methods that use Shapley values.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1173','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1173\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MNGZextraamas23.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MNGZextraamas23.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MNGZextraamas23.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1007\/978-3-031-40878-6\" title=\"Follow DOI:10.1007\/978-3-031-40878-6\" target=\"_blank\">doi:10.1007\/978-3-031-40878-6<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1173','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Nashed, Samer;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1155','tp_links')\" style=\"cursor:pointer;\">A Survey of Opponent Modeling in Adversarial Domains<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Journal of Artificial Intelligence Research (JAIR), <\/span><span class=\"tp_pub_additional_volume\">vol. 73, <\/span><span class=\"tp_pub_additional_pages\">pp. 277\u2013327, <\/span><span class=\"tp_pub_additional_year\">2022<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1155\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1155','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1155\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1155','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1155\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1155','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1155\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:NZjair22,<br \/>\r\ntitle = {A Survey of Opponent Modeling in Adversarial Domains},<br \/>\r\nauthor = {Samer Nashed and Shlomo Zilberstein},<br \/>\r\nurl = {https:\/\/jair.org\/index.php\/jair\/article\/view\/12889\/26762},<br \/>\r\ndoi = {10.1613\/jair.1.12889},<br \/>\r\nyear  = {2022},<br \/>\r\ndate = {2022-01-01},<br \/>\r\njournal = {Journal of Artificial Intelligence Research (JAIR)},<br \/>\r\nvolume = {73},<br \/>\r\npages = {277--327},<br \/>\r\nabstract = {Opponent modeling is the ability to use prior knowledge and observations in order to predict the behavior of an opponent. This survey presents a comprehensive overview of existing opponent modeling techniques for adversarial domains, many of which must address stochastic, continuous, or concurrent actions, and sparse, partially observable payoff structures. We discuss all the components of opponent modeling systems, including feature extraction, learning algorithms, and strategy abstractions. These discussions lead us to propose a new form of analysis for describing and predicting the evolution of game states over time. We then introduce a new framework that facilitates method comparison, analyze a representative selection of techniques using the proposed framework, and highlight common trends among recently proposed methods. Finally, we list several open problems and discuss future research directions inspired by AI research on opponent modeling and related research in other disciplines.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1155','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1155\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Opponent modeling is the ability to use prior knowledge and observations in order to predict the behavior of an opponent. This survey presents a comprehensive overview of existing opponent modeling techniques for adversarial domains, many of which must address stochastic, continuous, or concurrent actions, and sparse, partially observable payoff structures. We discuss all the components of opponent modeling systems, including feature extraction, learning algorithms, and strategy abstractions. These discussions lead us to propose a new form of analysis for describing and predicting the evolution of game states over time. We then introduce a new framework that facilitates method comparison, analyze a representative selection of techniques using the proposed framework, and highlight common trends among recently proposed methods. Finally, we list several open problems and discuss future research directions inspired by AI research on opponent modeling and related research in other disciplines.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1155','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1155\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/jair.org\/index.php\/jair\/article\/view\/12889\/26762\" title=\"https:\/\/jair.org\/index.php\/jair\/article\/view\/12889\/26762\" target=\"_blank\">https:\/\/jair.org\/index.php\/jair\/article\/view\/12889\/26762<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1613\/jair.1.12889\" title=\"Follow DOI:10.1613\/jair.1.12889\" target=\"_blank\">doi:10.1613\/jair.1.12889<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1155','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Saisubramanian, Sandhya;  Zilberstein, Shlomo;  Kamar, Ece<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1157','tp_links')\" style=\"cursor:pointer;\">Avoiding Negative Side Effects of Autonomous Systems in the Open World<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Journal of Artificial Intelligence Research (JAIR), <\/span><span class=\"tp_pub_additional_volume\">vol. 74, <\/span><span class=\"tp_pub_additional_pages\">pp. 143\u2013177, <\/span><span class=\"tp_pub_additional_year\">2022<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1157\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1157','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1157\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1157','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1157\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1157','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1157\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:SZKjair22,<br \/>\r\ntitle = {Avoiding Negative Side Effects of Autonomous Systems in the Open World},<br \/>\r\nauthor = {Sandhya Saisubramanian and Shlomo Zilberstein and Ece Kamar},<br \/>\r\nurl = {https:\/\/www.jair.org\/index.php\/jair\/article\/view\/13581\/26799},<br \/>\r\ndoi = {10.1613\/jair.1.13581},<br \/>\r\nyear  = {2022},<br \/>\r\ndate = {2022-01-01},<br \/>\r\nurldate = {2022-01-01},<br \/>\r\njournal = {Journal of Artificial Intelligence Research (JAIR)},<br \/>\r\nvolume = {74},<br \/>\r\npages = {143--177},<br \/>\r\nabstract = {Autonomous systems that operate in the open world often use incomplete models of their environment. Model incompleteness is inevitable due to the practical limitations in precise model specification and data collection about open-world environments. Due to the limited fidelity of the model, agent actions may produce negative side effects (NSEs) when deployed. Negative side effects are undesirable, unmodeled effects of agent actions on the environment. NSEs are inherently challenging to identify at design time and may affect the reliability, usability and safety of the system. We present two complementary approaches to mitigate the NSE via: (1) learning from feedback, and (2) environment shaping. The solution approaches target settings with different assumptions and agent responsibilities. In learning from feedback, the agent learns a penalty function associated with a NSE. We investigate the efficiency of different feedback mechanisms, including human feedback and autonomous exploration. The problem is formulated as a multi-objective Markov decision process such that optimizing the agent\u2019s assigned task is prioritized over mitigating NSE. A slack parameter denotes the maximum allowed deviation from the optimal expected reward for the agent\u2019s task in order to mitigate NSE. In environment shaping, we examine how a human can assist an agent, beyond providing feedback, and utilize their broader scope of knowledge to mitigate the impacts of NSE. We formulate the problem as a human-agent collaboration with decoupled objectives. The agent optimizes its assigned task and may produce NSE during its operation. The human assists the agent by performing modest reconfigurations of the environment so as to mitigate the impacts of NSE, without affecting the agent\u2019s ability to complete its assigned task. We present an algorithm for shaping and analyze its properties. Empirical evaluations demonstrate the trade-offs in the performance of different approaches in mitigating NSE in different settings.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1157','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1157\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Autonomous systems that operate in the open world often use incomplete models of their environment. Model incompleteness is inevitable due to the practical limitations in precise model specification and data collection about open-world environments. Due to the limited fidelity of the model, agent actions may produce negative side effects (NSEs) when deployed. Negative side effects are undesirable, unmodeled effects of agent actions on the environment. NSEs are inherently challenging to identify at design time and may affect the reliability, usability and safety of the system. We present two complementary approaches to mitigate the NSE via: (1) learning from feedback, and (2) environment shaping. The solution approaches target settings with different assumptions and agent responsibilities. In learning from feedback, the agent learns a penalty function associated with a NSE. We investigate the efficiency of different feedback mechanisms, including human feedback and autonomous exploration. The problem is formulated as a multi-objective Markov decision process such that optimizing the agent\u2019s assigned task is prioritized over mitigating NSE. A slack parameter denotes the maximum allowed deviation from the optimal expected reward for the agent\u2019s task in order to mitigate NSE. In environment shaping, we examine how a human can assist an agent, beyond providing feedback, and utilize their broader scope of knowledge to mitigate the impacts of NSE. We formulate the problem as a human-agent collaboration with decoupled objectives. The agent optimizes its assigned task and may produce NSE during its operation. The human assists the agent by performing modest reconfigurations of the environment so as to mitigate the impacts of NSE, without affecting the agent\u2019s ability to complete its assigned task. We present an algorithm for shaping and analyze its properties. Empirical evaluations demonstrate the trade-offs in the performance of different approaches in mitigating NSE in different settings.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1157','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1157\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/www.jair.org\/index.php\/jair\/article\/view\/13581\/26799\" title=\"https:\/\/www.jair.org\/index.php\/jair\/article\/view\/13581\/26799\" target=\"_blank\">https:\/\/www.jair.org\/index.php\/jair\/article\/view\/13581\/26799<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1613\/jair.1.13581\" title=\"Follow DOI:10.1613\/jair.1.13581\" target=\"_blank\">doi:10.1613\/jair.1.13581<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1157','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Rabiee, Sadegh;  Basich, Connor;  Wray, Kyle Hollins;  Zilberstein, Shlomo;  Biswas, Joydeep<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1158','tp_links')\" style=\"cursor:pointer;\">Competence-Aware Path Planning Via Introspective Perception<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">IEEE Robotics and Automation Letters, <\/span><span class=\"tp_pub_additional_volume\">vol. 7, <\/span><span class=\"tp_pub_additional_number\">no. 2, <\/span><span class=\"tp_pub_additional_pages\">pp. 3218\u20133225, <\/span><span class=\"tp_pub_additional_year\">2022<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1158\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1158','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1158\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1158','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1158\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1158','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1158\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:RBWZBlra22,<br \/>\r\ntitle = {Competence-Aware Path Planning Via Introspective Perception},<br \/>\r\nauthor = {Sadegh Rabiee and Connor Basich and Kyle Hollins Wray and Shlomo Zilberstein and Joydeep Biswas},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/RBWZBlra22.pdf},<br \/>\r\ndoi = {10.1109\/LRA.2022.3145517},<br \/>\r\nyear  = {2022},<br \/>\r\ndate = {2022-01-01},<br \/>\r\njournal = {IEEE Robotics and Automation Letters},<br \/>\r\nvolume = {7},<br \/>\r\nnumber = {2},<br \/>\r\npages = {3218--3225},<br \/>\r\nabstract = {Robots deployed in the real world over extended periods of time need to reason about unexpected failures, learn to predict them, and to proactively take actions to avoid future failures. Existing approaches for competence-aware planning are either model-based, requiring explicit enumeration of known failure sources, or purely statistical, using state- and location-specific failure statistics to infer competence. We instead propose a structured model-free approach to competence-aware planning by reasoning about plan execution failures due to errors in perception, without requiring a priori enumeration of failure sources or requiring location-specific failure statistics. We introduce competence-aware path planning via introspective perception (CPIP) , a Bayesian framework to iteratively learn and exploit task-level competence in novel deployment environments. CPIP factorizes the competence-aware planning problem into two components. First, perception errors are learned in a model-free and location-agnostic setting via introspective perception prior to deployment in novel environments. Second, during actual deployments, the prediction of task-level failures is learned in a context-aware setting. Experiments in a simulation show that the proposed CPIP approach outperforms the frequentist baseline in multiple mobile robot tasks, and is further validated via real robot experiments in environments with perceptually challenging obstacles and terrain.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1158','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1158\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Robots deployed in the real world over extended periods of time need to reason about unexpected failures, learn to predict them, and to proactively take actions to avoid future failures. Existing approaches for competence-aware planning are either model-based, requiring explicit enumeration of known failure sources, or purely statistical, using state- and location-specific failure statistics to infer competence. We instead propose a structured model-free approach to competence-aware planning by reasoning about plan execution failures due to errors in perception, without requiring a priori enumeration of failure sources or requiring location-specific failure statistics. We introduce competence-aware path planning via introspective perception (CPIP) , a Bayesian framework to iteratively learn and exploit task-level competence in novel deployment environments. CPIP factorizes the competence-aware planning problem into two components. First, perception errors are learned in a model-free and location-agnostic setting via introspective perception prior to deployment in novel environments. Second, during actual deployments, the prediction of task-level failures is learned in a context-aware setting. Experiments in a simulation show that the proposed CPIP approach outperforms the frequentist baseline in multiple mobile robot tasks, and is further validated via real robot experiments in environments with perceptually challenging obstacles and terrain.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1158','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1158\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/RBWZBlra22.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/RBWZBlra22.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/RBWZBlra22.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1109\/LRA.2022.3145517\" title=\"Follow DOI:10.1109\/LRA.2022.3145517\" target=\"_blank\">doi:10.1109\/LRA.2022.3145517<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1158','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Svegliato, Justin;  Basich, Connor;  Saisubramanian, Sandhya;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1159','tp_links')\" style=\"cursor:pointer;\">Metareasoning for Safe Decision Making in Autonomous Systems<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), <\/span><span class=\"tp_pub_additional_address\">Philadelphia, Pennsylvania, <\/span><span class=\"tp_pub_additional_year\">2022<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1159\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1159','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1159\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1159','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1159\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1159','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1159\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SBSZicra22,<br \/>\r\ntitle = {Metareasoning for Safe Decision Making in Autonomous Systems},<br \/>\r\nauthor = {Justin Svegliato and Connor Basich and Sandhya Saisubramanian and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SBSZicra22.pdf},<br \/>\r\nyear  = {2022},<br \/>\r\ndate = {2022-01-01},<br \/>\r\nbooktitle = {Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},<br \/>\r\naddress = {Philadelphia, Pennsylvania},<br \/>\r\nabstract = {Although experts carefully specify the high-level decision-making models in autonomous systems, it is infeasible to guarantee safety across every scenario during operation. We therefore propose a safety metareasoning system that optimizes the severity of the system's safety concerns and the interference to the system's task: the system executes in parallel a task process that completes a specified task and safety processes that each address a specified safety concern with a conflict resolver for arbitration. This paper offers a formal definition of a safety metareasoning system, a recommendation algorithm for a safety process, an arbitration algorithm for a conflict resolver, an application of our approach to planetary rover exploration, and a demonstration that our approach is effective in simulation.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1159','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1159\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Although experts carefully specify the high-level decision-making models in autonomous systems, it is infeasible to guarantee safety across every scenario during operation. We therefore propose a safety metareasoning system that optimizes the severity of the system's safety concerns and the interference to the system's task: the system executes in parallel a task process that completes a specified task and safety processes that each address a specified safety concern with a conflict resolver for arbitration. This paper offers a formal definition of a safety metareasoning system, a recommendation algorithm for a safety process, an arbitration algorithm for a conflict resolver, an application of our approach to planetary rover exploration, and a demonstration that our approach is effective in simulation.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1159','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1159\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SBSZicra22.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SBSZicra22.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SBSZicra22.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1159','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Miura, Shuwa;  Wray, Kyle Hollins;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1161','tp_links')\" style=\"cursor:pointer;\">Heuristic Search for SSPs with Lexicographic Preferences over Multiple Costs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 15th Annual Symposium on Combinatorial Search (SOCS), <\/span><span class=\"tp_pub_additional_address\">Vienna, Austria, <\/span><span class=\"tp_pub_additional_year\">2022<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1161\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1161','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1161\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1161','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1161\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1161','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1161\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:MWZsocs22,<br \/>\r\ntitle = {Heuristic Search for SSPs with Lexicographic Preferences over Multiple Costs},<br \/>\r\nauthor = {Shuwa Miura and Kyle Hollins Wray and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MWZsocs22.pdf},<br \/>\r\nyear  = {2022},<br \/>\r\ndate = {2022-01-01},<br \/>\r\nbooktitle = {Proceedings of the 15th Annual Symposium on Combinatorial Search (SOCS)},<br \/>\r\naddress = {Vienna, Austria},<br \/>\r\nabstract = {Real-world decision problems often involve multiple competing objectives. The Stochastic Shortest Path (SSP) with lexicographic preferences over multiple costs offers an expressive formulation for many practical problems. However, the existing solution methods either lack optimality guarantees or require costly computations over the entire state space. We propose the first heuristic search algorithm for this problem, based on the heuristic algorithm for Constrained SSPs. Our experiments show that our heuristic search algorithm can compute optimal policies while avoiding a large portion of the state space. We also analyze the theoretical properties of the problem, establishing the conditions under which SSPs with lexicographic preferences have a proper optimal policy.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1161','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1161\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Real-world decision problems often involve multiple competing objectives. The Stochastic Shortest Path (SSP) with lexicographic preferences over multiple costs offers an expressive formulation for many practical problems. However, the existing solution methods either lack optimality guarantees or require costly computations over the entire state space. We propose the first heuristic search algorithm for this problem, based on the heuristic algorithm for Constrained SSPs. Our experiments show that our heuristic search algorithm can compute optimal policies while avoiding a large portion of the state space. We also analyze the theoretical properties of the problem, establishing the conditions under which SSPs with lexicographic preferences have a proper optimal policy.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1161','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1161\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MWZsocs22.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MWZsocs22.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MWZsocs22.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1161','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Basich, Connor;  Peterson, John;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1164','tp_links')\" style=\"cursor:pointer;\">Planning with Intermittent State Observability: Knowing When to Act Blind<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), <\/span><span class=\"tp_pub_additional_address\">Kyoto, Japan, <\/span><span class=\"tp_pub_additional_year\">2022<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1164\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1164','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1164\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1164','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1164\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1164','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1164\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:BPZiros22,<br \/>\r\ntitle = {Planning with Intermittent State Observability: Knowing When to Act Blind},<br \/>\r\nauthor = {Connor Basich and John Peterson and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BPZiros22.pdf},<br \/>\r\nyear  = {2022},<br \/>\r\ndate = {2022-01-01},<br \/>\r\nbooktitle = {Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS)},<br \/>\r\npages = {11657--11664},<br \/>\r\naddress = {Kyoto, Japan},<br \/>\r\nabstract = {Contemporary planning models and methods often rely on constant availability of free state information at each step of execution. However, autonomous systems are increasingly deployed in the open world where state information may be costly or simply unavailable in certain situations. Failing to account for sensor limitations may lead to costly behavior or even catastrophic failure. While the partially observable Markov decision process (POMDP) can be used to model this problem, solving POMDPs is often intractable. We introduce a planning model called a semi-observable Markov decision process (SOMDP) specifically designed for MDPs where state observability may be intermittent. We propose an approach for solving SOMDPs that uses memory states to proactively plan for the potential loss of sensor information while exploiting the unique structure of SOMDPs. Our theoretical analysis and empirical evaluation demonstrate the advantages of SOMDPs relative to existing planning models.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1164','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1164\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Contemporary planning models and methods often rely on constant availability of free state information at each step of execution. However, autonomous systems are increasingly deployed in the open world where state information may be costly or simply unavailable in certain situations. Failing to account for sensor limitations may lead to costly behavior or even catastrophic failure. While the partially observable Markov decision process (POMDP) can be used to model this problem, solving POMDPs is often intractable. We introduce a planning model called a semi-observable Markov decision process (SOMDP) specifically designed for MDPs where state observability may be intermittent. We propose an approach for solving SOMDPs that uses memory states to proactively plan for the potential loss of sensor information while exploiting the unique structure of SOMDPs. Our theoretical analysis and empirical evaluation demonstrate the advantages of SOMDPs relative to existing planning models.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1164','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1164\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BPZiros22.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BPZiros22.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BPZiros22.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1164','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Nashed, Samer B.;  Svegliato, Justin;  Bhatia, Abhinav;  Russell, Stuart;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1165','tp_links')\" style=\"cursor:pointer;\">Selecting the Partial State Abstractions of MDPs: A Metareasoning Approach with Deep Reinforcement Learning<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), <\/span><span class=\"tp_pub_additional_address\">Kyoto, Japan, <\/span><span class=\"tp_pub_additional_year\">2022<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1165\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1165','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1165\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1165','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1165\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1165','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1165\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:NSBRZiros22,<br \/>\r\ntitle = {Selecting the Partial State Abstractions of MDPs: A Metareasoning Approach with Deep Reinforcement Learning},<br \/>\r\nauthor = {Samer B. Nashed and Justin Svegliato and Abhinav Bhatia and Stuart Russell and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/NSBRZiros22.pdf},<br \/>\r\nyear  = {2022},<br \/>\r\ndate = {2022-01-01},<br \/>\r\nbooktitle = {Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS)},<br \/>\r\npages = {116665--11670},<br \/>\r\naddress = {Kyoto, Japan},<br \/>\r\nabstract = {Markov decision processes (MDPs) are a common general-purpose model used in robotics for representing sequential decision-making problems. Given the complexity of robotics applications, a popular approach for approximately solving MDPs relies on state aggregation to reduce the size of the state space but at the expense of policy fidelity--offering a trade-off between policy quality and computation time. Naturally, this poses a challenging metareasoning problem: how can an autonomous system dynamically select different state abstractions that optimize this trade-off as it operates online? In this paper, we formalize this metareasoning problem with a notion of time-dependent utility and solve it using deep reinforcement learning. To do this, we develop several general, cheap heuristics that summarize the reward structure and transition topology of the MDP at hand to serve as effective features. Empirically, we demonstrate that our metareasoning approach outperforms several baseline approaches and a strong heuristic approach on a standard benchmark domain.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1165','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1165\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Markov decision processes (MDPs) are a common general-purpose model used in robotics for representing sequential decision-making problems. Given the complexity of robotics applications, a popular approach for approximately solving MDPs relies on state aggregation to reduce the size of the state space but at the expense of policy fidelity--offering a trade-off between policy quality and computation time. Naturally, this poses a challenging metareasoning problem: how can an autonomous system dynamically select different state abstractions that optimize this trade-off as it operates online? In this paper, we formalize this metareasoning problem with a notion of time-dependent utility and solve it using deep reinforcement learning. To do this, we develop several general, cheap heuristics that summarize the reward structure and transition topology of the MDP at hand to serve as effective features. Empirically, we demonstrate that our metareasoning approach outperforms several baseline approaches and a strong heuristic approach on a standard benchmark domain.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1165','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1165\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/NSBRZiros22.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/NSBRZiros22.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/NSBRZiros22.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1165','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Nashed, Samer B;  Svegliato, Justin;  Brucato, Matteo;  Basich, Connor;  Grupen, Roderic A;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1131','tp_links')\" style=\"cursor:pointer;\">Solving Markov Decision Processes with Partial State Abstractions<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1131\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1131','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1131\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1131','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1131\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1131','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1131\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:NSBBGZicra21,<br \/>\r\ntitle = {Solving Markov Decision Processes with Partial State Abstractions},<br \/>\r\nauthor = {Samer B Nashed and Justin Svegliato and Matteo Brucato and Connor Basich and Roderic A Grupen and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/NSBBGZicra21.pdf},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\nbooktitle = {Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},<br \/>\r\nabstract = {Autonomous systems often use approximate planners that exploit state abstractions to solve large MDPs in real-time decision-making problems. However, these planners can eliminate details needed to produce effective behavior in autonomous systems. We therefore propose a novel model, a partially abstract MDP, with a set of abstract states that each compress a set of ground states to condense irrelevant details and a set of ground states that expand from a set of grounded abstract states to retain relevant details. This paper offers (1) a definition of a partially abstract MDP that (2) generalizes its ground MDP and its abstract MDP and exhibits bounded optimality depending on its abstract MDP along with (3) a lazy algorithm for planning and execution in autonomous systems. The result is a scalable approach that computes near-optimal solutions to large problems in minutes rather than hours.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1131','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1131\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Autonomous systems often use approximate planners that exploit state abstractions to solve large MDPs in real-time decision-making problems. However, these planners can eliminate details needed to produce effective behavior in autonomous systems. We therefore propose a novel model, a partially abstract MDP, with a set of abstract states that each compress a set of ground states to condense irrelevant details and a set of ground states that expand from a set of grounded abstract states to retain relevant details. This paper offers (1) a definition of a partially abstract MDP that (2) generalizes its ground MDP and its abstract MDP and exhibits bounded optimality depending on its abstract MDP along with (3) a lazy algorithm for planning and execution in autonomous systems. The result is a scalable approach that computes near-optimal solutions to large problems in minutes rather than hours.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1131','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1131\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/NSBBGZicra21.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/NSBBGZicra21.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/NSBBGZicra21.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1131','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Basich, Connor;  Svegliato, Justin;  Beach, Allyson;  Wray, Kyle Hollins;  Witwicki, Stefan J;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1142','tp_links')\" style=\"cursor:pointer;\">Improving Competence via Iterative State Space Refinement<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), <\/span><span class=\"tp_pub_additional_address\">Prague, Czech Republic, <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1142\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1142','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1142\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1142','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1142\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1142','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1142\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:BSBWWZiros21,<br \/>\r\ntitle = {Improving Competence via Iterative State Space Refinement},<br \/>\r\nauthor = {Connor Basich and Justin Svegliato and Allyson Beach and Kyle Hollins Wray and Stefan J Witwicki and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSBWWZiros21.pdf},<br \/>\r\ndoi = {10.1109\/IROS51168.2021.9636239},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\nbooktitle = {Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS)},<br \/>\r\npages = {1865--1871},<br \/>\r\naddress = {Prague, Czech Republic},<br \/>\r\nabstract = {Despite considerable efforts by human designers, accounting for every unique situation that an autonomous robotic system deployed in the real world could face is often an infeasible task. As a result, many such deployed systems still rely on human assistance in various capacities to complete certain tasks while staying safe. Competence-aware systems (CAS) is a recently proposed model for reducing such reliance on human assistance while in turn optimizing the system\u2019s global autonomous operation by learning its own competence. However, such systems are limited by a fixed model of their environment and may perform poorly if their a priori planning model does not include certain features that emerge as important over the course of the system\u2019s deployment. In this paper, we propose a method for improving the competence of a CAS over time by identifying important state features missing from the system\u2019s model and incorporating them into its state representation, thereby refining its state space. Our approach exploits information that exists in the standard CAS model and adds no extra work to the human. The result is an agent that better predicts human involvement, improving its competence, reliability, and overall performance.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1142','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1142\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Despite considerable efforts by human designers, accounting for every unique situation that an autonomous robotic system deployed in the real world could face is often an infeasible task. As a result, many such deployed systems still rely on human assistance in various capacities to complete certain tasks while staying safe. Competence-aware systems (CAS) is a recently proposed model for reducing such reliance on human assistance while in turn optimizing the system\u2019s global autonomous operation by learning its own competence. However, such systems are limited by a fixed model of their environment and may perform poorly if their a priori planning model does not include certain features that emerge as important over the course of the system\u2019s deployment. In this paper, we propose a method for improving the competence of a CAS over time by identifying important state features missing from the system\u2019s model and incorporating them into its state representation, thereby refining its state space. Our approach exploits information that exists in the standard CAS model and adds no extra work to the human. The result is an agent that better predicts human involvement, improving its competence, reliability, and overall performance.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1142','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1142\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSBWWZiros21.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSBWWZiros21.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSBWWZiros21.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1109\/IROS51168.2021.9636239\" title=\"Follow DOI:10.1109\/IROS51168.2021.9636239\" target=\"_blank\">doi:10.1109\/IROS51168.2021.9636239<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1142','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Parr, Shane;  Khatri, Ishan;  Svegliato, Justin;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1143','tp_links')\" style=\"cursor:pointer;\">Agent-Aware State Estimation in Autonomous Vehicles<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), <\/span><span class=\"tp_pub_additional_address\">Prague, Czech Republic, <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1143\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1143','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1143\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1143','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1143\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1143','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1143\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:PKSZiros21,<br \/>\r\ntitle = {Agent-Aware State Estimation in Autonomous Vehicles},<br \/>\r\nauthor = {Shane Parr and Ishan Khatri and Justin Svegliato and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PKSZiros21.pdf},<br \/>\r\ndoi = {10.1109\/IROS51168.2021.9636210},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\nbooktitle = {Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS)},<br \/>\r\npages = {6694--6699},<br \/>\r\naddress = {Prague, Czech Republic},<br \/>\r\nabstract = {Autonomous systems often operate in environments where the behavior of multiple agents is coordinated by a shared global state. Reliable estimation of the global state is thus critical for successfully operating in a multi-agent setting. We introduce agent-aware state estimation--a framework for calculating indirect estimations of state given observations of the behavior of other agents in the environment. We also introduce transition-independent agent-aware state estimation--a tractable class of agent-aware state estimation--and show that it allows the speed of inference to scale linearly with the number of agents in the environment. As an example, we model traffic light classification in instances of complete loss of direct observation. By taking into account observations of vehicular behavior from multiple directions of traffic, our approach exhibits accuracy higher than that of existing traffic light-only HMM methods on a real-world autonomous vehicle data set under a variety of simulated occlusion scenarios.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1143','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1143\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Autonomous systems often operate in environments where the behavior of multiple agents is coordinated by a shared global state. Reliable estimation of the global state is thus critical for successfully operating in a multi-agent setting. We introduce agent-aware state estimation--a framework for calculating indirect estimations of state given observations of the behavior of other agents in the environment. We also introduce transition-independent agent-aware state estimation--a tractable class of agent-aware state estimation--and show that it allows the speed of inference to scale linearly with the number of agents in the environment. As an example, we model traffic light classification in instances of complete loss of direct observation. By taking into account observations of vehicular behavior from multiple directions of traffic, our approach exhibits accuracy higher than that of existing traffic light-only HMM methods on a real-world autonomous vehicle data set under a variety of simulated occlusion scenarios.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1143','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1143\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PKSZiros21.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PKSZiros21.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PKSZiros21.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1109\/IROS51168.2021.9636210\" title=\"Follow DOI:10.1109\/IROS51168.2021.9636210\" target=\"_blank\">doi:10.1109\/IROS51168.2021.9636210<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1143','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Basich, Connor;  Wang, Daniel;  Russino, Joseph;  Chien, Steve;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\">A Sampling-Based Optimization Approach to Handling Environmental Uncertainty for a Planetary Lander <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">ICAPS Workshop on Planning and Robotics (PlanRob), <\/span><span class=\"tp_pub_additional_address\">Guangzhou, China, <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1144\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1144','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1144\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1144','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1144\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:BWRCZicaps21ws1,<br \/>\r\ntitle = {A Sampling-Based Optimization Approach to Handling Environmental Uncertainty for a Planetary Lander},<br \/>\r\nauthor = {Connor Basich and Daniel Wang and Joseph Russino and Steve Chien and Shlomo Zilberstein},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\nbooktitle = {ICAPS Workshop on Planning and Robotics (PlanRob)},<br \/>\r\naddress = {Guangzhou, China},<br \/>\r\nabstract = {TBD.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1144','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1144\" style=\"display:none;\"><div class=\"tp_abstract_entry\">TBD.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1144','tp_abstract')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Miura, Shuwa;  Cohen, Andrew L;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1145','tp_links')\" style=\"cursor:pointer;\">Maximizing Legibility in Stochastic Environments<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 30th IEEE International Conference on Robot &amp; Human Interactive Communication, (RO-MAN), <\/span><span class=\"tp_pub_additional_address\">Vancouver, BC, Canada, <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1145\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1145','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1145\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1145','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1145\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1145','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1145\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:MCZroman21,<br \/>\r\ntitle = {Maximizing Legibility in Stochastic Environments},<br \/>\r\nauthor = {Shuwa Miura and Andrew L Cohen and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MCZroman21.pdf},<br \/>\r\ndoi = {10.1109\/RO-MAN50785.2021.9515318},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\nbooktitle = {Proceedings of the 30th IEEE International Conference on Robot & Human Interactive Communication, (RO-MAN)},<br \/>\r\npages = {1053--1059},<br \/>\r\naddress = {Vancouver, BC, Canada},<br \/>\r\nabstract = {Making an agent's intentions clear from its observed behavior is crucial for seamless human-agent interaction and for increased transparency and trust in AI systems. Existing methods that address this challenge and maximize legibility of behaviors are limited to deterministic domains. We develop a technique for maximizing legibility in stochastic environments and illustrate that using legibility as an objective improves interpretability of agent behavior in several scenarios. We provide initial empirical evidence that human subjects can better interpret legible behavior.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1145','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1145\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Making an agent's intentions clear from its observed behavior is crucial for seamless human-agent interaction and for increased transparency and trust in AI systems. Existing methods that address this challenge and maximize legibility of behaviors are limited to deterministic domains. We develop a technique for maximizing legibility in stochastic environments and illustrate that using legibility as an objective improves interpretability of agent behavior in several scenarios. We provide initial empirical evidence that human subjects can better interpret legible behavior.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1145','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1145\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MCZroman21.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MCZroman21.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MCZroman21.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1109\/RO-MAN50785.2021.9515318\" title=\"Follow DOI:10.1109\/RO-MAN50785.2021.9515318\" target=\"_blank\">doi:10.1109\/RO-MAN50785.2021.9515318<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1145','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Miura, Shuwa;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1146','tp_links')\" style=\"cursor:pointer;\">A Unifying Framework for Observer-Aware Planning and its Complexity<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 37th Conference on Uncertainty in Artificial Intelligence (UAI), <\/span><span class=\"tp_pub_additional_address\">Virtual Event, <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1146\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1146','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1146\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1146','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1146\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1146','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1146\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:MZuai21,<br \/>\r\ntitle = {A Unifying Framework for Observer-Aware Planning and its Complexity},<br \/>\r\nauthor = {Shuwa Miura and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZuai21.pdf},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\nbooktitle = {Proceedings of the 37th Conference on Uncertainty in Artificial Intelligence (UAI)},<br \/>\r\npages = {610--620},<br \/>\r\naddress = {Virtual Event},<br \/>\r\nabstract = {Being aware of observers and the inferences they make about an agent's behavior is crucial for successful multi-agent interaction. Existing works on observer-aware planning use different assumptions and techniques to produce observer-aware behaviors. We argue that observer-aware planning, in its most general form, can be modeled as an Interactive POMDP (I-POMDP), which requires complex modeling and is hard to solve. Hence, we introduce a less complex framework for producing observer-aware behaviors called Observer-Aware MDP (OAMDP) and analyze its relationship to I-POMDP. We establish the complexity of OAMDPs and show that they can improve interpretability of agent behaviors in several scenarios.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1146','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1146\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Being aware of observers and the inferences they make about an agent's behavior is crucial for successful multi-agent interaction. Existing works on observer-aware planning use different assumptions and techniques to produce observer-aware behaviors. We argue that observer-aware planning, in its most general form, can be modeled as an Interactive POMDP (I-POMDP), which requires complex modeling and is hard to solve. Hence, we introduce a less complex framework for producing observer-aware behaviors called Observer-Aware MDP (OAMDP) and analyze its relationship to I-POMDP. We establish the complexity of OAMDPs and show that they can improve interpretability of agent behaviors in several scenarios.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1146','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1146\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZuai21.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZuai21.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZuai21.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1146','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Pineda, Luis;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('859','tp_links')\" style=\"cursor:pointer;\">Soft Labeling in Stochastic Shortest Path Problems<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS), <\/span><span class=\"tp_pub_additional_address\">Montreal, Quebec, CA, <\/span><span class=\"tp_pub_additional_year\">2019<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_859\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('859','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_859\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('859','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_859\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('859','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_859\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:PZaamas19,<br \/>\r\ntitle = {Soft Labeling in Stochastic Shortest Path Problems},<br \/>\r\nauthor = {Luis Pineda and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZaamas19.pdf},<br \/>\r\nyear  = {2019},<br \/>\r\ndate = {2019-01-01},<br \/>\r\nbooktitle = {Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS)},<br \/>\r\npages = {467--475},<br \/>\r\naddress = {Montreal, Quebec, CA},<br \/>\r\nabstract = {The Stochastic Shortest Path (SSP) is an established model for goal-directed probabilistic planning. Despite its broad applicability, wide adoption of the model has been impaired by its high computational complexity. Efforts to address this challenge have produced promising algorithms that leverage two popular mechanisms: labeling and short-sightedness. The resulting algorithms can generate near- optimal solutions much faster than optimal solvers, albeit at the cost of poor theoretical guarantees. In this work, we introduce a generalization of labeling, called soft labeling, which results in a framework that encompasses a wide spectrum of efficient labeling algorithms, and offers better theoretical guarantees than existing short-sighted labeling approaches. We also propose a novel instantiation of this framework, the SOFT-FLARES algorithm, which achieves state-of-the-art performance on a diverse set of benchmarks.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('859','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_859\" style=\"display:none;\"><div class=\"tp_abstract_entry\">The Stochastic Shortest Path (SSP) is an established model for goal-directed probabilistic planning. Despite its broad applicability, wide adoption of the model has been impaired by its high computational complexity. Efforts to address this challenge have produced promising algorithms that leverage two popular mechanisms: labeling and short-sightedness. The resulting algorithms can generate near- optimal solutions much faster than optimal solvers, albeit at the cost of poor theoretical guarantees. In this work, we introduce a generalization of labeling, called soft labeling, which results in a framework that encompasses a wide spectrum of efficient labeling algorithms, and offers better theoretical guarantees than existing short-sighted labeling approaches. We also propose a novel instantiation of this framework, the SOFT-FLARES algorithm, which achieves state-of-the-art performance on a diverse set of benchmarks.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('859','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_859\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZaamas19.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZaamas19.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZaamas19.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('859','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Saisubramanian, Sandhya;  Wray, Kyle Hollins;  Pineda, Luis Enrique;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\">Planning in Stochastic Environments with Goal Uncertainty <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">ICAPS Workshop on Planning and Robotics (PlanRob), <\/span><span class=\"tp_pub_additional_address\">Berkeley, CA, <\/span><span class=\"tp_pub_additional_year\">2019<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1103\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1103','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1103\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1103','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1103\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SWPZicaps19ws1,<br \/>\r\ntitle = {Planning in Stochastic Environments with Goal Uncertainty},<br \/>\r\nauthor = {Sandhya Saisubramanian and Kyle Hollins Wray and Luis Enrique Pineda and Shlomo Zilberstein},<br \/>\r\nyear  = {2019},<br \/>\r\ndate = {2019-01-01},<br \/>\r\nbooktitle = {ICAPS Workshop on Planning and Robotics (PlanRob)},<br \/>\r\naddress = {Berkeley, CA},<br \/>\r\nabstract = {TBD.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1103','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1103\" style=\"display:none;\"><div class=\"tp_abstract_entry\">TBD.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1103','tp_abstract')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Saisubramanian, Sandhya;  Basich, Connor;  Zilberstein, Shlomo;  Goldman, Claudia V<\/p><p class=\"tp_pub_title\">The Value of Incorporating Social Preferences in Dynamic Ridesharing <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">ICAPS Workshop on Scheduling and Planning Applications (SPARK), <\/span><span class=\"tp_pub_additional_address\">Berkeley, CA, <\/span><span class=\"tp_pub_additional_year\">2019<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1104\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1104','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1104\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1104','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1104\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SBZGicaps19ws2,<br \/>\r\ntitle = {The Value of Incorporating Social Preferences in Dynamic Ridesharing},<br \/>\r\nauthor = {Sandhya Saisubramanian and Connor Basich and Shlomo Zilberstein and Claudia V Goldman},<br \/>\r\nyear  = {2019},<br \/>\r\ndate = {2019-01-01},<br \/>\r\nbooktitle = {ICAPS Workshop on Scheduling and Planning Applications (SPARK)},<br \/>\r\naddress = {Berkeley, CA},<br \/>\r\nabstract = {TBD.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1104','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1104\" style=\"display:none;\"><div class=\"tp_abstract_entry\">TBD.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1104','tp_abstract')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Pineda, Luis Enrique;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1105','tp_links')\" style=\"cursor:pointer;\">Probabilistic Planning with Reduced Models<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Journal of Artificial Intelligence Research (JAIR), <\/span><span class=\"tp_pub_additional_volume\">vol. 65, <\/span><span class=\"tp_pub_additional_pages\">pp. 271\u2013306, <\/span><span class=\"tp_pub_additional_year\">2019<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1105\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1105','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1105\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1105','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1105\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1105','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1105\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:PZjair19,<br \/>\r\ntitle = {Probabilistic Planning with Reduced Models},<br \/>\r\nauthor = {Luis Enrique Pineda and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZjair19.pdf},<br \/>\r\ndoi = {10.1613\/jair.1.11569},<br \/>\r\nyear  = {2019},<br \/>\r\ndate = {2019-01-01},<br \/>\r\njournal = {Journal of Artificial Intelligence Research (JAIR)},<br \/>\r\nvolume = {65},<br \/>\r\npages = {271--306},<br \/>\r\nabstract = {Reduced models are simplified versions of a given domain, designed to accelerate the planning process. Interest in reduced models has grown since the surprising success of determinization in the first international probabilistic planning competition, leading to the development of several enhanced determinization techniques. To address the drawbacks of previous determinization methods, we introduce a family of reduced models in which probabilistic outcomes are classified as one of two types: primary and exceptional. In each model that belongs to this family of reductions, primary outcomes can occur an unbounded number of times per trajectory, while exceptions can occur at most a finite number of times, specified by a parameter. Distinct reduced models are characterized by two parameters: the maximum number of primary outcomes per action, and the maximum number of occurrences of exceptions per trajectory. This family of reductions generalizes the well-known most-likely-outcome determinization approach, which includes one primary outcome per action and zero exceptional outcomes per plan. We present a framework to determine the benefits of planning with reduced models, and develop a continual planning approach that handles situations where the number of exceptions exceeds the specified bound during plan execution. Using this framework, we compare the performance of various reduced models and consider the challenge of generating good ones automatically. We show that each one of the dimensions--allowing more than one primary outcome or planning for some limited number of exceptions--could improve performance relative to standard determinization. The results place previous work on determinization in a broader context and lay the foundation for a systematic exploration of the space of model reductions.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1105','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1105\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Reduced models are simplified versions of a given domain, designed to accelerate the planning process. Interest in reduced models has grown since the surprising success of determinization in the first international probabilistic planning competition, leading to the development of several enhanced determinization techniques. To address the drawbacks of previous determinization methods, we introduce a family of reduced models in which probabilistic outcomes are classified as one of two types: primary and exceptional. In each model that belongs to this family of reductions, primary outcomes can occur an unbounded number of times per trajectory, while exceptions can occur at most a finite number of times, specified by a parameter. Distinct reduced models are characterized by two parameters: the maximum number of primary outcomes per action, and the maximum number of occurrences of exceptions per trajectory. This family of reductions generalizes the well-known most-likely-outcome determinization approach, which includes one primary outcome per action and zero exceptional outcomes per plan. We present a framework to determine the benefits of planning with reduced models, and develop a continual planning approach that handles situations where the number of exceptions exceeds the specified bound during plan execution. Using this framework, we compare the performance of various reduced models and consider the challenge of generating good ones automatically. We show that each one of the dimensions--allowing more than one primary outcome or planning for some limited number of exceptions--could improve performance relative to standard determinization. The results place previous work on determinization in a broader context and lay the foundation for a systematic exploration of the space of model reductions.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1105','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1105\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZjair19.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZjair19.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZjair19.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1613\/jair.1.11569\" title=\"Follow DOI:10.1613\/jair.1.11569\" target=\"_blank\">doi:10.1613\/jair.1.11569<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1105','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Saisubramanian, Sandhya;  Wray, Kyle Hollins;  Pineda, Luis Enrique;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1106','tp_links')\" style=\"cursor:pointer;\">Planning in Stochastic Environments with Goal Uncertainty<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), <\/span><span class=\"tp_pub_additional_address\">Macau, China, <\/span><span class=\"tp_pub_additional_year\">2019<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1106\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1106','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1106\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1106','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1106\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1106','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1106\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SWPZiros19,<br \/>\r\ntitle = {Planning in Stochastic Environments with Goal Uncertainty},<br \/>\r\nauthor = {Sandhya Saisubramanian and Kyle Hollins Wray and Luis Enrique Pineda and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SWPZiros19.pdf},<br \/>\r\nyear  = {2019},<br \/>\r\ndate = {2019-01-01},<br \/>\r\nbooktitle = {Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS)},<br \/>\r\naddress = {Macau, China},<br \/>\r\nabstract = {We present the Goal Uncertain Stochastic Shortest Path (GUSSP) problem -- a general framework to model path planning and decision making in stochastic environments with goal uncertainty. The framework extends the stochastic shortest path (SSP) model to dynamic environments in which it is impossible to determine the exact goal states ahead of plan execution. GUSSPs introduce flexibility in goal specification by allowing a belief over possible goal configurations. The unique observations at potential goals helps the agent identify the true goal during plan execution. The partial observability is restricted to goals, facilitating the reduction to an SSP with a modified state space. We formally define a GUSSP and discuss its theoretical properties. We then propose an admissible heuristic that reduces the planning time using FLARES -- a start-of-the-art probabilistic planner. We also propose a determinization approach for solving this class of problems. Finally, we present empirical results on a search and rescue mobile robot and three other problem domains in simulation.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1106','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1106\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We present the Goal Uncertain Stochastic Shortest Path (GUSSP) problem -- a general framework to model path planning and decision making in stochastic environments with goal uncertainty. The framework extends the stochastic shortest path (SSP) model to dynamic environments in which it is impossible to determine the exact goal states ahead of plan execution. GUSSPs introduce flexibility in goal specification by allowing a belief over possible goal configurations. The unique observations at potential goals helps the agent identify the true goal during plan execution. The partial observability is restricted to goals, facilitating the reduction to an SSP with a modified state space. We formally define a GUSSP and discuss its theoretical properties. We then propose an admissible heuristic that reduces the planning time using FLARES -- a start-of-the-art probabilistic planner. We also propose a determinization approach for solving this class of problems. Finally, we present empirical results on a search and rescue mobile robot and three other problem domains in simulation.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1106','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1106\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SWPZiros19.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SWPZiros19.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SWPZiros19.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1106','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Svegliato, Justin;  Wray, Kyle Hollins;  Witwicki, Stefan J;  Biswas, Joydeep;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1108','tp_links')\" style=\"cursor:pointer;\">Belief Space Metareasoning for Exception Recovery<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), <\/span><span class=\"tp_pub_additional_address\">Macau, China, <\/span><span class=\"tp_pub_additional_year\">2019<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1108\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1108','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1108\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1108','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1108\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1108','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1108\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SWWBZiros19,<br \/>\r\ntitle = {Belief Space Metareasoning for Exception Recovery},<br \/>\r\nauthor = {Justin Svegliato and Kyle Hollins Wray and Stefan J Witwicki and Joydeep Biswas and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SWWBZiros19.pdf},<br \/>\r\nyear  = {2019},<br \/>\r\ndate = {2019-01-01},<br \/>\r\nbooktitle = {Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS)},<br \/>\r\naddress = {Macau, China},<br \/>\r\nabstract = {Due to the complexity of the real world, autonomous systems use decision-making models that rely on simplifying assumptions to make them computationally tractable and feasible to design. However, since these limited representations cannot fully capture the domain of operation, an autonomous system may encounter unanticipated scenarios that cannot be resolved effectively. We first formally introduce an introspective autonomous system that uses belief space metareasoning to recover from exceptions by interleaving a main decision process with a set of exception handlers. We then apply introspective autonomy to autonomous driving. Finally, we demonstrate that an introspective autonomous vehicle is effective in simulation and on a fully operational prototype.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1108','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1108\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Due to the complexity of the real world, autonomous systems use decision-making models that rely on simplifying assumptions to make them computationally tractable and feasible to design. However, since these limited representations cannot fully capture the domain of operation, an autonomous system may encounter unanticipated scenarios that cannot be resolved effectively. We first formally introduce an introspective autonomous system that uses belief space metareasoning to recover from exceptions by interleaving a main decision process with a set of exception handlers. We then apply introspective autonomy to autonomous driving. Finally, we demonstrate that an introspective autonomous vehicle is effective in simulation and on a fully operational prototype.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1108','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1108\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SWWBZiros19.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SWWBZiros19.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SWWBZiros19.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1108','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Saisubramanian, Sandhya;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1107','tp_links')\" style=\"cursor:pointer;\">Adaptive Outcome Selection for Planning With Reduced Models<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), <\/span><span class=\"tp_pub_additional_address\">Macau, China, <\/span><span class=\"tp_pub_additional_year\">2019<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1107\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1107','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1107\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1107','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1107\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1107','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1107\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SZiros19,<br \/>\r\ntitle = {Adaptive Outcome Selection for Planning With Reduced Models},<br \/>\r\nauthor = {Sandhya Saisubramanian and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZiros19.pdf},<br \/>\r\nyear  = {2019},<br \/>\r\ndate = {2019-01-01},<br \/>\r\nbooktitle = {Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS)},<br \/>\r\naddress = {Macau, China},<br \/>\r\nabstract = {Reduced models allow autonomous robots to cope with the complexity of planning in stochastic environments by simplifying the model and reducing its accuracy. The solution quality of a reduced model depends on its fidelity. We present 0\/1 reduced model that selectively improves model fidelity in certain states by switching between using a simplified deterministic model and the full model, without significantly compromising the run time gains. We measure the reduction impact for a reduced model based on the values of the ignored outcomes and use this as a heuristic for outcome selection. Finally, we present empirical results of our approach on three different domains, including an electric vehicle charging problem using real-world data from a university campus.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1107','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1107\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Reduced models allow autonomous robots to cope with the complexity of planning in stochastic environments by simplifying the model and reducing its accuracy. The solution quality of a reduced model depends on its fidelity. We present 0\/1 reduced model that selectively improves model fidelity in certain states by switching between using a simplified deterministic model and the full model, without significantly compromising the run time gains. We measure the reduction impact for a reduced model based on the values of the ignored outcomes and use this as a heuristic for outcome selection. Finally, we present empirical results of our approach on three different domains, including an electric vehicle charging problem using real-world data from a university campus.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1107','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1107\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZiros19.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZiros19.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZiros19.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1107','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Saisubramanian, Sandhya;  Zilberstein, Shlomo;  Shenoy, Prashant J<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('868','tp_links')\" style=\"cursor:pointer;\">Planning Using a Portfolio of Reduced Models<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS), <\/span><span class=\"tp_pub_additional_address\">Stockholm, Sweden, <\/span><span class=\"tp_pub_additional_year\">2018<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_868\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('868','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_868\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('868','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_868\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('868','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_868\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SZSaamas18,<br \/>\r\ntitle = {Planning Using a Portfolio of Reduced Models},<br \/>\r\nauthor = {Sandhya Saisubramanian and Shlomo Zilberstein and Prashant J Shenoy},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZSaamas18.pdf},<br \/>\r\nyear  = {2018},<br \/>\r\ndate = {2018-01-01},<br \/>\r\nbooktitle = {Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS)},<br \/>\r\npages = {2057--2059},<br \/>\r\naddress = {Stockholm, Sweden},<br \/>\r\nabstract = {Existing reduced model techniques simplify a problem by applying a uniform principle to reduce the number of considered outcomes for all state-action pairs. It is non-trivial to identify which outcome selection principle will work well across all problem instances in a domain. We aim to create reduced models that yield near-optimal solutions, without compromising the run time gains of using a reduced model. First, we introduce planning using a portfolio of reduced models, a framework that provides flexibility in the reduced model formulation by using a portfolio of outcome selection principles. Second, we propose planning using cost adjustment, a technique that improves the solution quality by accounting for the outcomes ignored in the reduced model. Empirical evaluation of these techniques confirm their effectiveness in several domains.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('868','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_868\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Existing reduced model techniques simplify a problem by applying a uniform principle to reduce the number of considered outcomes for all state-action pairs. It is non-trivial to identify which outcome selection principle will work well across all problem instances in a domain. We aim to create reduced models that yield near-optimal solutions, without compromising the run time gains of using a reduced model. First, we introduce planning using a portfolio of reduced models, a framework that provides flexibility in the reduced model formulation by using a portfolio of outcome selection principles. Second, we propose planning using cost adjustment, a technique that improves the solution quality by accounting for the outcomes ignored in the reduced model. Empirical evaluation of these techniques confirm their effectiveness in several domains.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('868','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_868\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZSaamas18.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZSaamas18.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZSaamas18.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('868','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Srivastava, Siddharth;  Desai, Nishant;  Freedman, Richard G;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('866','tp_links')\" style=\"cursor:pointer;\">An Anytime Algorithm for Task and Motion MDPs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">ICAPS Workshop on Planning and Robotics (PlanRob), <\/span><span class=\"tp_pub_additional_address\">Delft, The Netherlands, <\/span><span class=\"tp_pub_additional_year\">2018<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_866\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('866','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_866\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('866','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_866\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('866','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_866\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SDFZicaps18ws1,<br \/>\r\ntitle = {An Anytime Algorithm for Task and Motion MDPs},<br \/>\r\nauthor = {Siddharth Srivastava and Nishant Desai and Richard G Freedman and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/arxiv.org\/abs\/1802.05835},<br \/>\r\nyear  = {2018},<br \/>\r\ndate = {2018-01-01},<br \/>\r\nbooktitle = {ICAPS Workshop on Planning and Robotics (PlanRob)},<br \/>\r\naddress = {Delft, The Netherlands},<br \/>\r\nabstract = {Integrated task and motion planning has emerged as a challenging problem in sequential decision making, where a robot needs to compute high-level strategy and low-level motion plans for solving complex tasks. While high-level strategies require decision making over longer time-horizons and scales, their feasibility depends on low-level constraints based upon the geometries and continuous dynamics of the environment. The hybrid nature of this problem makes it difficult to scale; most existing approaches focus on deterministic, fully observable scenarios. We present a new approach where the high-level decision problem occurs in a stochastic setting and can be modeled as a Markov decision process. In contrast to prior efforts, we show that complete MDP policies, or contingent behaviors, can be computed effectively in an anytime fashion. Our algorithm continuously improves the quality of the solution and is guaranteed to be probabilistically complete. We evaluate the performance of our approach on a challenging, realistic test problem: autonomous aircraft inspection. Our results show that we can effectively compute consistent task and motion policies for the most likely execution-time outcomes using only a fraction of the computation required to develop the complete task and motion policy.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('866','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_866\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Integrated task and motion planning has emerged as a challenging problem in sequential decision making, where a robot needs to compute high-level strategy and low-level motion plans for solving complex tasks. While high-level strategies require decision making over longer time-horizons and scales, their feasibility depends on low-level constraints based upon the geometries and continuous dynamics of the environment. The hybrid nature of this problem makes it difficult to scale; most existing approaches focus on deterministic, fully observable scenarios. We present a new approach where the high-level decision problem occurs in a stochastic setting and can be modeled as a Markov decision process. In contrast to prior efforts, we show that complete MDP policies, or contingent behaviors, can be computed effectively in an anytime fashion. Our algorithm continuously improves the quality of the solution and is guaranteed to be probabilistically complete. We evaluate the performance of our approach on a challenging, realistic test problem: autonomous aircraft inspection. Our results show that we can effectively compute consistent task and motion policies for the most likely execution-time outcomes using only a fraction of the computation required to develop the complete task and motion policy.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('866','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_866\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"ai ai-arxiv\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/arxiv.org\/abs\/1802.05835\" title=\"http:\/\/arxiv.org\/abs\/1802.05835\" target=\"_blank\">http:\/\/arxiv.org\/abs\/1802.05835<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('866','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Pineda, Luis Enrique;  Wray, Kyle Hollins;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('876','tp_links')\" style=\"cursor:pointer;\">Fast SSP Solvers Using Short-Sighted Labeling<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 31st Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_address\">San Francisco, California, <\/span><span class=\"tp_pub_additional_year\">2017<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_876\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('876','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_876\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('876','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_876\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('876','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_876\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:PWZaaai17,<br \/>\r\ntitle = {Fast SSP Solvers Using Short-Sighted Labeling},<br \/>\r\nauthor = {Luis Enrique Pineda and Kyle Hollins Wray and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PWZaaai17.pdf},<br \/>\r\nyear  = {2017},<br \/>\r\ndate = {2017-01-01},<br \/>\r\nbooktitle = {Proceedings of the 31st Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {3629--3635},<br \/>\r\naddress = {San Francisco, California},<br \/>\r\nabstract = {State-of-the-art methods for solving SSPs often work by limiting planning to restricted regions of the state space. The resulting problems can then be solved quickly, and the pro- cess is repeated during execution when states outside the restricted region are encountered. Typically, these approaches focus on states that are within some distance measure of the start state (e.g., number of actions or probability of being reached). However, these short-sighted approaches make it difficult to propagate information from states that are closer to a goal than to the start state, thus missing opportunities to improve planning. We present an alternative approach in which short-sightedness is used only to determine whether a state should be labeled as solved or not, but otherwise the set of states that can be accounted for during planning is unrestricted. Based on this idea, we propose the FLARES algorithm and show that it performs consistently well on a wide range of benchmark problems.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('876','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_876\" style=\"display:none;\"><div class=\"tp_abstract_entry\">State-of-the-art methods for solving SSPs often work by limiting planning to restricted regions of the state space. The resulting problems can then be solved quickly, and the pro- cess is repeated during execution when states outside the restricted region are encountered. Typically, these approaches focus on states that are within some distance measure of the start state (e.g., number of actions or probability of being reached). However, these short-sighted approaches make it difficult to propagate information from states that are closer to a goal than to the start state, thus missing opportunities to improve planning. We present an alternative approach in which short-sightedness is used only to determine whether a state should be labeled as solved or not, but otherwise the set of states that can be accounted for during planning is unrestricted. Based on this idea, we propose the FLARES algorithm and show that it performs consistently well on a wide range of benchmark problems.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('876','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_876\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PWZaaai17.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PWZaaai17.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PWZaaai17.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('876','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Zilberstein, Shlomo;  Mouaddib, Abdel-Illah<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('899','tp_links')\" style=\"cursor:pointer;\">Multi-Objective MDPs with Conditional Lexicographic Reward Preferences<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 29th Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_address\">Austin, Texas, <\/span><span class=\"tp_pub_additional_year\">2015<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_899\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('899','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_899\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('899','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_899\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('899','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_899\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WZMaaai15,<br \/>\r\ntitle = {Multi-Objective MDPs with Conditional Lexicographic Reward Preferences},<br \/>\r\nauthor = {Kyle Hollins Wray and Shlomo Zilberstein and Abdel-Illah Mouaddib},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZMaaai15.pdf},<br \/>\r\nyear  = {2015},<br \/>\r\ndate = {2015-01-01},<br \/>\r\nbooktitle = {Proceedings of the 29th Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {3418--3424},<br \/>\r\naddress = {Austin, Texas},<br \/>\r\nabstract = {Sequential decision problems that involve multiple objectives are prevalent. Consider for example a driver of a semi-autonomous car who may want to optimize competing objectives such as travel time and the effort associated with manual driving. We introduce a rich model called Lexicographic MDP (LMDP) and a corresponding planning algorithm called LVI that generalize previous work by allowing for conditional lexicographic preferences with slack. We analyze the convergence characteristics of LVI and establish its game theoretic properties. The performance of LVI in practice is tested within a realistic benchmark problem in the domain of semi-autonomous driving. Finally, we demonstrate how GPU-based optimization can improve the scalability of LVI and other value iteration algorithms for MDPs.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('899','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_899\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Sequential decision problems that involve multiple objectives are prevalent. Consider for example a driver of a semi-autonomous car who may want to optimize competing objectives such as travel time and the effort associated with manual driving. We introduce a rich model called Lexicographic MDP (LMDP) and a corresponding planning algorithm called LVI that generalize previous work by allowing for conditional lexicographic preferences with slack. We analyze the convergence characteristics of LVI and establish its game theoretic properties. The performance of LVI in practice is tested within a realistic benchmark problem in the domain of semi-autonomous driving. Finally, we demonstrate how GPU-based optimization can improve the scalability of LVI and other value iteration algorithms for MDPs.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('899','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_899\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZMaaai15.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZMaaai15.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZMaaai15.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('899','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Pineda, Luis;  Wray, Kyle Hollins;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('911','tp_links')\" style=\"cursor:pointer;\">Revisiting Multi-Objective MDPs with Relaxed Lexicographic Preferences<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">AAAI Fall Symposium on Sequential Decision Making for Intelligent Agents (SDMIA), <\/span><span class=\"tp_pub_additional_address\">Arlington, Virginia, <\/span><span class=\"tp_pub_additional_year\">2015<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_911\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('911','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_911\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('911','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_911\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('911','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_911\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:PWZfall15,<br \/>\r\ntitle = {Revisiting Multi-Objective MDPs with Relaxed Lexicographic Preferences},<br \/>\r\nauthor = {Luis Pineda and Kyle Hollins Wray and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PWZfall15.pdf},<br \/>\r\nyear  = {2015},<br \/>\r\ndate = {2015-01-01},<br \/>\r\nbooktitle = {AAAI Fall Symposium on Sequential Decision Making for Intelligent Agents (SDMIA)},<br \/>\r\naddress = {Arlington, Virginia},<br \/>\r\nabstract = {We consider stochastic planning problems that involve multiple objectives such as minimizing task completion time and energy consumption. These problems can be modeled as multi-objective Markov decision processes (MOMDPs), an extension of the widely-used MDP model to handle problems involving multiple value functions. We focus on a subclass of MOMDPs in which the objectives have a relaxed lexicographic structure, allowing an agent to seek improvement in a lower-priority objective when the impact on a higher- priority objective is within some small given tolerance. We examine the relationship between this class of problems and constrained MDPs, showing that the latter offer an alternative solution method with strong guarantees. We show empirically that a recently introduced algorithm for MOMDPs may not offer the same strong guarantees, but it does perform well in practice.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('911','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_911\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We consider stochastic planning problems that involve multiple objectives such as minimizing task completion time and energy consumption. These problems can be modeled as multi-objective Markov decision processes (MOMDPs), an extension of the widely-used MDP model to handle problems involving multiple value functions. We focus on a subclass of MOMDPs in which the objectives have a relaxed lexicographic structure, allowing an agent to seek improvement in a lower-priority objective when the impact on a higher- priority objective is within some small given tolerance. We examine the relationship between this class of problems and constrained MDPs, showing that the latter offer an alternative solution method with strong guarantees. We show empirically that a recently introduced algorithm for MOMDPs may not offer the same strong guarantees, but it does perform well in practice.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('911','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_911\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PWZfall15.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PWZfall15.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PWZfall15.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('911','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Pineda, Luis;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('916','tp_links')\" style=\"cursor:pointer;\">Planning Under Uncertainty Using Reduced Models: Revisiting Determinization<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 24th International Conference on Automated Planning and Scheduling (ICAPS), <\/span><span class=\"tp_pub_additional_address\">Portsmouth, New Hampshire, <\/span><span class=\"tp_pub_additional_year\">2014<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_916\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('916','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_916\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('916','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_916\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('916','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_916\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:PZicaps14,<br \/>\r\ntitle = {Planning Under Uncertainty Using Reduced Models: Revisiting Determinization},<br \/>\r\nauthor = {Luis Pineda and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZicaps14.pdf},<br \/>\r\nyear  = {2014},<br \/>\r\ndate = {2014-01-01},<br \/>\r\nbooktitle = {Proceedings of the 24th International Conference on Automated Planning and Scheduling (ICAPS)},<br \/>\r\npages = {217--225},<br \/>\r\naddress = {Portsmouth, New Hampshire},<br \/>\r\nabstract = {We introduce a family of MDP reduced models characterized by two parameters: the maximum number of primary outcomes per action that are fully accounted for and the maximum number of occurrences of the remaining exceptional outcomes that are planned for in advance. Reduced models can be solved much faster using heuristic search algorithms such as LAO*, benefiting from the dramatic reduction in the number of reachable states. A commonly used determinization approach is a special case of this family of reductions, with one primary outcome per action and zero exceptional outcomes per plan. We present a framework to compute the benefits of planning with reduced models, relying on online planning when the number of exceptional outcomes exceeds the bound. Using this framework, we compare the performance of various reduced models and consider the challenge of generating good ones automatically. We show that each one of the dimensions--allowing more than one primary outcome or planning for some limited number of exceptions--could improve performance relative to standard determinization. The results place recent work on determinization in a broader context and lay the foundation for efficient and systematic exploration of the space of MDP model reductions.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('916','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_916\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We introduce a family of MDP reduced models characterized by two parameters: the maximum number of primary outcomes per action that are fully accounted for and the maximum number of occurrences of the remaining exceptional outcomes that are planned for in advance. Reduced models can be solved much faster using heuristic search algorithms such as LAO*, benefiting from the dramatic reduction in the number of reachable states. A commonly used determinization approach is a special case of this family of reductions, with one primary outcome per action and zero exceptional outcomes per plan. We present a framework to compute the benefits of planning with reduced models, relying on online planning when the number of exceptional outcomes exceeds the bound. Using this framework, we compare the performance of various reduced models and consider the challenge of generating good ones automatically. We show that each one of the dimensions--allowing more than one primary outcome or planning for some limited number of exceptions--could improve performance relative to standard determinization. The results place recent work on determinization in a broader context and lay the foundation for efficient and systematic exploration of the space of MDP model reductions.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('916','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_916\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZicaps14.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZicaps14.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZicaps14.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('916','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Pineda, Luis;  Lu, Yi;  Zilberstein, Shlomo;  Goldman, Claudia V<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('922','tp_links')\" style=\"cursor:pointer;\">Fault-Tolerant Planning Under Uncertainty<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 23rd International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_address\">Beijing, China, <\/span><span class=\"tp_pub_additional_year\">2013<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_922\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('922','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_922\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('922','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_922\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('922','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_922\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:PLZGijcai13,<br \/>\r\ntitle = {Fault-Tolerant Planning Under Uncertainty},<br \/>\r\nauthor = {Luis Pineda and Yi Lu and Shlomo Zilberstein and Claudia V Goldman},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PLZGijcai13.pdf},<br \/>\r\nyear  = {2013},<br \/>\r\ndate = {2013-01-01},<br \/>\r\nbooktitle = {Proceedings of the 23rd International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\npages = {2350--2356},<br \/>\r\naddress = {Beijing, China},<br \/>\r\nabstract = {A fault represents some erroneous operation of a system that could result from an action selection error or some abnormal condition. We formally define error models that characterize the likelihood of various faults and consider the problem of fault-tolerant planning, which optimizes performance given an error model. We show that factoring the possibility of errors significantly degrades the performance of stochastic planning algorithms such as LAO*, because the number of reachable states grows dramatically. We introduce an approach to plan for a bounded number of faults and analyze its theoretical properties. When combined with a continual planning paradigm, the k-fault-tolerant planning method can produce near-optimal performance, even when the number of faults exceeds the bound. Empirical results in two challenging domains confirm the effectiveness of the approach in handling different types of runtime errors.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('922','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_922\" style=\"display:none;\"><div class=\"tp_abstract_entry\">A fault represents some erroneous operation of a system that could result from an action selection error or some abnormal condition. We formally define error models that characterize the likelihood of various faults and consider the problem of fault-tolerant planning, which optimizes performance given an error model. We show that factoring the possibility of errors significantly degrades the performance of stochastic planning algorithms such as LAO*, because the number of reachable states grows dramatically. We introduce an approach to plan for a bounded number of faults and analyze its theoretical properties. When combined with a continual planning paradigm, the k-fault-tolerant planning method can produce near-optimal performance, even when the number of faults exceeds the bound. Empirical results in two challenging domains confirm the effectiveness of the approach in handling different types of runtime errors.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('922','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_922\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PLZGijcai13.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PLZGijcai13.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PLZGijcai13.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('922','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Petrik, Marek;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('934','tp_links')\" style=\"cursor:pointer;\">Robust Approximate Bilinear Programming for Value Function Approximation<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Journal of Machine Learning Research (JMLR), <\/span><span class=\"tp_pub_additional_volume\">vol. 12, <\/span><span class=\"tp_pub_additional_pages\">pp. 3027\u20133063, <\/span><span class=\"tp_pub_additional_year\">2011<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_934\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('934','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_934\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('934','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_934\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('934','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_934\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:PZjmlr11,<br \/>\r\ntitle = {Robust Approximate Bilinear Programming for Value Function Approximation},<br \/>\r\nauthor = {Marek Petrik and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZjmlr11.pdf},<br \/>\r\nyear  = {2011},<br \/>\r\ndate = {2011-01-01},<br \/>\r\njournal = {Journal of Machine Learning Research (JMLR)},<br \/>\r\nvolume = {12},<br \/>\r\npages = {3027--3063},<br \/>\r\nabstract = {Value function approximation methods have been successfully used in many applications, but the prevailing techniques often lack useful a priori error bounds. We propose a new approximate bilinear programming formulation of value function approximation, which employs global optimization. The formulation provides strong a priori guarantees on both robust and expected policy loss by minimizing specific norms of the Bellman residual. Solving a bilinear program optimally is NP-hard, but this worst-case complexity is unavoidable because the Bellman-residual minimization itself is NP-hard. We describe and analyze the formulation as well as a simple approximate algorithm for solving bilinear programs. The analysis shows that this algorithm offers a convergent generalization of approximate policy iteration. We also briefly analyze the behavior of bilinear programming algorithms under incomplete samples. Finally, we demonstrate that the proposed approach can consistently minimize the Bellman residual on simple benchmark problems.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('934','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_934\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Value function approximation methods have been successfully used in many applications, but the prevailing techniques often lack useful a priori error bounds. We propose a new approximate bilinear programming formulation of value function approximation, which employs global optimization. The formulation provides strong a priori guarantees on both robust and expected policy loss by minimizing specific norms of the Bellman residual. Solving a bilinear program optimally is NP-hard, but this worst-case complexity is unavoidable because the Bellman-residual minimization itself is NP-hard. We describe and analyze the formulation as well as a simple approximate algorithm for solving bilinear programs. The analysis shows that this algorithm offers a convergent generalization of approximate policy iteration. We also briefly analyze the behavior of bilinear programming algorithms under incomplete samples. Finally, we demonstrate that the proposed approach can consistently minimize the Bellman residual on simple benchmark problems.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('934','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_934\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZjmlr11.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZjmlr11.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZjmlr11.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('934','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Petrik, Marek;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('942','tp_links')\" style=\"cursor:pointer;\">Linear Dynamic Programs for Resource Management<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 25th Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_address\">San Francisco, California, <\/span><span class=\"tp_pub_additional_year\">2011<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_942\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('942','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_942\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('942','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_942\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('942','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_942\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:PZaaai11,<br \/>\r\ntitle = {Linear Dynamic Programs for Resource Management},<br \/>\r\nauthor = {Marek Petrik and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZaaai11.pdf},<br \/>\r\nyear  = {2011},<br \/>\r\ndate = {2011-01-01},<br \/>\r\nbooktitle = {Proceedings of the 25th Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {1377--1383},<br \/>\r\naddress = {San Francisco, California},<br \/>\r\nabstract = {Sustainable resource management in many domains presents large continuous stochastic optimization prob- lems, which can often be modeled as Markov decision processes (MDPs). To solve such large MDPs, we identify and leverage linearity in state and action sets that is common in resource management. In particular, we introduce linear dynamic programs(LDPs) that generalize resource management problems and partially observable MDPs (POMDPs). We show that the LDP framework makes it possible to adapt point-based methods -- the state of the art in solving POMDPs -- to solving LDPs. The experimental results demonstrate the efficiency of this approach in managing the water level of a river reservoir. Finally, we discuss the relationship with dual dynamic programming, a method used to optimize hydroelectric systems.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('942','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_942\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Sustainable resource management in many domains presents large continuous stochastic optimization prob- lems, which can often be modeled as Markov decision processes (MDPs). To solve such large MDPs, we identify and leverage linearity in state and action sets that is common in resource management. In particular, we introduce linear dynamic programs(LDPs) that generalize resource management problems and partially observable MDPs (POMDPs). We show that the LDP framework makes it possible to adapt point-based methods -- the state of the art in solving POMDPs -- to solving LDPs. The experimental results demonstrate the efficiency of this approach in managing the water level of a river reservoir. Finally, we discuss the relationship with dual dynamic programming, a method used to optimize hydroelectric systems.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('942','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_942\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZaaai11.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZaaai11.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZaaai11.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('942','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wu, Xiaojian;  Kumar, Akshat;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('945','tp_links')\" style=\"cursor:pointer;\">Influence Diagrams with Memory States: Representation and Algorithms<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 2nd International Conference on Algorithmic Decision Theory (ADT), <\/span><span class=\"tp_pub_additional_address\">Piscataway, New Jersey, <\/span><span class=\"tp_pub_additional_year\">2011<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_945\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('945','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_945\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('945','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_945\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('945','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_945\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WKZadt11,<br \/>\r\ntitle = {Influence Diagrams with Memory States: Representation and Algorithms},<br \/>\r\nauthor = {Xiaojian Wu and Akshat Kumar and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WKZadt11.pdf},<br \/>\r\ndoi = {10.1007\/978-3-642-24873-3_23},<br \/>\r\nyear  = {2011},<br \/>\r\ndate = {2011-01-01},<br \/>\r\nbooktitle = {Proceedings of the 2nd International Conference on Algorithmic Decision Theory (ADT)},<br \/>\r\npages = {306--319},<br \/>\r\naddress = {Piscataway, New Jersey},<br \/>\r\nabstract = {Influence diagrams (IDs) offer a powerful framework for decision making under uncertainty, but their applicability has been hindered by the exponential growth of runtime and memory usage--largely due to the no-forgetting assumption. We present a novel way to maintain a limited amount of memory to inform each decision and still obtain near-optimal policies. The approach is based on augmenting the graphical model with memory states that represent key aspects of previous observations--a method that has proved useful in POMDP solvers. We also derive an efficient EM-based message-passing algorithm to compute the policy. Experimental results show that this approach produces high-quality approximate polices and offers better scalability than existing methods.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('945','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_945\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Influence diagrams (IDs) offer a powerful framework for decision making under uncertainty, but their applicability has been hindered by the exponential growth of runtime and memory usage--largely due to the no-forgetting assumption. We present a novel way to maintain a limited amount of memory to inform each decision and still obtain near-optimal policies. The approach is based on augmenting the graphical model with memory states that represent key aspects of previous observations--a method that has proved useful in POMDP solvers. We also derive an efficient EM-based message-passing algorithm to compute the policy. Experimental results show that this approach produces high-quality approximate polices and offers better scalability than existing methods.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('945','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_945\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WKZadt11.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WKZadt11.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WKZadt11.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1007\/978-3-642-24873-3_23\" title=\"Follow DOI:10.1007\/978-3-642-24873-3_23\" target=\"_blank\">doi:10.1007\/978-3-642-24873-3_23<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('945','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Petrik, Marek;  Taylor, Gavin;  Parr, Ron;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('953','tp_links')\" style=\"cursor:pointer;\">Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 27th International Conference on Machine Learning (ICML), <\/span><span class=\"tp_pub_additional_address\">Haifa, Israel, <\/span><span class=\"tp_pub_additional_year\">2010<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_953\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('953','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_953\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('953','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_953\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('953','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_953\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:PTPZicml10,<br \/>\r\ntitle = {Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes},<br \/>\r\nauthor = {Marek Petrik and Gavin Taylor and Ron Parr and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PTPZicml10.pdf},<br \/>\r\nyear  = {2010},<br \/>\r\ndate = {2010-01-01},<br \/>\r\nbooktitle = {Proceedings of the 27th International Conference on Machine Learning (ICML)},<br \/>\r\npages = {871--878},<br \/>\r\naddress = {Haifa, Israel},<br \/>\r\nabstract = {Approximate dynamic programming has been used successfully in a large variety of domains, but it relies on a small set of provided approximation features to calculate solutions reliably. Large and rich sets of features can cause existing algorithms to overfit because of a limited number of samples. We address this shortcoming using L1 regularization in approximate linear programming. Because the proposed method can automatically select the appropriate richness of features, its performance does not degrade with an increasing number of features. These results rely on new and stronger sampling bounds for regularized approximate linear programs. We also propose a computationally efficient homotopy method. The empirical evaluation of the approach shows that the proposed method performs well on simple MDPs and standard benchmark problems.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('953','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_953\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Approximate dynamic programming has been used successfully in a large variety of domains, but it relies on a small set of provided approximation features to calculate solutions reliably. Large and rich sets of features can cause existing algorithms to overfit because of a limited number of samples. We address this shortcoming using L1 regularization in approximate linear programming. Because the proposed method can automatically select the appropriate richness of features, its performance does not degrade with an increasing number of features. These results rely on new and stronger sampling bounds for regularized approximate linear programs. We also propose a computationally efficient homotopy method. The empirical evaluation of the approach shows that the proposed method performs well on simple MDPs and standard benchmark problems.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('953','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_953\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PTPZicml10.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PTPZicml10.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PTPZicml10.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('953','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Petrik, Marek;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('968','tp_links')\" style=\"cursor:pointer;\">Constraint Relaxation in Approximate Linear Programs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 26th International Conference on Machine Learning (ICML), <\/span><span class=\"tp_pub_additional_address\">Montreal, Canada, <\/span><span class=\"tp_pub_additional_year\">2009<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_968\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('968','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_968\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('968','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_968\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('968','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_968\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:PZicml09,<br \/>\r\ntitle = {Constraint Relaxation in Approximate Linear Programs},<br \/>\r\nauthor = {Marek Petrik and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZicml09.pdf},<br \/>\r\nyear  = {2009},<br \/>\r\ndate = {2009-01-01},<br \/>\r\nbooktitle = {Proceedings of the 26th International Conference on Machine Learning (ICML)},<br \/>\r\npages = {809--816},<br \/>\r\naddress = {Montreal, Canada},<br \/>\r\nabstract = {Approximate Linear Programming (ALP) is a reinforcement learning technique with nice theoretical properties, but it often performs poorly in practice. We identify some reasons for the poor quality of ALP solutions in problems where the approximation induces virtual loops. We then introduce two methods for improving solution quality. One method rolls out selected constraints of the ALP, guided by the dual information. The second method is a relaxation of the ALP, based on external penalty methods. The latter method is applicable in domains in which rolling out constraints is impractical. Both approaches show promising empirical results for simple benchmark problems as well as for a realistic blood inventory management problem.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('968','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_968\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Approximate Linear Programming (ALP) is a reinforcement learning technique with nice theoretical properties, but it often performs poorly in practice. We identify some reasons for the poor quality of ALP solutions in problems where the approximation induces virtual loops. We then introduce two methods for improving solution quality. One method rolls out selected constraints of the ALP, guided by the dual information. The second method is a relaxation of the ALP, based on external penalty methods. The latter method is applicable in domains in which rolling out constraints is impractical. Both approaches show promising empirical results for simple benchmark problems as well as for a realistic blood inventory management problem.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('968','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_968\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZicml09.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZicml09.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZicml09.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('968','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Petrik, Marek;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('977','tp_links')\" style=\"cursor:pointer;\">Robust Value Function Approximation Using Bilinear Programming<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 23rd Neural Information Processing Systems Conference (NIPS), <\/span><span class=\"tp_pub_additional_address\">Vancouver, British Columbia, Canada, <\/span><span class=\"tp_pub_additional_year\">2009<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_977\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('977','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_977\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('977','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_977\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('977','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_977\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:PZnips09,<br \/>\r\ntitle = {Robust Value Function Approximation Using Bilinear Programming},<br \/>\r\nauthor = {Marek Petrik and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZnips09.pdf},<br \/>\r\nyear  = {2009},<br \/>\r\ndate = {2009-01-01},<br \/>\r\nbooktitle = {Proceedings of the 23rd Neural Information Processing Systems Conference (NIPS)},<br \/>\r\npages = {1446--1454},<br \/>\r\naddress = {Vancouver, British Columbia, Canada},<br \/>\r\nabstract = {Existing value function approximation methods have been successfully used in many applications, but they often lack useful a priori error bounds. We propose approximate bilinear programming, a new formulation of value function approximation that provides strong a priori guarantees. In particular, this approach provably finds an approximate value function that minimizes the Bellman residual. Solving a bilinear program optimally is NP-hard, but this is unavoidable because the Bellman-residual minimization itself is NP-hard. We therefore employ and analyze a common approximate algorithm for bilinear programs. The analysis shows that this algorithm offers a convergent generalization of approximate policy iteration. Finally, we demonstrate that the proposed approach can consistently minimize the Bellman residual on a simple benchmark problem.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('977','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_977\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Existing value function approximation methods have been successfully used in many applications, but they often lack useful a priori error bounds. We propose approximate bilinear programming, a new formulation of value function approximation that provides strong a priori guarantees. In particular, this approach provably finds an approximate value function that minimizes the Bellman residual. Solving a bilinear program optimally is NP-hard, but this is unavoidable because the Bellman-residual minimization itself is NP-hard. We therefore employ and analyze a common approximate algorithm for bilinear programs. The analysis shows that this algorithm offers a convergent generalization of approximate policy iteration. Finally, we demonstrate that the proposed approach can consistently minimize the Bellman residual on a simple benchmark problem.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('977','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_977\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZnips09.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZnips09.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZnips09.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('977','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Petrik, Marek;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('996','tp_links')\" style=\"cursor:pointer;\">Average-Reward Decentralized Markov Decision Processes<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_address\">Hyderabad, India, <\/span><span class=\"tp_pub_additional_year\">2007<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_996\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('996','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_996\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('996','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_996\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('996','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_996\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:PZijcai07,<br \/>\r\ntitle = {Average-Reward Decentralized Markov Decision Processes},<br \/>\r\nauthor = {Marek Petrik and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZijcai07.pdf},<br \/>\r\nyear  = {2007},<br \/>\r\ndate = {2007-01-01},<br \/>\r\nbooktitle = {Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\npages = {1997--2002},<br \/>\r\naddress = {Hyderabad, India},<br \/>\r\nabstract = {Formal analysis of decentralized decision making has become a thriving research area in recent years, producing a number of multi-agent extensions of Markov decision processes. While much of the work has focused on optimizing discounted cumulative reward, optimizing average reward is sometimes a more suitable criterion. We formalize a class of such problems and analyze its characteristics, showing that it is NP complete and that optimal policies are deterministic. Our analysis lays the foundation for designing two optimal algorithms. Experimental results with a standard problem from the literature illustrate the applicability of these solution techniques.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('996','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_996\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Formal analysis of decentralized decision making has become a thriving research area in recent years, producing a number of multi-agent extensions of Markov decision processes. While much of the work has focused on optimizing discounted cumulative reward, optimizing average reward is sometimes a more suitable criterion. We formalize a class of such problems and analyze its characteristics, showing that it is NP complete and that optimal policies are deterministic. Our analysis lays the foundation for designing two optimal algorithms. Experimental results with a standard problem from the literature illustrate the applicability of these solution techniques.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('996','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_996\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZijcai07.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZijcai07.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZijcai07.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('996','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Feng, Zhengzhu;  Hansen, Eric A;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1040','tp_links')\" style=\"cursor:pointer;\">Symbolic Generalization for On-line Planning<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 19th Conference on Uncertainty in Artificial Intelligence (UAI), <\/span><span class=\"tp_pub_additional_address\">Acapulco, Mexico, <\/span><span class=\"tp_pub_additional_year\">2003<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1040\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1040','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1040\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1040','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1040\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1040','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1040\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:FHZuai03,<br \/>\r\ntitle = {Symbolic Generalization for On-line Planning},<br \/>\r\nauthor = {Zhengzhu Feng and Eric A Hansen and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FHZuai03.pdf},<br \/>\r\nyear  = {2003},<br \/>\r\ndate = {2003-01-01},<br \/>\r\nbooktitle = {Proceedings of the 19th Conference on Uncertainty in Artificial Intelligence (UAI)},<br \/>\r\npages = {209--216},<br \/>\r\naddress = {Acapulco, Mexico},<br \/>\r\nabstract = {Symbolic representations have been used successfully in off-line planning algorithms for Markov decision processes. We show that they can also improve the performance of on-line planners. In addition to reducing computation time, symbolic generalization can reduce the amount of costly real-world interactions required for convergence. We introduce Symbolic Real-Time Dynamic Programming (or sRTDP), an extension of RTDP. After each step of on-line interaction with an environment, sRTDP uses symbolic model-checking techniques to generalizes its experience by updating a group of states rather than a single state. We examine two heuristic approaches to dynamic grouping of states and show that they accelerate the planning process significantly in terms of both CPU time and the number of steps of interaction with the environment.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1040','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1040\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Symbolic representations have been used successfully in off-line planning algorithms for Markov decision processes. We show that they can also improve the performance of on-line planners. In addition to reducing computation time, symbolic generalization can reduce the amount of costly real-world interactions required for convergence. We introduce Symbolic Real-Time Dynamic Programming (or sRTDP), an extension of RTDP. After each step of on-line interaction with an environment, sRTDP uses symbolic model-checking techniques to generalizes its experience by updating a group of states rather than a single state. We examine two heuristic approaches to dynamic grouping of states and show that they accelerate the planning process significantly in terms of both CPU time and the number of steps of interaction with the environment.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1040','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1040\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FHZuai03.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FHZuai03.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FHZuai03.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1040','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Hansen, Eric A;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1051','tp_links')\" style=\"cursor:pointer;\">LAO*: A Heuristic Search Algorithm that Finds Solutions with Loops<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Artificial Intelligence (AIJ), <\/span><span class=\"tp_pub_additional_volume\">vol. 129, <\/span><span class=\"tp_pub_additional_number\">no. 1-2, <\/span><span class=\"tp_pub_additional_pages\">pp. 35\u201362, <\/span><span class=\"tp_pub_additional_year\">2001<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1051\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1051','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1051\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1051','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1051\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1051','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1051\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:HZaij01bb,<br \/>\r\ntitle = {LAO*: A Heuristic Search Algorithm that Finds Solutions with Loops},<br \/>\r\nauthor = {Eric A Hansen and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HZaij01b.pdf},<br \/>\r\ndoi = {10.1016\/S0004-3702(01)00106-0},<br \/>\r\nyear  = {2001},<br \/>\r\ndate = {2001-01-01},<br \/>\r\njournal = {Artificial Intelligence (AIJ)},<br \/>\r\nvolume = {129},<br \/>\r\nnumber = {1-2},<br \/>\r\npages = {35--62},<br \/>\r\nabstract = {Classic heuristic search algorithms can find solutions that take the form of a simple path (A*), a tree, or an acyclic graph (AO*). In this paper, we describe a novel generalization of heuristic search, called LAO*, that can find solutions with loops. We show that LAO* can be used to solve Markov decision problems and that it shares the advantage heuristic search has over dynamic programming for other classes of problems: given a start state, it can find an optimal solution without evaluating the entire state space.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1051','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1051\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Classic heuristic search algorithms can find solutions that take the form of a simple path (A*), a tree, or an acyclic graph (AO*). In this paper, we describe a novel generalization of heuristic search, called LAO*, that can find solutions with loops. We show that LAO* can be used to solve Markov decision problems and that it shares the advantage heuristic search has over dynamic programming for other classes of problems: given a start state, it can find an optimal solution without evaluating the entire state space.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1051','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1051\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HZaij01b.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HZaij01b.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HZaij01b.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1016\/S0004-3702(01)00106-0\" title=\"Follow DOI:10.1016\/S0004-3702(01)00106-0\" target=\"_blank\">doi:10.1016\/S0004-3702(01)00106-0<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1051','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Hansen, Eric A;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1068','tp_links')\" style=\"cursor:pointer;\">Heuristic Search in Cyclic AND\/OR Graphs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 15th National Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_address\">Madison, Wisconsin, <\/span><span class=\"tp_pub_additional_year\">1998<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1068\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1068','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1068\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1068','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1068\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1068','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1068\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:HZaaai98,<br \/>\r\ntitle = {Heuristic Search in Cyclic AND\/OR Graphs},<br \/>\r\nauthor = {Eric A Hansen and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HZaaai98.pdf},<br \/>\r\nyear  = {1998},<br \/>\r\ndate = {1998-01-01},<br \/>\r\nbooktitle = {Proceedings of the 15th National Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {412--418},<br \/>\r\naddress = {Madison, Wisconsin},<br \/>\r\nabstract = {Heuristic search algorithms can find solutions that take the form of a simple path (A*), a tree or an acyclic graph (AO*). We present a novel generalization of heuristic search (called LAO*) that can find solutions with loops, that is, solutions that take the form of a cyclic graph. We show that it can be used to solve Markov decision problems without evaluating the entire state space, giving it an advantage over dynamic-programming algorithms such as policy iteration and value iteration as an approach to stochastic planning.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1068','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1068\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Heuristic search algorithms can find solutions that take the form of a simple path (A*), a tree or an acyclic graph (AO*). We present a novel generalization of heuristic search (called LAO*) that can find solutions with loops, that is, solutions that take the form of a cyclic graph. We show that it can be used to solve Markov decision problems without evaluating the entire state space, giving it an advantage over dynamic-programming algorithms such as policy iteration and value iteration as an approach to stochastic planning.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1068','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1068\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HZaaai98.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HZaaai98.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HZaaai98.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1068','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Hansen, Eric A;  Barto, Andrew G;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1084','tp_links')\" style=\"cursor:pointer;\">Reinforcement Learning for Mixed Open-loop and Closed-loop Control<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 9th Neural Information Processing Systems Conference (NIPS), <\/span><span class=\"tp_pub_additional_address\">Denver, Colorado, <\/span><span class=\"tp_pub_additional_year\">1996<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1084\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1084','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1084\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1084','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1084\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1084','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1084\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:HBZnips96,<br \/>\r\ntitle = {Reinforcement Learning for Mixed Open-loop and Closed-loop Control},<br \/>\r\nauthor = {Eric A Hansen and Andrew G Barto and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HBZnips96.pdf},<br \/>\r\nyear  = {1996},<br \/>\r\ndate = {1996-01-01},<br \/>\r\nbooktitle = {Proceedings of the 9th Neural Information Processing Systems Conference (NIPS)},<br \/>\r\npages = {1026--1032},<br \/>\r\naddress = {Denver, Colorado},<br \/>\r\nabstract = {Closed-loop control relies on sensory feedback that is usually assumed to be free. But if sensing incurs a cost, it may be cost effective to take sequences of actions in open-loop mode. We describe a reinforcement learning algorithm that learns to combine open-loop and closed-loop control when sensing incurs a cost. Although we assume reliable sensors, use of open-loop control means that actions must sometimes be taken when the current state of the controlled system is uncertain. This is a special case of the hidden-state problem in reinforcement learning, and to cope, our algorithm relies on short-term memory. The main result of the paper is a rule that significantly limits exploration of possible memory states by pruning memory states for which the estimated value of information is greater than its cost. We prove that this rule allows convergence to an optimal policy.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1084','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1084\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Closed-loop control relies on sensory feedback that is usually assumed to be free. But if sensing incurs a cost, it may be cost effective to take sequences of actions in open-loop mode. We describe a reinforcement learning algorithm that learns to combine open-loop and closed-loop control when sensing incurs a cost. Although we assume reliable sensors, use of open-loop control means that actions must sometimes be taken when the current state of the controlled system is uncertain. This is a special case of the hidden-state problem in reinforcement learning, and to cope, our algorithm relies on short-term memory. The main result of the paper is a rule that significantly limits exploration of possible memory states by pruning memory states for which the estimated value of information is greater than its cost. We prove that this rule allows convergence to an optimal policy.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1084','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1084\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HBZnips96.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HBZnips96.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/HBZnips96.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1084','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><\/table><\/div><\/div>\n<div><\/div><\/div><\/div>\n<\/div>\n<h3><span style=\"color: #264278\"><b>Belief-Space Planning and POMDPs<\/b><\/span><\/h3>\n<div>How to select actions based on partial and imprecise information about the environment, and how to design efficient algorithms to do planning in belief space?<\/div>\n<div><div class=\"bg-margin-for-link\"><input type='hidden' bg_collapse_expand='6a2d3028e9a4e4013487677' value='6a2d3028e9a4e4013487677'><input type='hidden' id='bg-show-more-text-6a2d3028e9a4e4013487677' value='Show Related Publications'><input type='hidden' id='bg-show-less-text-6a2d3028e9a4e4013487677' value='Hide Related Publications'><a id='bg-showmore-action-6a2d3028e9a4e4013487677' class='bg-showmore-plg-link bg-arrow '  style=\" color:#7C2622;;\" href='#'>Show Related Publications<\/a><div id='bg-showmore-hidden-6a2d3028e9a4e4013487677' ><div class=\"teachpress_pub_list\"><form name=\"tppublistform\" method=\"get\"><a name=\"tppubs\" id=\"tppubs\"><\/a><\/form><table class=\"teachpress_publication_list\"><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Miura, Shuwa;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1188','tp_links')\" style=\"cursor:pointer;\">Observer-Aware Planning with Implicit and Explicit Communication<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the The 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS), <\/span><span class=\"tp_pub_additional_address\">Auckland, New Zealand, <\/span><span class=\"tp_pub_additional_year\">2024<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1188\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1188','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1188\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1188','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1188\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1188','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1188\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:MZaamas24,<br \/>\r\ntitle = {Observer-Aware Planning with Implicit and Explicit Communication},<br \/>\r\nauthor = {Shuwa Miura and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZaamas24.pdf},<br \/>\r\nyear  = {2024},<br \/>\r\ndate = {2024-01-01},<br \/>\r\nbooktitle = {Proceedings of the The 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS)},<br \/>\r\naddress = {Auckland, New Zealand},<br \/>\r\nabstract = {This paper presents a computational model designed for planning both implicit and explicit communication of intentions, goals, and desires. Building upon previous research focused on implicit communication of intention via actions, our model seeks to strategically influence an observer\u2019s belief using both the agent\u2019s actions and explicit messages. We show that our proposed model can be considered to be a special case of general multi-agent problems with explicit communication under certain assumptions. Since the mental state of the observer depends on histories, computing a policy for the proposed model amounts to optimizing a non-Markovian objective, which we show to be intractable in the worst case. To mitigate this challenge, we propose a technique based on splitting domain and communication actions during planning. We conclude with experimental evaluations of the proposed approach that illustrate its effectiveness.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1188','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1188\" style=\"display:none;\"><div class=\"tp_abstract_entry\">This paper presents a computational model designed for planning both implicit and explicit communication of intentions, goals, and desires. Building upon previous research focused on implicit communication of intention via actions, our model seeks to strategically influence an observer\u2019s belief using both the agent\u2019s actions and explicit messages. We show that our proposed model can be considered to be a special case of general multi-agent problems with explicit communication under certain assumptions. Since the mental state of the observer depends on histories, computing a policy for the proposed model amounts to optimizing a non-Markovian objective, which we show to be intractable in the worst case. To mitigate this challenge, we propose a technique based on splitting domain and communication actions during planning. We conclude with experimental evaluations of the proposed approach that illustrate its effectiveness.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1188','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1188\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZaamas24.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZaamas24.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZaamas24.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1188','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Mahmud, Saaduddin;  Vazquez-Chanlatte, Marcell;  Witwicki, Stefan;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1189','tp_links')\" style=\"cursor:pointer;\">Explaining the Behavior of POMDP-based Agents Through the Impact of Counterfactual Information<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the The 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS), <\/span><span class=\"tp_pub_additional_address\">Auckland, New Zealand, <\/span><span class=\"tp_pub_additional_year\">2024<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1189\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1189','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1189\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1189','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1189\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1189','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1189\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:MVWZaamas24,<br \/>\r\ntitle = {Explaining the Behavior of POMDP-based Agents Through the Impact of Counterfactual Information},<br \/>\r\nauthor = {Saaduddin Mahmud and Marcell Vazquez-Chanlatte and Stefan Witwicki and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MVWZaamas24.pdf},<br \/>\r\nyear  = {2024},<br \/>\r\ndate = {2024-01-01},<br \/>\r\nbooktitle = {Proceedings of the The 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS)},<br \/>\r\naddress = {Auckland, New Zealand},<br \/>\r\nabstract = {In this work, we consider AI agents operating in Partially Observable Markov Decision Processes (POMDPs)\u2013a widely-used framework for sequential decision making with incomplete state information. Agents operating with partial information take actions not only to advance their underlying goals but also to seek information and reduce uncertainty. Despite rapid progress in explainable AI, research on separating information-driven vs. goal-driven behaviors remains sparse. To address this gap, we introduce a novel explanation generation framework called Sequential Information Probing (SIP), to investigate the direct impact of state information, or its absence, on agent behavior. To quantify the impact we also propose two metrics under this SIP framework called Value of Information (VoI) and Influence of Information (IoI). We then theoretically derive several properties of these metrics. Finally, we present several experiments, including a case study on an autonomous vehicle, that illustrate the efficacy of our method.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1189','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1189\" style=\"display:none;\"><div class=\"tp_abstract_entry\">In this work, we consider AI agents operating in Partially Observable Markov Decision Processes (POMDPs)\u2013a widely-used framework for sequential decision making with incomplete state information. Agents operating with partial information take actions not only to advance their underlying goals but also to seek information and reduce uncertainty. Despite rapid progress in explainable AI, research on separating information-driven vs. goal-driven behaviors remains sparse. To address this gap, we introduce a novel explanation generation framework called Sequential Information Probing (SIP), to investigate the direct impact of state information, or its absence, on agent behavior. To quantify the impact we also propose two metrics under this SIP framework called Value of Information (VoI) and Influence of Information (IoI). We then theoretically derive several properties of these metrics. Finally, we present several experiments, including a case study on an autonomous vehicle, that illustrate the efficacy of our method.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1189','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1189\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MVWZaamas24.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MVWZaamas24.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MVWZaamas24.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1189','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Basich, Connor;  Peterson, John;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1164','tp_links')\" style=\"cursor:pointer;\">Planning with Intermittent State Observability: Knowing When to Act Blind<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), <\/span><span class=\"tp_pub_additional_address\">Kyoto, Japan, <\/span><span class=\"tp_pub_additional_year\">2022<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1164\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1164','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1164\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1164','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1164\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1164','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1164\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:BPZiros22,<br \/>\r\ntitle = {Planning with Intermittent State Observability: Knowing When to Act Blind},<br \/>\r\nauthor = {Connor Basich and John Peterson and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BPZiros22.pdf},<br \/>\r\nyear  = {2022},<br \/>\r\ndate = {2022-01-01},<br \/>\r\nbooktitle = {Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS)},<br \/>\r\npages = {11657--11664},<br \/>\r\naddress = {Kyoto, Japan},<br \/>\r\nabstract = {Contemporary planning models and methods often rely on constant availability of free state information at each step of execution. However, autonomous systems are increasingly deployed in the open world where state information may be costly or simply unavailable in certain situations. Failing to account for sensor limitations may lead to costly behavior or even catastrophic failure. While the partially observable Markov decision process (POMDP) can be used to model this problem, solving POMDPs is often intractable. We introduce a planning model called a semi-observable Markov decision process (SOMDP) specifically designed for MDPs where state observability may be intermittent. We propose an approach for solving SOMDPs that uses memory states to proactively plan for the potential loss of sensor information while exploiting the unique structure of SOMDPs. Our theoretical analysis and empirical evaluation demonstrate the advantages of SOMDPs relative to existing planning models.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1164','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1164\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Contemporary planning models and methods often rely on constant availability of free state information at each step of execution. However, autonomous systems are increasingly deployed in the open world where state information may be costly or simply unavailable in certain situations. Failing to account for sensor limitations may lead to costly behavior or even catastrophic failure. While the partially observable Markov decision process (POMDP) can be used to model this problem, solving POMDPs is often intractable. We introduce a planning model called a semi-observable Markov decision process (SOMDP) specifically designed for MDPs where state observability may be intermittent. We propose an approach for solving SOMDPs that uses memory states to proactively plan for the potential loss of sensor information while exploiting the unique structure of SOMDPs. Our theoretical analysis and empirical evaluation demonstrate the advantages of SOMDPs relative to existing planning models.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1164','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1164\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BPZiros22.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BPZiros22.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BPZiros22.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1164','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Basich, Connor;  Svegliato, Justin;  Beach, Allyson;  Wray, Kyle Hollins;  Witwicki, Stefan J;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1142','tp_links')\" style=\"cursor:pointer;\">Improving Competence via Iterative State Space Refinement<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), <\/span><span class=\"tp_pub_additional_address\">Prague, Czech Republic, <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1142\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1142','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1142\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1142','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1142\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1142','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1142\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:BSBWWZiros21,<br \/>\r\ntitle = {Improving Competence via Iterative State Space Refinement},<br \/>\r\nauthor = {Connor Basich and Justin Svegliato and Allyson Beach and Kyle Hollins Wray and Stefan J Witwicki and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSBWWZiros21.pdf},<br \/>\r\ndoi = {10.1109\/IROS51168.2021.9636239},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\nbooktitle = {Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS)},<br \/>\r\npages = {1865--1871},<br \/>\r\naddress = {Prague, Czech Republic},<br \/>\r\nabstract = {Despite considerable efforts by human designers, accounting for every unique situation that an autonomous robotic system deployed in the real world could face is often an infeasible task. As a result, many such deployed systems still rely on human assistance in various capacities to complete certain tasks while staying safe. Competence-aware systems (CAS) is a recently proposed model for reducing such reliance on human assistance while in turn optimizing the system\u2019s global autonomous operation by learning its own competence. However, such systems are limited by a fixed model of their environment and may perform poorly if their a priori planning model does not include certain features that emerge as important over the course of the system\u2019s deployment. In this paper, we propose a method for improving the competence of a CAS over time by identifying important state features missing from the system\u2019s model and incorporating them into its state representation, thereby refining its state space. Our approach exploits information that exists in the standard CAS model and adds no extra work to the human. The result is an agent that better predicts human involvement, improving its competence, reliability, and overall performance.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1142','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1142\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Despite considerable efforts by human designers, accounting for every unique situation that an autonomous robotic system deployed in the real world could face is often an infeasible task. As a result, many such deployed systems still rely on human assistance in various capacities to complete certain tasks while staying safe. Competence-aware systems (CAS) is a recently proposed model for reducing such reliance on human assistance while in turn optimizing the system\u2019s global autonomous operation by learning its own competence. However, such systems are limited by a fixed model of their environment and may perform poorly if their a priori planning model does not include certain features that emerge as important over the course of the system\u2019s deployment. In this paper, we propose a method for improving the competence of a CAS over time by identifying important state features missing from the system\u2019s model and incorporating them into its state representation, thereby refining its state space. Our approach exploits information that exists in the standard CAS model and adds no extra work to the human. The result is an agent that better predicts human involvement, improving its competence, reliability, and overall performance.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1142','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1142\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSBWWZiros21.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSBWWZiros21.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSBWWZiros21.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1109\/IROS51168.2021.9636239\" title=\"Follow DOI:10.1109\/IROS51168.2021.9636239\" target=\"_blank\">doi:10.1109\/IROS51168.2021.9636239<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1142','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Parr, Shane;  Khatri, Ishan;  Svegliato, Justin;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1143','tp_links')\" style=\"cursor:pointer;\">Agent-Aware State Estimation in Autonomous Vehicles<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), <\/span><span class=\"tp_pub_additional_address\">Prague, Czech Republic, <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1143\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1143','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1143\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1143','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1143\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1143','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1143\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:PKSZiros21,<br \/>\r\ntitle = {Agent-Aware State Estimation in Autonomous Vehicles},<br \/>\r\nauthor = {Shane Parr and Ishan Khatri and Justin Svegliato and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PKSZiros21.pdf},<br \/>\r\ndoi = {10.1109\/IROS51168.2021.9636210},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\nbooktitle = {Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS)},<br \/>\r\npages = {6694--6699},<br \/>\r\naddress = {Prague, Czech Republic},<br \/>\r\nabstract = {Autonomous systems often operate in environments where the behavior of multiple agents is coordinated by a shared global state. Reliable estimation of the global state is thus critical for successfully operating in a multi-agent setting. We introduce agent-aware state estimation--a framework for calculating indirect estimations of state given observations of the behavior of other agents in the environment. We also introduce transition-independent agent-aware state estimation--a tractable class of agent-aware state estimation--and show that it allows the speed of inference to scale linearly with the number of agents in the environment. As an example, we model traffic light classification in instances of complete loss of direct observation. By taking into account observations of vehicular behavior from multiple directions of traffic, our approach exhibits accuracy higher than that of existing traffic light-only HMM methods on a real-world autonomous vehicle data set under a variety of simulated occlusion scenarios.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1143','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1143\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Autonomous systems often operate in environments where the behavior of multiple agents is coordinated by a shared global state. Reliable estimation of the global state is thus critical for successfully operating in a multi-agent setting. We introduce agent-aware state estimation--a framework for calculating indirect estimations of state given observations of the behavior of other agents in the environment. We also introduce transition-independent agent-aware state estimation--a tractable class of agent-aware state estimation--and show that it allows the speed of inference to scale linearly with the number of agents in the environment. As an example, we model traffic light classification in instances of complete loss of direct observation. By taking into account observations of vehicular behavior from multiple directions of traffic, our approach exhibits accuracy higher than that of existing traffic light-only HMM methods on a real-world autonomous vehicle data set under a variety of simulated occlusion scenarios.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1143','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1143\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PKSZiros21.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PKSZiros21.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PKSZiros21.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1109\/IROS51168.2021.9636210\" title=\"Follow DOI:10.1109\/IROS51168.2021.9636210\" target=\"_blank\">doi:10.1109\/IROS51168.2021.9636210<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1143','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Miura, Shuwa;  Cohen, Andrew L;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1145','tp_links')\" style=\"cursor:pointer;\">Maximizing Legibility in Stochastic Environments<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 30th IEEE International Conference on Robot &amp; Human Interactive Communication, (RO-MAN), <\/span><span class=\"tp_pub_additional_address\">Vancouver, BC, Canada, <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1145\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1145','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1145\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1145','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1145\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1145','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1145\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:MCZroman21,<br \/>\r\ntitle = {Maximizing Legibility in Stochastic Environments},<br \/>\r\nauthor = {Shuwa Miura and Andrew L Cohen and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MCZroman21.pdf},<br \/>\r\ndoi = {10.1109\/RO-MAN50785.2021.9515318},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\nbooktitle = {Proceedings of the 30th IEEE International Conference on Robot & Human Interactive Communication, (RO-MAN)},<br \/>\r\npages = {1053--1059},<br \/>\r\naddress = {Vancouver, BC, Canada},<br \/>\r\nabstract = {Making an agent's intentions clear from its observed behavior is crucial for seamless human-agent interaction and for increased transparency and trust in AI systems. Existing methods that address this challenge and maximize legibility of behaviors are limited to deterministic domains. We develop a technique for maximizing legibility in stochastic environments and illustrate that using legibility as an objective improves interpretability of agent behavior in several scenarios. We provide initial empirical evidence that human subjects can better interpret legible behavior.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1145','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1145\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Making an agent's intentions clear from its observed behavior is crucial for seamless human-agent interaction and for increased transparency and trust in AI systems. Existing methods that address this challenge and maximize legibility of behaviors are limited to deterministic domains. We develop a technique for maximizing legibility in stochastic environments and illustrate that using legibility as an objective improves interpretability of agent behavior in several scenarios. We provide initial empirical evidence that human subjects can better interpret legible behavior.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1145','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1145\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MCZroman21.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MCZroman21.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MCZroman21.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1109\/RO-MAN50785.2021.9515318\" title=\"Follow DOI:10.1109\/RO-MAN50785.2021.9515318\" target=\"_blank\">doi:10.1109\/RO-MAN50785.2021.9515318<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1145','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Miura, Shuwa;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1146','tp_links')\" style=\"cursor:pointer;\">A Unifying Framework for Observer-Aware Planning and its Complexity<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 37th Conference on Uncertainty in Artificial Intelligence (UAI), <\/span><span class=\"tp_pub_additional_address\">Virtual Event, <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1146\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1146','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1146\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1146','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1146\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1146','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1146\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:MZuai21,<br \/>\r\ntitle = {A Unifying Framework for Observer-Aware Planning and its Complexity},<br \/>\r\nauthor = {Shuwa Miura and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZuai21.pdf},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\nbooktitle = {Proceedings of the 37th Conference on Uncertainty in Artificial Intelligence (UAI)},<br \/>\r\npages = {610--620},<br \/>\r\naddress = {Virtual Event},<br \/>\r\nabstract = {Being aware of observers and the inferences they make about an agent's behavior is crucial for successful multi-agent interaction. Existing works on observer-aware planning use different assumptions and techniques to produce observer-aware behaviors. We argue that observer-aware planning, in its most general form, can be modeled as an Interactive POMDP (I-POMDP), which requires complex modeling and is hard to solve. Hence, we introduce a less complex framework for producing observer-aware behaviors called Observer-Aware MDP (OAMDP) and analyze its relationship to I-POMDP. We establish the complexity of OAMDPs and show that they can improve interpretability of agent behaviors in several scenarios.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1146','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1146\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Being aware of observers and the inferences they make about an agent's behavior is crucial for successful multi-agent interaction. Existing works on observer-aware planning use different assumptions and techniques to produce observer-aware behaviors. We argue that observer-aware planning, in its most general form, can be modeled as an Interactive POMDP (I-POMDP), which requires complex modeling and is hard to solve. Hence, we introduce a less complex framework for producing observer-aware behaviors called Observer-Aware MDP (OAMDP) and analyze its relationship to I-POMDP. We establish the complexity of OAMDPs and show that they can improve interpretability of agent behaviors in several scenarios.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1146','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1146\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZuai21.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZuai21.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZuai21.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1146','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1098','tp_links')\" style=\"cursor:pointer;\">Generalized Controllers in POMDP Decision-Making<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), <\/span><span class=\"tp_pub_additional_address\">Montreal, Quebec, CA, <\/span><span class=\"tp_pub_additional_year\">2019<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1098\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1098','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1098\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1098','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1098\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1098','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1098\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WZicra19,<br \/>\r\ntitle = {Generalized Controllers in POMDP Decision-Making},<br \/>\r\nauthor = {Kyle Hollins Wray and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZicra19.pdf},<br \/>\r\nyear  = {2019},<br \/>\r\ndate = {2019-01-01},<br \/>\r\nbooktitle = {Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},<br \/>\r\npages = {7166--7172},<br \/>\r\naddress = {Montreal, Quebec, CA},<br \/>\r\nabstract = {We present a general policy formulation for partially observable Markov decision processes (POMDPs) called controller family policies that may be used as a framework to facilitate the design of new policy forms. We prove how modern approximate policy forms: point-based, finite state controller (FSC), and belief compression, are instances of this family of generalized controller policies. Our analysis provides a deeper understanding of the POMDP model and suggests novel ways to design POMDP solutions that can combine the benefits of different state-of-the-art methods. We illustrate this capability by creating a new customized POMDP policy form called the belief-integrated FSC (BI-FSC) tailored to overcome the shortcomings of a state-of-the-art algorithm that uses non-linear programming (NLP). Specifically, experiments show that for NLP the BI-FSC offers improved performance over a vanilla FSC-based policy form on benchmark domains. Furthermore, we demonstrate the BI-FSC's execution on a real robot navigating in a maze environment. Results confirm the value of using the controller family policy as a framework to design customized policies in POMDP robotic solutions.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1098','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1098\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We present a general policy formulation for partially observable Markov decision processes (POMDPs) called controller family policies that may be used as a framework to facilitate the design of new policy forms. We prove how modern approximate policy forms: point-based, finite state controller (FSC), and belief compression, are instances of this family of generalized controller policies. Our analysis provides a deeper understanding of the POMDP model and suggests novel ways to design POMDP solutions that can combine the benefits of different state-of-the-art methods. We illustrate this capability by creating a new customized POMDP policy form called the belief-integrated FSC (BI-FSC) tailored to overcome the shortcomings of a state-of-the-art algorithm that uses non-linear programming (NLP). Specifically, experiments show that for NLP the BI-FSC offers improved performance over a vanilla FSC-based policy form on benchmark domains. Furthermore, we demonstrate the BI-FSC's execution on a real robot navigating in a maze environment. Results confirm the value of using the controller family policy as a framework to design customized policies in POMDP robotic solutions.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1098','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1098\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZicra19.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZicra19.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZicra19.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1098','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Saisubramanian, Sandhya;  Wray, Kyle Hollins;  Pineda, Luis Enrique;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1106','tp_links')\" style=\"cursor:pointer;\">Planning in Stochastic Environments with Goal Uncertainty<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), <\/span><span class=\"tp_pub_additional_address\">Macau, China, <\/span><span class=\"tp_pub_additional_year\">2019<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1106\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1106','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1106\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1106','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1106\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1106','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1106\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SWPZiros19,<br \/>\r\ntitle = {Planning in Stochastic Environments with Goal Uncertainty},<br \/>\r\nauthor = {Sandhya Saisubramanian and Kyle Hollins Wray and Luis Enrique Pineda and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SWPZiros19.pdf},<br \/>\r\nyear  = {2019},<br \/>\r\ndate = {2019-01-01},<br \/>\r\nbooktitle = {Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS)},<br \/>\r\naddress = {Macau, China},<br \/>\r\nabstract = {We present the Goal Uncertain Stochastic Shortest Path (GUSSP) problem -- a general framework to model path planning and decision making in stochastic environments with goal uncertainty. The framework extends the stochastic shortest path (SSP) model to dynamic environments in which it is impossible to determine the exact goal states ahead of plan execution. GUSSPs introduce flexibility in goal specification by allowing a belief over possible goal configurations. The unique observations at potential goals helps the agent identify the true goal during plan execution. The partial observability is restricted to goals, facilitating the reduction to an SSP with a modified state space. We formally define a GUSSP and discuss its theoretical properties. We then propose an admissible heuristic that reduces the planning time using FLARES -- a start-of-the-art probabilistic planner. We also propose a determinization approach for solving this class of problems. Finally, we present empirical results on a search and rescue mobile robot and three other problem domains in simulation.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1106','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1106\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We present the Goal Uncertain Stochastic Shortest Path (GUSSP) problem -- a general framework to model path planning and decision making in stochastic environments with goal uncertainty. The framework extends the stochastic shortest path (SSP) model to dynamic environments in which it is impossible to determine the exact goal states ahead of plan execution. GUSSPs introduce flexibility in goal specification by allowing a belief over possible goal configurations. The unique observations at potential goals helps the agent identify the true goal during plan execution. The partial observability is restricted to goals, facilitating the reduction to an SSP with a modified state space. We formally define a GUSSP and discuss its theoretical properties. We then propose an admissible heuristic that reduces the planning time using FLARES -- a start-of-the-art probabilistic planner. We also propose a determinization approach for solving this class of problems. Finally, we present empirical results on a search and rescue mobile robot and three other problem domains in simulation.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1106','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1106\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SWPZiros19.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SWPZiros19.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SWPZiros19.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1106','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('878','tp_links')\" style=\"cursor:pointer;\">Approximating Reachable Belief Points in POMDPs with Applications to Robotic Navigation and Localization<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">ICAPS Workshop on Planning and Robotics (PlanRob), <\/span><span class=\"tp_pub_additional_address\">Pittsburgh, Pennsylvania, <\/span><span class=\"tp_pub_additional_year\">2017<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_878\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('878','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_878\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('878','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_878\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('878','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_878\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WZplanrob17,<br \/>\r\ntitle = {Approximating Reachable Belief Points in POMDPs with Applications to Robotic Navigation and Localization},<br \/>\r\nauthor = {Kyle Hollins Wray and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZplanrob17.pdf},<br \/>\r\nyear  = {2017},<br \/>\r\ndate = {2017-01-01},<br \/>\r\nbooktitle = {ICAPS Workshop on Planning and Robotics (PlanRob)},<br \/>\r\naddress = {Pittsburgh, Pennsylvania},<br \/>\r\nabstract = {Stochastic network design is a general framework for optimizing network connectivity. It has several applications in computational sustainability including spatial conservation planning, pre-disaster network preparation, and river net- work optimization. A common assumption in previous work has been made that network parameters (e.g., probability of species colonization) are precisely known, which is unrealistic in real-world settings. We therefore address the robust river network design problem where the goal is to optimize river connectivity for fish movement by removing barriers. We assume that fish passability probabilities are known only imprecisely, but are within some interval bounds. We then develop a planning approach that computes the policies with either high robust ratio or low regret. Empirically, our approach scales well to large river networks. We also provide insights into the solutions generated by our robust approach, which has significantly higher robust ratio than the baseline solution with mean parameter estimates.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('878','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_878\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Stochastic network design is a general framework for optimizing network connectivity. It has several applications in computational sustainability including spatial conservation planning, pre-disaster network preparation, and river net- work optimization. A common assumption in previous work has been made that network parameters (e.g., probability of species colonization) are precisely known, which is unrealistic in real-world settings. We therefore address the robust river network design problem where the goal is to optimize river connectivity for fish movement by removing barriers. We assume that fish passability probabilities are known only imprecisely, but are within some interval bounds. We then develop a planning approach that computes the policies with either high robust ratio or low regret. Empirically, our approach scales well to large river networks. We also provide insights into the solutions generated by our robust approach, which has significantly higher robust ratio than the baseline solution with mean parameter estimates.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('878','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_878\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZplanrob17.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZplanrob17.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZplanrob17.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('878','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('885','tp_links')\" style=\"cursor:pointer;\">Approximating reachable belief points in POMDPs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), <\/span><span class=\"tp_pub_additional_address\">Vancouver, BC, Canada, <\/span><span class=\"tp_pub_additional_year\">2017<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_885\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('885','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_885\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('885','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_885\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('885','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_885\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WZiros17,<br \/>\r\ntitle = {Approximating reachable belief points in POMDPs},<br \/>\r\nauthor = {Kyle Hollins Wray and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZiros17.pdf},<br \/>\r\ndoi = {10.1109\/IROS.2017.8202146},<br \/>\r\nyear  = {2017},<br \/>\r\ndate = {2017-01-01},<br \/>\r\nbooktitle = {Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS)},<br \/>\r\npages = {117--122},<br \/>\r\naddress = {Vancouver, BC, Canada},<br \/>\r\nabstract = {We propose an algorithm called ?-approximation that compresses the non-zero values of beliefs for partially observable Markov decision processes (POMDPs) in order to improve performance and reduce memory usage. Specifically, we approximate individual belief vectors with a fixed bound on the number of non-zero values they may contain. We prove the correctness and a strong error bound when the ?-approximation is used with the point-based value iteration (PBVI) family algorithms. An analysis compares the algorithm on six larger domains, varying the number of non-zero values for the ?-approximation. Results clearly demonstrate that when the algorithm used with PBVI (?-PBVI), we can achieve over an order of magnitude improvement. We ground our claims with a full robotic implementation for simultaneous navigation and localization using POMDPs with ?-PBVI.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('885','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_885\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We propose an algorithm called ?-approximation that compresses the non-zero values of beliefs for partially observable Markov decision processes (POMDPs) in order to improve performance and reduce memory usage. Specifically, we approximate individual belief vectors with a fixed bound on the number of non-zero values they may contain. We prove the correctness and a strong error bound when the ?-approximation is used with the point-based value iteration (PBVI) family algorithms. An analysis compares the algorithm on six larger domains, varying the number of non-zero values for the ?-approximation. Results clearly demonstrate that when the algorithm used with PBVI (?-PBVI), we can achieve over an order of magnitude improvement. We ground our claims with a full robotic implementation for simultaneous navigation and localization using POMDPs with ?-PBVI.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('885','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_885\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZiros17.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZiros17.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZiros17.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1109\/IROS.2017.8202146\" title=\"Follow DOI:10.1109\/IROS.2017.8202146\" target=\"_blank\">doi:10.1109\/IROS.2017.8202146<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('885','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('888','tp_links')\" style=\"cursor:pointer;\">A POMDP Formulation of Proactive Learning<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 30th Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_address\">Phoenix, Arizona, <\/span><span class=\"tp_pub_additional_year\">2016<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_888\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('888','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_888\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('888','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_888\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('888','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_888\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WZaaai16,<br \/>\r\ntitle = {A POMDP Formulation of Proactive Learning},<br \/>\r\nauthor = {Kyle Hollins Wray and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZaaai16.pdf},<br \/>\r\nyear  = {2016},<br \/>\r\ndate = {2016-01-01},<br \/>\r\nbooktitle = {Proceedings of the 30th Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {3202--3208},<br \/>\r\naddress = {Phoenix, Arizona},<br \/>\r\nabstract = {We cast the Proactive Learning (PAL) problem--Active Learning (AL) with multiple reluctant, fallible, cost-varying oracles--as a Partially Observable Markov Decision Process (POMDP). The agent selects an oracle at each time step to label a data point while it maintains a belief over the true underlying correctness of its current dataset's labels. The goal is to minimize labeling costs while considering the value of obtaining correct labels, thus maximizing final resultant classifier accuracy. We prove three properties that show our particular formulation leads to a structured and bounded-size set of belief points, enabling strong performance of point-based methods to solve the POMDP. Our method is compared with the original three algorithms proposed by Donmez and Carbonell and a simple baseline. We demonstrate that our approach matches or improves upon the original approach within five different oracle scenarios, each on two datasets. Finally, our algorithm provides a general, well-defined mathematical foundation to build upon.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('888','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_888\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We cast the Proactive Learning (PAL) problem--Active Learning (AL) with multiple reluctant, fallible, cost-varying oracles--as a Partially Observable Markov Decision Process (POMDP). The agent selects an oracle at each time step to label a data point while it maintains a belief over the true underlying correctness of its current dataset's labels. The goal is to minimize labeling costs while considering the value of obtaining correct labels, thus maximizing final resultant classifier accuracy. We prove three properties that show our particular formulation leads to a structured and bounded-size set of belief points, enabling strong performance of point-based methods to solve the POMDP. Our method is compared with the original three algorithms proposed by Donmez and Carbonell and a simple baseline. We demonstrate that our approach matches or improves upon the original approach within five different oracle scenarios, each on two datasets. Finally, our algorithm provides a general, well-defined mathematical foundation to build upon.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('888','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_888\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZaaai16.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZaaai16.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZaaai16.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('888','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Kumar, Akshat;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('903','tp_links')\" style=\"cursor:pointer;\">History-Based Controller Design and Optimization for Partially Observable MDPs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 25th International Conference on Automated Planning and Scheduling (ICAPS), <\/span><span class=\"tp_pub_additional_address\">Jerusalem, Israel, <\/span><span class=\"tp_pub_additional_year\">2015<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_903\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('903','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_903\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('903','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_903\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('903','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_903\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:KZicaps15,<br \/>\r\ntitle = {History-Based Controller Design and Optimization for Partially Observable MDPs},<br \/>\r\nauthor = {Akshat Kumar and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZicaps15.pdf},<br \/>\r\nyear  = {2015},<br \/>\r\ndate = {2015-01-01},<br \/>\r\nbooktitle = {Proceedings of the 25th International Conference on Automated Planning and Scheduling (ICAPS)},<br \/>\r\npages = {156--164},<br \/>\r\naddress = {Jerusalem, Israel},<br \/>\r\nabstract = {Partially observable MDPs provide an elegant framework for sequential decision making. Finite-state controllers (FSCs) are often used to represent policies for infinite-horizon problems as they offer a compact representation, simple-to-execute plans, and adjustable tradeoff between computational complexity and policy size. We develop novel connections between optimizing FSCs for POMDPs and the dual linear program for MDPs. Building on that, we present a dual mixed integer linear program (MIP) for optimizing FSCs. To assign well-defined meaning to FSC nodes as well as aid in policy search, we show how to associate history-based features with each FSC node. Using this representation, we address another challenging problem, that of iteratively deciding which nodes to add to FSC to get a better policy. Using an efficient off-the-shelf MIP solver, we show that this new approach can find compact near-optimal FSCs for several large benchmark domains, and is competitive with previous best approaches.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('903','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_903\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Partially observable MDPs provide an elegant framework for sequential decision making. Finite-state controllers (FSCs) are often used to represent policies for infinite-horizon problems as they offer a compact representation, simple-to-execute plans, and adjustable tradeoff between computational complexity and policy size. We develop novel connections between optimizing FSCs for POMDPs and the dual linear program for MDPs. Building on that, we present a dual mixed integer linear program (MIP) for optimizing FSCs. To assign well-defined meaning to FSC nodes as well as aid in policy search, we show how to associate history-based features with each FSC node. Using this representation, we address another challenging problem, that of iteratively deciding which nodes to add to FSC to get a better policy. Using an efficient off-the-shelf MIP solver, we show that this new approach can find compact near-optimal FSCs for several large benchmark domains, and is competitive with previous best approaches.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('903','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_903\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZicaps15.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZicaps15.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZicaps15.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('903','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('905','tp_links')\" style=\"cursor:pointer;\">Multi-Objective POMDPs with Lexicographic Reward Preferences<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_address\">Buenos Aires, Argentina, <\/span><span class=\"tp_pub_additional_year\">2015<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_905\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('905','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_905\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('905','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_905\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('905','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_905\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:KZijcai15,<br \/>\r\ntitle = {Multi-Objective POMDPs with Lexicographic Reward Preferences},<br \/>\r\nauthor = {Kyle Hollins Wray and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZijcai15.pdf},<br \/>\r\nyear  = {2015},<br \/>\r\ndate = {2015-01-01},<br \/>\r\nbooktitle = {Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\npages = {1719--1725},<br \/>\r\naddress = {Buenos Aires, Argentina},<br \/>\r\nabstract = {We propose a model, Lexicographic Partially Observable Markov Decision Process (LPOMDP), which extends POMDPs with lexicographic preferences over multiple value functions. It allows for slack--slightly less-than-optimal values--for higher-priority preferences to facilitate improvement in lower-priority value functions. Many real life situations are naturally captured by LPOMDPs with slack. We consider a semi-autonomous driving scenario in which time spent on the road is minimized, while maximizing time spent driving autonomously. We propose two solutions to LPOMDPs--Lexicographic Value Iteration (LVI) and Lexicographic Point-Based Value Iteration (LPBVI), establishing convergence results and correctness within strong slack bounds. We test the algorithms using real-world road data provided by Open Street Map (OSM) within 10 major cities. Finally, we present GPU-based optimizations for point-based solvers, demonstrating that their application enables us to quickly solve vastly larger LPOMDPs and other variations of POMDPs.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('905','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_905\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We propose a model, Lexicographic Partially Observable Markov Decision Process (LPOMDP), which extends POMDPs with lexicographic preferences over multiple value functions. It allows for slack--slightly less-than-optimal values--for higher-priority preferences to facilitate improvement in lower-priority value functions. Many real life situations are naturally captured by LPOMDPs with slack. We consider a semi-autonomous driving scenario in which time spent on the road is minimized, while maximizing time spent driving autonomously. We propose two solutions to LPOMDPs--Lexicographic Value Iteration (LVI) and Lexicographic Point-Based Value Iteration (LPBVI), establishing convergence results and correctness within strong slack bounds. We test the algorithms using real-world road data provided by Open Street Map (OSM) within 10 major cities. Finally, we present GPU-based optimizations for point-based solvers, demonstrating that their application enables us to quickly solve vastly larger LPOMDPs and other variations of POMDPs.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('905','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_905\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZijcai15.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZijcai15.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZijcai15.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('905','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('912','tp_links')\" style=\"cursor:pointer;\">A Parallel Point-Based POMDP Algorithm Leveraging GPUs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">AAAI Fall Symposium on Sequential Decision Making for Intelligent Agents (SDMIA), <\/span><span class=\"tp_pub_additional_address\">Arlington, Virginia, <\/span><span class=\"tp_pub_additional_year\">2015<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_912\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('912','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_912\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('912','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_912\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('912','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_912\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WZfall15,<br \/>\r\ntitle = {A Parallel Point-Based POMDP Algorithm Leveraging GPUs},<br \/>\r\nauthor = {Kyle Hollins Wray and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZfall15.pdf},<br \/>\r\nyear  = {2015},<br \/>\r\ndate = {2015-01-01},<br \/>\r\nbooktitle = {AAAI Fall Symposium on Sequential Decision Making for Intelligent Agents (SDMIA)},<br \/>\r\naddress = {Arlington, Virginia},<br \/>\r\nabstract = {We parallelize the Point-Based Value Iteration (PBVI) algorithm, which approximates the solution to Partially Observable Markov Decision Processes (POMDPs), using a Graph- ics Processing Unit (GPU). We detail additional optimizations, such as leveraging the bounded size of non-zero values over all belief point vectors, usable by serial and parallel algorithms. We compare serial (CPU) and parallel (GPU) implementations on 10 distinct problem domains, and demonstrate that our approach provides an order of magnitude improvement.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('912','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_912\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We parallelize the Point-Based Value Iteration (PBVI) algorithm, which approximates the solution to Partially Observable Markov Decision Processes (POMDPs), using a Graph- ics Processing Unit (GPU). We detail additional optimizations, such as leveraging the bounded size of non-zero values over all belief point vectors, usable by serial and parallel algorithms. We compare serial (CPU) and parallel (GPU) implementations on 10 distinct problem domains, and demonstrate that our approach provides an order of magnitude improvement.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('912','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_912\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZfall15.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZfall15.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZfall15.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('912','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Amato, Christopher;  Bernstein, Daniel S;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('948','tp_links')\" style=\"cursor:pointer;\">Optimizing Fixed-Size Stochastic Controllers for POMDPs and Decentralized POMDPs<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Autonomous Agents and Multi-Agent Systems (JAAMAS), <\/span><span class=\"tp_pub_additional_volume\">vol. 21, <\/span><span class=\"tp_pub_additional_number\">no. 3, <\/span><span class=\"tp_pub_additional_pages\">pp. 293\u2013320, <\/span><span class=\"tp_pub_additional_year\">2010<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_948\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('948','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_948\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('948','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_948\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('948','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_948\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:ABZjaamas10,<br \/>\r\ntitle = {Optimizing Fixed-Size Stochastic Controllers for POMDPs and Decentralized POMDPs},<br \/>\r\nauthor = {Christopher Amato and Daniel S Bernstein and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ABZjaamas10.pdf},<br \/>\r\ndoi = {10.1007\/s10458-009-9103-z},<br \/>\r\nyear  = {2010},<br \/>\r\ndate = {2010-01-01},<br \/>\r\njournal = {Autonomous Agents and Multi-Agent Systems (JAAMAS)},<br \/>\r\nvolume = {21},<br \/>\r\nnumber = {3},<br \/>\r\npages = {293--320},<br \/>\r\nabstract = {Coordination of distributed agents is required for problems arising in many areas, including multi-robot systems, networking and e-commerce. As a formal framework for such problems, we use the decentralized partially observable Markov decision process (DEC-POMDP). Though much work has been done on optimal dynamic programming algorithms for the single-agent version of the problem, optimal algorithms for the multiagent case have been elusive. The main contribution of this paper is an optimal policy iteration algorithm for solving DEC-POMDPs. The algorithm uses stochastic finite-state controllers to represent policies. The solution can include a correlation device, which allows agents to correlate their actions without communicating. This approach alternates between expanding the controller and performing value-preserving transformations, which modify the controller without sacrificing value. We present two Efficient value-preserving transformations: one can reduce the size of the controller and the other can improve its value while keeping the size fixed. Empirical results demonstrate the usefulness of value-preserving transformations in increasing value while keeping controller size to a minimum. To broaden the applicability of the approach, we also present a heuristic version of the policy iteration algorithm, which sacrifices convergence to optimality. This algorithm further reduces the size of the controllers at each step by assuming that probability distributions over the other agents' actions are known. While this assumption may not hold in general, it helps produce higher quality solutions in our test problems.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('948','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_948\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Coordination of distributed agents is required for problems arising in many areas, including multi-robot systems, networking and e-commerce. As a formal framework for such problems, we use the decentralized partially observable Markov decision process (DEC-POMDP). Though much work has been done on optimal dynamic programming algorithms for the single-agent version of the problem, optimal algorithms for the multiagent case have been elusive. The main contribution of this paper is an optimal policy iteration algorithm for solving DEC-POMDPs. The algorithm uses stochastic finite-state controllers to represent policies. The solution can include a correlation device, which allows agents to correlate their actions without communicating. This approach alternates between expanding the controller and performing value-preserving transformations, which modify the controller without sacrificing value. We present two Efficient value-preserving transformations: one can reduce the size of the controller and the other can improve its value while keeping the size fixed. Empirical results demonstrate the usefulness of value-preserving transformations in increasing value while keeping controller size to a minimum. To broaden the applicability of the approach, we also present a heuristic version of the policy iteration algorithm, which sacrifices convergence to optimality. This algorithm further reduces the size of the controllers at each step by assuming that probability distributions over the other agents' actions are known. While this assumption may not hold in general, it helps produce higher quality solutions in our test problems.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('948','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_948\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ABZjaamas10.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ABZjaamas10.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ABZjaamas10.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1007\/s10458-009-9103-z\" title=\"Follow DOI:10.1007\/s10458-009-9103-z\" target=\"_blank\">doi:10.1007\/s10458-009-9103-z<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('948','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Amato, Christopher;  Bonet, Blai;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('957','tp_links')\" style=\"cursor:pointer;\">Finite-State Controllers Based on Mealy Machines for Centralized and Decentralized POMDPs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 24th Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_address\">Atlanta, Georgia, <\/span><span class=\"tp_pub_additional_year\">2010<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_957\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('957','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_957\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('957','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_957\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('957','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_957\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:ABZaaai10,<br \/>\r\ntitle = {Finite-State Controllers Based on Mealy Machines for Centralized and Decentralized POMDPs},<br \/>\r\nauthor = {Christopher Amato and Blai Bonet and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ABZaaai10.pdf},<br \/>\r\nyear  = {2010},<br \/>\r\ndate = {2010-01-01},<br \/>\r\nbooktitle = {Proceedings of the 24th Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {1052--1058},<br \/>\r\naddress = {Atlanta, Georgia},<br \/>\r\nabstract = {Existing controller-based approaches for centralized and decentralized POMDPs are based on automata with output known as Moore machines. In this paper, we show that several advantages can be gained by utilizing another type of automata, the Mealy machine. Mealy machines are more powerful than Moore machines, provide a richer structure that can be exploited by solution methods, and can be easily incorporated into current controller-based approaches. To demonstrate this, we adapted some existing controller-based algorithms to use Mealy machines and obtained results on a set of benchmark domains. The Mealy-based approach always outperformed the Moore-based approach and often outperformed the state-of-the-art algorithms for both centralized and decentralized POMDPs. These findings provide fresh and general insights for the improvement of existing algorithms and the development of new ones.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('957','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_957\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Existing controller-based approaches for centralized and decentralized POMDPs are based on automata with output known as Moore machines. In this paper, we show that several advantages can be gained by utilizing another type of automata, the Mealy machine. Mealy machines are more powerful than Moore machines, provide a richer structure that can be exploited by solution methods, and can be easily incorporated into current controller-based approaches. To demonstrate this, we adapted some existing controller-based algorithms to use Mealy machines and obtained results on a set of benchmark domains. The Mealy-based approach always outperformed the Moore-based approach and often outperformed the state-of-the-art algorithms for both centralized and decentralized POMDPs. These findings provide fresh and general insights for the improvement of existing algorithms and the development of new ones.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('957','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_957\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ABZaaai10.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ABZaaai10.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ABZaaai10.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('957','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Amato, Christopher;  Bernstein, Daniel S;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('994','tp_links')\" style=\"cursor:pointer;\">Solving POMDPs Using Quadratically Constrained Linear Programs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_address\">Hyderabad, India, <\/span><span class=\"tp_pub_additional_year\">2007<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_994\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('994','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_994\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('994','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_994\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('994','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_994\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:ABZijcai07,<br \/>\r\ntitle = {Solving POMDPs Using Quadratically Constrained Linear Programs},<br \/>\r\nauthor = {Christopher Amato and Daniel S Bernstein and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ABZijcai07.pdf},<br \/>\r\nyear  = {2007},<br \/>\r\ndate = {2007-01-01},<br \/>\r\nbooktitle = {Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\npages = {2418--2424},<br \/>\r\naddress = {Hyderabad, India},<br \/>\r\nabstract = {Developing scalable algorithms for solving partially observable Markov decision processes (POMDPs) is an important challenge. One approach that effectively addresses the intractable memory requirements of POMDP algorithms is based on representing POMDP policies as finite-state controllers. In this paper, we illustrate some fundamental disadvantages of existing techniques that use controllers. We then propose a new approach that formulates the problem as a quadratically constrained linear program (QCLP), which defines an optimal controller of a desired size. This representation allows a wide range of powerful nonlinear programming algorithms to be used to solve POMDPs. Although QCLP optimization techniques guarantee only local optimality, the results we obtain using an existing optimization method show significant solution improvement over the state-of-the-art techniques. The results open up promising research directions for solving large POMDPs using nonlinear programming methods.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('994','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_994\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Developing scalable algorithms for solving partially observable Markov decision processes (POMDPs) is an important challenge. One approach that effectively addresses the intractable memory requirements of POMDP algorithms is based on representing POMDP policies as finite-state controllers. In this paper, we illustrate some fundamental disadvantages of existing techniques that use controllers. We then propose a new approach that formulates the problem as a quadratically constrained linear program (QCLP), which defines an optimal controller of a desired size. This representation allows a wide range of powerful nonlinear programming algorithms to be used to solve POMDPs. Although QCLP optimization techniques guarantee only local optimality, the results we obtain using an existing optimization method show significant solution improvement over the state-of-the-art techniques. The results open up promising research directions for solving large POMDPs using nonlinear programming methods.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('994','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_994\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ABZijcai07.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ABZijcai07.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ABZijcai07.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('994','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Feng, Zhengzhu;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1011','tp_links')\" style=\"cursor:pointer;\">Efficient Maximization in Solving POMDPs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 20th National Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_address\">Pittsburgh, Pennsylvania, <\/span><span class=\"tp_pub_additional_year\">2005<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1011\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1011','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1011\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1011','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1011\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1011','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1011\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:FZaaai05,<br \/>\r\ntitle = {Efficient Maximization in Solving POMDPs},<br \/>\r\nauthor = {Zhengzhu Feng and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FZaaai05.pdf},<br \/>\r\nyear  = {2005},<br \/>\r\ndate = {2005-01-01},<br \/>\r\nbooktitle = {Proceedings of the 20th National Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {975--980},<br \/>\r\naddress = {Pittsburgh, Pennsylvania},<br \/>\r\nabstract = {We present a simple, yet effective improvement to the dynamic programming algorithm for solving partially observable Markov decision processes. The technique targets the vector pruning operation during the maximization step, a key source of complexity in POMDP algorithms. We identify two types of structures in the belief space and exploit them to reduce significantly the number of constraints in the linear programs used for pruning. The benefits of the new technique are evaluated both analytically and experimentally, showing that it can lead to significant performance improvement. The results open up new research opportunities to enhance the performance and scalability of several POMDP algorithms.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1011','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1011\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We present a simple, yet effective improvement to the dynamic programming algorithm for solving partially observable Markov decision processes. The technique targets the vector pruning operation during the maximization step, a key source of complexity in POMDP algorithms. We identify two types of structures in the belief space and exploit them to reduce significantly the number of constraints in the linear programs used for pruning. The benefits of the new technique are evaluated both analytically and experimentally, showing that it can lead to significant performance improvement. The results open up new research opportunities to enhance the performance and scalability of several POMDP algorithms.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1011','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1011\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FZaaai05.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FZaaai05.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FZaaai05.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1011','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Feng, Zhengzhu;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1027','tp_links')\" style=\"cursor:pointer;\">Region-Based Incremental Pruning for POMDPs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence (UAI), <\/span><span class=\"tp_pub_additional_address\">Banff, Canada, <\/span><span class=\"tp_pub_additional_year\">2004<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1027\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1027','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1027\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1027','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1027\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1027','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1027\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:FZuai04,<br \/>\r\ntitle = {Region-Based Incremental Pruning for POMDPs},<br \/>\r\nauthor = {Zhengzhu Feng and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FZuai04.pdf},<br \/>\r\nyear  = {2004},<br \/>\r\ndate = {2004-01-01},<br \/>\r\nbooktitle = {Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence (UAI)},<br \/>\r\npages = {146--153},<br \/>\r\naddress = {Banff, Canada},<br \/>\r\nabstract = {We present a major improvement to the incremental pruning algorithm for solving partially observable Markov decision processes. Our technique targets the cross-sum step of the dynamic programming (DP) update, a key source of complexity in POMDP algorithms. Instead of reasoning about the whole belief space when pruning the cross-sums, our algorithm divides the belief space into smaller regions and performs independent pruning in each region. We evaluate the benefits of the new technique both analytically and experimentally, and show that it produces very significant performance gains. The results contribute to the scalability of POMDP algorithms to domains that cannot be handled by the best existing techniques.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1027','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1027\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We present a major improvement to the incremental pruning algorithm for solving partially observable Markov decision processes. Our technique targets the cross-sum step of the dynamic programming (DP) update, a key source of complexity in POMDP algorithms. Instead of reasoning about the whole belief space when pruning the cross-sums, our algorithm divides the belief space into smaller regions and performs independent pruning in each region. We evaluate the benefits of the new technique both analytically and experimentally, and show that it produces very significant performance gains. The results contribute to the scalability of POMDP algorithms to domains that cannot be handled by the best existing techniques.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1027','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1027\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FZuai04.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FZuai04.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FZuai04.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1027','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><\/table><\/div><\/div>\n<div><\/div><\/div><\/div>\n<h3><span style=\"color: #264278\"><b>Multiagent Planning and DEC-POMDPs<\/b><\/span><\/h3>\n<div>\n<div>How can a group of intelligent agents coordinate their decisions in spite of stochasticity and limited information, and how to extend decision-theoretic models to such complex multiagent settings?<\/div>\n<div><div class=\"bg-margin-for-link\"><input type='hidden' bg_collapse_expand='6a2d3028ec7b03001182715' value='6a2d3028ec7b03001182715'><input type='hidden' id='bg-show-more-text-6a2d3028ec7b03001182715' value='Show Related Publications'><input type='hidden' id='bg-show-less-text-6a2d3028ec7b03001182715' value='Hide Related Publications'><a id='bg-showmore-action-6a2d3028ec7b03001182715' class='bg-showmore-plg-link bg-arrow '  style=\" color:#7C2622;;\" href='#'>Show Related Publications<\/a><div id='bg-showmore-hidden-6a2d3028ec7b03001182715' ><div class=\"teachpress_pub_list\"><form name=\"tppublistform\" method=\"get\"><a name=\"tppubs\" id=\"tppubs\"><\/a><\/form><div class=\"tablenav\"><div class=\"tablenav-pages\"><span class=\"displaying-num\">65 entries<\/span> <a class=\"page-numbers button disabled\">&laquo;<\/a> <a class=\"page-numbers button disabled\">&lsaquo;<\/a> 1 of 2 <a href=\"https:\/\/groups.cs.umass.edu\/shlomo\/research\/?limit=2&amp;tgid=&amp;yr=&amp;type=&amp;usr=&amp;auth=&amp;tsr=#tppubs\" title=\"next page\" class=\"page-numbers button\">&rsaquo;<\/a> <a href=\"https:\/\/groups.cs.umass.edu\/shlomo\/research\/?limit=2&amp;tgid=&amp;yr=&amp;type=&amp;usr=&amp;auth=&amp;tsr=#tppubs\" title=\"last page\" class=\"page-numbers button\">&raquo;<\/a> <\/div><\/div><table class=\"teachpress_publication_list\"><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Choudhury, Moumita;  Saisubramanian, Sandhya;  Zhang, Hao;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1190','tp_links')\" style=\"cursor:pointer;\">Minimizing Negative Side Effects in Cooperative Multi-Agent Systems Using Distributed Coordination<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the The 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS), <\/span><span class=\"tp_pub_additional_address\">Auckland, New Zealand, <\/span><span class=\"tp_pub_additional_year\">2024<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1190\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1190','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1190\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1190','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1190\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1190','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1190\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:CSZZaamas24,<br \/>\r\ntitle = {Minimizing Negative Side Effects in Cooperative Multi-Agent Systems Using Distributed Coordination},<br \/>\r\nauthor = {Moumita Choudhury and Sandhya Saisubramanian and Hao Zhang and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CSZZaamas24.pdf},<br \/>\r\nyear  = {2024},<br \/>\r\ndate = {2024-01-01},<br \/>\r\nbooktitle = {Proceedings of the The 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS)},<br \/>\r\naddress = {Auckland, New Zealand},<br \/>\r\nabstract = {Autonomous agents in real-world environments may encounter undesirable outcomes or negative side effects (NSEs) when working collaboratively alongside other agents. We frame the challenge of minimizing NSEs in a multi-agent setting as a lexicographic decentralized Markov decision process in which we assume independence of rewards and transitions with respect to the primary assigned tasks, but allowing negative side effects to create a form of dependence among the agents. We present a lexicographic Q-learning approach to mitigate the NSEs using human feedback models while maintaining near-optimality with respect to the assigned tasks\u2013up to some given slack. Our empirical evaluation across two domains demonstrates that our collaborative approach effectively mitigates NSEs, outperforming non-collaborative methods.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1190','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1190\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Autonomous agents in real-world environments may encounter undesirable outcomes or negative side effects (NSEs) when working collaboratively alongside other agents. We frame the challenge of minimizing NSEs in a multi-agent setting as a lexicographic decentralized Markov decision process in which we assume independence of rewards and transitions with respect to the primary assigned tasks, but allowing negative side effects to create a form of dependence among the agents. We present a lexicographic Q-learning approach to mitigate the NSEs using human feedback models while maintaining near-optimality with respect to the assigned tasks\u2013up to some given slack. Our empirical evaluation across two domains demonstrates that our collaborative approach effectively mitigates NSEs, outperforming non-collaborative methods.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1190','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1190\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CSZZaamas24.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CSZZaamas24.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CSZZaamas24.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1190','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Choudhury, Moumita;  Saisubramanian, Sandhya;  Zhang, Hao;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1191','tp_links')\" style=\"cursor:pointer;\">Minimizing Negative Side Effects in Cooperative Multi-Agent Systems Using Distributed Coordination<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the The 37th International FLAIRS Conference, <\/span><span class=\"tp_pub_additional_address\">Miramar Beach, Florida, <\/span><span class=\"tp_pub_additional_year\">2024<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1191\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1191','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1191\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1191','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1191\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1191','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1191\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:CSZZflairs24,<br \/>\r\ntitle = {Minimizing Negative Side Effects in Cooperative Multi-Agent Systems Using Distributed Coordination},<br \/>\r\nauthor = {Moumita Choudhury and Sandhya Saisubramanian and Hao Zhang and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CSZZflairs24.pdf},<br \/>\r\nyear  = {2024},<br \/>\r\ndate = {2024-01-01},<br \/>\r\nbooktitle = {Proceedings of the The 37th International FLAIRS Conference},<br \/>\r\naddress = {Miramar Beach, Florida},<br \/>\r\nabstract = {Autonomous agents operating in real-world environments frequently encounter undesirable outcomes or negative side effects (NSEs) when working collaboratively alongside other agents. Even when agents can execute their primary task optimally when operating in isolation, their training may not account for potential negative interactions that arise in the presence of other agents. We frame the challenge of minimizing NSEs as a lexicographic decentralized Markov decision process in which we assume independence of rewards and transitions with respect to the primary assigned tasks, but recognize that addressing negative side effects creates a form of dependence among the agents. We present a lexicographic Q-learning approach to mitigate the NSEs using human feedback models while maintaining near-optimality with respect to the assigned tasks\u2013up to some given slack. Our empirical evaluation across two domains demonstrates that our collaborative approach effectively mitigates NSEs, outperforming non-collaborative methods.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1191','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1191\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Autonomous agents operating in real-world environments frequently encounter undesirable outcomes or negative side effects (NSEs) when working collaboratively alongside other agents. Even when agents can execute their primary task optimally when operating in isolation, their training may not account for potential negative interactions that arise in the presence of other agents. We frame the challenge of minimizing NSEs as a lexicographic decentralized Markov decision process in which we assume independence of rewards and transitions with respect to the primary assigned tasks, but recognize that addressing negative side effects creates a form of dependence among the agents. We present a lexicographic Q-learning approach to mitigate the NSEs using human feedback models while maintaining near-optimality with respect to the assigned tasks\u2013up to some given slack. Our empirical evaluation across two domains demonstrates that our collaborative approach effectively mitigates NSEs, outperforming non-collaborative methods.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1191','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1191\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CSZZflairs24.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CSZZflairs24.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CSZZflairs24.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1191','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_incollection\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Mahmud, Saaduddin;  Nashed, Samer B.;  Goldman, Claudia V.;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1173','tp_links')\" style=\"cursor:pointer;\">Estimating Causal Responsibility for Explaining Autonomous Behavior<\/a> <span class=\"tp_pub_type tp_  incollection\">Book Section<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span> Calvaresi, Davide (Ed.): <span class=\"tp_pub_additional_booktitle\">International Workshop on Explainable and Transparent AI and Multi-Agent Systems (EXTRAAMAS), <\/span><span class=\"tp_pub_additional_pages\">pp. 78\u201394, <\/span><span class=\"tp_pub_additional_publisher\">Springer, <\/span><span class=\"tp_pub_additional_year\">2023<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1173\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1173','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1173\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1173','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1173\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1173','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1173\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@incollection{SZ:MNGZextraamas23,<br \/>\r\ntitle = {Estimating Causal Responsibility for Explaining Autonomous Behavior},<br \/>\r\nauthor = {Saaduddin Mahmud and Samer B. Nashed and Claudia V. Goldman and Shlomo Zilberstein},<br \/>\r\neditor = {Davide Calvaresi},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MNGZextraamas23.pdf},<br \/>\r\ndoi = {10.1007\/978-3-031-40878-6},<br \/>\r\nyear  = {2023},<br \/>\r\ndate = {2023-01-01},<br \/>\r\nbooktitle = {International Workshop on Explainable and Transparent AI and Multi-Agent Systems (EXTRAAMAS)},<br \/>\r\npages = {78\u201394},<br \/>\r\npublisher = {Springer},<br \/>\r\nabstract = {There has been growing interest in causal explanations of stochastic, sequential decision-making systems. Structural causal models and causal reasoning offer several theoretical benefits when exact inference can be applied. Furthermore, users overwhelmingly prefer the resulting causal explanations over other state-of-the-art systems. In this work, we focus on one such method, MeanRESP, and its approximate versions that drastically reduce compute load and assign a responsibility score to each variable, which helps identify smaller sets of causes to be used as explanations. However, this method, and its approximate versions in particular, lack deeper theoretical analysis and broader empirical tests. To address these shortcomings, we provide three primary contributions. First, we offer several theoretical insights on the sample complexity and error rate of approximate MeanRESP. Second, we discuss several automated metrics for comparing explanations generated from approximate methods to those generated via exact methods. While we recognize the significance of user studies as the gold standard for evaluating explanations, our aim is to leverage the proposed metrics to systematically compare explanation-generation methods along important quantitative dimensions. Finally, we provide a more detailed discussion of MeanRESP and how its output under different definitions of responsibility compares to existing widely adopted methods that use Shapley values.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {incollection}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1173','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1173\" style=\"display:none;\"><div class=\"tp_abstract_entry\">There has been growing interest in causal explanations of stochastic, sequential decision-making systems. Structural causal models and causal reasoning offer several theoretical benefits when exact inference can be applied. Furthermore, users overwhelmingly prefer the resulting causal explanations over other state-of-the-art systems. In this work, we focus on one such method, MeanRESP, and its approximate versions that drastically reduce compute load and assign a responsibility score to each variable, which helps identify smaller sets of causes to be used as explanations. However, this method, and its approximate versions in particular, lack deeper theoretical analysis and broader empirical tests. To address these shortcomings, we provide three primary contributions. First, we offer several theoretical insights on the sample complexity and error rate of approximate MeanRESP. Second, we discuss several automated metrics for comparing explanations generated from approximate methods to those generated via exact methods. While we recognize the significance of user studies as the gold standard for evaluating explanations, our aim is to leverage the proposed metrics to systematically compare explanation-generation methods along important quantitative dimensions. Finally, we provide a more detailed discussion of MeanRESP and how its output under different definitions of responsibility compares to existing widely adopted methods that use Shapley values.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1173','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1173\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MNGZextraamas23.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MNGZextraamas23.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MNGZextraamas23.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1007\/978-3-031-40878-6\" title=\"Follow DOI:10.1007\/978-3-031-40878-6\" target=\"_blank\">doi:10.1007\/978-3-031-40878-6<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1173','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Parr, Shane;  Khatri, Ishan;  Svegliato, Justin;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1143','tp_links')\" style=\"cursor:pointer;\">Agent-Aware State Estimation in Autonomous Vehicles<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), <\/span><span class=\"tp_pub_additional_address\">Prague, Czech Republic, <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1143\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1143','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1143\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1143','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1143\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1143','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1143\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:PKSZiros21,<br \/>\r\ntitle = {Agent-Aware State Estimation in Autonomous Vehicles},<br \/>\r\nauthor = {Shane Parr and Ishan Khatri and Justin Svegliato and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PKSZiros21.pdf},<br \/>\r\ndoi = {10.1109\/IROS51168.2021.9636210},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\nbooktitle = {Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS)},<br \/>\r\npages = {6694--6699},<br \/>\r\naddress = {Prague, Czech Republic},<br \/>\r\nabstract = {Autonomous systems often operate in environments where the behavior of multiple agents is coordinated by a shared global state. Reliable estimation of the global state is thus critical for successfully operating in a multi-agent setting. We introduce agent-aware state estimation--a framework for calculating indirect estimations of state given observations of the behavior of other agents in the environment. We also introduce transition-independent agent-aware state estimation--a tractable class of agent-aware state estimation--and show that it allows the speed of inference to scale linearly with the number of agents in the environment. As an example, we model traffic light classification in instances of complete loss of direct observation. By taking into account observations of vehicular behavior from multiple directions of traffic, our approach exhibits accuracy higher than that of existing traffic light-only HMM methods on a real-world autonomous vehicle data set under a variety of simulated occlusion scenarios.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1143','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1143\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Autonomous systems often operate in environments where the behavior of multiple agents is coordinated by a shared global state. Reliable estimation of the global state is thus critical for successfully operating in a multi-agent setting. We introduce agent-aware state estimation--a framework for calculating indirect estimations of state given observations of the behavior of other agents in the environment. We also introduce transition-independent agent-aware state estimation--a tractable class of agent-aware state estimation--and show that it allows the speed of inference to scale linearly with the number of agents in the environment. As an example, we model traffic light classification in instances of complete loss of direct observation. By taking into account observations of vehicular behavior from multiple directions of traffic, our approach exhibits accuracy higher than that of existing traffic light-only HMM methods on a real-world autonomous vehicle data set under a variety of simulated occlusion scenarios.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1143','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1143\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PKSZiros21.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PKSZiros21.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PKSZiros21.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1109\/IROS51168.2021.9636210\" title=\"Follow DOI:10.1109\/IROS51168.2021.9636210\" target=\"_blank\">doi:10.1109\/IROS51168.2021.9636210<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1143','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wu, Feng;  Zilberstein, Shlomo;  Jennings, Nicholas R<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1129','tp_links')\" style=\"cursor:pointer;\">Multi-Agent Planning with High-Level Human Guidance<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of Principles and Practice of Multi-Agent Systems (PRIMA), <\/span><span class=\"tp_pub_additional_year\">2020<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1129\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1129','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1129\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1129','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1129\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1129','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1129\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WZJprima20,<br \/>\r\ntitle = {Multi-Agent Planning with High-Level Human Guidance},<br \/>\r\nauthor = {Feng Wu and Shlomo Zilberstein and Nicholas R Jennings},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZJprima20.pdf},<br \/>\r\nyear  = {2020},<br \/>\r\ndate = {2020-01-01},<br \/>\r\nbooktitle = {Proceedings of Principles and Practice of Multi-Agent Systems (PRIMA)},<br \/>\r\nabstract = {Planning and coordination of multiple agents in the presence of uncertainty and noisy sensors is extremely hard. A human operator who observes a multi-agent team can provide valuable guidance to the team based on her superior ability to interpret observations and assess the overall situation. We propose an extension of decentralized POMDPs that allows such human guidance to be factored into the planning and execution processes. Human guidance in our framework consists of intuitive high-level commands that the agents must translate into a suitable joint plan that is sensitive to what they know from local observations. The result is a framework that allows multi-agent systems to benefit from the complex strategic thinking of a human supervising them. We evaluate this approach on several common benchmark problems and show that it can lead to dramatic improvement in performance.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1129','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1129\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Planning and coordination of multiple agents in the presence of uncertainty and noisy sensors is extremely hard. A human operator who observes a multi-agent team can provide valuable guidance to the team based on her superior ability to interpret observations and assess the overall situation. We propose an extension of decentralized POMDPs that allows such human guidance to be factored into the planning and execution processes. Human guidance in our framework consists of intuitive high-level commands that the agents must translate into a suitable joint plan that is sensitive to what they know from local observations. The result is a framework that allows multi-agent systems to benefit from the complex strategic thinking of a human supervising them. We evaluate this approach on several common benchmark problems and show that it can lead to dramatic improvement in performance.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1129','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1129\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZJprima20.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZJprima20.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZJprima20.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1129','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wu, Feng;  Zilberstein, Shlomo;  Jennings, Nicholas R<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1111','tp_links')\" style=\"cursor:pointer;\">Stochastic Multi-agent Planning with Partial State Models<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the First International Conference on Distributed Artificial Intelligence (DAI), <\/span><span class=\"tp_pub_additional_address\">Beijing, China, <\/span><span class=\"tp_pub_additional_year\">2019<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1111\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1111','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1111\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1111','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1111\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1111','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1111\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WZJdai19,<br \/>\r\ntitle = {Stochastic Multi-agent Planning with Partial State Models},<br \/>\r\nauthor = {Feng Wu and Shlomo Zilberstein and Nicholas R Jennings},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZJdai19.pdf},<br \/>\r\ndoi = {10.1145\/3356464.3357699},<br \/>\r\nyear  = {2019},<br \/>\r\ndate = {2019-01-01},<br \/>\r\nbooktitle = {Proceedings of the First International Conference on Distributed Artificial Intelligence (DAI)},<br \/>\r\npages = {1-8},<br \/>\r\naddress = {Beijing, China},<br \/>\r\nabstract = {People who observe a multi-agent team can often provide valuable information to the agents based on their superior cognitive abilities to interpret sequences of observations and assess the overall situation. The knowledge they possess is often difficult to be fully represented using a formal model such as DEC-POMDP. To deal with this, we propose an extension of the DEC-POMDP that allows states to be partially specified and benefit from expert knowledge, while preserving the partial observability and decentralized operation of the agents. In particular, we present an algorithm for computing policies based on history samples that include human labeled data in the form of reward reshaping. We also consider ways to minimize the burden on human experts during the labeling phase. The results offer the first approach to incorporating human knowledge in such complex multi-agent settings. We demonstrate the benefits of our approach using a disaster recovery scenario, comparing it to several baseline approaches.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1111','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1111\" style=\"display:none;\"><div class=\"tp_abstract_entry\">People who observe a multi-agent team can often provide valuable information to the agents based on their superior cognitive abilities to interpret sequences of observations and assess the overall situation. The knowledge they possess is often difficult to be fully represented using a formal model such as DEC-POMDP. To deal with this, we propose an extension of the DEC-POMDP that allows states to be partially specified and benefit from expert knowledge, while preserving the partial observability and decentralized operation of the agents. In particular, we present an algorithm for computing policies based on history samples that include human labeled data in the form of reward reshaping. We also consider ways to minimize the burden on human experts during the labeling phase. The results offer the first approach to incorporating human knowledge in such complex multi-agent settings. We demonstrate the benefits of our approach using a disaster recovery scenario, comparing it to several baseline approaches.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1111','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1111\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZJdai19.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZJdai19.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZJdai19.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1145\/3356464.3357699\" title=\"Follow DOI:10.1145\/3356464.3357699\" target=\"_blank\">doi:10.1145\/3356464.3357699<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1111','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wu, Feng;  Zilberstein, Shlomo;  Chen, Xiaoping<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('862','tp_links')\" style=\"cursor:pointer;\">Privacy-Preserving Policy Iteration for Decentralized POMDPs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 32nd Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_address\">New Orleans, Louisiana, <\/span><span class=\"tp_pub_additional_year\">2018<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_862\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('862','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_862\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('862','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_862\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('862','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_862\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WZCaaai18,<br \/>\r\ntitle = {Privacy-Preserving Policy Iteration for Decentralized POMDPs},<br \/>\r\nauthor = {Feng Wu and Shlomo Zilberstein and Xiaoping Chen},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCaaai18.pdf},<br \/>\r\nyear  = {2018},<br \/>\r\ndate = {2018-01-01},<br \/>\r\nbooktitle = {Proceedings of the 32nd Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {4759--4766},<br \/>\r\naddress = {New Orleans, Louisiana},<br \/>\r\nabstract = {We propose the first privacy-preserving approach to address the privacy issues that arise in multi-agent planning problems modeled as a Dec-POMDP. Our solution is a distributed message-passing algorithm based on trials, where the agents' policies are optimized using the cross-entropy method. In our algorithm, the agents' private information is protected using a public-key homomorphic cryptosystem. We prove the correctness of our algorithm and analyze its complexity in terms of message passing and encryption\/decryption operations. Furthermore, we analyze several privacy aspects of our algorithm and show that it can preserve the agent privacy of non-neighbors, model privacy, and decision privacy. Our experimental results on several common Dec-POMDP bench- mark problems confirm the effectiveness of our approach.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('862','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_862\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We propose the first privacy-preserving approach to address the privacy issues that arise in multi-agent planning problems modeled as a Dec-POMDP. Our solution is a distributed message-passing algorithm based on trials, where the agents' policies are optimized using the cross-entropy method. In our algorithm, the agents' private information is protected using a public-key homomorphic cryptosystem. We prove the correctness of our algorithm and analyze its complexity in terms of message passing and encryption\/decryption operations. Furthermore, we analyze several privacy aspects of our algorithm and show that it can preserve the agent privacy of non-neighbors, model privacy, and decision privacy. Our experimental results on several common Dec-POMDP bench- mark problems confirm the effectiveness of our approach.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('862','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_862\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCaaai18.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCaaai18.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCaaai18.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('862','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Kumar, Akshat;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('863','tp_links')\" style=\"cursor:pointer;\">Integrated Cooperation and Competition in Multi-Agent Decision-Making<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 32nd Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_address\">New Orleans, Louisiana, <\/span><span class=\"tp_pub_additional_year\">2018<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_863\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('863','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_863\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('863','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_863\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('863','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_863\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WKZaaai18,<br \/>\r\ntitle = {Integrated Cooperation and Competition in Multi-Agent Decision-Making},<br \/>\r\nauthor = {Kyle Hollins Wray and Akshat Kumar and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WKZaaai18.pdf},<br \/>\r\nyear  = {2018},<br \/>\r\ndate = {2018-01-01},<br \/>\r\nbooktitle = {Proceedings of the 32nd Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {4751--4758},<br \/>\r\naddress = {New Orleans, Louisiana},<br \/>\r\nabstract = {Observing that many real-world sequential decision problems are not purely cooperative or purely competitive, we propose a new model--cooperative-competitive process (CCP)--that can simultaneously encapsulate both cooperation and competition. First, we discuss how the CCP model bridges the gap between cooperative and competitive models. Next, we investigate a specific class of group-dominant CCPs, in which agents cooperate to achieve a common goal as their primary objective, while also pursuing individual goals as a secondary objective. We provide an approximate solution for this class of problems that leverages stochastic finite-state controllers. The model is grounded in two multi-robot meeting and box-pushing domains that are implemented in simulation and demonstrated on two real robots.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('863','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_863\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Observing that many real-world sequential decision problems are not purely cooperative or purely competitive, we propose a new model--cooperative-competitive process (CCP)--that can simultaneously encapsulate both cooperation and competition. First, we discuss how the CCP model bridges the gap between cooperative and competitive models. Next, we investigate a specific class of group-dominant CCPs, in which agents cooperate to achieve a common goal as their primary objective, while also pursuing individual goals as a secondary objective. We provide an approximate solution for this class of problems that leverages stochastic finite-state controllers. The model is grounded in two multi-robot meeting and box-pushing domains that are implemented in simulation and demonstrated on two real robots.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('863','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_863\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WKZaaai18.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WKZaaai18.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WKZaaai18.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('863','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wu, Feng;  Zilberstein, Shlomo;  Chen, Xiaoping<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('883','tp_links')\" style=\"cursor:pointer;\">Multi-Agent Planning with Baseline Regret Minimization<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_year\">2017<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_883\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('883','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_883\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('883','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_883\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('883','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_883\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WZCijcai17,<br \/>\r\ntitle = {Multi-Agent Planning with Baseline Regret Minimization},<br \/>\r\nauthor = {Feng Wu and Shlomo Zilberstein and Xiaoping Chen},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCijcai17.pdf},<br \/>\r\ndoi = {10.24963\/ijcai.2017\/63},<br \/>\r\nyear  = {2017},<br \/>\r\ndate = {2017-01-01},<br \/>\r\nbooktitle = {Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\npages = {444--450},<br \/>\r\nabstract = {We propose a novel baseline regret minimization algorithm for multi-agent planning problems modeled as finite-horizon decentralized POMDPs. It guarantees to produce a policy that is provably at least as good as a given baseline policy. We also propose an iterative belief generation algorithm to efficiently minimize the baseline regret, which only requires necessary iterations so as to converge to the policy with minimum baseline regret. Experimental results on common benchmark problems confirm the benefits of the algorithm compared with the state-of-the-art approaches.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('883','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_883\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We propose a novel baseline regret minimization algorithm for multi-agent planning problems modeled as finite-horizon decentralized POMDPs. It guarantees to produce a policy that is provably at least as good as a given baseline policy. We also propose an iterative belief generation algorithm to efficiently minimize the baseline regret, which only requires necessary iterations so as to converge to the policy with minimum baseline regret. Experimental results on common benchmark problems confirm the benefits of the algorithm compared with the state-of-the-art approaches.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('883','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_883\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCijcai17.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCijcai17.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCijcai17.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.24963\/ijcai.2017\/63\" title=\"Follow DOI:10.24963\/ijcai.2017\/63\" target=\"_blank\">doi:10.24963\/ijcai.2017\/63<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('883','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Kumar, Akshat;  Mostafa, Hala;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('891','tp_links')\" style=\"cursor:pointer;\">Dual Formulations for Optimizing Dec-POMDP Controllers<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 26th International Conference on Automated Planning and Scheduling (ICAPS), <\/span><span class=\"tp_pub_additional_address\">London, UK, <\/span><span class=\"tp_pub_additional_year\">2016<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_891\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('891','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_891\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('891','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_891\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('891','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_891\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:KMZicaps16,<br \/>\r\ntitle = {Dual Formulations for Optimizing Dec-POMDP Controllers},<br \/>\r\nauthor = {Akshat Kumar and Hala Mostafa and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KMZicaps16.pdf},<br \/>\r\nyear  = {2016},<br \/>\r\ndate = {2016-01-01},<br \/>\r\nbooktitle = {Proceedings of the 26th International Conference on Automated Planning and Scheduling (ICAPS)},<br \/>\r\npages = {202--210},<br \/>\r\naddress = {London, UK},<br \/>\r\nabstract = {Decentralized POMDP is an expressive model for multiagent planning. Finite-state controllers (FSCs)--often used to represent policies for infinite-horizon problems---offer a compact, simple-to-execute policy representation. We exploit novel connections between optimizing decentralized FSCs and the dual linear program for MDPs. Consequently, we describe a dual mixed integer linear program (MIP) for optimizing deterministic FSCs. We exploit the Dec-POMDP structure to devise a compact MIP and formulate constraints that result in policies executable in partially-observable decentralized settings. We show analytically that the dual formulation can also be exploited within the expectation maximization (EM) framework to optimize stochastic FSCs. The resulting EM algorithm can be implemented by solving a sequence of linear programs, without requiring expensive message passing over the Dec-POMDP DBN. We also present an efficient technique for policy improvement based on a weighted entropy measure. Compared with state-of-the-art FSC methods, our approach offers over an order-of-magnitude speedup, while producing similar or better solutions.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('891','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_891\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Decentralized POMDP is an expressive model for multiagent planning. Finite-state controllers (FSCs)--often used to represent policies for infinite-horizon problems---offer a compact, simple-to-execute policy representation. We exploit novel connections between optimizing decentralized FSCs and the dual linear program for MDPs. Consequently, we describe a dual mixed integer linear program (MIP) for optimizing deterministic FSCs. We exploit the Dec-POMDP structure to devise a compact MIP and formulate constraints that result in policies executable in partially-observable decentralized settings. We show analytically that the dual formulation can also be exploited within the expectation maximization (EM) framework to optimize stochastic FSCs. The resulting EM algorithm can be implemented by solving a sequence of linear programs, without requiring expensive message passing over the Dec-POMDP DBN. We also present an efficient technique for policy improvement based on a weighted entropy measure. Compared with state-of-the-art FSC methods, our approach offers over an order-of-magnitude speedup, while producing similar or better solutions.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('891','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_891\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KMZicaps16.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KMZicaps16.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KMZicaps16.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('891','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Kumar, Akshat;  Zilberstein, Shlomo;  Toussaint, Marc<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('898','tp_links')\" style=\"cursor:pointer;\">Probabilistic Inference Techniques for Scalable Multiagent Decision Making<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Journal of Artificial Intelligence Research (JAIR), <\/span><span class=\"tp_pub_additional_volume\">vol. 53, <\/span><span class=\"tp_pub_additional_pages\">pp. 223\u2013270, <\/span><span class=\"tp_pub_additional_year\">2015<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_898\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('898','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_898\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('898','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_898\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('898','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_898\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:KZTjair15,<br \/>\r\ntitle = {Probabilistic Inference Techniques for Scalable Multiagent Decision Making},<br \/>\r\nauthor = {Akshat Kumar and Shlomo Zilberstein and Marc Toussaint},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZTjair15.pdf},<br \/>\r\ndoi = {10.1613\/jair.4649},<br \/>\r\nyear  = {2015},<br \/>\r\ndate = {2015-01-01},<br \/>\r\njournal = {Journal of Artificial Intelligence Research (JAIR)},<br \/>\r\nvolume = {53},<br \/>\r\npages = {223--270},<br \/>\r\nabstract = {Decentralized POMDPs provide an expressive framework for multiagent sequential decision making. However, the complexity of these models -- NEXP-Complete even for two agents -- has limited their scalability. We present a promising new class of approximation algorithms by developing novel connections between multiagent planning and machine learning. We show how the multiagent planning problem can be reformulated as inference in a mixture of dynamic Bayesian networks (DBNs). This planning-as-inference approach paves the way for the application of efficient inference techniques in DBNs to multiagent decision making. To further improve scalability, we identify certain conditions that are sufficient to extend the approach to multiagent systems with dozens of agents. Specifically, we show that the necessary inference within the expectation-maximization framework can be decomposed into processes that often involve a small subset of agents, thereby facilitating scalability. We further show that a number of existing multiagent planning models satisfy these conditions. Experiments on large planning benchmarks confirm the benefits of our approach in terms of runtime and scalability with respect to existing techniques.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('898','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_898\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Decentralized POMDPs provide an expressive framework for multiagent sequential decision making. However, the complexity of these models -- NEXP-Complete even for two agents -- has limited their scalability. We present a promising new class of approximation algorithms by developing novel connections between multiagent planning and machine learning. We show how the multiagent planning problem can be reformulated as inference in a mixture of dynamic Bayesian networks (DBNs). This planning-as-inference approach paves the way for the application of efficient inference techniques in DBNs to multiagent decision making. To further improve scalability, we identify certain conditions that are sufficient to extend the approach to multiagent systems with dozens of agents. Specifically, we show that the necessary inference within the expectation-maximization framework can be decomposed into processes that often involve a small subset of agents, thereby facilitating scalability. We further show that a number of existing multiagent planning models satisfy these conditions. Experiments on large planning benchmarks confirm the benefits of our approach in terms of runtime and scalability with respect to existing techniques.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('898','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_898\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZTjair15.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZTjair15.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZTjair15.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1613\/jair.4649\" title=\"Follow DOI:10.1613\/jair.4649\" target=\"_blank\">doi:10.1613\/jair.4649<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('898','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Nguyen, Duc Thien;  Yeoh, William;  Lau, Hoong Chuin;  Zilberstein, Shlomo;  Zhang, Chongjie<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('917','tp_links')\" style=\"cursor:pointer;\">Decentralized Multi-Agent Reinforcement Learning in Average-Reward Dynamic DCOPs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 28th Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_address\">Quebec City, Canada, <\/span><span class=\"tp_pub_additional_year\">2014<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_917\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('917','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_917\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('917','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_917\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('917','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_917\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:NYLZZaaai14,<br \/>\r\ntitle = {Decentralized Multi-Agent Reinforcement Learning in Average-Reward Dynamic DCOPs},<br \/>\r\nauthor = {Duc Thien Nguyen and William Yeoh and Hoong Chuin Lau and Shlomo Zilberstein and Chongjie Zhang},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/NYLZZaaai14.pdf},<br \/>\r\nyear  = {2014},<br \/>\r\ndate = {2014-01-01},<br \/>\r\nbooktitle = {Proceedings of the 28th Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {1447--1455},<br \/>\r\naddress = {Quebec City, Canada},<br \/>\r\nabstract = {Researchers have introduced the Dynamic Distributed Constraint Optimization Problem (Dynamic DCOP) formulation to model dynamically changing multi-agent coordination problems, where a dynamic DCOP is a sequence of (static canonical) DCOPs, each partially different from the DCOP preceding it. Existing work typically assumes that the problem in each time step is decoupled from the problems in other time steps, which might not hold in some applications. Therefore, in this paper, we make the following contributions: (i) We introduce a new model, called Markovian Dynamic DCOPs (MD-DCOPs), where the DCOP in the next time step is a function of the value assignments in the current time step; (ii) We introduce two distributed reinforcement learning algorithms, the Distributed RVI Q-learning algorithm and the Distributed R-learning algorithm, that balance exploration and exploitation to solve MD-DCOPs in an online manner; and (iii) We empirically evaluate them against an existing multiarm bandit DCOP algorithm on dynamic DCOPs.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('917','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_917\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Researchers have introduced the Dynamic Distributed Constraint Optimization Problem (Dynamic DCOP) formulation to model dynamically changing multi-agent coordination problems, where a dynamic DCOP is a sequence of (static canonical) DCOPs, each partially different from the DCOP preceding it. Existing work typically assumes that the problem in each time step is decoupled from the problems in other time steps, which might not hold in some applications. Therefore, in this paper, we make the following contributions: (i) We introduce a new model, called Markovian Dynamic DCOPs (MD-DCOPs), where the DCOP in the next time step is a function of the value assignments in the current time step; (ii) We introduce two distributed reinforcement learning algorithms, the Distributed RVI Q-learning algorithm and the Distributed R-learning algorithm, that balance exploration and exploitation to solve MD-DCOPs in an online manner; and (iii) We empirically evaluate them against an existing multiarm bandit DCOP algorithm on dynamic DCOPs.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('917','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_917\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/NYLZZaaai14.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/NYLZZaaai14.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/NYLZZaaai14.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('917','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Brafman, Ronen I;  Shani, Guy;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('921','tp_links')\" style=\"cursor:pointer;\">Qualitative Planning under Partial Observability in Multi-Agent Domains<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 27th Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_address\">Bellevue, Washington, <\/span><span class=\"tp_pub_additional_year\">2013<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_921\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('921','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_921\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('921','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_921\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('921','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_921\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:BSZaaai13,<br \/>\r\ntitle = {Qualitative Planning under Partial Observability in Multi-Agent Domains},<br \/>\r\nauthor = {Ronen I Brafman and Guy Shani and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSZaaai13.pdf},<br \/>\r\nyear  = {2013},<br \/>\r\ndate = {2013-01-01},<br \/>\r\nbooktitle = {Proceedings of the 27th Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {130--137},<br \/>\r\naddress = {Bellevue, Washington},<br \/>\r\nabstract = {Decentralized POMDPs (Dec-POMDPs) provide a rich, attractive model for planning under uncertainty and partial observability in cooperative multi-agent domains with a growing body of research. In this paper we formulate a qualitative, propositional model for multi-agent planning under uncertainty with partial observability, which we call Qualitative Dec-POMDP (QDec-POMDP). We show that the worst-case complexity of planning in QDec-POMDPs is similar to that of Dec-POMDPs. Still, because the model is more \"classical\" in nature, it is more compact and easier to specify. Furthermore, it eases the adaptation of methods used in classical and contingent planning to solve problems that challenge current Dec-POMDPs solvers. In particular, in this paper we describe a method based on compilation to classical planning, which handles multi-agent planning problems significantly larger than those handled by current Dec-POMDP algorithms.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('921','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_921\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Decentralized POMDPs (Dec-POMDPs) provide a rich, attractive model for planning under uncertainty and partial observability in cooperative multi-agent domains with a growing body of research. In this paper we formulate a qualitative, propositional model for multi-agent planning under uncertainty with partial observability, which we call Qualitative Dec-POMDP (QDec-POMDP). We show that the worst-case complexity of planning in QDec-POMDPs is similar to that of Dec-POMDPs. Still, because the model is more \"classical\" in nature, it is more compact and easier to specify. Furthermore, it eases the adaptation of methods used in classical and contingent planning to solve problems that challenge current Dec-POMDPs solvers. In particular, in this paper we describe a method based on compilation to classical planning, which handles multi-agent planning problems significantly larger than those handled by current Dec-POMDP algorithms.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('921','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_921\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSZaaai13.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSZaaai13.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSZaaai13.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('921','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wu, Feng;  Zilberstein, Shlomo;  Jennings, Nicholas R<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('923','tp_links')\" style=\"cursor:pointer;\">Monte-Carlo Expectation Maximization for Decentralized POMDPs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 23rd International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_address\">Beijing, China, <\/span><span class=\"tp_pub_additional_year\">2013<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_923\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('923','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_923\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('923','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_923\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('923','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_923\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WZJijcai13,<br \/>\r\ntitle = {Monte-Carlo Expectation Maximization for Decentralized POMDPs},<br \/>\r\nauthor = {Feng Wu and Shlomo Zilberstein and Nicholas R Jennings},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZJijcai13.pdf},<br \/>\r\nyear  = {2013},<br \/>\r\ndate = {2013-01-01},<br \/>\r\nbooktitle = {Proceedings of the 23rd International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\npages = {397--403},<br \/>\r\naddress = {Beijing, China},<br \/>\r\nabstract = {We address two significant drawbacks of state-of-the-art solvers of decentralized POMDPs (DEC-POMDPs): the reliance on complete knowledge of the model and limited scalability as the complexity of the domain grows. We extend a recently proposed approach for solving DEC-POMDPs via a reduction to the maximum likelihood problem, which in turn can be solved using EM. We introduce a model-free version of this approach that employs Monte-Carlo EM (MCEM). While a naive implementation of MCEM is inadequate in multi-agent settings, we introduce several improvements in sampling that produce high-quality results on a variety of DEC-POMDP benchmarks, including large problems with thousands of agents.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('923','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_923\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We address two significant drawbacks of state-of-the-art solvers of decentralized POMDPs (DEC-POMDPs): the reliance on complete knowledge of the model and limited scalability as the complexity of the domain grows. We extend a recently proposed approach for solving DEC-POMDPs via a reduction to the maximum likelihood problem, which in turn can be solved using EM. We introduce a model-free version of this approach that employs Monte-Carlo EM (MCEM). While a naive implementation of MCEM is inadequate in multi-agent settings, we introduce several improvements in sampling that produce high-quality results on a variety of DEC-POMDP benchmarks, including large problems with thousands of agents.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('923','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_923\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZJijcai13.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZJijcai13.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZJijcai13.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('923','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Yeoh, William;  Kumar, Akshat;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('925','tp_links')\" style=\"cursor:pointer;\">Automated Generation of Interaction Graphs for Value-Factored Dec-POMDPs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 23rd International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_address\">Beijing, China, <\/span><span class=\"tp_pub_additional_year\">2013<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_925\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('925','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_925\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('925','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_925\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('925','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_925\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:YKZijcai13,<br \/>\r\ntitle = {Automated Generation of Interaction Graphs for Value-Factored Dec-POMDPs},<br \/>\r\nauthor = {William Yeoh and Akshat Kumar and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/YKZijcai13.pdf},<br \/>\r\nyear  = {2013},<br \/>\r\ndate = {2013-01-01},<br \/>\r\nbooktitle = {Proceedings of the 23rd International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\npages = {411--417},<br \/>\r\naddress = {Beijing, China},<br \/>\r\nabstract = {The Decentralized Partially Observable Markov Decision Process (Dec-POMDP) is a powerful model for multiagent planning under uncertainty, but its applicability is hindered by its high complexity -- solving Dec-POMDPs optimally is NEXP-hard. Recently, Kumar et al. introduced the Value Factorization (VF) framework, which exploits decomposable value functions that can be factored into subfunctions. This framework has been shown to be a generalization of several models that leverage sparse agent interactions such as TI-Dec-MDPs, ND-POMDPs and TD-POMDPs. Existing algorithms for these models assume that the interaction graph of the problem is given. In this paper, we introduce three algorithms to automatically generate interaction graphs for models within the VF framework and establish lower and upper bounds on the expected reward of an optimal joint policy. We illustrate experimentally the benefits of these techniques for sensor placement in a decentralized tracking application.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('925','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_925\" style=\"display:none;\"><div class=\"tp_abstract_entry\">The Decentralized Partially Observable Markov Decision Process (Dec-POMDP) is a powerful model for multiagent planning under uncertainty, but its applicability is hindered by its high complexity -- solving Dec-POMDPs optimally is NEXP-hard. Recently, Kumar et al. introduced the Value Factorization (VF) framework, which exploits decomposable value functions that can be factored into subfunctions. This framework has been shown to be a generalization of several models that leverage sparse agent interactions such as TI-Dec-MDPs, ND-POMDPs and TD-POMDPs. Existing algorithms for these models assume that the interaction graph of the problem is given. In this paper, we introduce three algorithms to automatically generate interaction graphs for models within the VF framework and establish lower and upper bounds on the expected reward of an optimal joint policy. We illustrate experimentally the benefits of these techniques for sensor placement in a decentralized tracking application.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('925','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_925\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/YKZijcai13.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/YKZijcai13.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/YKZijcai13.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('925','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_incollection\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Durfee, Edmund;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('927','tp_links')\" style=\"cursor:pointer;\">Multiagent Planning, Control, and Execution<\/a> <span class=\"tp_pub_type tp_  incollection\">Book Section<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span> Weiss, G (Ed.): <span class=\"tp_pub_additional_booktitle\">Multiagent Systems, Second Edition, <\/span><span class=\"tp_pub_additional_pages\">pp. 485\u2013546, <\/span><span class=\"tp_pub_additional_publisher\">MIT Press, <\/span><span class=\"tp_pub_additional_address\">Cambridge, MA, USA, <\/span><span class=\"tp_pub_additional_year\">2013<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_resource_link\"><a id=\"tp_links_sh_927\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('927','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_927\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('927','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_927\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@incollection{SZ:DZmultiagent13,<br \/>\r\ntitle = {Multiagent Planning, Control, and Execution},<br \/>\r\nauthor = {Edmund Durfee and Shlomo Zilberstein},<br \/>\r\neditor = {G Weiss},<br \/>\r\nurl = {https:\/\/mitpress.mit.edu\/books\/multiagent-systems-second-edition},<br \/>\r\nyear  = {2013},<br \/>\r\ndate = {2013-01-01},<br \/>\r\nbooktitle = {Multiagent Systems, Second Edition},<br \/>\r\npages = {485--546},<br \/>\r\npublisher = {MIT Press},<br \/>\r\naddress = {Cambridge, MA, USA},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {incollection}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('927','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_927\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/mitpress.mit.edu\/books\/multiagent-systems-second-edition\" title=\"https:\/\/mitpress.mit.edu\/books\/multiagent-systems-second-edition\" target=\"_blank\">https:\/\/mitpress.mit.edu\/books\/multiagent-systems-second-edition<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('927','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wu, Feng;  Zilberstein, Shlomo;  Chen, Xiaoping<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('933','tp_links')\" style=\"cursor:pointer;\">Online Planning for Multi-Agent Systems with Bounded Communication<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Artificial Intelligence (AIJ), <\/span><span class=\"tp_pub_additional_volume\">vol. 175, <\/span><span class=\"tp_pub_additional_number\">no. 2, <\/span><span class=\"tp_pub_additional_pages\">pp. 487\u2013511, <\/span><span class=\"tp_pub_additional_year\">2011<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_933\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('933','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_933\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('933','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_933\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('933','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_933\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:WZCaij11,<br \/>\r\ntitle = {Online Planning for Multi-Agent Systems with Bounded Communication},<br \/>\r\nauthor = {Feng Wu and Shlomo Zilberstein and Xiaoping Chen},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCaij11.pdf},<br \/>\r\ndoi = {10.1016\/j.artint.2010.09.008},<br \/>\r\nyear  = {2011},<br \/>\r\ndate = {2011-01-01},<br \/>\r\njournal = {Artificial Intelligence (AIJ)},<br \/>\r\nvolume = {175},<br \/>\r\nnumber = {2},<br \/>\r\npages = {487--511},<br \/>\r\nabstract = {We propose an online algorithm for planning under uncertainty in multi-agent settings modeled as DEC-POMDPs. The algorithm helps overcome the high computational complexity of solving such problems offline. The key challenges in decentralized operation are to maintain coordinated behavior with little or no communication and, when communication is allowed, to optimize value with minimal communication. The algorithm addresses these challenges by generating identical conditional plans based on common knowledge and communicating only when history inconsistency is detected, allowing communication to be postponed when necessary. To be suitable for online operation, the algorithm computes good local policies using a new and fast local search method implemented using linear programming. Moreover, it bounds the amount of memory used at each step and can be applied to problems with arbitrary horizons. The experimental results confirm that the algorithm can solve problems that are too large for the best existing offline planning algorithms and it outperforms the best online method, producing much higher value with much less communication in most cases. The algorithm also proves to be effective when the communication channel is imperfect (periodically unavailable). These results contribute to the scalability of decision-theoretic planning in multi-agent settings.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('933','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_933\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We propose an online algorithm for planning under uncertainty in multi-agent settings modeled as DEC-POMDPs. The algorithm helps overcome the high computational complexity of solving such problems offline. The key challenges in decentralized operation are to maintain coordinated behavior with little or no communication and, when communication is allowed, to optimize value with minimal communication. The algorithm addresses these challenges by generating identical conditional plans based on common knowledge and communicating only when history inconsistency is detected, allowing communication to be postponed when necessary. To be suitable for online operation, the algorithm computes good local policies using a new and fast local search method implemented using linear programming. Moreover, it bounds the amount of memory used at each step and can be applied to problems with arbitrary horizons. The experimental results confirm that the algorithm can solve problems that are too large for the best existing offline planning algorithms and it outperforms the best online method, producing much higher value with much less communication in most cases. The algorithm also proves to be effective when the communication channel is imperfect (periodically unavailable). These results contribute to the scalability of decision-theoretic planning in multi-agent settings.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('933','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_933\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCaij11.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCaij11.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCaij11.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1016\/j.artint.2010.09.008\" title=\"Follow DOI:10.1016\/j.artint.2010.09.008\" target=\"_blank\">doi:10.1016\/j.artint.2010.09.008<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('933','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Kumar, Akshat;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('936','tp_links')\" style=\"cursor:pointer;\">Message-Passing Algorithms for Large Structured Decentralized POMDPs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), <\/span><span class=\"tp_pub_additional_address\">Taipei, Taiwan, <\/span><span class=\"tp_pub_additional_year\">2011<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_936\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('936','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_936\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('936','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_936\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('936','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_936\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:KZaamas11,<br \/>\r\ntitle = {Message-Passing Algorithms for Large Structured Decentralized POMDPs},<br \/>\r\nauthor = {Akshat Kumar and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZaamas11.pdf},<br \/>\r\nyear  = {2011},<br \/>\r\ndate = {2011-01-01},<br \/>\r\nbooktitle = {Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems (AAMAS)},<br \/>\r\npages = {1087--1088},<br \/>\r\naddress = {Taipei, Taiwan},<br \/>\r\nabstract = {Anytime algorithms allow a system to trade solution quality for computation time. In previous work, monitoring techniques have been developed to allow agents to stop the computation at the \"right\" time so as to optimize a given time-dependent utility function. However, these results apply only to the single-agent case. In this paper we analyze the problems that arise when several agents solve components of a larger problem, each using an anytime algorithm. Monitoring in this case is more challenging as each agent is uncertain about the progress made so far by the others. We develop a formal framework for decentralized monitoring, establish the complexity of several interesting variants of the problem, and propose solution techniques for each one. Finally, we show that the framework can be applied to decentralized flow and planning problems.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('936','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_936\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Anytime algorithms allow a system to trade solution quality for computation time. In previous work, monitoring techniques have been developed to allow agents to stop the computation at the \"right\" time so as to optimize a given time-dependent utility function. However, these results apply only to the single-agent case. In this paper we analyze the problems that arise when several agents solve components of a larger problem, each using an anytime algorithm. Monitoring in this case is more challenging as each agent is uncertain about the progress made so far by the others. We develop a formal framework for decentralized monitoring, establish the complexity of several interesting variants of the problem, and propose solution techniques for each one. Finally, we show that the framework can be applied to decentralized flow and planning problems.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('936','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_936\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZaamas11.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZaamas11.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZaamas11.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('936','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Kumar, Akshat;  Zilberstein, Shlomo;  Toussaint, Marc<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('939','tp_links')\" style=\"cursor:pointer;\">Scalable Multiagent Planning Using Probabilistic Inference<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_address\">Barcelona, Spain, <\/span><span class=\"tp_pub_additional_year\">2011<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_939\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('939','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_939\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('939','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_939\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('939','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_939\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:KZTijcai11,<br \/>\r\ntitle = {Scalable Multiagent Planning Using Probabilistic Inference},<br \/>\r\nauthor = {Akshat Kumar and Shlomo Zilberstein and Marc Toussaint},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZTijcai11.pdf},<br \/>\r\ndoi = {10.5591\/978-1-57735-516-8\/IJCAI11-357},<br \/>\r\nyear  = {2011},<br \/>\r\ndate = {2011-01-01},<br \/>\r\nbooktitle = {Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\npages = {2140--2146},<br \/>\r\naddress = {Barcelona, Spain},<br \/>\r\nabstract = {Multiagent planning has seen much progress with the development of formal models such as Dec-POMDPs. However, the complexity of these models -- NEXP-Complete even for two agents -- has limited scalability. We identify certain mild conditions that are sufficient to make multiagent planning amenable to a scalable approximation w.r.t. the number of agents. This is achieved by constructing a graphical model in which likelihood maximization is equivalent to plan optimization. Using the Expectation-Maximization framework for likelihood maximization, we show that the necessary inference can be decomposed into processes that often involve a small subset of agents, thereby facilitating scalability. We derive a global update rule that combines these local inferences to monotonically increase the overall solution quality. Experiments on a large multiagent planning benchmark confirm the benefits of the new approach in terms of runtime and scalability.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('939','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_939\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Multiagent planning has seen much progress with the development of formal models such as Dec-POMDPs. However, the complexity of these models -- NEXP-Complete even for two agents -- has limited scalability. We identify certain mild conditions that are sufficient to make multiagent planning amenable to a scalable approximation w.r.t. the number of agents. This is achieved by constructing a graphical model in which likelihood maximization is equivalent to plan optimization. Using the Expectation-Maximization framework for likelihood maximization, we show that the necessary inference can be decomposed into processes that often involve a small subset of agents, thereby facilitating scalability. We derive a global update rule that combines these local inferences to monotonically increase the overall solution quality. Experiments on a large multiagent planning benchmark confirm the benefits of the new approach in terms of runtime and scalability.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('939','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_939\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZTijcai11.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZTijcai11.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZTijcai11.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.5591\/978-1-57735-516-8\/IJCAI11-357\" title=\"Follow DOI:10.5591\/978-1-57735-516-8\/IJCAI11-357\" target=\"_blank\">doi:10.5591\/978-1-57735-516-8\/IJCAI11-357<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('939','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wu, Feng;  Zilberstein, Shlomo;  Chen, Xiaoping<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('940','tp_links')\" style=\"cursor:pointer;\">Online Planning for Ad Hoc Autonomous Agent Teams<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_address\">Barcelona, Spain, <\/span><span class=\"tp_pub_additional_year\">2011<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_940\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('940','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_940\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('940','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_940\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('940','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_940\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WZCijcai11,<br \/>\r\ntitle = {Online Planning for Ad Hoc Autonomous Agent Teams},<br \/>\r\nauthor = {Feng Wu and Shlomo Zilberstein and Xiaoping Chen},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCijcai11.pdf},<br \/>\r\ndoi = {10.5591\/978-1-57735-516-8\/IJCAI11-081},<br \/>\r\nyear  = {2011},<br \/>\r\ndate = {2011-01-01},<br \/>\r\nbooktitle = {Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\npages = {439--445},<br \/>\r\naddress = {Barcelona, Spain},<br \/>\r\nabstract = {We propose a novel online planning algorithm for ad hoc team settings -- challenging situations in which an agent must collaborate with unknown teammates without prior coordination. Our approach is based on constructing and solving a series of stage games, and then using biased adaptive play to choose actions. The utility function in each stage game is estimated via Monte-Carlo tree search using the UCT algorithm. We establish analytically the convergence of the algorithm and show that it performs well in a variety of ad hoc team domains.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('940','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_940\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We propose a novel online planning algorithm for ad hoc team settings -- challenging situations in which an agent must collaborate with unknown teammates without prior coordination. Our approach is based on constructing and solving a series of stage games, and then using biased adaptive play to choose actions. The utility function in each stage game is estimated via Monte-Carlo tree search using the UCT algorithm. We establish analytically the convergence of the algorithm and show that it performs well in a variety of ad hoc team domains.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('940','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_940\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCijcai11.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCijcai11.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCijcai11.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.5591\/978-1-57735-516-8\/IJCAI11-081\" title=\"Follow DOI:10.5591\/978-1-57735-516-8\/IJCAI11-081\" target=\"_blank\">doi:10.5591\/978-1-57735-516-8\/IJCAI11-081<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('940','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Amato, Christopher;  Bernstein, Daniel S;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('948','tp_links')\" style=\"cursor:pointer;\">Optimizing Fixed-Size Stochastic Controllers for POMDPs and Decentralized POMDPs<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Autonomous Agents and Multi-Agent Systems (JAAMAS), <\/span><span class=\"tp_pub_additional_volume\">vol. 21, <\/span><span class=\"tp_pub_additional_number\">no. 3, <\/span><span class=\"tp_pub_additional_pages\">pp. 293\u2013320, <\/span><span class=\"tp_pub_additional_year\">2010<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_948\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('948','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_948\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('948','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_948\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('948','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_948\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:ABZjaamas10,<br \/>\r\ntitle = {Optimizing Fixed-Size Stochastic Controllers for POMDPs and Decentralized POMDPs},<br \/>\r\nauthor = {Christopher Amato and Daniel S Bernstein and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ABZjaamas10.pdf},<br \/>\r\ndoi = {10.1007\/s10458-009-9103-z},<br \/>\r\nyear  = {2010},<br \/>\r\ndate = {2010-01-01},<br \/>\r\njournal = {Autonomous Agents and Multi-Agent Systems (JAAMAS)},<br \/>\r\nvolume = {21},<br \/>\r\nnumber = {3},<br \/>\r\npages = {293--320},<br \/>\r\nabstract = {Coordination of distributed agents is required for problems arising in many areas, including multi-robot systems, networking and e-commerce. As a formal framework for such problems, we use the decentralized partially observable Markov decision process (DEC-POMDP). Though much work has been done on optimal dynamic programming algorithms for the single-agent version of the problem, optimal algorithms for the multiagent case have been elusive. The main contribution of this paper is an optimal policy iteration algorithm for solving DEC-POMDPs. The algorithm uses stochastic finite-state controllers to represent policies. The solution can include a correlation device, which allows agents to correlate their actions without communicating. This approach alternates between expanding the controller and performing value-preserving transformations, which modify the controller without sacrificing value. We present two Efficient value-preserving transformations: one can reduce the size of the controller and the other can improve its value while keeping the size fixed. Empirical results demonstrate the usefulness of value-preserving transformations in increasing value while keeping controller size to a minimum. To broaden the applicability of the approach, we also present a heuristic version of the policy iteration algorithm, which sacrifices convergence to optimality. This algorithm further reduces the size of the controllers at each step by assuming that probability distributions over the other agents' actions are known. While this assumption may not hold in general, it helps produce higher quality solutions in our test problems.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('948','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_948\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Coordination of distributed agents is required for problems arising in many areas, including multi-robot systems, networking and e-commerce. As a formal framework for such problems, we use the decentralized partially observable Markov decision process (DEC-POMDP). Though much work has been done on optimal dynamic programming algorithms for the single-agent version of the problem, optimal algorithms for the multiagent case have been elusive. The main contribution of this paper is an optimal policy iteration algorithm for solving DEC-POMDPs. The algorithm uses stochastic finite-state controllers to represent policies. The solution can include a correlation device, which allows agents to correlate their actions without communicating. This approach alternates between expanding the controller and performing value-preserving transformations, which modify the controller without sacrificing value. We present two Efficient value-preserving transformations: one can reduce the size of the controller and the other can improve its value while keeping the size fixed. Empirical results demonstrate the usefulness of value-preserving transformations in increasing value while keeping controller size to a minimum. To broaden the applicability of the approach, we also present a heuristic version of the policy iteration algorithm, which sacrifices convergence to optimality. This algorithm further reduces the size of the controllers at each step by assuming that probability distributions over the other agents' actions are known. While this assumption may not hold in general, it helps produce higher quality solutions in our test problems.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('948','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_948\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ABZjaamas10.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ABZjaamas10.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ABZjaamas10.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1007\/s10458-009-9103-z\" title=\"Follow DOI:10.1007\/s10458-009-9103-z\" target=\"_blank\">doi:10.1007\/s10458-009-9103-z<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('948','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Kumar, Akshat;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('949','tp_links')\" style=\"cursor:pointer;\">Point-Based Backup for Decentralized POMDPs: Complexity and New Algorithms<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), <\/span><span class=\"tp_pub_additional_address\">Toronto, Canada, <\/span><span class=\"tp_pub_additional_year\">2010<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_949\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('949','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_949\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('949','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_949\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('949','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_949\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:KZaamas10,<br \/>\r\ntitle = {Point-Based Backup for Decentralized POMDPs: Complexity and New Algorithms},<br \/>\r\nauthor = {Akshat Kumar and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZaamas10.pdf},<br \/>\r\nyear  = {2010},<br \/>\r\ndate = {2010-01-01},<br \/>\r\nbooktitle = {Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS)},<br \/>\r\npages = {1315--1322},<br \/>\r\naddress = {Toronto, Canada},<br \/>\r\nabstract = {Decentralized POMDPs provide an expressive framework for sequential multi-agent decision making. Despite their high complexity, there has been significant progress in scaling up existing algorithms, largely due to the use of point-based methods. Performing point-based backup is a fundamental operation in state-of-the-art algorithms. We show that even a single backup step in the multi-agent setting is NP-Complete. Despite this negative worst-case result, we present an efficient and scalable optimal algorithm as well as a principled approximation scheme. The optimal algorithm exploits recent advances in the weighted CSP literature to overcome the complexity of the backup operation. The polytime approximation scheme provides a constant factor approximation guarantee based on the number of belief points. In experiments on standard domains, the optimal approach provides significant speedup (up to 2 orders of magnitude) over the previous best optimal algorithm and is able to increase the number of belief points by more than a factor of 3. The approximation scheme also works well in practice, providing near-optimal solutions to the backup problem.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('949','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_949\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Decentralized POMDPs provide an expressive framework for sequential multi-agent decision making. Despite their high complexity, there has been significant progress in scaling up existing algorithms, largely due to the use of point-based methods. Performing point-based backup is a fundamental operation in state-of-the-art algorithms. We show that even a single backup step in the multi-agent setting is NP-Complete. Despite this negative worst-case result, we present an efficient and scalable optimal algorithm as well as a principled approximation scheme. The optimal algorithm exploits recent advances in the weighted CSP literature to overcome the complexity of the backup operation. The polytime approximation scheme provides a constant factor approximation guarantee based on the number of belief points. In experiments on standard domains, the optimal approach provides significant speedup (up to 2 orders of magnitude) over the previous best optimal algorithm and is able to increase the number of belief points by more than a factor of 3. The approximation scheme also works well in practice, providing near-optimal solutions to the backup problem.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('949','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_949\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZaamas10.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZaamas10.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZaamas10.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('949','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wu, Feng;  Zilberstein, Shlomo;  Chen, Xiaoping<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('951','tp_links')\" style=\"cursor:pointer;\">Point-Based Policy Generation for Decentralized POMDPs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), <\/span><span class=\"tp_pub_additional_address\">Toronto, Canada, <\/span><span class=\"tp_pub_additional_year\">2010<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_951\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('951','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_951\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('951','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_951\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('951','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_951\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WZCaamas10,<br \/>\r\ntitle = {Point-Based Policy Generation for Decentralized POMDPs},<br \/>\r\nauthor = {Feng Wu and Shlomo Zilberstein and Xiaoping Chen},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCaamas10.pdf},<br \/>\r\nyear  = {2010},<br \/>\r\ndate = {2010-01-01},<br \/>\r\nbooktitle = {Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS)},<br \/>\r\npages = {1307--1314},<br \/>\r\naddress = {Toronto, Canada},<br \/>\r\nabstract = {Memory-bounded techniques have shown great promise in solving complex multi-agent planning problems modeled as DEC-POMDPs. Much of the performance gains can be attributed to pruning techniques that alleviate the complexity of the exhaustive backup step of the original MBDP algorithm. Despite these improvements, state-of-the-art algorithms can still handle a relative small pool of candidate policies, which limits the quality of the solution in some benchmark problems. We present a new algorithm, Point-Based Policy Generation, which avoids altogether searching the entire joint policy space. The key observation is that the best joint policy for each reachable belief state can be constructed directly, instead of producing first a large set of candidates. We also provide an efficient approximate implementation of this operation. The experimental results show that our solution technique improves the performance significantly in terms of both runtime and solution quality.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('951','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_951\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Memory-bounded techniques have shown great promise in solving complex multi-agent planning problems modeled as DEC-POMDPs. Much of the performance gains can be attributed to pruning techniques that alleviate the complexity of the exhaustive backup step of the original MBDP algorithm. Despite these improvements, state-of-the-art algorithms can still handle a relative small pool of candidate policies, which limits the quality of the solution in some benchmark problems. We present a new algorithm, Point-Based Policy Generation, which avoids altogether searching the entire joint policy space. The key observation is that the best joint policy for each reachable belief state can be constructed directly, instead of producing first a large set of candidates. We also provide an efficient approximate implementation of this operation. The experimental results show that our solution technique improves the performance significantly in terms of both runtime and solution quality.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('951','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_951\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCaamas10.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCaamas10.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCaamas10.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('951','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Kumar, Akshat;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('954','tp_links')\" style=\"cursor:pointer;\">Anytime Planning for Decentralized POMDPs using Expectation Maximization<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence (UAI), <\/span><span class=\"tp_pub_additional_address\">Catalina Island, California, <\/span><span class=\"tp_pub_additional_year\">2010<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_954\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('954','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_954\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('954','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_954\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('954','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_954\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:KZuai10,<br \/>\r\ntitle = {Anytime Planning for Decentralized POMDPs using Expectation Maximization},<br \/>\r\nauthor = {Akshat Kumar and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZuai10.pdf},<br \/>\r\nyear  = {2010},<br \/>\r\ndate = {2010-01-01},<br \/>\r\nbooktitle = {Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence (UAI)},<br \/>\r\npages = {294--301},<br \/>\r\naddress = {Catalina Island, California},<br \/>\r\nabstract = {Decentralized POMDPs provide an expressive framework for multi-agent sequential decision making. While finite-horizon DEC-POMDPs have enjoyed significant success, progress remains slow for the infinite-horizon case mainly due to the inherent complexity of optimizing stochastic controllers representing agent policies. We present a promising new class of algorithms for the infinite-horizon case, which recasts the optimization problem as inference in a mixture of DBNs. An attractive feature of this approach is the straightforward adoption of existing inference techniques in DBNs for solving DEC-POMDPs and supporting richer representations such as factored or continuous states and actions. We also derive the Expectation Maximization (EM) algorithm to optimize the joint policy represented as DBNs. Experiments on benchmark domains show that EM compares favorably against the state-of-the-art solvers.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('954','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_954\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Decentralized POMDPs provide an expressive framework for multi-agent sequential decision making. While finite-horizon DEC-POMDPs have enjoyed significant success, progress remains slow for the infinite-horizon case mainly due to the inherent complexity of optimizing stochastic controllers representing agent policies. We present a promising new class of algorithms for the infinite-horizon case, which recasts the optimization problem as inference in a mixture of DBNs. An attractive feature of this approach is the straightforward adoption of existing inference techniques in DBNs for solving DEC-POMDPs and supporting richer representations such as factored or continuous states and actions. We also derive the Expectation Maximization (EM) algorithm to optimize the joint policy represented as DBNs. Experiments on benchmark domains show that EM compares favorably against the state-of-the-art solvers.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('954','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_954\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZuai10.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZuai10.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZuai10.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('954','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wu, Feng;  Zilberstein, Shlomo;  Chen, Xiaoping<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('955','tp_links')\" style=\"cursor:pointer;\">Rollout Sampling Policy Iteration for Decentralized POMDPs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence (UAI), <\/span><span class=\"tp_pub_additional_address\">Catalina Island, California, <\/span><span class=\"tp_pub_additional_year\">2010<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_955\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('955','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_955\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('955','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_955\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('955','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_955\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WZCuai10,<br \/>\r\ntitle = {Rollout Sampling Policy Iteration for Decentralized POMDPs},<br \/>\r\nauthor = {Feng Wu and Shlomo Zilberstein and Xiaoping Chen},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCuai10.pdf},<br \/>\r\nyear  = {2010},<br \/>\r\ndate = {2010-01-01},<br \/>\r\nbooktitle = {Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence (UAI)},<br \/>\r\npages = {666--673},<br \/>\r\naddress = {Catalina Island, California},<br \/>\r\nabstract = {We present decentralized rollout sampling policy iteration (DecRSPI)--a new algorithm for multiagent decision problems formalized as DEC-POMDPs. DecRSPI is designed to improve scalability and tackle problems that lack an explicit model. The algorithm uses Monte-Carlo methods to generate a sample of reachable belief states. Then it computes a joint policy for each belief state based on the rollout estimations. A new policy representation allows us to represent solutions compactly. The key benefits of the algorithm are its linear time complexity over the number of agents, its bounded memory usage and good solution quality. It can solve larger problems that are intractable for existing planning algorithms. Experimental results confirm the effectiveness and scalability of the approach.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('955','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_955\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We present decentralized rollout sampling policy iteration (DecRSPI)--a new algorithm for multiagent decision problems formalized as DEC-POMDPs. DecRSPI is designed to improve scalability and tackle problems that lack an explicit model. The algorithm uses Monte-Carlo methods to generate a sample of reachable belief states. Then it computes a joint policy for each belief state based on the rollout estimations. A new policy representation allows us to represent solutions compactly. The key benefits of the algorithm are its linear time complexity over the number of agents, its bounded memory usage and good solution quality. It can solve larger problems that are intractable for existing planning algorithms. Experimental results confirm the effectiveness and scalability of the approach.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('955','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_955\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCuai10.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCuai10.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCuai10.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('955','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Amato, Christopher;  Bonet, Blai;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('957','tp_links')\" style=\"cursor:pointer;\">Finite-State Controllers Based on Mealy Machines for Centralized and Decentralized POMDPs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 24th Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_address\">Atlanta, Georgia, <\/span><span class=\"tp_pub_additional_year\">2010<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_957\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('957','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_957\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('957','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_957\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('957','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_957\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:ABZaaai10,<br \/>\r\ntitle = {Finite-State Controllers Based on Mealy Machines for Centralized and Decentralized POMDPs},<br \/>\r\nauthor = {Christopher Amato and Blai Bonet and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ABZaaai10.pdf},<br \/>\r\nyear  = {2010},<br \/>\r\ndate = {2010-01-01},<br \/>\r\nbooktitle = {Proceedings of the 24th Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {1052--1058},<br \/>\r\naddress = {Atlanta, Georgia},<br \/>\r\nabstract = {Existing controller-based approaches for centralized and decentralized POMDPs are based on automata with output known as Moore machines. In this paper, we show that several advantages can be gained by utilizing another type of automata, the Mealy machine. Mealy machines are more powerful than Moore machines, provide a richer structure that can be exploited by solution methods, and can be easily incorporated into current controller-based approaches. To demonstrate this, we adapted some existing controller-based algorithms to use Mealy machines and obtained results on a set of benchmark domains. The Mealy-based approach always outperformed the Moore-based approach and often outperformed the state-of-the-art algorithms for both centralized and decentralized POMDPs. These findings provide fresh and general insights for the improvement of existing algorithms and the development of new ones.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('957','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_957\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Existing controller-based approaches for centralized and decentralized POMDPs are based on automata with output known as Moore machines. In this paper, we show that several advantages can be gained by utilizing another type of automata, the Mealy machine. Mealy machines are more powerful than Moore machines, provide a richer structure that can be exploited by solution methods, and can be easily incorporated into current controller-based approaches. To demonstrate this, we adapted some existing controller-based algorithms to use Mealy machines and obtained results on a set of benchmark domains. The Mealy-based approach always outperformed the Moore-based approach and often outperformed the state-of-the-art algorithms for both centralized and decentralized POMDPs. These findings provide fresh and general insights for the improvement of existing algorithms and the development of new ones.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('957','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_957\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ABZaaai10.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ABZaaai10.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ABZaaai10.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('957','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wu, Feng;  Zilberstein, Shlomo;  Chen, Xiaoping<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('958','tp_links')\" style=\"cursor:pointer;\">Trial-Based Dynamic Programming for Multi-Agent Planning<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 24th Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_address\">Atlanta, Georgia, <\/span><span class=\"tp_pub_additional_year\">2010<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_958\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('958','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_958\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('958','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_958\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('958','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_958\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WZCaaai10,<br \/>\r\ntitle = {Trial-Based Dynamic Programming for Multi-Agent Planning},<br \/>\r\nauthor = {Feng Wu and Shlomo Zilberstein and Xiaoping Chen},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCaaai10.pdf},<br \/>\r\nyear  = {2010},<br \/>\r\ndate = {2010-01-01},<br \/>\r\nbooktitle = {Proceedings of the 24th Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {908--914},<br \/>\r\naddress = {Atlanta, Georgia},<br \/>\r\nabstract = {Trial-based approaches offer an efficient way to solve single-agent MDPs and POMDPs. These approaches allow agents to focus their computations on regions of the environment they encounter during the trials, leading to significant computational savings. We present a novel trial-based dynamic programming (TBDP) algorithm for DEC-POMDPs that extends these benefits to multi-agent settings. The algorithm uses trial-based methods for both belief generation and policy evaluation. Policy improvement is implemented efficiently using linear programming and a sub-policy reuse technique that helps bound the amount of memory. The results show that TBDP can produce significant value improvements and is much faster than the best existing planning algorithms.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('958','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_958\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Trial-based approaches offer an efficient way to solve single-agent MDPs and POMDPs. These approaches allow agents to focus their computations on regions of the environment they encounter during the trials, leading to significant computational savings. We present a novel trial-based dynamic programming (TBDP) algorithm for DEC-POMDPs that extends these benefits to multi-agent settings. The algorithm uses trial-based methods for both belief generation and policy evaluation. Policy improvement is implemented efficiently using linear programming and a sub-policy reuse technique that helps bound the amount of memory. The results show that TBDP can produce significant value improvements and is much faster than the best existing planning algorithms.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('958','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_958\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCaaai10.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCaaai10.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCaaai10.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('958','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Bernstein, Daniel S;  Amato, Christopher;  Hansen, Eric A;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('960','tp_links')\" style=\"cursor:pointer;\">Policy Iteration for Decentralized Control of Markov Decision Processes<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Journal of Artificial Intelligence Research (JAIR), <\/span><span class=\"tp_pub_additional_volume\">vol. 34, <\/span><span class=\"tp_pub_additional_pages\">pp. 89\u2013132, <\/span><span class=\"tp_pub_additional_year\">2009<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_960\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('960','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_960\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('960','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_960\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('960','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_960\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:BAHZjair09,<br \/>\r\ntitle = {Policy Iteration for Decentralized Control of Markov Decision Processes},<br \/>\r\nauthor = {Daniel S Bernstein and Christopher Amato and Eric A Hansen and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BAHZjair09.pdf},<br \/>\r\ndoi = {10.1613\/jair.2667},<br \/>\r\nyear  = {2009},<br \/>\r\ndate = {2009-01-01},<br \/>\r\njournal = {Journal of Artificial Intelligence Research (JAIR)},<br \/>\r\nvolume = {34},<br \/>\r\npages = {89--132},<br \/>\r\nabstract = {Coordination of distributed agents is required for problems arising in many areas, including multi-robot systems, networking and e-commerce. As a formal framework for such problems, we use the decentralized partially observable Markov decision process (DEC-POMDP). Though much work has been done on optimal dynamic programming algorithms for the single-agent version of the problem, optimal algorithms for the multiagent case have been elusive. The main contribution of this paper is an optimal policy iteration algorithm for solving DEC-POMDPs. The algorithm uses stochastic finite-state controllers to represent policies. The solution can include a correlation device, which allows agents to correlate their actions without communicating. This approach alternates between expanding the controller and performing value-preserving transformations, which modify the controller without sacrificing value. We present two Efficient value-preserving transformations: one can reduce the size of the controller and the other can improve its value while keeping the size fixed. Empirical results demonstrate the usefulness of value-preserving transformations in increasing value while keeping controller size to a minimum. To broaden the applicability of the approach, we also present a heuristic version of the policy iteration algorithm, which sacrifices convergence to optimality. This algorithm further reduces the size of the controllers at each step by assuming that probability distributions over the other agents' actions are known. While this assumption may not hold in general, it helps produce higher quality solutions in our test problems.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('960','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_960\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Coordination of distributed agents is required for problems arising in many areas, including multi-robot systems, networking and e-commerce. As a formal framework for such problems, we use the decentralized partially observable Markov decision process (DEC-POMDP). Though much work has been done on optimal dynamic programming algorithms for the single-agent version of the problem, optimal algorithms for the multiagent case have been elusive. The main contribution of this paper is an optimal policy iteration algorithm for solving DEC-POMDPs. The algorithm uses stochastic finite-state controllers to represent policies. The solution can include a correlation device, which allows agents to correlate their actions without communicating. This approach alternates between expanding the controller and performing value-preserving transformations, which modify the controller without sacrificing value. We present two Efficient value-preserving transformations: one can reduce the size of the controller and the other can improve its value while keeping the size fixed. Empirical results demonstrate the usefulness of value-preserving transformations in increasing value while keeping controller size to a minimum. To broaden the applicability of the approach, we also present a heuristic version of the policy iteration algorithm, which sacrifices convergence to optimality. This algorithm further reduces the size of the controllers at each step by assuming that probability distributions over the other agents' actions are known. While this assumption may not hold in general, it helps produce higher quality solutions in our test problems.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('960','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_960\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BAHZjair09.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BAHZjair09.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BAHZjair09.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1613\/jair.2667\" title=\"Follow DOI:10.1613\/jair.2667\" target=\"_blank\">doi:10.1613\/jair.2667<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('960','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Petrik, Marek;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('961','tp_links')\" style=\"cursor:pointer;\">A Bilinear Programming Approach for Multiagent Planning<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Journal of Artificial Intelligence Research (JAIR), <\/span><span class=\"tp_pub_additional_volume\">vol. 35, <\/span><span class=\"tp_pub_additional_pages\">pp. 235\u2013274, <\/span><span class=\"tp_pub_additional_year\">2009<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_961\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('961','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_961\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('961','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_961\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('961','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_961\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:PZjair09,<br \/>\r\ntitle = {A Bilinear Programming Approach for Multiagent Planning},<br \/>\r\nauthor = {Marek Petrik and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZjair09.pdf},<br \/>\r\ndoi = {10.1613\/jair.2673},<br \/>\r\nyear  = {2009},<br \/>\r\ndate = {2009-01-01},<br \/>\r\njournal = {Journal of Artificial Intelligence Research (JAIR)},<br \/>\r\nvolume = {35},<br \/>\r\npages = {235--274},<br \/>\r\nabstract = {Multiagent planning and coordination problems are common and known to be computationally hard. We show that a wide range of two-agent problems can be formulated as bilinear programs. We present a successive approximation algorithm that significantly outperforms the coverage set algorithm, which is the state-of-the-art method for this class of multiagent problems. Because the algorithm is formulated for bilinear programs, it is more general and simpler to implement. The new algorithm can be terminated at any time and--unlike the coverage set algorithm--it facilitates the derivation of a useful online performance bound. It is also much more efficient, on average reducing the computation time of the optimal solution by about four orders of magnitude. Finally, we introduce an automatic dimensionality reduction method that improves the effectiveness of the algorithm, extending its applicability to new domains and providing a new way to analyze a subclass of bilinear programs.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('961','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_961\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Multiagent planning and coordination problems are common and known to be computationally hard. We show that a wide range of two-agent problems can be formulated as bilinear programs. We present a successive approximation algorithm that significantly outperforms the coverage set algorithm, which is the state-of-the-art method for this class of multiagent problems. Because the algorithm is formulated for bilinear programs, it is more general and simpler to implement. The new algorithm can be terminated at any time and--unlike the coverage set algorithm--it facilitates the derivation of a useful online performance bound. It is also much more efficient, on average reducing the computation time of the optimal solution by about four orders of magnitude. Finally, we introduce an automatic dimensionality reduction method that improves the effectiveness of the algorithm, extending its applicability to new domains and providing a new way to analyze a subclass of bilinear programs.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('961','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_961\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZjair09.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZjair09.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZjair09.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1613\/jair.2673\" title=\"Follow DOI:10.1613\/jair.2673\" target=\"_blank\">doi:10.1613\/jair.2673<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('961','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Becker, Raphen;  Carlin, Alan;  Lesser, Victor;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('962','tp_links')\" style=\"cursor:pointer;\">Analyzing Myopic Approaches for Multi-Agent Communication<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Computational Intelligence, <\/span><span class=\"tp_pub_additional_volume\">vol. 25, <\/span><span class=\"tp_pub_additional_number\">no. 1, <\/span><span class=\"tp_pub_additional_pages\">pp. 31\u201350, <\/span><span class=\"tp_pub_additional_year\">2009<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_962\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('962','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_962\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('962','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_962\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('962','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_962\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:BCLZci09,<br \/>\r\ntitle = {Analyzing Myopic Approaches for Multi-Agent Communication},<br \/>\r\nauthor = {Raphen Becker and Alan Carlin and Victor Lesser and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BCLZci09.pdf},<br \/>\r\ndoi = {10.1111\/j.1467-8640.2008.01329.x},<br \/>\r\nyear  = {2009},<br \/>\r\ndate = {2009-01-01},<br \/>\r\njournal = {Computational Intelligence},<br \/>\r\nvolume = {25},<br \/>\r\nnumber = {1},<br \/>\r\npages = {31--50},<br \/>\r\nabstract = {Choosing when to communicate is a fundamental problem in multi-agent systems. This problem becomes particularly challenging when communication is constrained and each agent has different partial information about the overall situation. We take a decision-theoretic approach to this problem that balances the benefits of communication against the costs. Although computing the exact value of communication is intractable, it can be estimated using a standard myopic assumption--that communication is only possible at the present time. We examine specific situations in which this assumption leads to poor performance and demonstrate an alternative approach that relaxes the assumption and improves performance. The results provide an effective method for value-driven communication policies in multi-agent systems.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('962','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_962\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Choosing when to communicate is a fundamental problem in multi-agent systems. This problem becomes particularly challenging when communication is constrained and each agent has different partial information about the overall situation. We take a decision-theoretic approach to this problem that balances the benefits of communication against the costs. Although computing the exact value of communication is intractable, it can be estimated using a standard myopic assumption--that communication is only possible at the present time. We examine specific situations in which this assumption leads to poor performance and demonstrate an alternative approach that relaxes the assumption and improves performance. The results provide an effective method for value-driven communication policies in multi-agent systems.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('962','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_962\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BCLZci09.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BCLZci09.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BCLZci09.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1111\/j.1467-8640.2008.01329.x\" title=\"Follow DOI:10.1111\/j.1467-8640.2008.01329.x\" target=\"_blank\">doi:10.1111\/j.1467-8640.2008.01329.x<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('962','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Amato, Christopher;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('963','tp_links')\" style=\"cursor:pointer;\">Achieving Goals in Decentralized POMDPs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 8th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), <\/span><span class=\"tp_pub_additional_address\">Budapest, Hungary, <\/span><span class=\"tp_pub_additional_year\">2009<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_963\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('963','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_963\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('963','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_963\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('963','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_963\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{ASZ:Zaamas09,<br \/>\r\ntitle = {Achieving Goals in Decentralized POMDPs},<br \/>\r\nauthor = {Christopher Amato and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/AZaamas09.pdf},<br \/>\r\nyear  = {2009},<br \/>\r\ndate = {2009-01-01},<br \/>\r\nbooktitle = {Proceedings of the 8th International Conference on Autonomous Agents and Multiagent Systems (AAMAS)},<br \/>\r\npages = {593--600},<br \/>\r\naddress = {Budapest, Hungary},<br \/>\r\nabstract = {Coordination of multiple agents under uncertainty in the decentralized POMDP model is known to be NEXP-complete, even when the agents have a joint set of goals. Nevertheless, we show that the existence of goals can help develop effective planning algorithms. We examine an approach to model these problems as indefinite-horizon decentralized POMDPs, suitable for many practical problems that terminate after some unspecified number of steps. Our algorithm for solving these problems is optimal under some common assumptions--that terminal actions exist for each agent and rewards for non-terminal actions are negative. We also propose an infinite-horizon approximation method that allows us to relax these assumptions while maintaining goal conditions. An optimality bound is developed for this sample-based approach and experimental results show that it is able to exploit the goal structure effectively. Compared with the state-of-the-art, our approach can solve larger problems and produce significantly better solutions.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('963','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_963\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Coordination of multiple agents under uncertainty in the decentralized POMDP model is known to be NEXP-complete, even when the agents have a joint set of goals. Nevertheless, we show that the existence of goals can help develop effective planning algorithms. We examine an approach to model these problems as indefinite-horizon decentralized POMDPs, suitable for many practical problems that terminate after some unspecified number of steps. Our algorithm for solving these problems is optimal under some common assumptions--that terminal actions exist for each agent and rewards for non-terminal actions are negative. We also propose an infinite-horizon approximation method that allows us to relax these assumptions while maintaining goal conditions. An optimality bound is developed for this sample-based approach and experimental results show that it is able to exploit the goal structure effectively. Compared with the state-of-the-art, our approach can solve larger problems and produce significantly better solutions.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('963','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_963\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/AZaamas09.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/AZaamas09.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/AZaamas09.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('963','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Kumar, Akshat;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('964','tp_links')\" style=\"cursor:pointer;\">Constraint-Based Dynamic Programming for Decentralized POMDPs with Structured Interactions<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 8th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), <\/span><span class=\"tp_pub_additional_address\">Budapest, Hungary, <\/span><span class=\"tp_pub_additional_year\">2009<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_964\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('964','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_964\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('964','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_964\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('964','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_964\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:KZaamas09,<br \/>\r\ntitle = {Constraint-Based Dynamic Programming for Decentralized POMDPs with Structured Interactions},<br \/>\r\nauthor = {Akshat Kumar and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZaamas09.pdf},<br \/>\r\nyear  = {2009},<br \/>\r\ndate = {2009-01-01},<br \/>\r\nbooktitle = {Proceedings of the 8th International Conference on Autonomous Agents and Multiagent Systems (AAMAS)},<br \/>\r\npages = {561--568},<br \/>\r\naddress = {Budapest, Hungary},<br \/>\r\nabstract = {Decentralized partially observable MDPs (DEC-POMDPs) provide a rich framework for modeling decision making by a team of agents. Despite rapid progress in this area, the limited scalability of solution techniques has restricted the applicability of the model. To overcome this computational barrier, research has focused on restricted classes of DEC-POMDPs, which are easier to solve yet rich enough to capture many practical problems. We present CBDP, an efficient and scalable point-based dynamic programming algorithm for one such model called ND-POMDP (Network Distributed POMDP). Specifically, CBDP provides magnitudes of speedup in the policy computation and generates better quality solution for all test instances. It has linear complexity in the number of agents and horizon length. Furthermore, the complexity per horizon for the examined class of problems is exponential only in a small parameter that depends upon the interaction among the agents, achieving significant scalability for large, loosely coupled multi-agent systems. The efficiency of CBDP lies in exploiting the structure of interactions using constraint networks. These results extend significantly the effectiveness of decision-theoretic planning in multi-agent settings.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('964','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_964\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Decentralized partially observable MDPs (DEC-POMDPs) provide a rich framework for modeling decision making by a team of agents. Despite rapid progress in this area, the limited scalability of solution techniques has restricted the applicability of the model. To overcome this computational barrier, research has focused on restricted classes of DEC-POMDPs, which are easier to solve yet rich enough to capture many practical problems. We present CBDP, an efficient and scalable point-based dynamic programming algorithm for one such model called ND-POMDP (Network Distributed POMDP). Specifically, CBDP provides magnitudes of speedup in the policy computation and generates better quality solution for all test instances. It has linear complexity in the number of agents and horizon length. Furthermore, the complexity per horizon for the examined class of problems is exponential only in a small parameter that depends upon the interaction among the agents, achieving significant scalability for large, loosely coupled multi-agent systems. The efficiency of CBDP lies in exploiting the structure of interactions using constraint networks. These results extend significantly the effectiveness of decision-theoretic planning in multi-agent settings.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('964','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_964\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZaamas09.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZaamas09.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZaamas09.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('964','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Kumar, Akshat;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('966','tp_links')\" style=\"cursor:pointer;\">Dynamic Programming Approximations for Partially Observable Stochastic Games<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 22nd International FLAIRS Conference, <\/span><span class=\"tp_pub_additional_address\">Sanibel Island, Florida, <\/span><span class=\"tp_pub_additional_year\">2009<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_966\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('966','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_966\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('966','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_966\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('966','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_966\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:KZflairs09,<br \/>\r\ntitle = {Dynamic Programming Approximations for Partially Observable Stochastic Games},<br \/>\r\nauthor = {Akshat Kumar and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CZmsdm09.pdf},<br \/>\r\nyear  = {2009},<br \/>\r\ndate = {2009-01-01},<br \/>\r\nbooktitle = {Proceedings of the 22nd International FLAIRS Conference},<br \/>\r\npages = {547--552},<br \/>\r\naddress = {Sanibel Island, Florida},<br \/>\r\nabstract = {Partially observable stochastic games (POSGs) provide a rich mathematical framework for planning under uncertainty by a group of agents. However, this modeling advantage comes with a price, namely a high computational cost. Solving POSGs optimally quickly becomes intractable after a few decision cycles. Our main contribution is to provide bounded approximation techniques, which enable us to scale POSG algorithms by several orders of magnitude. We study both the POSG model and its cooperative counterpart, DEC-POMDP. Experiments on a number of problems confirm the scalability of our approach while still providing useful policies.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('966','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_966\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Partially observable stochastic games (POSGs) provide a rich mathematical framework for planning under uncertainty by a group of agents. However, this modeling advantage comes with a price, namely a high computational cost. Solving POSGs optimally quickly becomes intractable after a few decision cycles. Our main contribution is to provide bounded approximation techniques, which enable us to scale POSG algorithms by several orders of magnitude. We study both the POSG model and its cooperative counterpart, DEC-POMDP. Experiments on a number of problems confirm the scalability of our approach while still providing useful policies.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('966','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_966\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CZmsdm09.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CZmsdm09.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CZmsdm09.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('966','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Kumar, Akshat;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('969','tp_links')\" style=\"cursor:pointer;\">Event-Detecting Multi-Agent MDPs: Complexity and Constant-Factor Approximation<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_address\">Pasadena, California, <\/span><span class=\"tp_pub_additional_year\">2009<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_969\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('969','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_969\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('969','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_969\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('969','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_969\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:KZijcai09,<br \/>\r\ntitle = {Event-Detecting Multi-Agent MDPs: Complexity and Constant-Factor Approximation},<br \/>\r\nauthor = {Akshat Kumar and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZijcai09.pdf},<br \/>\r\nyear  = {2009},<br \/>\r\ndate = {2009-01-01},<br \/>\r\nbooktitle = {Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\npages = {201--207},<br \/>\r\naddress = {Pasadena, California},<br \/>\r\nabstract = {Planning under uncertainty for multiple agents has grown rapidly with the development of formal models such as multi-agent MDPs and decentralized MDPs. But despite their richness, the applicability of these models remains limited due to their computational complexity. We present the class of event-detecting multi-agent MDPs (eMMDPs), designed to detect multiple mobile targets by a team of sensor agents. We show that eMMDPs are NP-Hard and present a scalable 2-approximation algorithm for solving them using matroid theory and constraint optimization. The complexity of the algorithm is linear in the state-space and number of agents, quadratic in the horizon, and exponential only in a small parameter that depends on the interaction among the agents. Despite the worst-case approximation ratio of 2, experimental results show that the algorithm produces near-optimal policies for a range of test problems.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('969','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_969\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Planning under uncertainty for multiple agents has grown rapidly with the development of formal models such as multi-agent MDPs and decentralized MDPs. But despite their richness, the applicability of these models remains limited due to their computational complexity. We present the class of event-detecting multi-agent MDPs (eMMDPs), designed to detect multiple mobile targets by a team of sensor agents. We show that eMMDPs are NP-Hard and present a scalable 2-approximation algorithm for solving them using matroid theory and constraint optimization. The complexity of the algorithm is linear in the state-space and number of agents, quadratic in the horizon, and exponential only in a small parameter that depends on the interaction among the agents. Despite the worst-case approximation ratio of 2, experimental results show that the algorithm produces near-optimal policies for a range of test problems.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('969','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_969\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZijcai09.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZijcai09.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZijcai09.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('969','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Amato, Christopher;  Dibangoye, Jilles Steeve;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('971','tp_links')\" style=\"cursor:pointer;\">Incremental Policy Generation for Finite-Horizon DEC-POMDPs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 19th International Conference on Automated Planning and Scheduling (ICAPS), <\/span><span class=\"tp_pub_additional_address\">Thessaloniki, Greece, <\/span><span class=\"tp_pub_additional_year\">2009<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_971\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('971','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_971\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('971','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_971\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('971','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_971\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:ADZicaps09,<br \/>\r\ntitle = {Incremental Policy Generation for Finite-Horizon DEC-POMDPs},<br \/>\r\nauthor = {Christopher Amato and Jilles Steeve Dibangoye and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ADZicaps09.pdf},<br \/>\r\nyear  = {2009},<br \/>\r\ndate = {2009-01-01},<br \/>\r\nbooktitle = {Proceedings of the 19th International Conference on Automated Planning and Scheduling (ICAPS)},<br \/>\r\npages = {2--9},<br \/>\r\naddress = {Thessaloniki, Greece},<br \/>\r\nabstract = {Decentralized partially observable MDPs (DEC-POMDPs) provide a rich framework for modeling decision making by a team of agents. Despite rapid progress in this area, the limited scalability of solution techniques has restricted the applicability of the model. To overcome this computational barrier, research has focused on restricted classes of DEC-POMDPs, which are easier to solve yet rich enough to capture many practical problems. We present CBDP, an efficient and scalable point-based dynamic programming algorithm for one such model called ND-POMDP (Network Distributed POMDP). Specifically, CBDP provides magnitudes of speedup in the policy computation and generates better quality solution for all test instances. It has linear complexity in the number of agents and horizon length. Furthermore, the complexity per horizon for the examined class of problems is exponential only in a small parameter that depends upon the interaction among the agents, achieving significant scalability for large, loosely coupled multi-agent systems. The efficiency of CBDP lies in exploiting the structure of interactions using constraint networks. These results extend significantly the effectiveness of decision-theoretic planning in multi-agent settings.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('971','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_971\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Decentralized partially observable MDPs (DEC-POMDPs) provide a rich framework for modeling decision making by a team of agents. Despite rapid progress in this area, the limited scalability of solution techniques has restricted the applicability of the model. To overcome this computational barrier, research has focused on restricted classes of DEC-POMDPs, which are easier to solve yet rich enough to capture many practical problems. We present CBDP, an efficient and scalable point-based dynamic programming algorithm for one such model called ND-POMDP (Network Distributed POMDP). Specifically, CBDP provides magnitudes of speedup in the policy computation and generates better quality solution for all test instances. It has linear complexity in the number of agents and horizon length. Furthermore, the complexity per horizon for the examined class of problems is exponential only in a small parameter that depends upon the interaction among the agents, achieving significant scalability for large, loosely coupled multi-agent systems. The efficiency of CBDP lies in exploiting the structure of interactions using constraint networks. These results extend significantly the effectiveness of decision-theoretic planning in multi-agent settings.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('971','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_971\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ADZicaps09.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ADZicaps09.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ADZicaps09.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('971','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wu, Feng;  Zilberstein, Shlomo;  Chen, Xiaoping<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('972','tp_links')\" style=\"cursor:pointer;\">Multi-Agent Online Planning with Communication<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 19th International Conference on Automated Planning and Scheduling (ICAPS), <\/span><span class=\"tp_pub_additional_address\">Thessaloniki, Greece, <\/span><span class=\"tp_pub_additional_year\">2009<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_972\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('972','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_972\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('972','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_972\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('972','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_972\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WZCicaps09,<br \/>\r\ntitle = {Multi-Agent Online Planning with Communication},<br \/>\r\nauthor = {Feng Wu and Shlomo Zilberstein and Xiaoping Chen},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCicaps09.pdf},<br \/>\r\nyear  = {2009},<br \/>\r\ndate = {2009-01-01},<br \/>\r\nbooktitle = {Proceedings of the 19th International Conference on Automated Planning and Scheduling (ICAPS)},<br \/>\r\npages = {321--329},<br \/>\r\naddress = {Thessaloniki, Greece},<br \/>\r\nabstract = {We propose an online algorithm for planning under uncertainty in multi-agent settings modeled as DEC-POMDPs. The algorithm helps overcome the high computational complexity of solving such problems off-line. The key challenge is to produce coordinated behavior using little or no communication. When communication is allowed but constrained, the challenge is to produce high value with minimal communication. The algorithm addresses these challenges by communicating only when history inconsistency is detected, allowing communication to be postponed if necessary. Moreover, it bounds the memory usage at each step and can be applied to problems with arbitrary horizons. The experimental results confirm that the algorithm can solve problems that are too large for the best existing off-line planning algorithms and it outperforms the best online method, producing higher value with much less communication in most cases.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('972','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_972\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We propose an online algorithm for planning under uncertainty in multi-agent settings modeled as DEC-POMDPs. The algorithm helps overcome the high computational complexity of solving such problems off-line. The key challenge is to produce coordinated behavior using little or no communication. When communication is allowed but constrained, the challenge is to produce high value with minimal communication. The algorithm addresses these challenges by communicating only when history inconsistency is detected, allowing communication to be postponed if necessary. Moreover, it bounds the memory usage at each step and can be applied to problems with arbitrary horizons. The experimental results confirm that the algorithm can solve problems that are too large for the best existing off-line planning algorithms and it outperforms the best online method, producing higher value with much less communication in most cases.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('972','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_972\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCicaps09.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCicaps09.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZCicaps09.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('972','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Allen, Martin;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('976','tp_links')\" style=\"cursor:pointer;\">Complexity of Decentralized Control: Special Cases<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 23rd Neural Information Processing Systems Conference (NIPS), <\/span><span class=\"tp_pub_additional_address\">Vancouver, British Columbia, Canada, <\/span><span class=\"tp_pub_additional_year\">2009<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_976\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('976','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_976\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('976','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_976\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('976','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_976\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:AZnips09,<br \/>\r\ntitle = {Complexity of Decentralized Control: Special Cases},<br \/>\r\nauthor = {Martin Allen and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/AZnips09.pdf},<br \/>\r\nyear  = {2009},<br \/>\r\ndate = {2009-01-01},<br \/>\r\nbooktitle = {Proceedings of the 23rd Neural Information Processing Systems Conference (NIPS)},<br \/>\r\npages = {19--27},<br \/>\r\naddress = {Vancouver, British Columbia, Canada},<br \/>\r\nabstract = {The worst-case complexity of general decentralized POMDPs, which are equivalent to partially observable stochastic games (POSGs) is very high, both for the cooperative and competitive cases. Some reductions in complexity have been achieved by exploiting independence relations in some models. We show that these results are somewhat limited: when these independence assumptions are relaxed in very small ways, complexity returns to that of the general case.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('976','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_976\" style=\"display:none;\"><div class=\"tp_abstract_entry\">The worst-case complexity of general decentralized POMDPs, which are equivalent to partially observable stochastic games (POSGs) is very high, both for the cooperative and competitive cases. Some reductions in complexity have been achieved by exploiting independence relations in some models. We show that these results are somewhat limited: when these independence assumptions are relaxed in very small ways, complexity returns to that of the general case.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('976','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_976\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/AZnips09.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/AZnips09.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/AZnips09.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('976','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Goldman, Claudia V;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('978','tp_links')\" style=\"cursor:pointer;\">Communication-Based Decomposition Mechanisms for Decentralized MDPs<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Journal of Artificial Intelligence Research (JAIR), <\/span><span class=\"tp_pub_additional_volume\">vol. 32, <\/span><span class=\"tp_pub_additional_pages\">pp. 169\u2013202, <\/span><span class=\"tp_pub_additional_year\">2008<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_978\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('978','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_978\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('978','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_978\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('978','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_978\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:GZjair08,<br \/>\r\ntitle = {Communication-Based Decomposition Mechanisms for Decentralized MDPs},<br \/>\r\nauthor = {Claudia V Goldman and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/GZjair08.pdf},<br \/>\r\ndoi = {10.1613\/jair.2466},<br \/>\r\nyear  = {2008},<br \/>\r\ndate = {2008-01-01},<br \/>\r\njournal = {Journal of Artificial Intelligence Research (JAIR)},<br \/>\r\nvolume = {32},<br \/>\r\npages = {169--202},<br \/>\r\nabstract = {Multi-agent planning in stochastic environments can be framed formally as a decentralized Markov decision problem. Many real-life distributed problems that arise in manufacturing, multi-robot coordination and information gathering scenarios can be formalized using this framework. However, finding the optimal solution in the general case is hard, limiting the applicability of recently developed algorithms. This paper provides a practical approach for solving decentralized control problems when communication among the decision makers is possible, but costly. We develop the notion of communication-based mechanism that allows us to decompose a decentralized MDP into multiple single-agent problems. In this framework, referred to as decentralized semi-Markov decision process with direct communication (Dec-SMDP-Com), agents operate separately between communications. We show that finding an optimal mechanism is equivalent to solving optimally a Dec-SMDP-Com. We also provide a heuristic search algorithm that converges on the optimal decomposition. Restricting the decomposition to some specific types of local behaviors reduces significantly the complexity of planning. In particular, we present a polynomial-time algorithm for the case in which individual agents perform goal-oriented behaviors between communications. The paper concludes with an additional tractable algorithm that enables the introduction of human knowledge, thereby reducing the overall problem to finding the best time to communicate. Empirical results show that these approaches provide good approximate solutions.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('978','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_978\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Multi-agent planning in stochastic environments can be framed formally as a decentralized Markov decision problem. Many real-life distributed problems that arise in manufacturing, multi-robot coordination and information gathering scenarios can be formalized using this framework. However, finding the optimal solution in the general case is hard, limiting the applicability of recently developed algorithms. This paper provides a practical approach for solving decentralized control problems when communication among the decision makers is possible, but costly. We develop the notion of communication-based mechanism that allows us to decompose a decentralized MDP into multiple single-agent problems. In this framework, referred to as decentralized semi-Markov decision process with direct communication (Dec-SMDP-Com), agents operate separately between communications. We show that finding an optimal mechanism is equivalent to solving optimally a Dec-SMDP-Com. We also provide a heuristic search algorithm that converges on the optimal decomposition. Restricting the decomposition to some specific types of local behaviors reduces significantly the complexity of planning. In particular, we present a polynomial-time algorithm for the case in which individual agents perform goal-oriented behaviors between communications. The paper concludes with an additional tractable algorithm that enables the introduction of human knowledge, thereby reducing the overall problem to finding the best time to communicate. Empirical results show that these approaches provide good approximate solutions.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('978','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_978\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/GZjair08.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/GZjair08.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/GZjair08.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1613\/jair.2466\" title=\"Follow DOI:10.1613\/jair.2466\" target=\"_blank\">doi:10.1613\/jair.2466<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('978','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Seuken, Sven;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('979','tp_links')\" style=\"cursor:pointer;\">Formal Models and Algorithms for Decentralized Decision Making under Uncertainty<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Autonomous Agents and Multi-Agent Systems (JAAMAS), <\/span><span class=\"tp_pub_additional_volume\">vol. 17, <\/span><span class=\"tp_pub_additional_number\">no. 2, <\/span><span class=\"tp_pub_additional_pages\">pp. 190\u2013250, <\/span><span class=\"tp_pub_additional_year\">2008<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_979\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('979','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_979\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('979','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_979\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('979','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_979\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:SZjaamas08,<br \/>\r\ntitle = {Formal Models and Algorithms for Decentralized Decision Making under Uncertainty},<br \/>\r\nauthor = {Sven Seuken and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZjaamas08.pdf},<br \/>\r\ndoi = {10.1007\/s10458-007-9026-5},<br \/>\r\nyear  = {2008},<br \/>\r\ndate = {2008-01-01},<br \/>\r\njournal = {Autonomous Agents and Multi-Agent Systems (JAAMAS)},<br \/>\r\nvolume = {17},<br \/>\r\nnumber = {2},<br \/>\r\npages = {190--250},<br \/>\r\nabstract = {Multi-agent planning in stochastic environments can be framed formally as a decentralized Markov decision problem. Many real-life distributed problems that arise in manufacturing, multi-robot coordination and information gathering scenarios can be formalized using this framework. However, finding the optimal solution in the general case is hard, limiting the applicability of recently developed algorithms. This paper provides a practical approach for solving decentralized control problems when communication among the decision makers is possible, but costly. We develop the notion of communication-based mechanism that allows us to decompose a decentralized MDP into multiple single-agent problems. In this framework, referred to as decentralized semi-Markov decision process with direct communication (Dec-SMDP-Com), agents operate separately between communications. We show that finding an optimal mechanism is equivalent to solving optimally a Dec-SMDP-Com. We also provide a heuristic search algorithm that converges on the optimal decomposition. Restricting the decomposition to some specific types of local behaviors reduces significantly the complexity of planning. In particular, we present a polynomial-time algorithm for the case in which individual agents perform goal-oriented behaviors between communications. The paper concludes with an additional tractable algorithm that enables the introduction of human knowledge, thereby reducing the overall problem to finding the best time to communicate. Empirical results show that these approaches provide good approximate solutions.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('979','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_979\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Multi-agent planning in stochastic environments can be framed formally as a decentralized Markov decision problem. Many real-life distributed problems that arise in manufacturing, multi-robot coordination and information gathering scenarios can be formalized using this framework. However, finding the optimal solution in the general case is hard, limiting the applicability of recently developed algorithms. This paper provides a practical approach for solving decentralized control problems when communication among the decision makers is possible, but costly. We develop the notion of communication-based mechanism that allows us to decompose a decentralized MDP into multiple single-agent problems. In this framework, referred to as decentralized semi-Markov decision process with direct communication (Dec-SMDP-Com), agents operate separately between communications. We show that finding an optimal mechanism is equivalent to solving optimally a Dec-SMDP-Com. We also provide a heuristic search algorithm that converges on the optimal decomposition. Restricting the decomposition to some specific types of local behaviors reduces significantly the complexity of planning. In particular, we present a polynomial-time algorithm for the case in which individual agents perform goal-oriented behaviors between communications. The paper concludes with an additional tractable algorithm that enables the introduction of human knowledge, thereby reducing the overall problem to finding the best time to communicate. Empirical results show that these approaches provide good approximate solutions.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('979','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_979\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZjaamas08.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZjaamas08.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZjaamas08.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1007\/s10458-007-9026-5\" title=\"Follow DOI:10.1007\/s10458-007-9026-5\" target=\"_blank\">doi:10.1007\/s10458-007-9026-5<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('979','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Petrik, Marek;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('980','tp_links')\" style=\"cursor:pointer;\">A Successive Approximation Algorithm for Coordination Problems<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 10th International Symposium on Artificial Intelligence and Mathematics (ISAIM), <\/span><span class=\"tp_pub_additional_address\">Ft. Lauderdale, Florida, <\/span><span class=\"tp_pub_additional_year\">2008<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_980\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('980','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_980\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('980','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_980\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('980','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_980\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:PZisaim08,<br \/>\r\ntitle = {A Successive Approximation Algorithm for Coordination Problems},<br \/>\r\nauthor = {Marek Petrik and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZisaim08.pdf},<br \/>\r\nyear  = {2008},<br \/>\r\ndate = {2008-01-01},<br \/>\r\nbooktitle = {Proceedings of the 10th International Symposium on Artificial Intelligence and Mathematics (ISAIM)},<br \/>\r\naddress = {Ft. Lauderdale, Florida},<br \/>\r\nabstract = {Developing scalable coordination algorithms for multi-agent systems is a hard computational challenge. One useful approach, demonstrated by the Coverage Set Algorithm (CSA), exploits structured interaction to produce significant computational gains. Empirically, CSA exhibits very good anytime performance, but an error bound on the results has not been established. We reformulate the algorithm and derive an online error bound for approximate solutions. Moreover, we propose an effective way to automatically reduce the complexity of the interaction. Our experiments show that this is a promising approach to solve a broad class of decentralized decision problems. The general formulation used by the algorithm makes it both easy to implement and widely applicable to a variety of other AI problems.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('980','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_980\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Developing scalable coordination algorithms for multi-agent systems is a hard computational challenge. One useful approach, demonstrated by the Coverage Set Algorithm (CSA), exploits structured interaction to produce significant computational gains. Empirically, CSA exhibits very good anytime performance, but an error bound on the results has not been established. We reformulate the algorithm and derive an online error bound for approximate solutions. Moreover, we propose an effective way to automatically reduce the complexity of the interaction. Our experiments show that this is a promising approach to solve a broad class of decentralized decision problems. The general formulation used by the algorithm makes it both easy to implement and widely applicable to a variety of other AI problems.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('980','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_980\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZisaim08.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZisaim08.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZisaim08.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('980','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Carlin, Alan;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('982','tp_links')\" style=\"cursor:pointer;\">Value-Based Observation Compression for DEC-POMDPs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 7th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), <\/span><span class=\"tp_pub_additional_address\">Estoril, Portugal, <\/span><span class=\"tp_pub_additional_year\">2008<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_982\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('982','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_982\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('982','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_982\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('982','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_982\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:CZaamas08,<br \/>\r\ntitle = {Value-Based Observation Compression for DEC-POMDPs},<br \/>\r\nauthor = {Alan Carlin and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CZaamas08.pdf},<br \/>\r\nyear  = {2008},<br \/>\r\ndate = {2008-01-01},<br \/>\r\nbooktitle = {Proceedings of the 7th International Conference on Autonomous Agents and Multiagent Systems (AAMAS)},<br \/>\r\npages = {501--508},<br \/>\r\naddress = {Estoril, Portugal},<br \/>\r\nabstract = {Representing agent policies compactly is essential for improving the scalability of multi-agent planning algorithms. In this paper, we focus on developing a pruning technique that allows us to merge certain observations within agent policies, while minimizing loss of value. This is particularly important for solving finite-horizon decentralized POMDPs, where agent policies are represented as trees, and where the size of policy trees grows exponentially with the number of observations. We introduce a value-based observation compression technique that prunes the least valuable observations while maintaining an error bound on the value lost as a result of pruning. We analyze the characteristics of this pruning strategy and show empirically that it is effective. Thus, we use compact policies to obtain significantly higher values compared with the best existing DEC-POMDP algorithm.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('982','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_982\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Representing agent policies compactly is essential for improving the scalability of multi-agent planning algorithms. In this paper, we focus on developing a pruning technique that allows us to merge certain observations within agent policies, while minimizing loss of value. This is particularly important for solving finite-horizon decentralized POMDPs, where agent policies are represented as trees, and where the size of policy trees grows exponentially with the number of observations. We introduce a value-based observation compression technique that prunes the least valuable observations while maintaining an error bound on the value lost as a result of pruning. We analyze the characteristics of this pruning strategy and show empirically that it is effective. Thus, we use compact policies to obtain significantly higher values compared with the best existing DEC-POMDP algorithm.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('982','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_982\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CZaamas08.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CZaamas08.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CZaamas08.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('982','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Carlin, Alan;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('984','tp_links')\" style=\"cursor:pointer;\">Observation Compression in DEC-POMDP Policy Trees<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">AAMAS Workshop on Multi-Agent Sequential Decision Making in Uncertain Domains (MSDM), <\/span><span class=\"tp_pub_additional_address\">Estoril, Portugal, <\/span><span class=\"tp_pub_additional_year\">2008<\/span><span class=\"tp_pub_additional_note\">, (Best Paper Award)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_984\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('984','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_984\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('984','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_984\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('984','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_984\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:CZmsdm08,<br \/>\r\ntitle = {Observation Compression in DEC-POMDP Policy Trees},<br \/>\r\nauthor = {Alan Carlin and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CZmsdm08.pdf},<br \/>\r\nyear  = {2008},<br \/>\r\ndate = {2008-01-01},<br \/>\r\nbooktitle = {AAMAS Workshop on Multi-Agent Sequential Decision Making in Uncertain Domains (MSDM)},<br \/>\r\npages = {31--45},<br \/>\r\naddress = {Estoril, Portugal},<br \/>\r\nabstract = {Representing agent policies compactly is essential for improving the scalability of multi-agent planning algorithms. In this paper, we focus on developing a pruning technique that allows us to merge certain observations from agent policies, while minimizing the loss of value. This is particularly important for solving finite-horizon decentralized POMDPs, where agent policies are represented as trees, and where the size of policy trees grows exponentially with the number of observations. We introduce a value-based observation compression technique that prunes the least valuable observations while maintaining an error bound on the value lost as a result of pruning. We analyze the characteristics of this pruning strategy and show empirically that it is effective. Thus, we use compact policies to obtain significantly higher values compared with the best existing DEC-POMDP algorithm.},<br \/>\r\nnote = {Best Paper Award},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('984','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_984\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Representing agent policies compactly is essential for improving the scalability of multi-agent planning algorithms. In this paper, we focus on developing a pruning technique that allows us to merge certain observations from agent policies, while minimizing the loss of value. This is particularly important for solving finite-horizon decentralized POMDPs, where agent policies are represented as trees, and where the size of policy trees grows exponentially with the number of observations. We introduce a value-based observation compression technique that prunes the least valuable observations while maintaining an error bound on the value lost as a result of pruning. We analyze the characteristics of this pruning strategy and show empirically that it is effective. Thus, we use compact policies to obtain significantly higher values compared with the best existing DEC-POMDP algorithm.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('984','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_984\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CZmsdm08.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CZmsdm08.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CZmsdm08.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('984','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Amato, Christopher;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('990','tp_links')\" style=\"cursor:pointer;\">What's Worth Memorizing: Attribute-based Planning for DEC-POMDPs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">ICAPS Workshop on Multiagent Planning, <\/span><span class=\"tp_pub_additional_address\">Sydney, Australia, <\/span><span class=\"tp_pub_additional_year\">2008<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_990\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('990','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_990\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('990','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_990\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('990','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_990\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:AZmasplan08,<br \/>\r\ntitle = {What's Worth Memorizing: Attribute-based Planning for DEC-POMDPs},<br \/>\r\nauthor = {Christopher Amato and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/AZmasplan08.pdf},<br \/>\r\nyear  = {2008},<br \/>\r\ndate = {2008-01-01},<br \/>\r\nbooktitle = {ICAPS Workshop on Multiagent Planning},<br \/>\r\naddress = {Sydney, Australia},<br \/>\r\nabstract = {Current algorithms for decentralized partially observable Markov decision processes (DEC-POMDPs) require a large amount of memory to produce high quality plans. To combat this, existing methods optimize a set of finite-state controllers with an arbitrary amount of fixed memory. While this works well for some problems, in general, scalability and solution quality remain limited. As an alternative, we propose remembering some attributes that summarize key aspects of an agent's action and observation history. These attributes are often simple to determine, provide a well-motivated choice of controller size and focus the solution search on important components of agent histories. We show that for a range of DEC-POMDPs such attribute-based representation improves plan quality and scalability.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('990','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_990\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Current algorithms for decentralized partially observable Markov decision processes (DEC-POMDPs) require a large amount of memory to produce high quality plans. To combat this, existing methods optimize a set of finite-state controllers with an arbitrary amount of fixed memory. While this works well for some problems, in general, scalability and solution quality remain limited. As an alternative, we propose remembering some attributes that summarize key aspects of an agent's action and observation history. These attributes are often simple to determine, provide a well-motivated choice of controller size and focus the solution search on important components of agent histories. We show that for a range of DEC-POMDPs such attribute-based representation improves plan quality and scalability.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('990','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_990\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/AZmasplan08.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/AZmasplan08.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/AZmasplan08.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('990','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Goldman, Claudia V;  Allen, Martin;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('992','tp_links')\" style=\"cursor:pointer;\">Learning to Communicate in a Decentralized Environment<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Autonomous Agents and Multi-Agent Systems (JAAMAS), <\/span><span class=\"tp_pub_additional_volume\">vol. 15, <\/span><span class=\"tp_pub_additional_number\">no. 1, <\/span><span class=\"tp_pub_additional_pages\">pp. 47\u201390, <\/span><span class=\"tp_pub_additional_year\">2007<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_992\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('992','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_992\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('992','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_992\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('992','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_992\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:GAZjaamas07,<br \/>\r\ntitle = {Learning to Communicate in a Decentralized Environment},<br \/>\r\nauthor = {Claudia V Goldman and Martin Allen and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/GAZjaamas07.pdf},<br \/>\r\ndoi = {10.1007\/s10458-006-0008-9},<br \/>\r\nyear  = {2007},<br \/>\r\ndate = {2007-01-01},<br \/>\r\njournal = {Autonomous Agents and Multi-Agent Systems (JAAMAS)},<br \/>\r\nvolume = {15},<br \/>\r\nnumber = {1},<br \/>\r\npages = {47--90},<br \/>\r\nabstract = {Learning to communicate is an emerging challenge in AI research. It is known that agents interacting in decentralized, stochastic environments can benefit from exchanging information. Multi-agent planning generally assumes that agents share a common means of communication; however, in building robust distributed systems it is important to address potential miscoordination resulting from misinterpretation of messages exchanged. This paper lays foundations for studying this problem, examining its properties analytically and empirically in a decision-theoretic context. We establish a formal framework for the problem, and identify a collection of necessary and sufficient properties for decision problems that allow agents to employ probabilistic updating schemes in order to learn how to interpret what others are communicating. Solving the problem optimally is often intractable, but our approach enables agents using different languages to converge upon coordination over time. Our experimental work establishes how these methods perform when applied to problems of varying complexity.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('992','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_992\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Learning to communicate is an emerging challenge in AI research. It is known that agents interacting in decentralized, stochastic environments can benefit from exchanging information. Multi-agent planning generally assumes that agents share a common means of communication; however, in building robust distributed systems it is important to address potential miscoordination resulting from misinterpretation of messages exchanged. This paper lays foundations for studying this problem, examining its properties analytically and empirically in a decision-theoretic context. We establish a formal framework for the problem, and identify a collection of necessary and sufficient properties for decision problems that allow agents to employ probabilistic updating schemes in order to learn how to interpret what others are communicating. Solving the problem optimally is often intractable, but our approach enables agents using different languages to converge upon coordination over time. Our experimental work establishes how these methods perform when applied to problems of varying complexity.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('992','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_992\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/GAZjaamas07.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/GAZjaamas07.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/GAZjaamas07.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1007\/s10458-006-0008-9\" title=\"Follow DOI:10.1007\/s10458-006-0008-9\" target=\"_blank\">doi:10.1007\/s10458-006-0008-9<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('992','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Seuken, Sven;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('997','tp_links')\" style=\"cursor:pointer;\">Memory-Bounded Dynamic Programming for DEC-POMDPs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_address\">Hyderabad, India, <\/span><span class=\"tp_pub_additional_year\">2007<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_997\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('997','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_997\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('997','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_997\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('997','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_997\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SZijcai07,<br \/>\r\ntitle = {Memory-Bounded Dynamic Programming for DEC-POMDPs},<br \/>\r\nauthor = {Sven Seuken and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZijcai07.pdf},<br \/>\r\nyear  = {2007},<br \/>\r\ndate = {2007-01-01},<br \/>\r\nbooktitle = {Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\npages = {2009--2015},<br \/>\r\naddress = {Hyderabad, India},<br \/>\r\nabstract = {Decentralized decision making under uncertainty has been shown to be intractable when each agent has different partial information about the domain. Thus, improving the applicability and scalability of planning algorithms is an important challenge. We present the first memory-bounded dynamic programming algorithm for finite-horizon decentralized POMDPs. A set of heuristics is used to identify relevant points of the infinitely large belief space. Using these belief points, the algorithm successively selects the best joint policies for each horizon. The algorithm is extremely efficient, having linear time and space complexity with respect to the horizon length. Experimental results show that it can handle horizons that are multiple orders of magnitude larger than what was previously possible, while achieving the same or better solution quality. These results significantly increase the applicability of decentralized decision-making techniques.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('997','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_997\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Decentralized decision making under uncertainty has been shown to be intractable when each agent has different partial information about the domain. Thus, improving the applicability and scalability of planning algorithms is an important challenge. We present the first memory-bounded dynamic programming algorithm for finite-horizon decentralized POMDPs. A set of heuristics is used to identify relevant points of the infinitely large belief space. Using these belief points, the algorithm successively selects the best joint policies for each horizon. The algorithm is extremely efficient, having linear time and space complexity with respect to the horizon length. Experimental results show that it can handle horizons that are multiple orders of magnitude larger than what was previously possible, while achieving the same or better solution quality. These results significantly increase the applicability of decentralized decision-making techniques.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('997','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_997\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZijcai07.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZijcai07.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZijcai07.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('997','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Amato, Christopher;  Bernstein, Daniel S;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('999','tp_links')\" style=\"cursor:pointer;\">Optimizing Memory-Bounded Controllers for Decentralized POMDPs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence (UAI), <\/span><span class=\"tp_pub_additional_address\">Vancouver, British Columbia, <\/span><span class=\"tp_pub_additional_year\">2007<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_999\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('999','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_999\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('999','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_999\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('999','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_999\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:ABZuai07,<br \/>\r\ntitle = {Optimizing Memory-Bounded Controllers for Decentralized POMDPs},<br \/>\r\nauthor = {Christopher Amato and Daniel S Bernstein and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ABZuai07.pdf},<br \/>\r\nyear  = {2007},<br \/>\r\ndate = {2007-01-01},<br \/>\r\nbooktitle = {Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence (UAI)},<br \/>\r\npages = {1--8},<br \/>\r\naddress = {Vancouver, British Columbia},<br \/>\r\nabstract = {We present a memory-bounded optimization approach for solving infinite-horizon decentralized POMDPs. Policies for each agent are represented by stochastic finite state controllers. We formulate the problem of optimizing these policies as a nonlinear program, leveraging powerful existing nonlinear optimization techniques for solving the problem. While existing solvers only guarantee locally optimal solutions, we show that our formulation produces higher quality controllers than the state-of-the-art approach. We also incorporate a shared source of randomness in the form of a correlation device to further increase solution quality with only a limited increase in space and time. Our experimental results show that nonlinear optimization can be used to provide high quality, concise solutions to decentralized decision problems under uncertainty.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('999','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_999\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We present a memory-bounded optimization approach for solving infinite-horizon decentralized POMDPs. Policies for each agent are represented by stochastic finite state controllers. We formulate the problem of optimizing these policies as a nonlinear program, leveraging powerful existing nonlinear optimization techniques for solving the problem. While existing solvers only guarantee locally optimal solutions, we show that our formulation produces higher quality controllers than the state-of-the-art approach. We also incorporate a shared source of randomness in the form of a correlation device to further increase solution quality with only a limited increase in space and time. Our experimental results show that nonlinear optimization can be used to provide high quality, concise solutions to decentralized decision problems under uncertainty.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('999','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_999\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ABZuai07.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ABZuai07.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/ABZuai07.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('999','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Seuken, Sven;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1000','tp_links')\" style=\"cursor:pointer;\">Improved Memory-Bounded Dynamic Programming for Decentralized POMDPs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence (UAI), <\/span><span class=\"tp_pub_additional_address\">Vancouver, British Columbia, <\/span><span class=\"tp_pub_additional_year\">2007<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1000\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1000','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1000\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1000','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1000\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1000','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1000\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SZuai07,<br \/>\r\ntitle = {Improved Memory-Bounded Dynamic Programming for Decentralized POMDPs},<br \/>\r\nauthor = {Sven Seuken and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZuai07.pdf},<br \/>\r\nyear  = {2007},<br \/>\r\ndate = {2007-01-01},<br \/>\r\nbooktitle = {Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence (UAI)},<br \/>\r\npages = {344--351},<br \/>\r\naddress = {Vancouver, British Columbia},<br \/>\r\nabstract = {Memory-Bounded Dynamic Programming (MBDP) has proved extremely effective in solving decentralized POMDPs with large horizons. We generalize the algorithm and improve its scalability by reducing the complexity with respect to the number of observations from exponential to polynomial. We derive error bounds on solution quality with respect to this new approximation and analyze the convergence behavior. To evaluate the effectiveness of the improvements, we introduce a new, larger benchmark problem. Experimental results show that despite the high complexity of decentralized POMDPs, scalable solution techniques such as MBDP perform surprisingly well.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1000','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1000\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Memory-Bounded Dynamic Programming (MBDP) has proved extremely effective in solving decentralized POMDPs with large horizons. We generalize the algorithm and improve its scalability by reducing the complexity with respect to the number of observations from exponential to polynomial. We derive error bounds on solution quality with respect to this new approximation and analyze the convergence behavior. To evaluate the effectiveness of the improvements, we introduce a new, larger benchmark problem. Experimental results show that despite the high complexity of decentralized POMDPs, scalable solution techniques such as MBDP perform surprisingly well.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1000','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1000\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZuai07.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZuai07.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZuai07.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1000','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Allen, Martin;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1001','tp_links')\" style=\"cursor:pointer;\">Agent Influence as a Predictor of Difficulty for Decentralized Problem-Solving<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 22nd Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_address\">Vancouver, British Columbia, <\/span><span class=\"tp_pub_additional_year\">2007<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1001\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1001','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1001\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1001','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1001\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1001','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1001\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:AZaaai07,<br \/>\r\ntitle = {Agent Influence as a Predictor of Difficulty for Decentralized Problem-Solving},<br \/>\r\nauthor = {Martin Allen and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/AZaaai07.pdf},<br \/>\r\nyear  = {2007},<br \/>\r\ndate = {2007-01-01},<br \/>\r\nbooktitle = {Proceedings of the 22nd Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {688--693},<br \/>\r\naddress = {Vancouver, British Columbia},<br \/>\r\nabstract = {We study the effect of problem structure on the practical performance of optimal dynamic programming for decentralized decision problems. It is shown that restricting agent influence over problem dynamics can make the problem easier to solve. Experimental results establish that agent influence correlates with problem difficulty: as the gap between the influence of different agents grows, problems tend to become much easier to solve. The measure thus provides a general-purpose, automatic characterization of decentralized problems, identifying those for which optimal methods are more or less likely to work. Such a measure is also of possible use as a heuristic in the design of algorithms that create task decompositions and control hierarchies in order to simplify multiagent problems.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1001','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1001\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We study the effect of problem structure on the practical performance of optimal dynamic programming for decentralized decision problems. It is shown that restricting agent influence over problem dynamics can make the problem easier to solve. Experimental results establish that agent influence correlates with problem difficulty: as the gap between the influence of different agents grows, problems tend to become much easier to solve. The measure thus provides a general-purpose, automatic characterization of decentralized problems, identifying those for which optimal methods are more or less likely to work. Such a measure is also of possible use as a heuristic in the design of algorithms that create task decompositions and control hierarchies in order to simplify multiagent problems.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1001','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1001\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/AZaaai07.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/AZaaai07.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/AZaaai07.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1001','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Petrik, Marek;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1002','tp_links')\" style=\"cursor:pointer;\">Anytime Coordination Using Separable Bilinear Programs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 22nd Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_address\">Vancouver, British Columbia, <\/span><span class=\"tp_pub_additional_year\">2007<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1002\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1002','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1002\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1002','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1002\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1002','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1002\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:PZaaai07,<br \/>\r\ntitle = {Anytime Coordination Using Separable Bilinear Programs},<br \/>\r\nauthor = {Marek Petrik and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZaaai07.pdf},<br \/>\r\nyear  = {2007},<br \/>\r\ndate = {2007-01-01},<br \/>\r\nbooktitle = {Proceedings of the 22nd Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {750--755},<br \/>\r\naddress = {Vancouver, British Columbia},<br \/>\r\nabstract = {Developing scalable coordination algorithms for multi-agent systems is a hard computational challenge. One useful approach, demonstrated by the Coverage Set Algorithm (CSA), exploits structured interaction to produce significant computational gains. Empirically, CSA exhibits very good anytime performance, but an error bound on the results has not been established. We reformulate the algorithm and derive both online and offline error bounds for approximate solutions. Moreover, we propose an effective way to automatically reduce the complexity of the interaction. Our experiments show that this is a promising approach to solve a broad class of decentralized decision problems. The general formulation used by the algorithm makes it both easy to implement and widely applicable to a variety of other AI problems.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1002','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1002\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Developing scalable coordination algorithms for multi-agent systems is a hard computational challenge. One useful approach, demonstrated by the Coverage Set Algorithm (CSA), exploits structured interaction to produce significant computational gains. Empirically, CSA exhibits very good anytime performance, but an error bound on the results has not been established. We reformulate the algorithm and derive both online and offline error bounds for approximate solutions. Moreover, we propose an effective way to automatically reduce the complexity of the interaction. Our experiments show that this is a promising approach to solve a broad class of decentralized decision problems. The general formulation used by the algorithm makes it both easy to implement and widely applicable to a variety of other AI problems.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1002','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1002\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZaaai07.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZaaai07.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PZaaai07.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1002','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Szer, Daniel;  Charpillet, Francois;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1013','tp_links')\" style=\"cursor:pointer;\">MAA*: A Heuristic Search Algorithm for Solving Decentralized POMDPs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 21st Conference on Uncertainty in Artificial Intelligence (UAI), <\/span><span class=\"tp_pub_additional_address\">Edinburgh, Scotland, <\/span><span class=\"tp_pub_additional_year\">2005<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1013\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1013','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1013\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1013','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1013\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1013','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1013\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SCZuai05,<br \/>\r\ntitle = {MAA*: A Heuristic Search Algorithm for Solving Decentralized POMDPs},<br \/>\r\nauthor = {Daniel Szer and Francois Charpillet and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SCZuai05.pdf},<br \/>\r\nyear  = {2005},<br \/>\r\ndate = {2005-01-01},<br \/>\r\nbooktitle = {Proceedings of the 21st Conference on Uncertainty in Artificial Intelligence (UAI)},<br \/>\r\npages = {576--583},<br \/>\r\naddress = {Edinburgh, Scotland},<br \/>\r\nabstract = {We present multi-agent A* (MAA*), the first complete and optimal heuristic search algorithm for solving decentralized partially-observable Markov decision problems (DEC- POMDPs) with finite horizon. The algorithm is suitable for computing optimal plans for a cooperative group of agents that operate in a stochastic environment such as multi-robot coordination, network traffic control, or distributed resource allocation. Solving such problems effectively is a major challenge in the area of planning under uncertainty. Our solution is based on a synthesis of classical heuristic search and decentralized control theory. Experimental results show that MAA* has significant advantages. We introduce an anytime variant of MAA* and conclude with a discussion of promising extensions such as an approach to solving infinite-horizon problems.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1013','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1013\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We present multi-agent A* (MAA*), the first complete and optimal heuristic search algorithm for solving decentralized partially-observable Markov decision problems (DEC- POMDPs) with finite horizon. The algorithm is suitable for computing optimal plans for a cooperative group of agents that operate in a stochastic environment such as multi-robot coordination, network traffic control, or distributed resource allocation. Solving such problems effectively is a major challenge in the area of planning under uncertainty. Our solution is based on a synthesis of classical heuristic search and decentralized control theory. Experimental results show that MAA* has significant advantages. We introduce an anytime variant of MAA* and conclude with a discussion of promising extensions such as an approach to solving infinite-horizon problems.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1013','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1013\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SCZuai05.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SCZuai05.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SCZuai05.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1013','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><\/table><div class=\"tablenav\"><div class=\"tablenav-pages\"><span class=\"displaying-num\">65 entries<\/span> <a class=\"page-numbers button disabled\">&laquo;<\/a> <a class=\"page-numbers button disabled\">&lsaquo;<\/a> 1 of 2 <a href=\"https:\/\/groups.cs.umass.edu\/shlomo\/research\/?limit=2&amp;tgid=&amp;yr=&amp;type=&amp;usr=&amp;auth=&amp;tsr=#tppubs\" title=\"next page\" class=\"page-numbers button\">&rsaquo;<\/a> <a href=\"https:\/\/groups.cs.umass.edu\/shlomo\/research\/?limit=2&amp;tgid=&amp;yr=&amp;type=&amp;usr=&amp;auth=&amp;tsr=#tppubs\" title=\"last page\" class=\"page-numbers button\">&raquo;<\/a> <\/div><\/div><\/div><\/div>\n<div><\/div><\/div><\/div>\n<\/div>\n<div>\n<h3><span style=\"color: #264278\"><b>Generalized Planning<\/b><\/span><\/h3>\n<div>\n<div>How can agents create generalized plans, which are algorithm-like plans that include loops and branches, can handle unknown quantities of objects, and work for large classes of problem instances?<\/div>\n<div><div class=\"bg-margin-for-link\"><input type='hidden' bg_collapse_expand='6a2d3028f10437013314972' value='6a2d3028f10437013314972'><input type='hidden' id='bg-show-more-text-6a2d3028f10437013314972' value='Show Related Publications'><input type='hidden' id='bg-show-less-text-6a2d3028f10437013314972' value='Hide Related Publications'><a id='bg-showmore-action-6a2d3028f10437013314972' class='bg-showmore-plg-link bg-arrow '  style=\" color:#7C2622;;\" href='#'>Show Related Publications<\/a><div id='bg-showmore-hidden-6a2d3028f10437013314972' ><div class=\"teachpress_pub_list\"><form name=\"tppublistform\" method=\"get\"><a name=\"tppubs\" id=\"tppubs\"><\/a><\/form><table class=\"teachpress_publication_list\"><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Bhatia, Abhinav;  Nashed, Samer B.;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1172','tp_links')\" style=\"cursor:pointer;\">RL3: Boosting Meta Reinforcement Learning via RL inside RL2<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">NeurIPS Workshop on Generalized Planning (GenPlan), <\/span><span class=\"tp_pub_additional_address\">New Orleans, Louisiana, <\/span><span class=\"tp_pub_additional_year\">2023<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1172\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1172','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1172\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1172','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1172\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1172','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1172\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:BNZgenplan23,<br \/>\r\ntitle = {RL3: Boosting Meta Reinforcement Learning via RL inside RL2},<br \/>\r\nauthor = {Abhinav Bhatia and Samer B. Nashed and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BNZgenplan23.pdf},<br \/>\r\nyear  = {2023},<br \/>\r\ndate = {2023-01-01},<br \/>\r\nurldate = {2023-01-01},<br \/>\r\nbooktitle = {NeurIPS Workshop on Generalized Planning (GenPlan)},<br \/>\r\naddress = {New Orleans, Louisiana},<br \/>\r\nabstract = {Meta reinforcement learning (meta-RL) methods such as RL2 have emerged as promising approaches for learning data-efficient RL algorithms tailored to a given task distribution. However, these RL algorithms struggle with long-horizon tasks and out-of-distribution tasks since they rely on recurrent neural networks to pro- cess the sequence of experiences instead of summarizing them into general RL components such as value functions. Moreover, even transformers have a practical limit to the length of histories they can efficiently reason about before training and inference costs become prohibitive. In contrast, traditional RL algorithms are data-inefficient since they do not leverage domain knowledge, but they do converge to an optimal policy as more data becomes available. In this paper, we propose RL3, a principled hybrid approach that combines traditional RL and meta-RL by incorporating task-specific action-values learned through traditional RL as an input to the meta-RL neural network. We show that RL3 earns greater cumulative reward on long-horizon and out-of-distribution tasks compared to RL2, while maintaining the efficiency of the latter in the short term. Experiments are conducted on both custom and benchmark discrete domains from the meta-RL literature that exhibit a range of short-term, long-term, and complex dependencies.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1172','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1172\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Meta reinforcement learning (meta-RL) methods such as RL2 have emerged as promising approaches for learning data-efficient RL algorithms tailored to a given task distribution. However, these RL algorithms struggle with long-horizon tasks and out-of-distribution tasks since they rely on recurrent neural networks to pro- cess the sequence of experiences instead of summarizing them into general RL components such as value functions. Moreover, even transformers have a practical limit to the length of histories they can efficiently reason about before training and inference costs become prohibitive. In contrast, traditional RL algorithms are data-inefficient since they do not leverage domain knowledge, but they do converge to an optimal policy as more data becomes available. In this paper, we propose RL3, a principled hybrid approach that combines traditional RL and meta-RL by incorporating task-specific action-values learned through traditional RL as an input to the meta-RL neural network. We show that RL3 earns greater cumulative reward on long-horizon and out-of-distribution tasks compared to RL2, while maintaining the efficiency of the latter in the short term. Experiments are conducted on both custom and benchmark discrete domains from the meta-RL literature that exhibit a range of short-term, long-term, and complex dependencies.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1172','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1172\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BNZgenplan23.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BNZgenplan23.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BNZgenplan23.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1172','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Srivastava, Siddharth;  Zilberstein, Shlomo;  Gupta, Abhishek;  Abbeel, Pieter;  Russell, Stuart J<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('901','tp_links')\" style=\"cursor:pointer;\">Tractability of Planning with Loops<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 29th Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_address\">Austin, Texas, <\/span><span class=\"tp_pub_additional_year\">2015<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_901\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('901','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_901\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('901','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_901\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('901','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_901\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SZGARaaai15,<br \/>\r\ntitle = {Tractability of Planning with Loops},<br \/>\r\nauthor = {Siddharth Srivastava and Shlomo Zilberstein and Abhishek Gupta and Pieter Abbeel and Stuart J Russell},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZGARaaai15.pdf},<br \/>\r\nyear  = {2015},<br \/>\r\ndate = {2015-01-01},<br \/>\r\nbooktitle = {Proceedings of the 29th Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {3393--3401},<br \/>\r\naddress = {Austin, Texas},<br \/>\r\nabstract = {We create a unified framework for analyzing and synthesizing plans with loops for solving problems with non-deterministic numeric effects and a limited form of partial observability. Three different action models -- with deterministic, qualitative non-deterministic and Boolean non-deterministic semantics -- are handled using a single abstract representation. We establish the conditions under which the correctness and termination of solutions, represented as abstract policies, can be verified. We also examine the feasibility of learning abstract policies from examples. We demonstrate our techniques on several planning problems and show that they apply to challenging real-world tasks such as doing the laundry with a PR2 robot. These results resolve a number of open questions about planning with loops and facilitate the development of new algorithms and applications.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('901','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_901\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We create a unified framework for analyzing and synthesizing plans with loops for solving problems with non-deterministic numeric effects and a limited form of partial observability. Three different action models -- with deterministic, qualitative non-deterministic and Boolean non-deterministic semantics -- are handled using a single abstract representation. We establish the conditions under which the correctness and termination of solutions, represented as abstract policies, can be verified. We also examine the feasibility of learning abstract policies from examples. We demonstrate our techniques on several planning problems and show that they apply to challenging real-world tasks such as doing the laundry with a PR2 robot. These results resolve a number of open questions about planning with loops and facilitate the development of new algorithms and applications.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('901','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_901\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZGARaaai15.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZGARaaai15.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZGARaaai15.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('901','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Srivastava, Siddharth;  Immerman, Neil;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('928','tp_links')\" style=\"cursor:pointer;\">Applicability Conditions for Plans with Loops: Computability Results and Algorithms<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Artificial Intelligence (AIJ), <\/span><span class=\"tp_pub_additional_volume\">vol. 191, <\/span><span class=\"tp_pub_additional_pages\">pp. 1\u201319, <\/span><span class=\"tp_pub_additional_year\">2012<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_928\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('928','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_928\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('928','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_928\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('928','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_928\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:SIZaij12,<br \/>\r\ntitle = {Applicability Conditions for Plans with Loops: Computability Results and Algorithms},<br \/>\r\nauthor = {Siddharth Srivastava and Neil Immerman and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZaij12.pdf},<br \/>\r\ndoi = {10.1016\/j.artint.2012.07.005},<br \/>\r\nyear  = {2012},<br \/>\r\ndate = {2012-01-01},<br \/>\r\njournal = {Artificial Intelligence (AIJ)},<br \/>\r\nvolume = {191},<br \/>\r\npages = {1--19},<br \/>\r\nabstract = {The utility of including loops in plans has been long recognized by the planning community. Loops in a plan help increase both its applicability and the compactness of its representation. However, progress in finding such plans has been limited largely due to lack of methods for reasoning about the correctness and safety properties of loops of actions. We present novel algorithms for determining the applicability and progress made by a general class of loops of actions. These methods can be used for directing the search for plans with loops towards greater applicability while guaranteeing termination, as well as in post-processing of computed plans to precisely characterize their applicability. Experimental results demonstrate the efficiency of these algorithms. We also discuss the factors which can make the problem of determining applicability conditions for plans with loops incomputable.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('928','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_928\" style=\"display:none;\"><div class=\"tp_abstract_entry\">The utility of including loops in plans has been long recognized by the planning community. Loops in a plan help increase both its applicability and the compactness of its representation. However, progress in finding such plans has been limited largely due to lack of methods for reasoning about the correctness and safety properties of loops of actions. We present novel algorithms for determining the applicability and progress made by a general class of loops of actions. These methods can be used for directing the search for plans with loops towards greater applicability while guaranteeing termination, as well as in post-processing of computed plans to precisely characterize their applicability. Experimental results demonstrate the efficiency of these algorithms. We also discuss the factors which can make the problem of determining applicability conditions for plans with loops incomputable.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('928','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_928\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZaij12.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZaij12.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZaij12.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1016\/j.artint.2012.07.005\" title=\"Follow DOI:10.1016\/j.artint.2012.07.005\" target=\"_blank\">doi:10.1016\/j.artint.2012.07.005<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('928','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Srivastava, Siddharth;  Immerman, Neil;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('932','tp_links')\" style=\"cursor:pointer;\">A New Representation and Associated Algorithms for Generalized Planning<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Artificial Intelligence (AIJ), <\/span><span class=\"tp_pub_additional_volume\">vol. 175, <\/span><span class=\"tp_pub_additional_number\">no. 2, <\/span><span class=\"tp_pub_additional_pages\">pp. 615\u2013647, <\/span><span class=\"tp_pub_additional_year\">2011<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_932\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('932','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_932\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('932','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_932\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('932','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_932\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:SIZaij11,<br \/>\r\ntitle = {A New Representation and Associated Algorithms for Generalized Planning},<br \/>\r\nauthor = {Siddharth Srivastava and Neil Immerman and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZaij11.pdf},<br \/>\r\ndoi = {10.1016\/j.artint.2010.10.006},<br \/>\r\nyear  = {2011},<br \/>\r\ndate = {2011-01-01},<br \/>\r\njournal = {Artificial Intelligence (AIJ)},<br \/>\r\nvolume = {175},<br \/>\r\nnumber = {2},<br \/>\r\npages = {615--647},<br \/>\r\nabstract = {Constructing plans that can handle multiple problem instances is a longstanding open problem in AI. We present a framework for generalized planning that captures the notion of algorithm-like plans and unifies various approaches developed for addressing this problem. Using this framework, and building on the TVLA system for static analysis of programs, we develop a novel approach for computing generalizations of classical plans by identifying sequences of actions that will make measurable progress when placed in a loop. In a wide class of problems that we characterize formally in the paper, these methods allow us to find generalized plans with loops for solving problem instances of unbounded sizes and also to determine the correctness and applicability of the computed generalized plans. We demonstrate the scope and scalability of the proposed approach on a wide range of planning problems.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('932','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_932\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Constructing plans that can handle multiple problem instances is a longstanding open problem in AI. We present a framework for generalized planning that captures the notion of algorithm-like plans and unifies various approaches developed for addressing this problem. Using this framework, and building on the TVLA system for static analysis of programs, we develop a novel approach for computing generalizations of classical plans by identifying sequences of actions that will make measurable progress when placed in a loop. In a wide class of problems that we characterize formally in the paper, these methods allow us to find generalized plans with loops for solving problem instances of unbounded sizes and also to determine the correctness and applicability of the computed generalized plans. We demonstrate the scope and scalability of the proposed approach on a wide range of planning problems.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('932','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_932\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZaij11.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZaij11.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZaij11.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1016\/j.artint.2010.10.006\" title=\"Follow DOI:10.1016\/j.artint.2010.10.006\" target=\"_blank\">doi:10.1016\/j.artint.2010.10.006<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('932','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Srivastava, Siddharth;  Immerman, Neil;  Zilberstein, Shlomo;  Zhang, Tianjiao<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('937','tp_links')\" style=\"cursor:pointer;\">Directed Search for Generalized Plans Using Classical Planners<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 21st International Conference on Automated Planning and Scheduling (ICAPS), <\/span><span class=\"tp_pub_additional_address\">Freiburg, Germany, <\/span><span class=\"tp_pub_additional_year\">2011<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_937\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('937','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_937\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('937','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_937\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('937','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_937\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SIZZicaps11,<br \/>\r\ntitle = {Directed Search for Generalized Plans Using Classical Planners},<br \/>\r\nauthor = {Siddharth Srivastava and Neil Immerman and Shlomo Zilberstein and Tianjiao Zhang},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZZicaps11.pdf},<br \/>\r\nyear  = {2011},<br \/>\r\ndate = {2011-01-01},<br \/>\r\nbooktitle = {Proceedings of the 21st International Conference on Automated Planning and Scheduling (ICAPS)},<br \/>\r\npages = {226--233},<br \/>\r\naddress = {Freiburg, Germany},<br \/>\r\nabstract = {We consider the problem of finding generalized plans for situations where the number of objects may be unknown and unbounded during planning. The input is a domain specification, a goal condition, and a class of concrete problem instances or initial states to be solved, expressed in an abstract first-order representation. Starting with an empty generalized plan, our overall approach is to incrementally increase the applicability of the plan by identifying a problem instance that it cannot solve, invoking a classical planner to solve that problem, generalizing the obtained solution and merging it back into the generalized plan. The main contributions of this paper are methods for (a) generating and solving small problem instances not yet covered by an existing generalized plan, (b) translating between concrete classical plans and abstract plan representations, and (c) extending partial generalized plans and increasing their applicability. We analyze the theoretical properties of these methods, prove their correctness, and illustrate experimentally their scalability. The resulting hybrid approach shows that solving only a few, small, classical planning problems can be sufficient to produce a generalized plan that applies to infinitely many problems with unknown numbers of objects.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('937','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_937\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We consider the problem of finding generalized plans for situations where the number of objects may be unknown and unbounded during planning. The input is a domain specification, a goal condition, and a class of concrete problem instances or initial states to be solved, expressed in an abstract first-order representation. Starting with an empty generalized plan, our overall approach is to incrementally increase the applicability of the plan by identifying a problem instance that it cannot solve, invoking a classical planner to solve that problem, generalizing the obtained solution and merging it back into the generalized plan. The main contributions of this paper are methods for (a) generating and solving small problem instances not yet covered by an existing generalized plan, (b) translating between concrete classical plans and abstract plan representations, and (c) extending partial generalized plans and increasing their applicability. We analyze the theoretical properties of these methods, prove their correctness, and illustrate experimentally their scalability. The resulting hybrid approach shows that solving only a few, small, classical planning problems can be sufficient to produce a generalized plan that applies to infinitely many problems with unknown numbers of objects.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('937','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_937\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZZicaps11.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZZicaps11.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZZicaps11.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('937','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Srivastava, Siddharth;  Zilberstein, Shlomo;  Immerman, Neil;  Geffner, Hector<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('943','tp_links')\" style=\"cursor:pointer;\">Qualitative Numeric Planning<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 25th Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_address\">San Francisco, California, <\/span><span class=\"tp_pub_additional_year\">2011<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_943\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('943','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_943\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('943','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_943\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('943','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_943\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SZIGaaai11,<br \/>\r\ntitle = {Qualitative Numeric Planning},<br \/>\r\nauthor = {Siddharth Srivastava and Shlomo Zilberstein and Neil Immerman and Hector Geffner},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZIGaaai11.pdf},<br \/>\r\nyear  = {2011},<br \/>\r\ndate = {2011-01-01},<br \/>\r\nbooktitle = {Proceedings of the 25th Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {1010--1016},<br \/>\r\naddress = {San Francisco, California},<br \/>\r\nabstract = {We consider a new class of planning problems involving a set of non-negative real variables, and a set of non-deterministic actions that increase or decrease the values of these variables by some arbitrary amount. The formulas specifying the initial state, goal state, or action preconditions can only assert whether certain variables are equal to zero or not. Assuming that the state of the variables is fully observable, we obtain two results. First, the solution to the problem can be expressed as a policy mapping qualitative states into actions, where a qualitative state includes a Boolean variable for each original variable, indicating whether its value is zero or not. Second, testing whether any such policy, that may express nested loops of actions, is a solution to the problem, can be determined in time that is polynomial in the qualitative state space, which is much smaller than the original infinite state space. We also report experimental results using a simple generate-and-test planner to illustrate these findings.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('943','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_943\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We consider a new class of planning problems involving a set of non-negative real variables, and a set of non-deterministic actions that increase or decrease the values of these variables by some arbitrary amount. The formulas specifying the initial state, goal state, or action preconditions can only assert whether certain variables are equal to zero or not. Assuming that the state of the variables is fully observable, we obtain two results. First, the solution to the problem can be expressed as a policy mapping qualitative states into actions, where a qualitative state includes a Boolean variable for each original variable, indicating whether its value is zero or not. Second, testing whether any such policy, that may express nested loops of actions, is a solution to the problem, can be determined in time that is polynomial in the qualitative state space, which is much smaller than the original infinite state space. We also report experimental results using a simple generate-and-test planner to illustrate these findings.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('943','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_943\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZIGaaai11.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZIGaaai11.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZIGaaai11.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('943','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Srivastava, Siddharth;  Immerman, Neil;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('944','tp_links')\" style=\"cursor:pointer;\">Termination and Correctness Analysis of Cyclic Control<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 25th Conference on Artificial Intelligence (AAAI Nectar Track), <\/span><span class=\"tp_pub_additional_address\">San Francisco, California, <\/span><span class=\"tp_pub_additional_year\">2011<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_944\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('944','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_944\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('944','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_944\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('944','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_944\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SIZaaai11,<br \/>\r\ntitle = {Termination and Correctness Analysis of Cyclic Control},<br \/>\r\nauthor = {Siddharth Srivastava and Neil Immerman and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZaaai11.pdf},<br \/>\r\nyear  = {2011},<br \/>\r\ndate = {2011-01-01},<br \/>\r\nbooktitle = {Proceedings of the 25th Conference on Artificial Intelligence (AAAI Nectar Track)},<br \/>\r\npages = {1567--1570},<br \/>\r\naddress = {San Francisco, California},<br \/>\r\nabstract = {We consider a new class of planning problems involving a set of non-negative real variables, and a set of non-deterministic actions that increase or decrease the values of these variables by some arbitrary amount. The formulas specifying the initial state, goal state, or action preconditions can only assert whether certain variables are equal to zero or not. Assuming that the state of the variables is fully observable, we obtain two results. First, the solution to the problem can be expressed as a policy mapping qualitative states into actions, where a qualitative state includes a Boolean variable for each original variable, indicating whether its value is zero or not. Second, testing whether any such policy, that may express nested loops of actions, is a solution to the problem, can be determined in time that is polynomial in the qualitative state space, which is much smaller than the original infinite state space. We also report experimental results using a simple generate-and-test planner to illustrate these findings.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('944','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_944\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We consider a new class of planning problems involving a set of non-negative real variables, and a set of non-deterministic actions that increase or decrease the values of these variables by some arbitrary amount. The formulas specifying the initial state, goal state, or action preconditions can only assert whether certain variables are equal to zero or not. Assuming that the state of the variables is fully observable, we obtain two results. First, the solution to the problem can be expressed as a policy mapping qualitative states into actions, where a qualitative state includes a Boolean variable for each original variable, indicating whether its value is zero or not. Second, testing whether any such policy, that may express nested loops of actions, is a solution to the problem, can be determined in time that is polynomial in the qualitative state space, which is much smaller than the original infinite state space. We also report experimental results using a simple generate-and-test planner to illustrate these findings.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('944','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_944\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZaaai11.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZaaai11.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZaaai11.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('944','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Srivastava, Siddharth;  Immerman, Neil;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('952','tp_links')\" style=\"cursor:pointer;\">Computing Applicability Conditions for Plans with Loops<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 20th International Conference on Automated Planning and Scheduling (ICAPS), <\/span><span class=\"tp_pub_additional_address\">Toronto, Canada, <\/span><span class=\"tp_pub_additional_year\">2010<\/span><span class=\"tp_pub_additional_note\">, (Best Paper Award)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_952\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('952','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_952\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('952','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_952\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('952','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_952\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SIZicaps10,<br \/>\r\ntitle = {Computing Applicability Conditions for Plans with Loops},<br \/>\r\nauthor = {Siddharth Srivastava and Neil Immerman and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZicaps10.pdf},<br \/>\r\nyear  = {2010},<br \/>\r\ndate = {2010-01-01},<br \/>\r\nbooktitle = {Proceedings of the 20th International Conference on Automated Planning and Scheduling (ICAPS)},<br \/>\r\npages = {161--168},<br \/>\r\naddress = {Toronto, Canada},<br \/>\r\nabstract = {The utility of including loops in plans has been long recognized by the planning community. Loops in a plan help increase both its applicability and the compactness of representation. However, progress in finding such plans has been limited largely due to lack of methods for reasoning about the correctness and safety properties of loops of actions. We present novel algorithms for determining the applicability and progress made by a general class of loops of actions. These methods can be used for directing the search for plans with loops towards greater applicability while guaranteeing termination, as well as in post-processing of computed plans to precisely characterize their applicability. Experimental results demonstrate the efficiency of these algorithms.},<br \/>\r\nnote = {Best Paper Award},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('952','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_952\" style=\"display:none;\"><div class=\"tp_abstract_entry\">The utility of including loops in plans has been long recognized by the planning community. Loops in a plan help increase both its applicability and the compactness of representation. However, progress in finding such plans has been limited largely due to lack of methods for reasoning about the correctness and safety properties of loops of actions. We present novel algorithms for determining the applicability and progress made by a general class of loops of actions. These methods can be used for directing the search for plans with loops towards greater applicability while guaranteeing termination, as well as in post-processing of computed plans to precisely characterize their applicability. Experimental results demonstrate the efficiency of these algorithms.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('952','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_952\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZicaps10.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZicaps10.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZicaps10.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('952','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Srivastava, Siddharth;  Immerman, Neil;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('950','tp_links')\" style=\"cursor:pointer;\">Merging Example Plans into Generalized Plans for Non-deterministic Environments<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), <\/span><span class=\"tp_pub_additional_address\">Toronto, Canada, <\/span><span class=\"tp_pub_additional_year\">2010<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_950\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('950','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_950\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('950','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_950\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('950','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_950\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SIZaamas10,<br \/>\r\ntitle = {Merging Example Plans into Generalized Plans for Non-deterministic Environments},<br \/>\r\nauthor = {Siddharth Srivastava and Neil Immerman and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZaamas10.pdf},<br \/>\r\nyear  = {2010},<br \/>\r\ndate = {2010-01-01},<br \/>\r\nbooktitle = {Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS)},<br \/>\r\npages = {1341--1348},<br \/>\r\naddress = {Toronto, Canada},<br \/>\r\nabstract = {We present a new approach for finding generalized contingent plans with loops and branches in situations where there is uncertainty in state properties and object quantities, but lack of probabilistic information about these uncertainties. We use a state abstraction technique from static analysis of programs, which uses 3-valued logic to compactly represent belief states with unbounded numbers of objects. Our approach for finding plans is to incrementally generalize and merge input example plans which can be generated by classical planners. The expressiveness and scope of this approach are demonstrated using experimental results on common benchmark domains.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('950','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_950\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We present a new approach for finding generalized contingent plans with loops and branches in situations where there is uncertainty in state properties and object quantities, but lack of probabilistic information about these uncertainties. We use a state abstraction technique from static analysis of programs, which uses 3-valued logic to compactly represent belief states with unbounded numbers of objects. Our approach for finding plans is to incrementally generalize and merge input example plans which can be generated by classical planners. The expressiveness and scope of this approach are demonstrated using experimental results on common benchmark domains.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('950','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_950\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZaamas10.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZaamas10.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZaamas10.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('950','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Srivastava, Siddharth;  Immerman, Neil;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('970','tp_links')\" style=\"cursor:pointer;\">Abstract Planning with Unknown Object Quantities and Properties<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 8th Symposium on Abstraction, Reformulation, and Approximation (SARA), <\/span><span class=\"tp_pub_additional_address\">Lake Arrowhead, California, <\/span><span class=\"tp_pub_additional_year\">2009<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_970\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('970','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_970\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('970','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_970\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('970','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_970\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SIZsara09,<br \/>\r\ntitle = {Abstract Planning with Unknown Object Quantities and Properties},<br \/>\r\nauthor = {Siddharth Srivastava and Neil Immerman and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZsara09.pdf},<br \/>\r\nyear  = {2009},<br \/>\r\ndate = {2009-01-01},<br \/>\r\nbooktitle = {Proceedings of the 8th Symposium on Abstraction, Reformulation, and Approximation (SARA)},<br \/>\r\npages = {143--150},<br \/>\r\naddress = {Lake Arrowhead, California},<br \/>\r\nabstract = {State abstraction has been widely used for state aggregation in approaches to AI search and planning. In this paper we use a powerful abstraction technique from software model checking for representing collections of states with different object quantities and properties. We exploit this method to develop precise abstractions and action operators for use in AI. This enables us to find scalable, algorithm-like plans with branches and loops which can solve problems of unbounded sizes. We describe how this method of abstraction can be effectively used in AI, with compelling results from implementations of two planning algorithms.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('970','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_970\" style=\"display:none;\"><div class=\"tp_abstract_entry\">State abstraction has been widely used for state aggregation in approaches to AI search and planning. In this paper we use a powerful abstraction technique from software model checking for representing collections of states with different object quantities and properties. We exploit this method to develop precise abstractions and action operators for use in AI. This enables us to find scalable, algorithm-like plans with branches and loops which can solve problems of unbounded sizes. We describe how this method of abstraction can be effectively used in AI, with compelling results from implementations of two planning algorithms.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('970','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_970\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZsara09.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZsara09.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZsara09.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('970','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Srivastava, Siddharth;  Immerman, Neil;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('981','tp_links')\" style=\"cursor:pointer;\">Using Abstraction for Generalized Planning<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 10th International Symposium on Artificial Intelligence and Mathematics (ISAIM), <\/span><span class=\"tp_pub_additional_address\">Ft. Lauderdale, Florida, <\/span><span class=\"tp_pub_additional_year\">2008<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_981\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('981','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_981\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('981','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_981\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('981','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_981\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SIZisaim08,<br \/>\r\ntitle = {Using Abstraction for Generalized Planning},<br \/>\r\nauthor = {Siddharth Srivastava and Neil Immerman and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZisaim08.pdf},<br \/>\r\nyear  = {2008},<br \/>\r\ndate = {2008-01-01},<br \/>\r\nbooktitle = {Proceedings of the 10th International Symposium on Artificial Intelligence and Mathematics (ISAIM)},<br \/>\r\naddress = {Ft. Lauderdale, Florida},<br \/>\r\nabstract = {Given the complexity of planning, it is often beneficial to create plans that work for a wide class of problems. This facilitates reuse of existing plans for different instances of the same problem or even for other problems that are somehow similar. We present novel approaches for finding such plans through search and for learning them from examples. We use state representation and abstraction techniques originally developed for static analysis of programs. The generalized plans that we compute include loops and work for classes of problems having varying numbers of objects that must be manipulated to reach the goal. Our algorithm for learning generalized plans takes as its input an example plan for a certain problem instance and a finite 3-valued first-order structure representing a set of initial states from different problem instances. It learns a generalized plan along with a classification of the problem instances where it works. The algorithm for finding plans takes as input a similar 3-valued structure and a goal test. Its output is a set of generalized plans and conditions describing the problem instances for which they work.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('981','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_981\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Given the complexity of planning, it is often beneficial to create plans that work for a wide class of problems. This facilitates reuse of existing plans for different instances of the same problem or even for other problems that are somehow similar. We present novel approaches for finding such plans through search and for learning them from examples. We use state representation and abstraction techniques originally developed for static analysis of programs. The generalized plans that we compute include loops and work for classes of problems having varying numbers of objects that must be manipulated to reach the goal. Our algorithm for learning generalized plans takes as its input an example plan for a certain problem instance and a finite 3-valued first-order structure representing a set of initial states from different problem instances. It learns a generalized plan along with a classification of the problem instances where it works. The algorithm for finding plans takes as input a similar 3-valued structure and a goal test. Its output is a set of generalized plans and conditions describing the problem instances for which they work.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('981','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_981\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZisaim08.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZisaim08.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZisaim08.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('981','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Srivastava, Siddharth;  Immerman, Neil;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('985','tp_links')\" style=\"cursor:pointer;\">Learning Generalized Plans Using Abstract Counting<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 23rd Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_address\">Chicago, Illinois, <\/span><span class=\"tp_pub_additional_year\">2008<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_985\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('985','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_985\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('985','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_985\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('985','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_985\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SIZaaai08,<br \/>\r\ntitle = {Learning Generalized Plans Using Abstract Counting},<br \/>\r\nauthor = {Siddharth Srivastava and Neil Immerman and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZaaai08.pdf},<br \/>\r\nyear  = {2008},<br \/>\r\ndate = {2008-01-01},<br \/>\r\nbooktitle = {Proceedings of the 23rd Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {991--997},<br \/>\r\naddress = {Chicago, Illinois},<br \/>\r\nabstract = {Given the complexity of planning, it is often beneficial to create plans that work for a wide class of problems. This facilitates reuse of existing plans for different instances drawn from the same problem or from an infinite family of similar problems. We define a class of such planning problems called generalized planning problems and present a novel approach for transforming classical plans into generalized plans. These algorithm-like plans include loops and work for problem instances having varying numbers of objects that must be manipulated to reach the goal. Our approach takes as input a classical plan for a certain problem instance. It outputs a generalized plan along with a classification of the problem instances where it is guaranteed to work. We illustrate the utility of our approach through results of a working implementation on various practical examples.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('985','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_985\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Given the complexity of planning, it is often beneficial to create plans that work for a wide class of problems. This facilitates reuse of existing plans for different instances drawn from the same problem or from an infinite family of similar problems. We define a class of such planning problems called generalized planning problems and present a novel approach for transforming classical plans into generalized plans. These algorithm-like plans include loops and work for problem instances having varying numbers of objects that must be manipulated to reach the goal. Our approach takes as input a classical plan for a certain problem instance. It outputs a generalized plan along with a classification of the problem instances where it is guaranteed to work. We illustrate the utility of our approach through results of a working implementation on various practical examples.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('985','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_985\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZaaai08.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZaaai08.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SIZaaai08.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('985','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><\/table><\/div><\/div>\n<div><\/div><\/div><\/div>\n<\/div>\n<\/div>\n<h3><span style=\"color: #264278\"><b>Introspective Autonomy<\/b><\/span><\/h3>\n<div>\n<div>How can autonomous AI systems acquire a model of their own capabilities and limitations, seek human assistance when needed, and become progressively independent?<\/div>\n<div><div class=\"bg-margin-for-link\"><input type='hidden' bg_collapse_expand='6a2d3028f34b13078105680' value='6a2d3028f34b13078105680'><input type='hidden' id='bg-show-more-text-6a2d3028f34b13078105680' value='Show Related Publications'><input type='hidden' id='bg-show-less-text-6a2d3028f34b13078105680' value='Hide Related Publications'><a id='bg-showmore-action-6a2d3028f34b13078105680' class='bg-showmore-plg-link bg-arrow '  style=\" color:#7C2622;;\" href='#'>Show Related Publications<\/a><div id='bg-showmore-hidden-6a2d3028f34b13078105680' ><div class=\"teachpress_pub_list\"><form name=\"tppublistform\" method=\"get\"><a name=\"tppubs\" id=\"tppubs\"><\/a><\/form><table class=\"teachpress_publication_list\"><tr class=\"tp_publication tp_publication_misc\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Witwicki, Stefan;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1176','tp_links')\" style=\"cursor:pointer;\">Belief State Determination for Real-Time Decision-Making<\/a> <span class=\"tp_pub_type tp_  misc\">Miscellaneous<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_year\">2024<\/span><span class=\"tp_pub_additional_note\">, (US Patent 11,921,506)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1176\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1176','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1176\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1176','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1176\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1176','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1176\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@misc{SZ:WWZpatent24c,<br \/>\r\ntitle = {Belief State Determination for Real-Time Decision-Making},<br \/>\r\nauthor = {Kyle Hollins Wray and Stefan Witwicki and Shlomo Zilberstein},<br \/>\r\nurl = {https:\/\/patents.google.com\/patent\/US11921506B2\/en},<br \/>\r\nyear  = {2024},<br \/>\r\ndate = {2024-03-01},<br \/>\r\npublisher = {Google Patents},<br \/>\r\nabstract = {Real-time decision-making for a vehicle using belief state determination is described. Operational environment data is received while the vehicle is traversing a vehicle transportation network, where the data includes data associated with an external object. An operational environment monitor establishes an observation that relates the object to a distinct vehicle operation scenario. A belief state model of the monitor computes a belief state for the observation directly from the operational environment data. The monitor provides the computed belief state to a decision component implementing a policy that maps a respective belief state for the object within the distinct vehicle operation scenario to a respective candidate vehicle control action. A candidate vehicle control action is received from the policy of the decision component, and a vehicle control action is selected for traversing the vehicle transportation from any available candidate vehicle control actions.},<br \/>\r\nnote = {US Patent 11,921,506},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {misc}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1176','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1176\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Real-time decision-making for a vehicle using belief state determination is described. Operational environment data is received while the vehicle is traversing a vehicle transportation network, where the data includes data associated with an external object. An operational environment monitor establishes an observation that relates the object to a distinct vehicle operation scenario. A belief state model of the monitor computes a belief state for the observation directly from the operational environment data. The monitor provides the computed belief state to a decision component implementing a policy that maps a respective belief state for the object within the distinct vehicle operation scenario to a respective candidate vehicle control action. A candidate vehicle control action is received from the policy of the decision component, and a vehicle control action is selected for traversing the vehicle transportation from any available candidate vehicle control actions.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1176','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1176\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/patents.google.com\/patent\/US11921506B2\/en\" title=\"https:\/\/patents.google.com\/patent\/US11921506B2\/en\" target=\"_blank\">https:\/\/patents.google.com\/patent\/US11921506B2\/en<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1176','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_misc\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Witwicki, Stefan;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1177','tp_links')\" style=\"cursor:pointer;\">Objective-Based Reasoning in Autonomous Vehicle Decision-Making<\/a> <span class=\"tp_pub_type tp_  misc\">Miscellaneous<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_year\">2024<\/span><span class=\"tp_pub_additional_note\">, (US Patent 11,899,454)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1177\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1177','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1177\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1177','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1177\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1177','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1177\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@misc{SZ:WWZpatent24b,<br \/>\r\ntitle = {Objective-Based Reasoning in Autonomous Vehicle Decision-Making},<br \/>\r\nauthor = {Kyle Hollins Wray and Stefan Witwicki and Shlomo Zilberstein},<br \/>\r\nurl = {https:\/\/patents.google.com\/patent\/US11899454B2\/en},<br \/>\r\nyear  = {2024},<br \/>\r\ndate = {2024-02-01},<br \/>\r\npublisher = {Google Patents},<br \/>\r\nabstract = {An autonomous vehicle traverses a vehicle transportation network using a multi-objective policy based on a model for specific scenarios. The multi-objective policy includes a topographical map that shows a relationship between at least two objectives. The autonomous vehicle receives a candidate vehicle control action associated with each of the at least two objectives. The autonomous vehicle selects a vehicle control action based on a buffer value that is associated with the at least two objectives. The autonomous vehicle traverses a portion of the vehicle transportation network in accordance with the selected vehicle control action.},<br \/>\r\nnote = {US Patent 11,899,454},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {misc}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1177','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1177\" style=\"display:none;\"><div class=\"tp_abstract_entry\">An autonomous vehicle traverses a vehicle transportation network using a multi-objective policy based on a model for specific scenarios. The multi-objective policy includes a topographical map that shows a relationship between at least two objectives. The autonomous vehicle receives a candidate vehicle control action associated with each of the at least two objectives. The autonomous vehicle selects a vehicle control action based on a buffer value that is associated with the at least two objectives. The autonomous vehicle traverses a portion of the vehicle transportation network in accordance with the selected vehicle control action.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1177','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1177\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/patents.google.com\/patent\/US11899454B2\/en\" title=\"https:\/\/patents.google.com\/patent\/US11899454B2\/en\" target=\"_blank\">https:\/\/patents.google.com\/patent\/US11899454B2\/en<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1177','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_misc\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Witwicki, Stefan;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1178','tp_links')\" style=\"cursor:pointer;\">Shared Autonomous Vehicle Operational Management<\/a> <span class=\"tp_pub_type tp_  misc\">Miscellaneous<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_year\">2024<\/span><span class=\"tp_pub_additional_note\">, (US Patent 11,874,120)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1178\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1178','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1178\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1178','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1178\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1178','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1178\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@misc{SZ:WWZpatent24a,<br \/>\r\ntitle = {Shared Autonomous Vehicle Operational Management},<br \/>\r\nauthor = {Kyle Hollins Wray and Stefan Witwicki and Shlomo Zilberstein},<br \/>\r\nurl = {https:\/\/patents.google.com\/patent\/US11874120B2\/en},<br \/>\r\nyear  = {2024},<br \/>\r\ndate = {2024-01-01},<br \/>\r\npublisher = {Google Patents},<br \/>\r\nabstract = {Traversing, by an autonomous vehicle, a vehicle transportation network, may include identifying a distinct vehicle operational scenario, wherein traversing the vehicle transportation network includes traversing a portion of the vehicle transportation network that includes the distinct vehicle operational scenario, communicating shared scenario-specific operational control management data associated with the distinct vehicle operational scenario with an external shared scenario-specific operational control management system, operating a scenario-specific operational control evaluation module instance including an instance of a scenario-specific operational control evaluation model of the distinct vehicle operational scenario, and wherein operating the scenario-specific operational control evaluation module instance includes identifying a policy for the scenario-specific operational control evaluation model, receiving a candidate vehicle control action from the policy for the scenario-specific operational control evaluation model, and traversing a portion of the vehicle transportation network based on the candidate vehicle control action.},<br \/>\r\nnote = {US Patent 11,874,120},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {misc}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1178','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1178\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Traversing, by an autonomous vehicle, a vehicle transportation network, may include identifying a distinct vehicle operational scenario, wherein traversing the vehicle transportation network includes traversing a portion of the vehicle transportation network that includes the distinct vehicle operational scenario, communicating shared scenario-specific operational control management data associated with the distinct vehicle operational scenario with an external shared scenario-specific operational control management system, operating a scenario-specific operational control evaluation module instance including an instance of a scenario-specific operational control evaluation model of the distinct vehicle operational scenario, and wherein operating the scenario-specific operational control evaluation module instance includes identifying a policy for the scenario-specific operational control evaluation model, receiving a candidate vehicle control action from the policy for the scenario-specific operational control evaluation model, and traversing a portion of the vehicle transportation network based on the candidate vehicle control action.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1178','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1178\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/patents.google.com\/patent\/US11874120B2\/en\" title=\"https:\/\/patents.google.com\/patent\/US11874120B2\/en\" target=\"_blank\">https:\/\/patents.google.com\/patent\/US11874120B2\/en<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1178','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_misc\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Witwicki, Stefan;  Zilberstein, Shlomo;  Bentahar, Omar;  Jamgochian, Arec<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1179','tp_links')\" style=\"cursor:pointer;\">Explainability of Autonomous Vehicle Decision Making<\/a> <span class=\"tp_pub_type tp_  misc\">Miscellaneous<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_year\">2023<\/span><span class=\"tp_pub_additional_note\">, (US Patent 11,714,971)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1179\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1179','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1179\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1179','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1179\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1179','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1179\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@misc{SZ:WWZBJpatent23e,<br \/>\r\ntitle = {Explainability of Autonomous Vehicle Decision Making},<br \/>\r\nauthor = {Kyle Hollins Wray and Stefan Witwicki and Shlomo Zilberstein and Omar Bentahar and Arec Jamgochian},<br \/>\r\nurl = {https:\/\/patents.google.com\/patent\/US11714971B2\/en},<br \/>\r\nyear  = {2023},<br \/>\r\ndate = {2023-08-01},<br \/>\r\npublisher = {Google Patents},<br \/>\r\nabstract = {A processor is configured to execute instructions stored in a memory to identify distinct vehicle operational scenarios; instantiate decision components, where each of the decision components is an instance of a respective decision problem, and where the each of the decision components maintains a respective state describing the respective vehicle operational scenario; receive respective candidate vehicle control actions from the decision components; select an action from the respective candidate vehicle control actions, where the action is from a selected decision component of the decision components, and where the action is used to control the AV to traverse a portion of the vehicle transportation network; and generate an explanation as to why the action was selected, where the explanation includes respective descriptors of the action, the selected decision component, and a state factor of the respective state of the selected decision component.},<br \/>\r\nnote = {US Patent 11,714,971},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {misc}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1179','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1179\" style=\"display:none;\"><div class=\"tp_abstract_entry\">A processor is configured to execute instructions stored in a memory to identify distinct vehicle operational scenarios; instantiate decision components, where each of the decision components is an instance of a respective decision problem, and where the each of the decision components maintains a respective state describing the respective vehicle operational scenario; receive respective candidate vehicle control actions from the decision components; select an action from the respective candidate vehicle control actions, where the action is from a selected decision component of the decision components, and where the action is used to control the AV to traverse a portion of the vehicle transportation network; and generate an explanation as to why the action was selected, where the explanation includes respective descriptors of the action, the selected decision component, and a state factor of the respective state of the selected decision component.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1179','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1179\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/patents.google.com\/patent\/US11714971B2\/en\" title=\"https:\/\/patents.google.com\/patent\/US11714971B2\/en\" target=\"_blank\">https:\/\/patents.google.com\/patent\/US11714971B2\/en<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1179','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_misc\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Witwicki, Stefan;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1180','tp_links')\" style=\"cursor:pointer;\">Autonomous Vehicle Operation with Explicit Occlusion Reasoning<\/a> <span class=\"tp_pub_type tp_  misc\">Miscellaneous<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_year\">2023<\/span><span class=\"tp_pub_additional_note\">, (US Patent 11,702,070)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1180\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1180','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1180\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1180','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1180\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1180','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1180\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@misc{SZ:WWZpatent23d,<br \/>\r\ntitle = {Autonomous Vehicle Operation with Explicit Occlusion Reasoning},<br \/>\r\nauthor = {Kyle Hollins Wray and Stefan Witwicki and Shlomo Zilberstein},<br \/>\r\nurl = {https:\/\/patents.google.com\/patent\/US11702070B2\/en},<br \/>\r\nyear  = {2023},<br \/>\r\ndate = {2023-07-01},<br \/>\r\npublisher = {Google Patents},<br \/>\r\nabstract = {Autonomous vehicle operation with explicit occlusion reasoning may include traversing, by a vehicle, a vehicle transporation network. Traversing the vehicle transportation network can include receiving, from a sensor of the vehicle, sensor data for a portion of a vehicle operational environment, determining, using the sensor data, a visibility grid comprising coordinates forming an unobserved region within a defined distance from the vehicle, computing a probability of a presence of an external object within the unobserved region by comparing the visibility grid to a map (e.g., a high-definition map), and traversing a portion of the vehicle transportation network using the probability. An apparatus and a vehicle are also described.},<br \/>\r\nnote = {US Patent 11,702,070},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {misc}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1180','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1180\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Autonomous vehicle operation with explicit occlusion reasoning may include traversing, by a vehicle, a vehicle transporation network. Traversing the vehicle transportation network can include receiving, from a sensor of the vehicle, sensor data for a portion of a vehicle operational environment, determining, using the sensor data, a visibility grid comprising coordinates forming an unobserved region within a defined distance from the vehicle, computing a probability of a presence of an external object within the unobserved region by comparing the visibility grid to a map (e.g., a high-definition map), and traversing a portion of the vehicle transportation network using the probability. An apparatus and a vehicle are also described.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1180','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1180\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/patents.google.com\/patent\/US11702070B2\/en\" title=\"https:\/\/patents.google.com\/patent\/US11702070B2\/en\" target=\"_blank\">https:\/\/patents.google.com\/patent\/US11702070B2\/en<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1180','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_misc\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Witwicki, Stefan;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1181','tp_links')\" style=\"cursor:pointer;\">Risk Aware Executor with Action Set Recommendations<\/a> <span class=\"tp_pub_type tp_  misc\">Miscellaneous<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_year\">2023<\/span><span class=\"tp_pub_additional_note\">, (US Patent 11,635,758)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1181\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1181','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1181\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1181','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1181\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1181','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1181\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@misc{SZ:WWZpatent23c,<br \/>\r\ntitle = {Risk Aware Executor with Action Set Recommendations},<br \/>\r\nauthor = {Kyle Hollins Wray and Stefan Witwicki and Shlomo Zilberstein},<br \/>\r\nurl = {https:\/\/patents.google.com\/patent\/US11635758B2\/en},<br \/>\r\nyear  = {2023},<br \/>\r\ndate = {2023-04-01},<br \/>\r\npublisher = {Google Patents},<br \/>\r\nabstract = {A method for use in traversing a vehicle transportation network by an autonomous vehicle (AV) includes traversing, by the AV, the vehicle transportation network. Traversing the vehicle transportation network includes identifying a distinct vehicle operational scenario; instantiating a first decision component instance; receiving a first set of candidate vehicle control actions from the first decision component instance; selecting an action; and controlling the AV to traverse a portion of the vehicle transportation network based on the action. The first decision component instance is an instance of a first decision component modeling the distinct vehicle operational scenario.},<br \/>\r\nnote = {US Patent 11,635,758},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {misc}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1181','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1181\" style=\"display:none;\"><div class=\"tp_abstract_entry\">A method for use in traversing a vehicle transportation network by an autonomous vehicle (AV) includes traversing, by the AV, the vehicle transportation network. Traversing the vehicle transportation network includes identifying a distinct vehicle operational scenario; instantiating a first decision component instance; receiving a first set of candidate vehicle control actions from the first decision component instance; selecting an action; and controlling the AV to traverse a portion of the vehicle transportation network based on the action. The first decision component instance is an instance of a first decision component modeling the distinct vehicle operational scenario.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1181','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1181\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/patents.google.com\/patent\/US11635758B2\/en\" title=\"https:\/\/patents.google.com\/patent\/US11635758B2\/en\" target=\"_blank\">https:\/\/patents.google.com\/patent\/US11635758B2\/en<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1181','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Basich, Connor;  Svegliato, Justin;  Wray, Kyle Hollins;  Witwicki, Stefan;  Biswas, Joydeep;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1166','tp_links')\" style=\"cursor:pointer;\">Competence-Aware Systems<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Artificial Intelligence (AIJ), <\/span><span class=\"tp_pub_additional_issue\">iss. 316, <\/span><span class=\"tp_pub_additional_pages\">pp. 103844, <\/span><span class=\"tp_pub_additional_year\">2023<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1166\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1166','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1166\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1166','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1166\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1166','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1166\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:BSWWBZaij23,<br \/>\r\ntitle = {Competence-Aware Systems},<br \/>\r\nauthor = {Connor Basich and Justin Svegliato and Kyle Hollins Wray and Stefan Witwicki and Joydeep Biswas and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSWWBZaij23.pdf},<br \/>\r\ndoi = {10.1016\/j.artint.2022.103844},<br \/>\r\nyear  = {2023},<br \/>\r\ndate = {2023-03-16},<br \/>\r\nurldate = {2023-03-16},<br \/>\r\njournal = {Artificial Intelligence (AIJ)},<br \/>\r\nissue = {316},<br \/>\r\npages = {103844},<br \/>\r\nabstract = {Building autonomous systems for deployment in the open world has been a longstanding objective in both artificial intelligence and robotics. The open world, however, presents challenges that question some of the assumptions often made in contemporary AI models. Autonomous systems that operate in the open world face complex, non-stationary environments wherein enumerating all situations the system may face over the course of its deployment is intractable. Nevertheless, these systems are expected to operate safely and reliably for extended durations. Consequently, AI systems often rely on some degree of human assistance to mitigate risks while completing their tasks, and are hence better treated as semi-autonomous systems. In order to reduce unnecessary reliance on humans and optimize autonomy, we propose a novel introspective planning model\u2014competence-aware systems (CAS)\u2014that enables a semi-autonomous system to reason about its own competence and allowed level of autonomy by leveraging human feedback or assistance. A CAS learns to adjust its level of autonomy based on experience and interactions with a human authority so as to reduce improper reliance on the human and optimize the degree of autonomy it employs in any given circumstance. To handle situations in which the initial CAS model has insufficient state information to properly discriminate feedback received from humans, we introduce a methodology called iterative state space refinement that gradually increases the granularity of the state space online. The approach exploits information that exists in the standard CAS model and requires no additional input from the human. The result is an agent that can more confidently predict the correct feedback from the human authority in each level of autonomy, enabling it learn its competence in a larger portion of the state space.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1166','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1166\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Building autonomous systems for deployment in the open world has been a longstanding objective in both artificial intelligence and robotics. The open world, however, presents challenges that question some of the assumptions often made in contemporary AI models. Autonomous systems that operate in the open world face complex, non-stationary environments wherein enumerating all situations the system may face over the course of its deployment is intractable. Nevertheless, these systems are expected to operate safely and reliably for extended durations. Consequently, AI systems often rely on some degree of human assistance to mitigate risks while completing their tasks, and are hence better treated as semi-autonomous systems. In order to reduce unnecessary reliance on humans and optimize autonomy, we propose a novel introspective planning model\u2014competence-aware systems (CAS)\u2014that enables a semi-autonomous system to reason about its own competence and allowed level of autonomy by leveraging human feedback or assistance. A CAS learns to adjust its level of autonomy based on experience and interactions with a human authority so as to reduce improper reliance on the human and optimize the degree of autonomy it employs in any given circumstance. To handle situations in which the initial CAS model has insufficient state information to properly discriminate feedback received from humans, we introduce a methodology called iterative state space refinement that gradually increases the granularity of the state space online. The approach exploits information that exists in the standard CAS model and requires no additional input from the human. The result is an agent that can more confidently predict the correct feedback from the human authority in each level of autonomy, enabling it learn its competence in a larger portion of the state space.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1166','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1166\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSWWBZaij23.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSWWBZaij23.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSWWBZaij23.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1016\/j.artint.2022.103844\" title=\"Follow DOI:10.1016\/j.artint.2022.103844\" target=\"_blank\">doi:10.1016\/j.artint.2022.103844<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1166','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_misc\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Witwicki, Stefan;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1182','tp_links')\" style=\"cursor:pointer;\">Learning Safety and Human-Centered Constraints in Autonomous Vehicles<\/a> <span class=\"tp_pub_type tp_  misc\">Miscellaneous<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_year\">2023<\/span><span class=\"tp_pub_additional_note\">, (US Patent 11,613,269)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1182\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1182','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1182\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1182','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1182\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1182','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1182\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@misc{SZ:WWZpatent23b,<br \/>\r\ntitle = {Learning Safety and Human-Centered Constraints in Autonomous Vehicles},<br \/>\r\nauthor = {Kyle Hollins Wray and Stefan Witwicki and Shlomo Zilberstein},<br \/>\r\nurl = {https:\/\/patents.google.com\/patent\/US11613269B2\/en},<br \/>\r\nyear  = {2023},<br \/>\r\ndate = {2023-03-01},<br \/>\r\npublisher = {Google Patents},<br \/>\r\nabstract = {Traversing a vehicle transportation network includes operating a scenario-specific operational control evaluation module instance. The scenario-specific operational control evaluation module instance includes an instance of a scenario-specific operational control evaluation model of a distinct vehicle operational scenario. Operating the scenario-specific operational control evaluation module instance includes identifying a multi-objective policy for the scenario-specific operational control evaluation model. The multi-objective policy may include a relationship between at least two objectives. Traversing the vehicle transportation network includes receiving a candidate vehicle control action associated with each of the at least two objectives. Traversing the vehicle transportation network includes selecting a vehicle control action based on a buffer value. Traversing the vehicle transportation network includes performing the selected vehicle control action, determining a preference indicator for each objective, and updating the multi-objective policy.},<br \/>\r\nnote = {US Patent 11,613,269},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {misc}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1182','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1182\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Traversing a vehicle transportation network includes operating a scenario-specific operational control evaluation module instance. The scenario-specific operational control evaluation module instance includes an instance of a scenario-specific operational control evaluation model of a distinct vehicle operational scenario. Operating the scenario-specific operational control evaluation module instance includes identifying a multi-objective policy for the scenario-specific operational control evaluation model. The multi-objective policy may include a relationship between at least two objectives. Traversing the vehicle transportation network includes receiving a candidate vehicle control action associated with each of the at least two objectives. Traversing the vehicle transportation network includes selecting a vehicle control action based on a buffer value. Traversing the vehicle transportation network includes performing the selected vehicle control action, determining a preference indicator for each objective, and updating the multi-objective policy.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1182','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1182\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/patents.google.com\/patent\/US11613269B2\/en\" title=\"https:\/\/patents.google.com\/patent\/US11613269B2\/en\" target=\"_blank\">https:\/\/patents.google.com\/patent\/US11613269B2\/en<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1182','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_misc\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Bentahar, Omar;  Vagadia, Astha;  Cesafsky, Laura;  Jamgochian, Arec;  Witwicki, Stefan;  Baig, Najamuddin Mirza;  Gyorfi, Julius S;  Zilberstein, Shlomo;  Sharma, Sparsh<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1183','tp_links')\" style=\"cursor:pointer;\">Explainability of Autonomous Vehicle Decision Making<\/a> <span class=\"tp_pub_type tp_  misc\">Miscellaneous<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_year\">2023<\/span><span class=\"tp_pub_additional_note\">, (US Patent 11,577,746)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1183\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1183','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1183\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1183','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1183\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1183','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1183\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@misc{SZ:WBVCJWpatent23a,<br \/>\r\ntitle = {Explainability of Autonomous Vehicle Decision Making},<br \/>\r\nauthor = {Kyle Hollins Wray and Omar Bentahar and Astha Vagadia and Laura Cesafsky and Arec Jamgochian and Stefan Witwicki and Najamuddin Mirza Baig and Julius S Gyorfi and Shlomo Zilberstein and Sparsh Sharma},<br \/>\r\nurl = {https:\/\/patents.google.com\/patent\/US11577746B2\/en},<br \/>\r\nyear  = {2023},<br \/>\r\ndate = {2023-02-01},<br \/>\r\npublisher = {Google Patents},<br \/>\r\nabstract = {A processor is configured to execute instructions stored in a memory to determine, in response to identifying vehicle operational scenarios of a scene, an action for controlling the AV, where the action is from a selected decision component that determined the action based on level of certainty associated with a state factor; generate an explanation as to why the action was selected, such that the explanation includes respective descriptors of the action, the selected decision component, and the state factor; and display the explanation in a graphical view that includes a first graphical indicator of a world object of the selected decision component, a second graphical indicator describing the state factor, and a third graphical indicator describing the action.},<br \/>\r\nnote = {US Patent 11,577,746},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {misc}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1183','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1183\" style=\"display:none;\"><div class=\"tp_abstract_entry\">A processor is configured to execute instructions stored in a memory to determine, in response to identifying vehicle operational scenarios of a scene, an action for controlling the AV, where the action is from a selected decision component that determined the action based on level of certainty associated with a state factor; generate an explanation as to why the action was selected, such that the explanation includes respective descriptors of the action, the selected decision component, and the state factor; and display the explanation in a graphical view that includes a first graphical indicator of a world object of the selected decision component, a second graphical indicator describing the state factor, and a third graphical indicator describing the action.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1183','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1183\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/patents.google.com\/patent\/US11577746B2\/en\" title=\"https:\/\/patents.google.com\/patent\/US11577746B2\/en\" target=\"_blank\">https:\/\/patents.google.com\/patent\/US11577746B2\/en<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1183','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_misc\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle;  Witwicki, Stefan;  Zilberstein, Shlomo;  Pedersen, Liam<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1184','tp_links')\" style=\"cursor:pointer;\">Autonomous Vehicle Operational Management Including Operating a Partially Observable Markov Decision Process Model Instance<\/a> <span class=\"tp_pub_type tp_  misc\">Miscellaneous<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_year\">2022<\/span><span class=\"tp_pub_additional_note\">, (US Patent 11,500,380)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1184\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1184','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1184\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1184','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1184\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1184','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1184\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@misc{SZ:WWZPpatent22d,<br \/>\r\ntitle = {Autonomous Vehicle Operational Management Including Operating a Partially Observable Markov Decision Process Model Instance},<br \/>\r\nauthor = {Kyle Wray and Stefan Witwicki and Shlomo Zilberstein and Liam Pedersen},<br \/>\r\nurl = {https:\/\/patents.google.com\/patent\/US11500380B2\/en},<br \/>\r\nyear  = {2022},<br \/>\r\ndate = {2022-11-01},<br \/>\r\npublisher = {Google Patents},<br \/>\r\nabstract = {Autonomous vehicle operational management may include traversing, by an autonomous vehicle, a vehicle transportation network. Traversing the vehicle transportation network may include operating a scenario-specific operational control evaluation module instance, wherein the scenario-specific operational control evaluation module instance is an instance of a scenario-specific operational control evaluation module, wherein the scenario-specific operational control evaluation module implements a partially observable Markov decision process. Traversing the vehicle transportation network may include receiving a candidate vehicle control action from the scenario-specific operational control evaluation module instance, and traversing a portion of the vehicle transportation network based on the candidate vehicle control action.},<br \/>\r\nnote = {US Patent 11,500,380},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {misc}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1184','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1184\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Autonomous vehicle operational management may include traversing, by an autonomous vehicle, a vehicle transportation network. Traversing the vehicle transportation network may include operating a scenario-specific operational control evaluation module instance, wherein the scenario-specific operational control evaluation module instance is an instance of a scenario-specific operational control evaluation module, wherein the scenario-specific operational control evaluation module implements a partially observable Markov decision process. Traversing the vehicle transportation network may include receiving a candidate vehicle control action from the scenario-specific operational control evaluation module instance, and traversing a portion of the vehicle transportation network based on the candidate vehicle control action.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1184','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1184\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/patents.google.com\/patent\/US11500380B2\/en\" title=\"https:\/\/patents.google.com\/patent\/US11500380B2\/en\" target=\"_blank\">https:\/\/patents.google.com\/patent\/US11500380B2\/en<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1184','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_misc\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Basich, Connor;  Wray, Kyle Hollins;  Witwicki, Stefan;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1185','tp_links')\" style=\"cursor:pointer;\">Introspective Competence Modeling for AV Decision Making<\/a> <span class=\"tp_pub_type tp_  misc\">Miscellaneous<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_year\">2022<\/span><span class=\"tp_pub_additional_note\">, (US Patent 11,307,585)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1185\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1185','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1185\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1185','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1185\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1185','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1185\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@misc{SZ:BWWZpatent22c,<br \/>\r\ntitle = {Introspective Competence Modeling for AV Decision Making},<br \/>\r\nauthor = {Connor Basich and Kyle Hollins Wray and Stefan Witwicki and Shlomo Zilberstein},<br \/>\r\nurl = {https:\/\/patents.google.com\/patent\/US11307585B2\/en},<br \/>\r\nyear  = {2022},<br \/>\r\ndate = {2022-04-01},<br \/>\r\npublisher = {Google Patents},<br \/>\r\nabstract = {A first method includes detecting, based on sensor data, an environment state; selecting an action based on the environment state; determining an autonomy level associated with the environment state and the action; and performing the action according to the autonomy level. The autonomy level can be selected based at least on an autonomy model and a feedback model. A second method includes calculating, by solving an extended Stochastic Shortest Path (SSP) problem, a policy for solving a task. The policy can map environment states and autonomy levels to actions and autonomy levels. Calculating the policy can include generating plans that operate across multiple levels of autonomy.},<br \/>\r\nnote = {US Patent 11,307,585},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {misc}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1185','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1185\" style=\"display:none;\"><div class=\"tp_abstract_entry\">A first method includes detecting, based on sensor data, an environment state; selecting an action based on the environment state; determining an autonomy level associated with the environment state and the action; and performing the action according to the autonomy level. The autonomy level can be selected based at least on an autonomy model and a feedback model. A second method includes calculating, by solving an extended Stochastic Shortest Path (SSP) problem, a policy for solving a task. The policy can map environment states and autonomy levels to actions and autonomy levels. Calculating the policy can include generating plans that operate across multiple levels of autonomy.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1185','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1185\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/patents.google.com\/patent\/US11307585B2\/en\" title=\"https:\/\/patents.google.com\/patent\/US11307585B2\/en\" target=\"_blank\">https:\/\/patents.google.com\/patent\/US11307585B2\/en<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1185','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_misc\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Witwicki, Stefan;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1186','tp_links')\" style=\"cursor:pointer;\">Multiple Objective Explanation and Control Interface Design<\/a> <span class=\"tp_pub_type tp_  misc\">Miscellaneous<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_year\">2022<\/span><span class=\"tp_pub_additional_note\">, (US Patent 11,300,957)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1186\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1186','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1186\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1186','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1186\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1186','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1186\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@misc{SZ:WWZpatent22b,<br \/>\r\ntitle = {Multiple Objective Explanation and Control Interface Design},<br \/>\r\nauthor = {Kyle Hollins Wray and Stefan Witwicki and Shlomo Zilberstein},<br \/>\r\nurl = {https:\/\/patents.google.com\/patent\/US11300957B2\/en},<br \/>\r\nyear  = {2022},<br \/>\r\ndate = {2022-04-01},<br \/>\r\npublisher = {Google Patents},<br \/>\r\nabstract = {A vehicle traversing a vehicle transportation network may use a scenario-specific operational control evaluation model instance. A multi-objective policy for the model is received, wherein the policy includes at least a first objective, a second objective, and a priority of the first objective relative to the second objective. A representation of the policy (e.g., the first objective, the second objective, and the priority) is generated using a user interface. Based on feedback to the user interface, a change to the multi-objective policy for the scenario-specific operational control evaluation model is received. The change is to the first objective, the second objective, the priority, of some combination thereof. Then, for determining a vehicle control action for traversing the vehicle transportation network, an updated multi-objective policy for the scenario-specific operational control evaluation model is generated to include the change to the policy.},<br \/>\r\nnote = {US Patent 11,300,957},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {misc}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1186','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1186\" style=\"display:none;\"><div class=\"tp_abstract_entry\">A vehicle traversing a vehicle transportation network may use a scenario-specific operational control evaluation model instance. A multi-objective policy for the model is received, wherein the policy includes at least a first objective, a second objective, and a priority of the first objective relative to the second objective. A representation of the policy (e.g., the first objective, the second objective, and the priority) is generated using a user interface. Based on feedback to the user interface, a change to the multi-objective policy for the scenario-specific operational control evaluation model is received. The change is to the first objective, the second objective, the priority, of some combination thereof. Then, for determining a vehicle control action for traversing the vehicle transportation network, an updated multi-objective policy for the scenario-specific operational control evaluation model is generated to include the change to the policy.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1186','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1186\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/patents.google.com\/patent\/US11300957B2\/en\" title=\"https:\/\/patents.google.com\/patent\/US11300957B2\/en\" target=\"_blank\">https:\/\/patents.google.com\/patent\/US11300957B2\/en<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1186','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Rabiee, Sadegh;  Basich, Connor;  Wray, Kyle Hollins;  Zilberstein, Shlomo;  Biswas, Joydeep<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1158','tp_links')\" style=\"cursor:pointer;\">Competence-Aware Path Planning Via Introspective Perception<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">IEEE Robotics and Automation Letters, <\/span><span class=\"tp_pub_additional_volume\">vol. 7, <\/span><span class=\"tp_pub_additional_number\">no. 2, <\/span><span class=\"tp_pub_additional_pages\">pp. 3218\u20133225, <\/span><span class=\"tp_pub_additional_year\">2022<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1158\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1158','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1158\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1158','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1158\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1158','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1158\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:RBWZBlra22,<br \/>\r\ntitle = {Competence-Aware Path Planning Via Introspective Perception},<br \/>\r\nauthor = {Sadegh Rabiee and Connor Basich and Kyle Hollins Wray and Shlomo Zilberstein and Joydeep Biswas},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/RBWZBlra22.pdf},<br \/>\r\ndoi = {10.1109\/LRA.2022.3145517},<br \/>\r\nyear  = {2022},<br \/>\r\ndate = {2022-01-01},<br \/>\r\njournal = {IEEE Robotics and Automation Letters},<br \/>\r\nvolume = {7},<br \/>\r\nnumber = {2},<br \/>\r\npages = {3218--3225},<br \/>\r\nabstract = {Robots deployed in the real world over extended periods of time need to reason about unexpected failures, learn to predict them, and to proactively take actions to avoid future failures. Existing approaches for competence-aware planning are either model-based, requiring explicit enumeration of known failure sources, or purely statistical, using state- and location-specific failure statistics to infer competence. We instead propose a structured model-free approach to competence-aware planning by reasoning about plan execution failures due to errors in perception, without requiring a priori enumeration of failure sources or requiring location-specific failure statistics. We introduce competence-aware path planning via introspective perception (CPIP) , a Bayesian framework to iteratively learn and exploit task-level competence in novel deployment environments. CPIP factorizes the competence-aware planning problem into two components. First, perception errors are learned in a model-free and location-agnostic setting via introspective perception prior to deployment in novel environments. Second, during actual deployments, the prediction of task-level failures is learned in a context-aware setting. Experiments in a simulation show that the proposed CPIP approach outperforms the frequentist baseline in multiple mobile robot tasks, and is further validated via real robot experiments in environments with perceptually challenging obstacles and terrain.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1158','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1158\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Robots deployed in the real world over extended periods of time need to reason about unexpected failures, learn to predict them, and to proactively take actions to avoid future failures. Existing approaches for competence-aware planning are either model-based, requiring explicit enumeration of known failure sources, or purely statistical, using state- and location-specific failure statistics to infer competence. We instead propose a structured model-free approach to competence-aware planning by reasoning about plan execution failures due to errors in perception, without requiring a priori enumeration of failure sources or requiring location-specific failure statistics. We introduce competence-aware path planning via introspective perception (CPIP) , a Bayesian framework to iteratively learn and exploit task-level competence in novel deployment environments. CPIP factorizes the competence-aware planning problem into two components. First, perception errors are learned in a model-free and location-agnostic setting via introspective perception prior to deployment in novel environments. Second, during actual deployments, the prediction of task-level failures is learned in a context-aware setting. Experiments in a simulation show that the proposed CPIP approach outperforms the frequentist baseline in multiple mobile robot tasks, and is further validated via real robot experiments in environments with perceptually challenging obstacles and terrain.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1158','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1158\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/RBWZBlra22.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/RBWZBlra22.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/RBWZBlra22.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1109\/LRA.2022.3145517\" title=\"Follow DOI:10.1109\/LRA.2022.3145517\" target=\"_blank\">doi:10.1109\/LRA.2022.3145517<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1158','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_misc\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Ostafew, Christopher;  Vagadia, Astha;  Baig, Najamuddin;  James, Viju;  Witwicki, Stefan;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1187','tp_links')\" style=\"cursor:pointer;\">Exception Situation Playback for Tele-Operators<\/a> <span class=\"tp_pub_type tp_  misc\">Miscellaneous<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_year\">2022<\/span><span class=\"tp_pub_additional_note\">, (US Patent 11,215,987)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1187\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1187','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1187\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1187','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1187\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1187','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1187\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@misc{SZ:OVBJWZpatent22a,<br \/>\r\ntitle = {Exception Situation Playback for Tele-Operators},<br \/>\r\nauthor = {Christopher Ostafew and Astha Vagadia and Najamuddin Baig and Viju James and Stefan Witwicki and Shlomo Zilberstein},<br \/>\r\nurl = {https:\/\/patents.google.com\/patent\/US11215987B2\/en},<br \/>\r\nyear  = {2022},<br \/>\r\ndate = {2022-01-01},<br \/>\r\npublisher = {Google Patents},<br \/>\r\nabstract = {Resolving an exception situation in autonomous driving includes receiving an assistance request to resolve the exception situation from an autonomous vehicle (AV); identifying a solution to the exception situation; forwarding the solution to a tele-operator; receiving a request for playback data from the tele-operator; receiving, from the AV, the playback data; and obtaining, from the tele-operator, a validated solution based on the tele-operator using the playback data. The playback data includes snapshots n_i of data related to autonomous driving stored at the AV at respective consecutive times t_i, for i= 1,... , N.},<br \/>\r\nnote = {US Patent 11,215,987},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {misc}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1187','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1187\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Resolving an exception situation in autonomous driving includes receiving an assistance request to resolve the exception situation from an autonomous vehicle (AV); identifying a solution to the exception situation; forwarding the solution to a tele-operator; receiving a request for playback data from the tele-operator; receiving, from the AV, the playback data; and obtaining, from the tele-operator, a validated solution based on the tele-operator using the playback data. The playback data includes snapshots n_i of data related to autonomous driving stored at the AV at respective consecutive times t_i, for i= 1,... , N.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1187','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1187\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/patents.google.com\/patent\/US11215987B2\/en\" title=\"https:\/\/patents.google.com\/patent\/US11215987B2\/en\" target=\"_blank\">https:\/\/patents.google.com\/patent\/US11215987B2\/en<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1187','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_misc\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Witwicki, Stefan J;  Zilberstein, Shlomo;  Bentahar, Omar;  Jamgochian, Arec<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1149','tp_links')\" style=\"cursor:pointer;\">Explainability of Autonomous Vehicle Decision Making<\/a> <span class=\"tp_pub_type tp_  misc\">Miscellaneous<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_year\">2021<\/span><span class=\"tp_pub_additional_note\">, (US Patent App. 16\/778,890)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1149\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1149','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1149\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1149','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1149\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1149','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1149\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@misc{SZ:BWWZpatent21j,<br \/>\r\ntitle = {Explainability of Autonomous Vehicle Decision Making},<br \/>\r\nauthor = {Kyle Hollins Wray and Stefan J Witwicki and Shlomo Zilberstein and Omar Bentahar and Arec Jamgochian},<br \/>\r\nurl = {https:\/\/patents.google.com\/patent\/US20210240190A1\/en},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-08-01},<br \/>\r\npublisher = {Google Patents},<br \/>\r\nabstract = {A processor is configured to execute instructions stored in a memory to identify distinct vehicle operational scenarios; instantiate decision components, where each of the decision components is an instance of a respective decision problem, and where the each of the decision components maintains a respective state describing the respective vehicle operational scenario; receive respective candidate vehicle control actions from the decision components; select an action from the respective candidate vehicle control actions, where the action is from a selected decision component of the decision components, and where the action is used to control the AV to traverse a portion of the vehicle transportation network; and generate an explanation as to why the action was selected, where the explanation includes respective descriptors of the action, the selected decision component, and a state factor of the respective state of the selected decision component.},<br \/>\r\nnote = {US Patent App. 16\/778,890},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {misc}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1149','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1149\" style=\"display:none;\"><div class=\"tp_abstract_entry\">A processor is configured to execute instructions stored in a memory to identify distinct vehicle operational scenarios; instantiate decision components, where each of the decision components is an instance of a respective decision problem, and where the each of the decision components maintains a respective state describing the respective vehicle operational scenario; receive respective candidate vehicle control actions from the decision components; select an action from the respective candidate vehicle control actions, where the action is from a selected decision component of the decision components, and where the action is used to control the AV to traverse a portion of the vehicle transportation network; and generate an explanation as to why the action was selected, where the explanation includes respective descriptors of the action, the selected decision component, and a state factor of the respective state of the selected decision component.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1149','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1149\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/patents.google.com\/patent\/US20210240190A1\/en\" title=\"https:\/\/patents.google.com\/patent\/US20210240190A1\/en\" target=\"_blank\">https:\/\/patents.google.com\/patent\/US20210240190A1\/en<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1149','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_misc\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Witwicki, Stefan J;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1152','tp_links')\" style=\"cursor:pointer;\">Reinforcement and Model Learning for Vehicle Operation<\/a> <span class=\"tp_pub_type tp_  misc\">Miscellaneous<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_year\">2021<\/span><span class=\"tp_pub_additional_note\">, (US Patent 11,027,751)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1152\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1152','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1152\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1152','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1152\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1152','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1152\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@misc{SZ:BWWZpatent21f,<br \/>\r\ntitle = {Reinforcement and Model Learning for Vehicle Operation},<br \/>\r\nauthor = {Kyle Hollins Wray and Stefan J Witwicki and Shlomo Zilberstein},<br \/>\r\nurl = {https:\/\/patents.google.com\/patent\/US11027751B2\/en},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-06-01},<br \/>\r\npublisher = {Google Patents},<br \/>\r\nabstract = {Methods and vehicles may be configured to gain experience in the form of state-action and\/or action-observation histories for an operational scenario as the vehicle traverses a vehicle transportation network. The histories may be incorporated into a model in the form of learning to improve the model over time. The learning may be used to improve integration with human behavior. Driver feedback may be used in the learning examples to improve future performance and to integrate with human behavior. The learning may be used to create customized scenario solutions. The learning may be used to transfer a learned solution and apply the learned solution to a similar scenario.},<br \/>\r\nnote = {US Patent 11,027,751},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {misc}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1152','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1152\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Methods and vehicles may be configured to gain experience in the form of state-action and\/or action-observation histories for an operational scenario as the vehicle traverses a vehicle transportation network. The histories may be incorporated into a model in the form of learning to improve the model over time. The learning may be used to improve integration with human behavior. Driver feedback may be used in the learning examples to improve future performance and to integrate with human behavior. The learning may be used to create customized scenario solutions. The learning may be used to transfer a learned solution and apply the learned solution to a similar scenario.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1152','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1152\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/patents.google.com\/patent\/US11027751B2\/en\" title=\"https:\/\/patents.google.com\/patent\/US11027751B2\/en\" target=\"_blank\">https:\/\/patents.google.com\/patent\/US11027751B2\/en<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1152','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_misc\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Witwicki, Stefan J;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1154','tp_links')\" style=\"cursor:pointer;\">Risk Aware Executor with Action Set Recommendations<\/a> <span class=\"tp_pub_type tp_  misc\">Miscellaneous<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_year\">2021<\/span><span class=\"tp_pub_additional_note\">, (US Patent App. 16\/696,235)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1154\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1154','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1154\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1154','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1154\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1154','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1154\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@misc{SZ:BWWZpatent21d,<br \/>\r\ntitle = {Risk Aware Executor with Action Set Recommendations},<br \/>\r\nauthor = {Kyle Hollins Wray and Stefan J Witwicki and Shlomo Zilberstein},<br \/>\r\nurl = {https:\/\/patents.google.com\/patent\/US20210157315A1\/en},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-05-01},<br \/>\r\npublisher = {Google Patents},<br \/>\r\nabstract = {A method for use in traversing a vehicle transportation network by an autonomous vehicle (AV) includes traversing, by the AV, the vehicle transportation network. Traversing the vehicle transportation network includes identifying a distinct vehicle operational scenario; instantiating a first decision component instance; receiving a first set of candidate vehicle control actions from the first decision component instance; selecting an action; and controlling the AV to traverse a portion of the vehicle transportation network based on the action. The first decision component instance is an instance of a first decision component modeling the distinct vehicle operational scenario.},<br \/>\r\nnote = {US Patent App. 16\/696,235},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {misc}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1154','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1154\" style=\"display:none;\"><div class=\"tp_abstract_entry\">A method for use in traversing a vehicle transportation network by an autonomous vehicle (AV) includes traversing, by the AV, the vehicle transportation network. Traversing the vehicle transportation network includes identifying a distinct vehicle operational scenario; instantiating a first decision component instance; receiving a first set of candidate vehicle control actions from the first decision component instance; selecting an action; and controlling the AV to traverse a portion of the vehicle transportation network based on the action. The first decision component instance is an instance of a first decision component modeling the distinct vehicle operational scenario.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1154','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1154\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/patents.google.com\/patent\/US20210157315A1\/en\" title=\"https:\/\/patents.google.com\/patent\/US20210157315A1\/en\" target=\"_blank\">https:\/\/patents.google.com\/patent\/US20210157315A1\/en<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1154','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_misc\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Basich, Connor;  Wray, Kyle Hollins;  Witwicki, Stefan J;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1139','tp_links')\" style=\"cursor:pointer;\">Introspective Competence Modeling for AV Decision Making<\/a> <span class=\"tp_pub_type tp_  misc\">Miscellaneous<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_year\">2021<\/span><span class=\"tp_pub_additional_note\">, (US Patent App. 16\/668,584)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1139\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1139','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1139\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1139','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1139\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1139','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1139\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@misc{SZ:BWWZpatent21c,<br \/>\r\ntitle = {Introspective Competence Modeling for AV Decision Making},<br \/>\r\nauthor = {Connor Basich and Kyle Hollins Wray and Stefan J Witwicki and Shlomo Zilberstein},<br \/>\r\nurl = {https:\/\/patents.google.com\/patent\/US20210132606A1\/en},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\npublisher = {Google Patents},<br \/>\r\nabstract = {A first method includes detecting, based on sensor data, an environment state; selecting an action based on the environment state; determining an autonomy level associated with the environment state and the action; and performing the action according to the autonomy level. The autonomy level can be selected based at least on an autonomy model and a feedback model. A second method includes calculating, by solving an extended Stochastic Shortest Path (SSP) problem, a policy for solving a task. The policy can map environment states and autonomy levels to actions and autonomy levels. Calculating the policy can include generating plans that operate across multiple levels of autonomy.},<br \/>\r\nnote = {US Patent App. 16\/668,584},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {misc}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1139','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1139\" style=\"display:none;\"><div class=\"tp_abstract_entry\">A first method includes detecting, based on sensor data, an environment state; selecting an action based on the environment state; determining an autonomy level associated with the environment state and the action; and performing the action according to the autonomy level. The autonomy level can be selected based at least on an autonomy model and a feedback model. A second method includes calculating, by solving an extended Stochastic Shortest Path (SSP) problem, a policy for solving a task. The policy can map environment states and autonomy levels to actions and autonomy levels. Calculating the policy can include generating plans that operate across multiple levels of autonomy.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1139','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1139\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/patents.google.com\/patent\/US20210132606A1\/en\" title=\"https:\/\/patents.google.com\/patent\/US20210132606A1\/en\" target=\"_blank\">https:\/\/patents.google.com\/patent\/US20210132606A1\/en<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1139','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Rabiee, Sadegh;  Basich, Connor;  Wray, Kyle Hollins;  Zilberstein, Shlomo;  Biswas, Joydeep<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1147','tp_links')\" style=\"cursor:pointer;\">Competence-Aware Path Planning via Introspective Perception<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">CoRR, <\/span><span class=\"tp_pub_additional_volume\">vol. abs\/2109.13974, <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1147\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1147','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1147\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1147','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1147\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1147','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1147\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:SZarXiv21c,<br \/>\r\ntitle = {Competence-Aware Path Planning via Introspective Perception},<br \/>\r\nauthor = {Sadegh Rabiee and Connor Basich and Kyle Hollins Wray and Shlomo Zilberstein and Joydeep Biswas},<br \/>\r\nurl = {https:\/\/arxiv.org\/abs\/2109.13974},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\njournal = {CoRR},<br \/>\r\nvolume = {abs\/2109.13974},<br \/>\r\nabstract = {Robots deployed in the real world over extended periods of time need to reason about unexpected failures, learn to predict them, and to proactively take actions to avoid future failures. Existing approaches for competence-aware planning are either model-based, requiring explicit enumeration of known failure modes, or purely statistical, using state- and location-specific failure statistics to infer competence. We instead propose a structured model-free approach to competence-aware planning by reasoning about plan execution failures due to errors in perception, without requiring a-priori enumeration of failure modes or requiring location-specific failure statistics. We introduce competence-aware path planning via introspective perception (CPIP), a Bayesian framework to iteratively learn and exploit task-level competence in novel deployment environments. CPIP factorizes the competence-aware planning problem into two components. First, perception errors are learned in a model-free and location-agnostic setting via introspective perception prior to deployment in novel environments. Second, during actual deployments, the prediction of task-level failures is learned in a context-aware setting. Experiments in a simulation show that the proposed CPIP approach outperforms the frequentist baseline in multiple mobile robot tasks, and is further validated via real robot experiments in an environment with perceptually challenging obstacles and terrain.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1147','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1147\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Robots deployed in the real world over extended periods of time need to reason about unexpected failures, learn to predict them, and to proactively take actions to avoid future failures. Existing approaches for competence-aware planning are either model-based, requiring explicit enumeration of known failure modes, or purely statistical, using state- and location-specific failure statistics to infer competence. We instead propose a structured model-free approach to competence-aware planning by reasoning about plan execution failures due to errors in perception, without requiring a-priori enumeration of failure modes or requiring location-specific failure statistics. We introduce competence-aware path planning via introspective perception (CPIP), a Bayesian framework to iteratively learn and exploit task-level competence in novel deployment environments. CPIP factorizes the competence-aware planning problem into two components. First, perception errors are learned in a model-free and location-agnostic setting via introspective perception prior to deployment in novel environments. Second, during actual deployments, the prediction of task-level failures is learned in a context-aware setting. Experiments in a simulation show that the proposed CPIP approach outperforms the frequentist baseline in multiple mobile robot tasks, and is further validated via real robot experiments in an environment with perceptually challenging obstacles and terrain.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1147','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1147\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"ai ai-arxiv\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/arxiv.org\/abs\/2109.13974\" title=\"https:\/\/arxiv.org\/abs\/2109.13974\" target=\"_blank\">https:\/\/arxiv.org\/abs\/2109.13974<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1147','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_misc\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Witwicki, Stefan J;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1150','tp_links')\" style=\"cursor:pointer;\">Multiple Objective Explanation and Control Interface Design<\/a> <span class=\"tp_pub_type tp_  misc\">Miscellaneous<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_year\">2021<\/span><span class=\"tp_pub_additional_note\">, (US Patent App. 16\/727,038)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1150\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1150','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1150\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1150','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1150\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1150','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1150\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@misc{SZ:BWWZpatent21h,<br \/>\r\ntitle = {Multiple Objective Explanation and Control Interface Design},<br \/>\r\nauthor = {Kyle Hollins Wray and Stefan J Witwicki and Shlomo Zilberstein},<br \/>\r\nurl = {https:\/\/patents.google.com\/patent\/US20210200208A1\/en},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\npublisher = {Google Patents},<br \/>\r\nabstract = {A vehicle traversing a vehicle transportation network may use a scenario-specific operational control evaluation model instance. A multi-objective policy for the model is received, wherein the policy includes at least a first objective, a second objective, and a priority of the first objective relative to the second objective. A representation of the policy (e.g., the first objective, the second objective, and the priority) is generated using a user interface. Based on feedback to the user interface, a change to the multi-objective policy for the scenario-specific operational control evaluation model is received. The change is to the first objective, the second objective, the priority, of some combination thereof. Then, for determining a vehicle control action for traversing the vehicle transportation network, an updated multi-objective policy for the scenario-specific operational control evaluation model is generated to include the change to the policy.},<br \/>\r\nnote = {US Patent App. 16\/727,038},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {misc}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1150','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1150\" style=\"display:none;\"><div class=\"tp_abstract_entry\">A vehicle traversing a vehicle transportation network may use a scenario-specific operational control evaluation model instance. A multi-objective policy for the model is received, wherein the policy includes at least a first objective, a second objective, and a priority of the first objective relative to the second objective. A representation of the policy (e.g., the first objective, the second objective, and the priority) is generated using a user interface. Based on feedback to the user interface, a change to the multi-objective policy for the scenario-specific operational control evaluation model is received. The change is to the first objective, the second objective, the priority, of some combination thereof. Then, for determining a vehicle control action for traversing the vehicle transportation network, an updated multi-objective policy for the scenario-specific operational control evaluation model is generated to include the change to the policy.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1150','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1150\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/patents.google.com\/patent\/US20210200208A1\/en\" title=\"https:\/\/patents.google.com\/patent\/US20210200208A1\/en\" target=\"_blank\">https:\/\/patents.google.com\/patent\/US20210200208A1\/en<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1150','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_misc\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Witwicki, Stefan J;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1140','tp_links')\" style=\"cursor:pointer;\">Shared Autonomous Vehicle Operational Management<\/a> <span class=\"tp_pub_type tp_  misc\">Miscellaneous<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_year\">2021<\/span><span class=\"tp_pub_additional_note\">, (US Patent App. 16\/955,531)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1140\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1140','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1140\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1140','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1140\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1140','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1140\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@misc{SZ:WWZpatent21b,<br \/>\r\ntitle = {Shared Autonomous Vehicle Operational Management},<br \/>\r\nauthor = {Kyle Hollins Wray and Stefan J Witwicki and Shlomo Zilberstein},<br \/>\r\nurl = {https:\/\/patents.google.com\/patent\/US20210078602A1\/en},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-00-01},<br \/>\r\npublisher = {Google Patents},<br \/>\r\nabstract = {Traversing, by an autonomous vehicle, a vehicle transportation network, may include identifying a distinct vehicle operational scenario, wherein traversing the vehicle transportation network includes traversing a portion of the vehicle transportation network that includes the distinct vehicle operational scenario, communicating shared scenario-specific operational control management data associated with the distinct vehicle operational scenario with an external shared scenario-specific operational control management system, operating a scenario-specific operational control evaluation module instance including an instance of a scenario-specific operational control evaluation model of the distinct vehicle operational scenario, and wherein operating the scenario-specific operational control evaluation module instance includes identifying a policy for the scenario-specific operational control evaluation model, receiving a candidate vehicle control action from the policy for the scenario-specific operational control evaluation model, and traversing a portion of the vehicle transportation network based on the candidate vehicle control action.},<br \/>\r\nnote = {US Patent App. 16\/955,531},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {misc}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1140','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1140\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Traversing, by an autonomous vehicle, a vehicle transportation network, may include identifying a distinct vehicle operational scenario, wherein traversing the vehicle transportation network includes traversing a portion of the vehicle transportation network that includes the distinct vehicle operational scenario, communicating shared scenario-specific operational control management data associated with the distinct vehicle operational scenario with an external shared scenario-specific operational control management system, operating a scenario-specific operational control evaluation module instance including an instance of a scenario-specific operational control evaluation model of the distinct vehicle operational scenario, and wherein operating the scenario-specific operational control evaluation module instance includes identifying a policy for the scenario-specific operational control evaluation model, receiving a candidate vehicle control action from the policy for the scenario-specific operational control evaluation model, and traversing a portion of the vehicle transportation network based on the candidate vehicle control action.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1140','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1140\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/patents.google.com\/patent\/US20210078602A1\/en\" title=\"https:\/\/patents.google.com\/patent\/US20210078602A1\/en\" target=\"_blank\">https:\/\/patents.google.com\/patent\/US20210078602A1\/en<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1140','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_misc\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Witwicki, Stefan J;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1141','tp_links')\" style=\"cursor:pointer;\">Centralized Shared Autonomous Vehicle Operational Management<\/a> <span class=\"tp_pub_type tp_  misc\">Miscellaneous<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_year\">2021<\/span><span class=\"tp_pub_additional_note\">, (US Patent App. 16\/955,531)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1141\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1141','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1141\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1141','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1141\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1141','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1141\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@misc{SZ:WWZpatent21a,<br \/>\r\ntitle = {Centralized Shared Autonomous Vehicle Operational Management},<br \/>\r\nauthor = {Kyle Hollins Wray and Stefan J Witwicki and Shlomo Zilberstein},<br \/>\r\nurl = {https:\/\/patents.google.com\/patent\/US20210009154A1\/en},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-00-01},<br \/>\r\npublisher = {Google Patents},<br \/>\r\nabstract = {Centralized shared scenario-specific operational control management includes receiving, at a centralized shared scenario-specific operational control management device, shared scenario-specific operational control management input data, from an autonomous vehicle, validating the shared scenario-specific operational control management input data, identifying a current distinct vehicle operational scenario based on the shared scenario-specific operational control management input data, generating shared scenario-specific operational control management output data based on the current distinct vehicle operational scenario, and transmitting the shared scenario-specific operational control management output data.},<br \/>\r\nnote = {US Patent App. 16\/955,531},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {misc}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1141','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1141\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Centralized shared scenario-specific operational control management includes receiving, at a centralized shared scenario-specific operational control management device, shared scenario-specific operational control management input data, from an autonomous vehicle, validating the shared scenario-specific operational control management input data, identifying a current distinct vehicle operational scenario based on the shared scenario-specific operational control management input data, generating shared scenario-specific operational control management output data based on the current distinct vehicle operational scenario, and transmitting the shared scenario-specific operational control management output data.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1141','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1141\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/patents.google.com\/patent\/US20210009154A1\/en\" title=\"https:\/\/patents.google.com\/patent\/US20210009154A1\/en\" target=\"_blank\">https:\/\/patents.google.com\/patent\/US20210009154A1\/en<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1141','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_misc\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Witwicki, Stefan J;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1148','tp_links')\" style=\"cursor:pointer;\">Autonomous Vehicle Operation with Explicit Occlusion Reasoning<\/a> <span class=\"tp_pub_type tp_  misc\">Miscellaneous<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_year\">2021<\/span><span class=\"tp_pub_additional_note\">, (US Patent App. 16\/753,601)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1148\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1148','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1148\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1148','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1148\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1148','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1148\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@misc{SZ:BWWZpatent21k,<br \/>\r\ntitle = {Autonomous Vehicle Operation with Explicit Occlusion Reasoning},<br \/>\r\nauthor = {Kyle Hollins Wray and Stefan J Witwicki and Shlomo Zilberstein},<br \/>\r\nurl = {https:\/\/patents.google.com\/patent\/US20210261123A1\/en},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-00-01},<br \/>\r\npublisher = {Google Patents},<br \/>\r\nabstract = {Autonomous vehicle operation with explicit occlusion reasoning may include traversing, by a vehicle, a vehicle trans-network. Traversing the vehicle transportation network can include receiving, from a sensor of the vehicle, sensor data for a portion of a vehicle operational environment, determining, using the sensor data, a visibility grid comprising coordinates forming an unobserved region within a defined distance from the vehicle, computing a probability of a presence of an external object within the unobserved region by comparing the visibility grid to a map (eg, a high-definition map), and traversing a portion of the vehicle transportation network using the probability. An apparatus and a vehicle are also described.},<br \/>\r\nnote = {US Patent App. 16\/753,601},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {misc}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1148','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1148\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Autonomous vehicle operation with explicit occlusion reasoning may include traversing, by a vehicle, a vehicle trans-network. Traversing the vehicle transportation network can include receiving, from a sensor of the vehicle, sensor data for a portion of a vehicle operational environment, determining, using the sensor data, a visibility grid comprising coordinates forming an unobserved region within a defined distance from the vehicle, computing a probability of a presence of an external object within the unobserved region by comparing the visibility grid to a map (eg, a high-definition map), and traversing a portion of the vehicle transportation network using the probability. An apparatus and a vehicle are also described.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1148','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1148\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/patents.google.com\/patent\/US20210261123A1\/en\" title=\"https:\/\/patents.google.com\/patent\/US20210261123A1\/en\" target=\"_blank\">https:\/\/patents.google.com\/patent\/US20210261123A1\/en<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1148','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_misc\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Witwicki, Stefan J;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1151','tp_links')\" style=\"cursor:pointer;\">Learning Safety and Human-Centered Constraints in Autonomous Vehicles<\/a> <span class=\"tp_pub_type tp_  misc\">Miscellaneous<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_year\">2021<\/span><span class=\"tp_pub_additional_note\">, (US Patent App. 16\/724,635)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1151\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1151','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1151\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1151','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1151\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1151','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1151\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@misc{SZ:BWWZpatent21g,<br \/>\r\ntitle = {Learning Safety and Human-Centered Constraints in Autonomous Vehicles},<br \/>\r\nauthor = {Kyle Hollins Wray and Stefan J Witwicki and Shlomo Zilberstein},<br \/>\r\nurl = {https:\/\/patents.google.com\/patent\/US20210132606A1\/en},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-00-01},<br \/>\r\npublisher = {Google Patents},<br \/>\r\nabstract = {Traversing a vehicle transportation network includes operating a scenario-specific operational control evaluation module instance. The scenario-specific operational control evaluation module instance includes an instance of a scenario-specific operational control evaluation model of a distinct vehicle operational scenario. Operating the scenario-specific operational control evaluation module instance includes identifying a multi-objective policy for the scenario-specific operational control evaluation model. The multi-objective policy may include a relationship between at least two objectives. Traversing the vehicle transportation network includes receiving a candidate vehicle control action associated with each of the at least two objectives. Traversing the vehicle transportation network includes selecting a vehicle control action based on a buffer value. Traversing the vehicle transportation network includes performing the selected vehicle control action, determining a preference indicator for each objective, and updating the multi-objective policy.},<br \/>\r\nnote = {US Patent App. 16\/724,635},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {misc}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1151','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1151\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Traversing a vehicle transportation network includes operating a scenario-specific operational control evaluation module instance. The scenario-specific operational control evaluation module instance includes an instance of a scenario-specific operational control evaluation model of a distinct vehicle operational scenario. Operating the scenario-specific operational control evaluation module instance includes identifying a multi-objective policy for the scenario-specific operational control evaluation model. The multi-objective policy may include a relationship between at least two objectives. Traversing the vehicle transportation network includes receiving a candidate vehicle control action associated with each of the at least two objectives. Traversing the vehicle transportation network includes selecting a vehicle control action based on a buffer value. Traversing the vehicle transportation network includes performing the selected vehicle control action, determining a preference indicator for each objective, and updating the multi-objective policy.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1151','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1151\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/patents.google.com\/patent\/US20210132606A1\/en\" title=\"https:\/\/patents.google.com\/patent\/US20210132606A1\/en\" target=\"_blank\">https:\/\/patents.google.com\/patent\/US20210132606A1\/en<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1151','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_misc\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Witwicki, Stefan J;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1153','tp_links')\" style=\"cursor:pointer;\">Objective-Based Reasoning in Autonomous Vehicle Decision-Making<\/a> <span class=\"tp_pub_type tp_  misc\">Miscellaneous<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_year\">2021<\/span><span class=\"tp_pub_additional_note\">, (US Patent App. 16\/695,613)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1153\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1153','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1153\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1153','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1153\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1153','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1153\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@misc{SZ:BWWZpatent21e,<br \/>\r\ntitle = {Objective-Based Reasoning in Autonomous Vehicle Decision-Making},<br \/>\r\nauthor = {Kyle Hollins Wray and Stefan J Witwicki and Shlomo Zilberstein},<br \/>\r\nurl = {https:\/\/patents.google.com\/patent\/US20210157314A1\/en},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-00-01},<br \/>\r\npublisher = {Google Patents},<br \/>\r\nabstract = {Traversing a vehicle transportation network includes operating a scenario-specific operational control evaluation module instance. The scenario-specific operational control evaluation module instance includes an instance of a scenario-specific operational control evaluation model of a distinct vehicle operational scenario. Operating the scenario-specific operational control evaluation module instance includes identifying a multi-objective policy for the scenario-specific operational control evaluation model. The multi-objective policy may include a relationship between at least two objectives. Traversing the vehicle transportation network includes receiving a candidate vehicle control action associated with each of the at least two objectives. Traversing the vehicle transportation network includes selecting a vehicle control action based on a buffer value. Traversing the vehicle transportation network includes traversing a portion of the vehicle transportation network in accordance with the selected vehicle control action.},<br \/>\r\nnote = {US Patent App. 16\/695,613},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {misc}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1153','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1153\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Traversing a vehicle transportation network includes operating a scenario-specific operational control evaluation module instance. The scenario-specific operational control evaluation module instance includes an instance of a scenario-specific operational control evaluation model of a distinct vehicle operational scenario. Operating the scenario-specific operational control evaluation module instance includes identifying a multi-objective policy for the scenario-specific operational control evaluation model. The multi-objective policy may include a relationship between at least two objectives. Traversing the vehicle transportation network includes receiving a candidate vehicle control action associated with each of the at least two objectives. Traversing the vehicle transportation network includes selecting a vehicle control action based on a buffer value. Traversing the vehicle transportation network includes traversing a portion of the vehicle transportation network in accordance with the selected vehicle control action.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1153','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1153\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/patents.google.com\/patent\/US20210157314A1\/en\" title=\"https:\/\/patents.google.com\/patent\/US20210157314A1\/en\" target=\"_blank\">https:\/\/patents.google.com\/patent\/US20210157314A1\/en<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1153','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_misc\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Witwicki, Stefan J;  Zilberstein, Shlomo;  Pedersen, Liam<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1128','tp_links')\" style=\"cursor:pointer;\">Autonomous Vehicle Operational Management Control<\/a> <span class=\"tp_pub_type tp_  misc\">Miscellaneous<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_year\">2020<\/span><span class=\"tp_pub_additional_note\">, (US Patent 10,654,476)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1128\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1128','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1128\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1128','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1128\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1128','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1128\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@misc{SZ:WWZCpatent20e,<br \/>\r\ntitle = {Autonomous Vehicle Operational Management Control},<br \/>\r\nauthor = {Kyle Hollins Wray and Stefan J Witwicki and Shlomo Zilberstein and Liam Pedersen},<br \/>\r\nurl = {https:\/\/patents.google.com\/patent\/US10654476B2\/en},<br \/>\r\nyear  = {2020},<br \/>\r\ndate = {2020-05-01},<br \/>\r\npublisher = {Google Patents},<br \/>\r\nabstract = {Autonomous vehicle operational management may include traversing, by an autonomous vehicle, a vehicle transportation network. Traversing the vehicle transportation network may include receiving, from a sensor of the autonomous vehicle, sensor information corresponding to an external object within a defined distance of the autonomous vehicle, identifying a distinct vehicle operational scenario in response to receiving the sensor information, instantiating a scenario-specific operational control evaluation module instance, wherein the scenario-specific operational control evaluation module instance is an instance of a scenario-specific operational control evaluation module modeling the distinct vehicle operational scenario, receiving a candidate vehicle control action from the scenario-specific operational control evaluation module instance, and traversing a portion of the vehicle transportation network based on the candidate vehicle control action.},<br \/>\r\nnote = {US Patent 10,654,476},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {misc}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1128','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1128\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Autonomous vehicle operational management may include traversing, by an autonomous vehicle, a vehicle transportation network. Traversing the vehicle transportation network may include receiving, from a sensor of the autonomous vehicle, sensor information corresponding to an external object within a defined distance of the autonomous vehicle, identifying a distinct vehicle operational scenario in response to receiving the sensor information, instantiating a scenario-specific operational control evaluation module instance, wherein the scenario-specific operational control evaluation module instance is an instance of a scenario-specific operational control evaluation module modeling the distinct vehicle operational scenario, receiving a candidate vehicle control action from the scenario-specific operational control evaluation module instance, and traversing a portion of the vehicle transportation network based on the candidate vehicle control action.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1128','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1128\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/patents.google.com\/patent\/US10654476B2\/en\" title=\"https:\/\/patents.google.com\/patent\/US10654476B2\/en\" target=\"_blank\">https:\/\/patents.google.com\/patent\/US10654476B2\/en<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1128','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_misc\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Witwicki, Stefan J;  Zilberstein, Shlomo;  Pedersen, Liam<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1113','tp_links')\" style=\"cursor:pointer;\">Autonomous Vehicle Operational Management Including Operating A Partially Observable Markov Decision Process Model Instance<\/a> <span class=\"tp_pub_type tp_  misc\">Miscellaneous<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_year\">2020<\/span><span class=\"tp_pub_additional_note\">, (US Patent App. 16\/473,148)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1113\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1113','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1113\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1113','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1113\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1113','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1113\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@misc{SZ:WWZPpatent20b,<br \/>\r\ntitle = {Autonomous Vehicle Operational Management Including Operating A Partially Observable Markov Decision Process Model Instance},<br \/>\r\nauthor = {Kyle Hollins Wray and Stefan J Witwicki and Shlomo Zilberstein and Liam Pedersen},<br \/>\r\nurl = {https:\/\/patents.google.com\/patent\/US20200097003A1\/en},<br \/>\r\nyear  = {2020},<br \/>\r\ndate = {2020-03-26},<br \/>\r\npublisher = {Google Patents},<br \/>\r\nabstract = {Autonomous vehicle operational management may include traversing, by an autonomous vehicle, a vehicle transportation network. Traversing the vehicle transportation network may include operating a scenario-specific operational control evaluation module instance, wherein the scenario-specific operational control evaluation module instance is an instance of a scenario-specific operational control evaluation module, wherein the scenario-specific operational control evaluation module implements a partially observable Markov decision process. Traversing the vehicle transportation network may include receiving a candidate vehicle control action from the scenario-specific operational control evaluation module instance, and traversing a portion of the vehicle transportation network based on the candidate vehicle control action.},<br \/>\r\nnote = {US Patent App. 16\/473,148},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {misc}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1113','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1113\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Autonomous vehicle operational management may include traversing, by an autonomous vehicle, a vehicle transportation network. Traversing the vehicle transportation network may include operating a scenario-specific operational control evaluation module instance, wherein the scenario-specific operational control evaluation module instance is an instance of a scenario-specific operational control evaluation module, wherein the scenario-specific operational control evaluation module implements a partially observable Markov decision process. Traversing the vehicle transportation network may include receiving a candidate vehicle control action from the scenario-specific operational control evaluation module instance, and traversing a portion of the vehicle transportation network based on the candidate vehicle control action.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1113','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1113\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/patents.google.com\/patent\/US20200097003A1\/en\" title=\"https:\/\/patents.google.com\/patent\/US20200097003A1\/en\" target=\"_blank\">https:\/\/patents.google.com\/patent\/US20200097003A1\/en<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1113','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_misc\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Witwicki, Stefan J;  Zilberstein, Shlomo;  Pedersen, Liam<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1114','tp_links')\" style=\"cursor:pointer;\">Autonomous Vehicle Operational Management Blocking Monitoring<\/a> <span class=\"tp_pub_type tp_  misc\">Miscellaneous<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_year\">2020<\/span><span class=\"tp_pub_additional_note\">, (US Patent App. 16\/473,037)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1114\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1114','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1114\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1114','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1114\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1114','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1114\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@misc{SZ:WWZPpatent20c,<br \/>\r\ntitle = {Autonomous Vehicle Operational Management Blocking Monitoring},<br \/>\r\nauthor = {Kyle Hollins Wray and Stefan J Witwicki and Shlomo Zilberstein and Liam Pedersen},<br \/>\r\nurl = {https:\/\/patents.google.com\/patent\/US20200098269A1\/en},<br \/>\r\nyear  = {2020},<br \/>\r\ndate = {2020-03-26},<br \/>\r\npublisher = {Google Patents},<br \/>\r\nabstract = {Autonomous vehicle operational management including blocking monitoring may include traversing, by an autonomous vehicle, a vehicle transportation network. Traversing the vehicle transportation network may include operating a blocking monitor instance, which may include identifying operational environment information including information corresponding to a first external object within a defined distance of the autonomous vehicle, determining a first area of the vehicle transportation network based on a current geospatial location of the autonomous vehicle in the vehicle transportation network and an identified route for the autonomous vehicle, and determining a probability of availability for the first area based on the operational environment information. Traversing the vehicle transportation network may include traversing a portion of the vehicle transportation network based on the probability of availability.},<br \/>\r\nnote = {US Patent App. 16\/473,037},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {misc}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1114','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1114\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Autonomous vehicle operational management including blocking monitoring may include traversing, by an autonomous vehicle, a vehicle transportation network. Traversing the vehicle transportation network may include operating a blocking monitor instance, which may include identifying operational environment information including information corresponding to a first external object within a defined distance of the autonomous vehicle, determining a first area of the vehicle transportation network based on a current geospatial location of the autonomous vehicle in the vehicle transportation network and an identified route for the autonomous vehicle, and determining a probability of availability for the first area based on the operational environment information. Traversing the vehicle transportation network may include traversing a portion of the vehicle transportation network based on the probability of availability.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1114','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1114\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/patents.google.com\/patent\/US20200098269A1\/en\" title=\"https:\/\/patents.google.com\/patent\/US20200098269A1\/en\" target=\"_blank\">https:\/\/patents.google.com\/patent\/US20200098269A1\/en<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1114','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_misc\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Witwicki, Stefan J;  Zilberstein, Shlomo;  Cefkin, Melissa<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1112','tp_links')\" style=\"cursor:pointer;\">Orientation-Adjust Actions for Autonomous Vehicle Operational Management<\/a> <span class=\"tp_pub_type tp_  misc\">Miscellaneous<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_year\">2020<\/span><span class=\"tp_pub_additional_note\">, (US Patent App. 16\/023,710)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1112\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1112','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1112\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1112','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1112\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1112','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1112\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@misc{SZ:WWZCpatent20a,<br \/>\r\ntitle = {Orientation-Adjust Actions for Autonomous Vehicle Operational Management},<br \/>\r\nauthor = {Kyle Hollins Wray and Stefan J Witwicki and Shlomo Zilberstein and Melissa Cefkin},<br \/>\r\nurl = {http:\/\/www.freepatentsonline.com\/y2020\/0005645.html},<br \/>\r\nyear  = {2020},<br \/>\r\ndate = {2020-01-02},<br \/>\r\npublisher = {Google Patents},<br \/>\r\nabstract = {Traversing, by an autonomous vehicle, a vehicle transportation network, may include identifying a policy for a scenario-specific operational control evaluation model of a distinct vehicle operational scenario, receiving a candidate vehicle control action from the policy, wherein, in response to a determination that an uncertainty value for the distinct vehicle operational scenario exceeds a defined uncertainty threshold, the candidate vehicle control action is an orientation-adjust vehicle control action, and traversing a portion of the vehicle transportation network in accordance with the candidate vehicle control action, wherein the portion of the vehicle transportation network includes the distinct vehicle operational scenario.},<br \/>\r\nnote = {US Patent App. 16\/023,710},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {misc}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1112','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1112\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Traversing, by an autonomous vehicle, a vehicle transportation network, may include identifying a policy for a scenario-specific operational control evaluation model of a distinct vehicle operational scenario, receiving a candidate vehicle control action from the policy, wherein, in response to a determination that an uncertainty value for the distinct vehicle operational scenario exceeds a defined uncertainty threshold, the candidate vehicle control action is an orientation-adjust vehicle control action, and traversing a portion of the vehicle transportation network in accordance with the candidate vehicle control action, wherein the portion of the vehicle transportation network includes the distinct vehicle operational scenario.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1112','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1112\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/www.freepatentsonline.com\/y2020\/0005645.html\" title=\"http:\/\/www.freepatentsonline.com\/y2020\/0005645.html\" target=\"_blank\">http:\/\/www.freepatentsonline.com\/y2020\/0005645.html<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1112','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Basich, Connor;  Svegliato, Justin;  Wray, Kyle Hollins;  Witwicki, Stefan J;  Biswas, Joydeep;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1117','tp_links')\" style=\"cursor:pointer;\">Learning to Optimize Autonomy in Competence-Aware Systems<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS), <\/span><span class=\"tp_pub_additional_address\">Auckland, New Zealand, <\/span><span class=\"tp_pub_additional_year\">2020<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1117\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1117','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1117\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1117','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1117\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1117','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1117\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:BSWWBZaamas20,<br \/>\r\ntitle = {Learning to Optimize Autonomy in Competence-Aware Systems},<br \/>\r\nauthor = {Connor Basich and Justin Svegliato and Kyle Hollins Wray and Stefan J Witwicki and Joydeep Biswas and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSWWBZaamas20.pdf},<br \/>\r\nyear  = {2020},<br \/>\r\ndate = {2020-01-01},<br \/>\r\nbooktitle = {Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS)},<br \/>\r\naddress = {Auckland, New Zealand},<br \/>\r\nabstract = {Interest in semi-autonomous systems (SAS) is growing rapidly as a paradigm to deploy autonomous systems in domains that require occasional reliance on humans. This paradigm allows service robots or autonomous vehicles to operate at varying levels of autonomy and offer safety in situations that require human judgment. We propose an introspective model of autonomy that is learned and updated online through experience and dictates the extent to which the agent can act autonomously in any given situation. We define a competence-aware system (CAS) that explicitly models its own proficiency at different levels of autonomy and the available human feedback. A CAS learns to adjust its level of autonomy based on experience to maximize overall efficiency, factoring in the cost of human assistance. We analyze the convergence properties of CAS and provide experimental results for robot delivery and autonomous driving domains that demonstrate the benefits of the approach.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1117','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1117\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Interest in semi-autonomous systems (SAS) is growing rapidly as a paradigm to deploy autonomous systems in domains that require occasional reliance on humans. This paradigm allows service robots or autonomous vehicles to operate at varying levels of autonomy and offer safety in situations that require human judgment. We propose an introspective model of autonomy that is learned and updated online through experience and dictates the extent to which the agent can act autonomously in any given situation. We define a competence-aware system (CAS) that explicitly models its own proficiency at different levels of autonomy and the available human feedback. A CAS learns to adjust its level of autonomy based on experience to maximize overall efficiency, factoring in the cost of human assistance. We analyze the convergence properties of CAS and provide experimental results for robot delivery and autonomous driving domains that demonstrate the benefits of the approach.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1117','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1117\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSWWBZaamas20.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSWWBZaamas20.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSWWBZaamas20.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1117','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Miura, Shuwa;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1118','tp_links')\" style=\"cursor:pointer;\">Maximizing Plan Legibility in Stochastic Environments (Extended Abstract)<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS), <\/span><span class=\"tp_pub_additional_address\">Auckland, New Zealand, <\/span><span class=\"tp_pub_additional_year\">2020<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1118\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1118','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1118\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1118','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1118\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1118','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1118\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:MZaamas20,<br \/>\r\ntitle = {Maximizing Plan Legibility in Stochastic Environments (Extended Abstract)},<br \/>\r\nauthor = {Shuwa Miura and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZaamas20.pdf},<br \/>\r\nyear  = {2020},<br \/>\r\ndate = {2020-01-01},<br \/>\r\nbooktitle = {Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS)},<br \/>\r\naddress = {Auckland, New Zealand},<br \/>\r\nabstract = {Legible behavior allows an observing agent to infer the intention of an observed agent. Producing legible behavior is crucial for successful multi-agent interaction in many domains. We introduce techniques for legible planning in stochastic environments. Maximizing legibility, however, presents a complex trade-off between maximizing the underlying rewards. Hence, we propose a method to balance the trade-off. In our experiments, we demonstrate that maximizing legibility results in unambiguous behaviors.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1118','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1118\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Legible behavior allows an observing agent to infer the intention of an observed agent. Producing legible behavior is crucial for successful multi-agent interaction in many domains. We introduce techniques for legible planning in stochastic environments. Maximizing legibility, however, presents a complex trade-off between maximizing the underlying rewards. Hence, we propose a method to balance the trade-off. In our experiments, we demonstrate that maximizing legibility results in unambiguous behaviors.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1118','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1118\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZaamas20.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZaamas20.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZaamas20.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1118','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Saisubramanian, Sandhya;  Kamar, Ece;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1119','tp_links')\" style=\"cursor:pointer;\">Mitigating the Negative Side Effects of Reasoning with Imperfect Models: A Multi-Objective Approach (Extended Abstract)<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS), <\/span><span class=\"tp_pub_additional_address\">Auckland, New Zealand, <\/span><span class=\"tp_pub_additional_year\">2020<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1119\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1119','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1119\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1119','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1119\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1119','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1119\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SKZaamas20,<br \/>\r\ntitle = {Mitigating the Negative Side Effects of Reasoning with Imperfect Models: A Multi-Objective Approach (Extended Abstract)},<br \/>\r\nauthor = {Sandhya Saisubramanian and Ece Kamar and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SKZaamas20.pdf},<br \/>\r\nyear  = {2020},<br \/>\r\ndate = {2020-01-01},<br \/>\r\nbooktitle = {Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS)},<br \/>\r\naddress = {Auckland, New Zealand},<br \/>\r\nabstract = {Agents often operate using imperfect models of the environment that ignore certain aspects of the real world. Reasoning with such models may lead to negative side effects (NSE) when satisfying the primary objective of the available model, which are inherently difficult to identify at design time. We examine how various forms of feedback can be used to learn a penalty function associated with NSE during execution. We formulate the problem of mitigating the impact of NSE as a multi-objective Markov decision process with lexicographic reward preferences and slack. Empirical evaluation of our approach on three domains shows that the proposed framework can successfully mitigate NSE.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1119','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1119\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Agents often operate using imperfect models of the environment that ignore certain aspects of the real world. Reasoning with such models may lead to negative side effects (NSE) when satisfying the primary objective of the available model, which are inherently difficult to identify at design time. We examine how various forms of feedback can be used to learn a penalty function associated with NSE during execution. We formulate the problem of mitigating the impact of NSE as a multi-objective Markov decision process with lexicographic reward preferences and slack. Empirical evaluation of our approach on three domains shows that the proposed framework can successfully mitigate NSE.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1119','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1119\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SKZaamas20.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SKZaamas20.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SKZaamas20.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1119','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Basich, Connor;  Svegliato, Justin;  Wray, Kyle Hollins;  Witwicki, Stefan J;  Biswas, Joydeep;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1124','tp_links')\" style=\"cursor:pointer;\">Learning to Optimize Autonomy in Competence-Aware Systems<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">CoRR, <\/span><span class=\"tp_pub_additional_volume\">vol. abs\/2003.07745, <\/span><span class=\"tp_pub_additional_year\">2020<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1124\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1124','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1124\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1124','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1124\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1124','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1124\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:BSWWBZarXiv20a,<br \/>\r\ntitle = {Learning to Optimize Autonomy in Competence-Aware Systems},<br \/>\r\nauthor = {Connor Basich and Justin Svegliato and Kyle Hollins Wray and Stefan J Witwicki and Joydeep Biswas and Shlomo Zilberstein},<br \/>\r\nurl = {https:\/\/arxiv.org\/abs\/2003.07745},<br \/>\r\nyear  = {2020},<br \/>\r\ndate = {2020-01-01},<br \/>\r\njournal = {CoRR},<br \/>\r\nvolume = {abs\/2003.07745},<br \/>\r\nabstract = {Interest in semi-autonomous systems (SAS) is growing rapidly as a paradigm to deploy autonomous systems in domains that require occasional reliance on humans. This paradigm allows service robots or autonomous vehicles to operate at varying levels of autonomy and offer safety in situations that require human judgment. We propose an introspective model of autonomy that is learned and updated online through experience and dictates the extent to which the agent can act autonomously in any given situation. We define a competence-aware system (CAS) that explicitly models its own proficiency at different levels of autonomy and the available human feedback. A CAS learns to adjust its level of autonomy based on experience to maximize overall efficiency, factoring in the cost of human assistance. We analyze the convergence properties of CAS and provide experimental results for robot delivery and autonomous driving domains that demonstrate the benefits of the approach.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1124','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1124\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Interest in semi-autonomous systems (SAS) is growing rapidly as a paradigm to deploy autonomous systems in domains that require occasional reliance on humans. This paradigm allows service robots or autonomous vehicles to operate at varying levels of autonomy and offer safety in situations that require human judgment. We propose an introspective model of autonomy that is learned and updated online through experience and dictates the extent to which the agent can act autonomously in any given situation. We define a competence-aware system (CAS) that explicitly models its own proficiency at different levels of autonomy and the available human feedback. A CAS learns to adjust its level of autonomy based on experience to maximize overall efficiency, factoring in the cost of human assistance. We analyze the convergence properties of CAS and provide experimental results for robot delivery and autonomous driving domains that demonstrate the benefits of the approach.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1124','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1124\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"ai ai-arxiv\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/arxiv.org\/abs\/2003.07745\" title=\"https:\/\/arxiv.org\/abs\/2003.07745\" target=\"_blank\">https:\/\/arxiv.org\/abs\/2003.07745<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1124','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Parr, Shane;  Khatri, Ishan;  Svegliato, Justin;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1125','tp_links')\" style=\"cursor:pointer;\">Agent-Aware State Estimation: Effective Traffic Light Classification for Autonomous Vehicles<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">ICRA 2020 Workshop on Sensing, Estimating and Understanding the Dynamic World, <\/span><span class=\"tp_pub_additional_year\">2020<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1125\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1125','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1125\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1125','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1125\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1125','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1125\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:PKSZicra20ws,<br \/>\r\ntitle = {Agent-Aware State Estimation: Effective Traffic Light Classification for Autonomous Vehicles},<br \/>\r\nauthor = {Shane Parr and Ishan Khatri and Justin Svegliato and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PKSZicra20ws.pdf},<br \/>\r\nyear  = {2020},<br \/>\r\ndate = {2020-01-01},<br \/>\r\nbooktitle = {ICRA 2020 Workshop on Sensing, Estimating and Understanding the Dynamic World},<br \/>\r\nabstract = {Autonomous systems often operate in environments where the behavior of all agents is mostly governed by the perception of a specific feature of the environment. When an autonomous system cannot recover this feature, there can be disastrous consequences. We introduce a novel framework for agent-aware state estimation that exploits the dependency of all agents' behavior on a feature to better indirectly observe the feature. To allow for fast and accurate inference, we provide a mapping of our framework to a dynamic Bayesian network and show that speed of inference scales favorably with the number of agents in the environment. We then apply our approach to traffic light classification, focusing on instances where direct vision of the light may be obstructed by glare, heavy rain, vehicles, or other environmental factors. Finally, we show that agent-aware state estimation outperforms prevailing methods that only use direct image data of the traffic light on a real-world autonomous vehicle data set of challenging scenarios.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1125','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1125\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Autonomous systems often operate in environments where the behavior of all agents is mostly governed by the perception of a specific feature of the environment. When an autonomous system cannot recover this feature, there can be disastrous consequences. We introduce a novel framework for agent-aware state estimation that exploits the dependency of all agents' behavior on a feature to better indirectly observe the feature. To allow for fast and accurate inference, we provide a mapping of our framework to a dynamic Bayesian network and show that speed of inference scales favorably with the number of agents in the environment. We then apply our approach to traffic light classification, focusing on instances where direct vision of the light may be obstructed by glare, heavy rain, vehicles, or other environmental factors. Finally, we show that agent-aware state estimation outperforms prevailing methods that only use direct image data of the traffic light on a real-world autonomous vehicle data set of challenging scenarios.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1125','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1125\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PKSZicra20ws.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PKSZicra20ws.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/PKSZicra20ws.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1125','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_misc\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Svegliato, Justin;  Witwicki, Stefan J;  Wray, Kyle Hollins;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1127','tp_links')\" style=\"cursor:pointer;\">Introspective Autonomous Vehicle Operational Management<\/a> <span class=\"tp_pub_type tp_  misc\">Miscellaneous<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_year\">2020<\/span><span class=\"tp_pub_additional_note\">, (US Patent 10,649,453)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1127\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1127','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1127\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1127','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1127\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1127','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1127\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@misc{SZ:SWWZpatent20d,<br \/>\r\ntitle = {Introspective Autonomous Vehicle Operational Management},<br \/>\r\nauthor = {Justin Svegliato and Stefan J Witwicki and Kyle Hollins Wray and Shlomo Zilberstein},<br \/>\r\nurl = {https:\/\/patents.google.com\/patent\/US10649453B1\/en},<br \/>\r\nyear  = {2020},<br \/>\r\ndate = {2020-00-01},<br \/>\r\npublisher = {Google Patents},<br \/>\r\nabstract = {Introspective autonomous vehicle operational management includes operating an introspective autonomous vehicle operational management controller including a policy for a model of an introspective autonomous vehicle operational management domain. Operating the controller includes, in response to a determination that a current belief state of the policy indicates an exceptional condition, identifying an exception handler for controlling the autonomous vehicle. Operating the controller includes, in response to a determination that the current belief state indicates an unexceptional condition, identifying a primary handler as the active handler. Operating the controller includes controlling the autonomous vehicle to traverse a current portion of the vehicle transportation network in accordance with the active handler, receiving an indicator output by the active handler, generating an updated belief state based on the indicator, and controlling the autonomous vehicle to traverse a subsequent portion of the vehicle transportation network based on the updated belief state.},<br \/>\r\nnote = {US Patent 10,649,453},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {misc}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1127','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1127\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Introspective autonomous vehicle operational management includes operating an introspective autonomous vehicle operational management controller including a policy for a model of an introspective autonomous vehicle operational management domain. Operating the controller includes, in response to a determination that a current belief state of the policy indicates an exceptional condition, identifying an exception handler for controlling the autonomous vehicle. Operating the controller includes, in response to a determination that the current belief state indicates an unexceptional condition, identifying a primary handler as the active handler. Operating the controller includes controlling the autonomous vehicle to traverse a current portion of the vehicle transportation network in accordance with the active handler, receiving an indicator output by the active handler, generating an updated belief state based on the indicator, and controlling the autonomous vehicle to traverse a subsequent portion of the vehicle transportation network based on the updated belief state.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1127','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1127\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/patents.google.com\/patent\/US10649453B1\/en\" title=\"https:\/\/patents.google.com\/patent\/US10649453B1\/en\" target=\"_blank\">https:\/\/patents.google.com\/patent\/US10649453B1\/en<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1127','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('860','tp_links')\" style=\"cursor:pointer;\">Policy Networks: A Framework for Scalable Integration of Multiple Decision-Making Models (Extended Abstract)<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS), <\/span><span class=\"tp_pub_additional_address\">Montreal, Quebec, CA, <\/span><span class=\"tp_pub_additional_year\">2019<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_860\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('860','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_860\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('860','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_860\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('860','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_860\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WZaamas19,<br \/>\r\ntitle = {Policy Networks: A Framework for Scalable Integration of Multiple Decision-Making Models (Extended Abstract)},<br \/>\r\nauthor = {Kyle Hollins Wray and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZaamas19.pdf},<br \/>\r\nyear  = {2019},<br \/>\r\ndate = {2019-01-01},<br \/>\r\nbooktitle = {Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS)},<br \/>\r\npages = {2270--2272},<br \/>\r\naddress = {Montreal, Quebec, CA},<br \/>\r\nabstract = {Policy networks are graphical models that integrate decision-making models. They allow for multiple Markov decision processes (MDPs) that describe distinct focused aspects of a domain to work in harmony to solve a large-scale problem. This paper defines policy networks and shows how they are able to naturally generalize many previous models, such as options and constrained MDPs.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('860','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_860\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Policy networks are graphical models that integrate decision-making models. They allow for multiple Markov decision processes (MDPs) that describe distinct focused aspects of a domain to work in harmony to solve a large-scale problem. This paper defines policy networks and shows how they are able to naturally generalize many previous models, such as options and constrained MDPs.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('860','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_860\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZaamas19.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZaamas19.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZaamas19.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('860','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Svegliato, Justin;  Wray, Kyle Hollins;  Witwicki, Stefan J;  Biswas, Joydeep;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1108','tp_links')\" style=\"cursor:pointer;\">Belief Space Metareasoning for Exception Recovery<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), <\/span><span class=\"tp_pub_additional_address\">Macau, China, <\/span><span class=\"tp_pub_additional_year\">2019<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1108\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1108','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1108\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1108','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1108\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1108','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1108\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SWWBZiros19,<br \/>\r\ntitle = {Belief Space Metareasoning for Exception Recovery},<br \/>\r\nauthor = {Justin Svegliato and Kyle Hollins Wray and Stefan J Witwicki and Joydeep Biswas and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SWWBZiros19.pdf},<br \/>\r\nyear  = {2019},<br \/>\r\ndate = {2019-01-01},<br \/>\r\nbooktitle = {Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS)},<br \/>\r\naddress = {Macau, China},<br \/>\r\nabstract = {Due to the complexity of the real world, autonomous systems use decision-making models that rely on simplifying assumptions to make them computationally tractable and feasible to design. However, since these limited representations cannot fully capture the domain of operation, an autonomous system may encounter unanticipated scenarios that cannot be resolved effectively. We first formally introduce an introspective autonomous system that uses belief space metareasoning to recover from exceptions by interleaving a main decision process with a set of exception handlers. We then apply introspective autonomy to autonomous driving. Finally, we demonstrate that an introspective autonomous vehicle is effective in simulation and on a fully operational prototype.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1108','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1108\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Due to the complexity of the real world, autonomous systems use decision-making models that rely on simplifying assumptions to make them computationally tractable and feasible to design. However, since these limited representations cannot fully capture the domain of operation, an autonomous system may encounter unanticipated scenarios that cannot be resolved effectively. We first formally introduce an introspective autonomous system that uses belief space metareasoning to recover from exceptions by interleaving a main decision process with a set of exception handlers. We then apply introspective autonomy to autonomous driving. Finally, we demonstrate that an introspective autonomous vehicle is effective in simulation and on a fully operational prototype.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1108','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1108\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SWWBZiros19.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SWWBZiros19.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SWWBZiros19.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1108','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_misc\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Witwicki, Stefan J;  Zilberstein, Shlomo;  Pedersen, Liam<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1110','tp_links')\" style=\"cursor:pointer;\">Autonomous Vehicle Operational Management Control<\/a> <span class=\"tp_pub_type tp_  misc\">Miscellaneous<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_year\">2019<\/span><span class=\"tp_pub_additional_note\">, (US Patent App. 16\/472,573)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1110\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1110','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1110\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1110','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1110\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1110','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1110\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@misc{SZ:WWZPpatent19a,<br \/>\r\ntitle = {Autonomous Vehicle Operational Management Control},<br \/>\r\nauthor = {Kyle Hollins Wray and Stefan J Witwicki and Shlomo Zilberstein and Liam Pedersen},<br \/>\r\nurl = {https:\/\/patentimages.storage.googleapis.com\/60\/6e\/6e\/91882eea0fc0e7\/US20190329771A1.pdf},<br \/>\r\nyear  = {2019},<br \/>\r\ndate = {2019-01-01},<br \/>\r\npublisher = {Google Patents},<br \/>\r\nabstract = {Autonomous vehicle operational management may include traversing, by an autonomous vehicle, a vehicle transportation network. Traversing the vehicle transportation network may include receiving, from a sensor of the autonomous vehicle, sensor information corresponding to an external object within a defined distance of the autonomous vehicle, identifying a distinct vehicle operational scenario in response to receiving the sensor information, instantiating a scenario-specific operational control evaluation module instance, wherein the scenario-specific operational control evaluation module instance is an instance of a scenario-specific operational control evaluation module modeling the distinct vehicle operational scenario, receiving a candidate vehicle control action from the scenario-specific operational control evaluation module instance, and traversing a portion of the vehicle transportation network based on the candidate vehicle control action.},<br \/>\r\nnote = {US Patent App. 16\/472,573},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {misc}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1110','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1110\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Autonomous vehicle operational management may include traversing, by an autonomous vehicle, a vehicle transportation network. Traversing the vehicle transportation network may include receiving, from a sensor of the autonomous vehicle, sensor information corresponding to an external object within a defined distance of the autonomous vehicle, identifying a distinct vehicle operational scenario in response to receiving the sensor information, instantiating a scenario-specific operational control evaluation module instance, wherein the scenario-specific operational control evaluation module instance is an instance of a scenario-specific operational control evaluation module modeling the distinct vehicle operational scenario, receiving a candidate vehicle control action from the scenario-specific operational control evaluation module instance, and traversing a portion of the vehicle transportation network based on the candidate vehicle control action.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1110','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1110\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/patentimages.storage.googleapis.com\/60\/6e\/6e\/91882eea0fc0e7\/US20190329771A1.pdf\" title=\"https:\/\/patentimages.storage.googleapis.com\/60\/6e\/6e\/91882eea0fc0e7\/US2019032977[...]\" target=\"_blank\">https:\/\/patentimages.storage.googleapis.com\/60\/6e\/6e\/91882eea0fc0e7\/US2019032977[...]<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1110','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_proceedings\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Shaw, Julie A.;  Stone, Peter;  Witwicki, Stefan J.;  Zilberstein, Shlomo (Ed.)<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1099','tp_links')\" style=\"cursor:pointer;\">Proceedings of the AAAI Fall Symposium on Reasoning and Learning in Real-World Systems for Long-Term Autonomy<\/a> <span class=\"tp_pub_type tp_  proceedings\">Proceedings<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_address\">Arlington, VA, <\/span><span class=\"tp_pub_additional_year\">2018<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1099\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1099','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1099\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1099','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1099\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1099','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1099\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@proceedings{SZ:WSSWZlta18,<br \/>\r\ntitle = {Proceedings of the AAAI Fall Symposium on Reasoning and Learning in Real-World Systems for Long-Term Autonomy},<br \/>\r\neditor = {Kyle Hollins Wray and Julie A. Shaw and Peter Stone and Stefan J. Witwicki and Shlomo Zilberstein},<br \/>\r\nurl = {https:\/\/web.cs.umass.edu\/publication\/details.php?id=2462},<br \/>\r\nyear  = {2018},<br \/>\r\ndate = {2018-01-01},<br \/>\r\naddress = {Arlington, VA},<br \/>\r\nabstract = {Over the past decade, decision-making agents have been increasingly deployed in industrial settings, consumer products, healthcare, education, and entertainment. The development of drone delivery services, virtual assistants, and autonomous vehicles have highlighted numerous challenges surrounding the operation of autonomous systems in unstructured environments. This includes mechanisms to support autonomous operations over extended periods of time, techniques that facilitate the use of human assistance in learning and decision-making, learning to reduce the reliance on humans over time, addressing the practical scalability of existing methods, relaxing unrealistic assumptions, and alleviating safety concerns about deploying these systems.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {proceedings}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1099','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1099\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Over the past decade, decision-making agents have been increasingly deployed in industrial settings, consumer products, healthcare, education, and entertainment. The development of drone delivery services, virtual assistants, and autonomous vehicles have highlighted numerous challenges surrounding the operation of autonomous systems in unstructured environments. This includes mechanisms to support autonomous operations over extended periods of time, techniques that facilitate the use of human assistance in learning and decision-making, learning to reduce the reliance on humans over time, addressing the practical scalability of existing methods, relaxing unrealistic assumptions, and alleviating safety concerns about deploying these systems.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1099','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1099\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/web.cs.umass.edu\/publication\/details.php?id=2462\" title=\"https:\/\/web.cs.umass.edu\/publication\/details.php?id=2462\" target=\"_blank\">https:\/\/web.cs.umass.edu\/publication\/details.php?id=2462<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1099','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1100','tp_links')\" style=\"cursor:pointer;\">Policy Networks for Reasoning in Long-Term Autonomy<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">AAAI Fall Symposium on Reasoning and Learning in Real-World Systems for Long-Term Autonomy (LTA), <\/span><span class=\"tp_pub_additional_address\">Arlington, Virginia, <\/span><span class=\"tp_pub_additional_year\">2018<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1100\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1100','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1100\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1100','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1100\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1100','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1100\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WZlta18,<br \/>\r\ntitle = {Policy Networks for Reasoning in Long-Term Autonomy},<br \/>\r\nauthor = {Kyle Hollins Wray and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZlta18.pdf},<br \/>\r\nyear  = {2018},<br \/>\r\ndate = {2018-01-01},<br \/>\r\nbooktitle = {AAAI Fall Symposium on Reasoning and Learning in Real-World Systems for Long-Term Autonomy (LTA)},<br \/>\r\naddress = {Arlington, Virginia},<br \/>\r\nabstract = {Policy networks are graphical models that integrate decision-making models. They allow for multiple Markov decision processes (MDPs) that describe distinct focused aspects of a domain to work in harmony to solve a large-scale problem. This paper presents the formalization of policy networks and their use in modeling reasoning tasks necessary for scalable long-term autonomy. We prove that policy networks generalize a wide array of previous models, such as options and constrained MDPs, which can be equivalently viewed as the integration of multiple models. To illustrate the approach, we apply policy networks to the challenging real world domain of robotic home health care. We demonstrate the benefits of policy networks on a real robot and show how they facilitate scalable integration of multiple decision-making models.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1100','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1100\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Policy networks are graphical models that integrate decision-making models. They allow for multiple Markov decision processes (MDPs) that describe distinct focused aspects of a domain to work in harmony to solve a large-scale problem. This paper presents the formalization of policy networks and their use in modeling reasoning tasks necessary for scalable long-term autonomy. We prove that policy networks generalize a wide array of previous models, such as options and constrained MDPs, which can be equivalently viewed as the integration of multiple models. To illustrate the approach, we apply policy networks to the challenging real world domain of robotic home health care. We demonstrate the benefits of policy networks on a real robot and show how they facilitate scalable integration of multiple decision-making models.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1100','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1100\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZlta18.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZlta18.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WZlta18.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1100','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Witwicki, Stefan J;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('882','tp_links')\" style=\"cursor:pointer;\">Online Decision Making for Scalable Autonomous Systems<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_year\">2017<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_882\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('882','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_882\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('882','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_882\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('882','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_882\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WWZijcai17,<br \/>\r\ntitle = {Online Decision Making for Scalable Autonomous Systems},<br \/>\r\nauthor = {Kyle Hollins Wray and Stefan J Witwicki and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WWZijcai17.pdf},<br \/>\r\ndoi = {10.24963\/ijcai.2017\/664},<br \/>\r\nyear  = {2017},<br \/>\r\ndate = {2017-01-01},<br \/>\r\nbooktitle = {Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\npages = {4768--4774},<br \/>\r\nabstract = {We present a general formal model called MODIA that can tackle a central challenge for autonomous vehicles (AVs), namely the ability to interact with an unspecified, large number of world entities. In MODIA, a collection of possible decision- problems (DPs), known a priori, are instantiated online and executed as decision-components (DCs), unknown a priori. To combine the individual action recommendations of the DCs into a single action, we propose the lexicographic executor action function (LEAF) mechanism. We analyze the complexity of MODIA and establish LEAF's relation to regret minimization. Finally, we implement MODIA and LEAF using collections of partially observable Markov decision pro- cess (POMDP) DPs, and use them for complex AV intersection decision-making. We evaluate the approach in six scenarios within a realistic vehicle simulator and present its use on an AV prototype.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('882','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_882\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We present a general formal model called MODIA that can tackle a central challenge for autonomous vehicles (AVs), namely the ability to interact with an unspecified, large number of world entities. In MODIA, a collection of possible decision- problems (DPs), known a priori, are instantiated online and executed as decision-components (DCs), unknown a priori. To combine the individual action recommendations of the DCs into a single action, we propose the lexicographic executor action function (LEAF) mechanism. We analyze the complexity of MODIA and establish LEAF's relation to regret minimization. Finally, we implement MODIA and LEAF using collections of partially observable Markov decision pro- cess (POMDP) DPs, and use them for complex AV intersection decision-making. We evaluate the approach in six scenarios within a realistic vehicle simulator and present its use on an AV prototype.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('882','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_882\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WWZijcai17.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WWZijcai17.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WWZijcai17.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.24963\/ijcai.2017\/664\" title=\"Follow DOI:10.24963\/ijcai.2017\/664\" target=\"_blank\">doi:10.24963\/ijcai.2017\/664<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('882','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wray, Kyle Hollins;  Pineda, Luis Enrique;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('892','tp_links')\" style=\"cursor:pointer;\">Hierarchical Approach to Transfer of Control in Semi-Autonomous Systems<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_address\">New York, NY, <\/span><span class=\"tp_pub_additional_year\">2016<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_892\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('892','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_892\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('892','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_892\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('892','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_892\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WPZijcai16,<br \/>\r\ntitle = {Hierarchical Approach to Transfer of Control in Semi-Autonomous Systems},<br \/>\r\nauthor = {Kyle Hollins Wray and Luis Enrique Pineda and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WPZijcai16.pdf},<br \/>\r\nyear  = {2016},<br \/>\r\ndate = {2016-01-01},<br \/>\r\nbooktitle = {Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\npages = {517--523},<br \/>\r\naddress = {New York, NY},<br \/>\r\nabstract = {Semi-Autonomous Systems (SAS) encapsulate a stochastic decision process explicitly controlled by both an agent and a human, in order to leverage the distinct capabilities of each actor. Planning in SAS must address the challenge of transferring control quickly, safely, and smoothly back-and-forth between the agent and the human. We formally define SAS and the requirements to guarantee that the controlling entities are always able to act competently. We then consider applying the model to Semi-Autonomous VEhicles (SAVE), using a hierarchical approach in which micro-level transfer-of-control actions are governed by a high-fidelity POMDP model. Macro-level path planning in our hierarchical approach is performed by solving a Stochastic Shortest Path (SSP) problem. We analyze the integrated model and show that it provides the required guarantees. Finally, we test the SAVE model using real-world road data from Open Street Map (OSM) within 10 cities, showing the benefits of the collaboration between the agent and human.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('892','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_892\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Semi-Autonomous Systems (SAS) encapsulate a stochastic decision process explicitly controlled by both an agent and a human, in order to leverage the distinct capabilities of each actor. Planning in SAS must address the challenge of transferring control quickly, safely, and smoothly back-and-forth between the agent and the human. We formally define SAS and the requirements to guarantee that the controlling entities are always able to act competently. We then consider applying the model to Semi-Autonomous VEhicles (SAVE), using a hierarchical approach in which micro-level transfer-of-control actions are governed by a high-fidelity POMDP model. Macro-level path planning in our hierarchical approach is performed by solving a Stochastic Shortest Path (SSP) problem. We analyze the integrated model and show that it provides the required guarantees. Finally, we test the SAVE model using real-world road data from Open Street Map (OSM) within 10 cities, showing the benefits of the collaboration between the agent and human.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('892','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_892\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WPZijcai16.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WPZijcai16.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WPZijcai16.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('892','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('900','tp_links')\" style=\"cursor:pointer;\">Building Strong Semi-Autonomous Systems<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 29th Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_address\">Austin, Texas, <\/span><span class=\"tp_pub_additional_year\">2015<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_900\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('900','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_900\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('900','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_900\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('900','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_900\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:Zaaai15,<br \/>\r\ntitle = {Building Strong Semi-Autonomous Systems},<br \/>\r\nauthor = {Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/Zaaai15.pdf},<br \/>\r\nyear  = {2015},<br \/>\r\ndate = {2015-01-01},<br \/>\r\nbooktitle = {Proceedings of the 29th Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {4088--4092},<br \/>\r\naddress = {Austin, Texas},<br \/>\r\nabstract = {The vision of populating the world with autonomous systems that reduce human labor and improve safety is gradually becoming a reality. Autonomous systems have changed the way space exploration is conducted and are beginning to transform everyday life with a range of household products. In many areas, however, there are considerable barriers to the deployment of fully autonomous systems. We refer to systems that require some degree of human intervention in order to complete a task as semi-autonomous systems. We examine the broad rationale for semi-autonomy and define basic properties of such systems. Accounting for the human in the loop presents a considerable challenge for current planning techniques. We examine various design choices in the development of semi-autonomous systems and their implications on planning and execution. Finally, we discuss fruitful research directions for advancing the science of semi-autonomy.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('900','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_900\" style=\"display:none;\"><div class=\"tp_abstract_entry\">The vision of populating the world with autonomous systems that reduce human labor and improve safety is gradually becoming a reality. Autonomous systems have changed the way space exploration is conducted and are beginning to transform everyday life with a range of household products. In many areas, however, there are considerable barriers to the deployment of fully autonomous systems. We refer to systems that require some degree of human intervention in order to complete a task as semi-autonomous systems. We examine the broad rationale for semi-autonomy and define basic properties of such systems. Accounting for the human in the loop presents a considerable challenge for current planning techniques. We examine various design choices in the development of semi-autonomous systems and their implications on planning and execution. Finally, we discuss fruitful research directions for advancing the science of semi-autonomy.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('900','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_900\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/Zaaai15.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/Zaaai15.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/Zaaai15.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('900','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Mouaddib, Abdel-Illah;  Jeanpierre, Laurent;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('904','tp_links')\" style=\"cursor:pointer;\">Handling Advice in MDPs for Semi-Autonomous Systems<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">ICAPS Workshop on Planning and Robotics (PlanRob), <\/span><span class=\"tp_pub_additional_address\">Jerusalem, Israel, <\/span><span class=\"tp_pub_additional_year\">2015<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_904\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('904','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_904\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('904','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_904\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('904','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_904\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:MJZplanrob15,<br \/>\r\ntitle = {Handling Advice in MDPs for Semi-Autonomous Systems},<br \/>\r\nauthor = {Abdel-Illah Mouaddib and Laurent Jeanpierre and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MJZplanrob15.pdf},<br \/>\r\nyear  = {2015},<br \/>\r\ndate = {2015-01-01},<br \/>\r\nbooktitle = {ICAPS Workshop on Planning and Robotics (PlanRob)},<br \/>\r\naddress = {Jerusalem, Israel},<br \/>\r\nabstract = {This paper proposes an effective new model for decision making in situations where full autonomy is not feasible due to the inability to fully model and reason about the domain. To overcome this limitation, we consider a human operator who can supervise the system and guide its operation by providing high-level advice. We define a rich representation for advice and describe an effective algorithm for generating a new policy that conforms to the given advice. Advice is designed to improve the efficiency and safety of the system by imposing constraints on state visitation (either encouraging or discouraging the system to visit certain states). Coupled with the standard reward maximization criterion for MDPs, advice poses a complex multi-criteria decision problem. We present and analyze an effective algorithm for optimizing the policy in the presence of advice.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('904','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_904\" style=\"display:none;\"><div class=\"tp_abstract_entry\">This paper proposes an effective new model for decision making in situations where full autonomy is not feasible due to the inability to fully model and reason about the domain. To overcome this limitation, we consider a human operator who can supervise the system and guide its operation by providing high-level advice. We define a rich representation for advice and describe an effective algorithm for generating a new policy that conforms to the given advice. Advice is designed to improve the efficiency and safety of the system by imposing constraints on state visitation (either encouraging or discouraging the system to visit certain states). Coupled with the standard reward maximization criterion for MDPs, advice poses a complex multi-criteria decision problem. We present and analyze an effective algorithm for optimizing the policy in the presence of advice.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('904','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_904\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MJZplanrob15.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MJZplanrob15.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MJZplanrob15.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('904','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Mouaddib, Abdel-Illah;  Zilberstein, Shlomo;  Beynier, Aurelie;  Jeanpierre, Laurent<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('956','tp_links')\" style=\"cursor:pointer;\">A Decision-Theoretic Approach to Cooperative Control and Adjustable Autonomy<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 9th European Conference on Artificial Intelligence (ECAI), <\/span><span class=\"tp_pub_additional_address\">Lisbon, Portugal, <\/span><span class=\"tp_pub_additional_year\">2010<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_956\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('956','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_956\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('956','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_956\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('956','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_956\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:MZBJecai10,<br \/>\r\ntitle = {A Decision-Theoretic Approach to Cooperative Control and Adjustable Autonomy},<br \/>\r\nauthor = {Abdel-Illah Mouaddib and Shlomo Zilberstein and Aurelie Beynier and Laurent Jeanpierre},<br \/>\r\nurl = {https:\/\/doi.org\/10.3233\/978-1-60750-606-5-971},<br \/>\r\ndoi = {10.3233\/978-1-60750-606-5-971},<br \/>\r\nyear  = {2010},<br \/>\r\ndate = {2010-01-01},<br \/>\r\nbooktitle = {Proceedings of the 9th European Conference on Artificial Intelligence (ECAI)},<br \/>\r\npages = {971--972},<br \/>\r\naddress = {Lisbon, Portugal},<br \/>\r\nabstract = {Cooperative control can help overcome the limitations of autonomous systems (AS) by introducing a supervision unit (SU) (human or another system) into the control loop and creating adjustable autonomy. We present a decision-theoretic approach to accomplish this using Mixed Markov Decision Processes (MI-MDPs). The solution is an optimal plan that tells the AS what actions to perform as well as when to request SU attention or transfer control to the SU. This provides a varying degree of autonomy, particularly suitable for robots exploring a domain with regions that are too complex or risky for autonomous operation, or intelligent vehicles operating in heavy traffic.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('956','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_956\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Cooperative control can help overcome the limitations of autonomous systems (AS) by introducing a supervision unit (SU) (human or another system) into the control loop and creating adjustable autonomy. We present a decision-theoretic approach to accomplish this using Mixed Markov Decision Processes (MI-MDPs). The solution is an optimal plan that tells the AS what actions to perform as well as when to request SU attention or transfer control to the SU. This provides a varying degree of autonomy, particularly suitable for robots exploring a domain with regions that are too complex or risky for autonomous operation, or intelligent vehicles operating in heavy traffic.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('956','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_956\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/doi.org\/10.3233\/978-1-60750-606-5-971\" title=\"https:\/\/doi.org\/10.3233\/978-1-60750-606-5-971\" target=\"_blank\">https:\/\/doi.org\/10.3233\/978-1-60750-606-5-971<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.3233\/978-1-60750-606-5-971\" title=\"Follow DOI:10.3233\/978-1-60750-606-5-971\" target=\"_blank\">doi:10.3233\/978-1-60750-606-5-971<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('956','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><\/table><\/div><\/div>\n<div><\/div><\/div><\/div>\n<\/div>\n<h3><span style=\"color: #264278\"><b>Building Safe AI systems<\/b><\/span><\/h3>\n<div>\n<div>How can we create AI systems that are safe, transparent, and ethical?<\/div>\n<div><div class=\"bg-margin-for-link\"><input type='hidden' bg_collapse_expand='6a2d30290386b5084887217' value='6a2d30290386b5084887217'><input type='hidden' id='bg-show-more-text-6a2d30290386b5084887217' value='Show Related Publications'><input type='hidden' id='bg-show-less-text-6a2d30290386b5084887217' value='Hide Related Publications'><a id='bg-showmore-action-6a2d30290386b5084887217' class='bg-showmore-plg-link bg-arrow '  style=\" color:#7C2622;;\" href='#'>Show Related Publications<\/a><div id='bg-showmore-hidden-6a2d30290386b5084887217' ><div class=\"teachpress_pub_list\"><form name=\"tppublistform\" method=\"get\"><a name=\"tppubs\" id=\"tppubs\"><\/a><\/form><table class=\"teachpress_publication_list\"><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Choudhury, Moumita;  Saisubramanian, Sandhya;  Zhang, Hao;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1190','tp_links')\" style=\"cursor:pointer;\">Minimizing Negative Side Effects in Cooperative Multi-Agent Systems Using Distributed Coordination<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the The 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS), <\/span><span class=\"tp_pub_additional_address\">Auckland, New Zealand, <\/span><span class=\"tp_pub_additional_year\">2024<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1190\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1190','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1190\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1190','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1190\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1190','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1190\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:CSZZaamas24,<br \/>\r\ntitle = {Minimizing Negative Side Effects in Cooperative Multi-Agent Systems Using Distributed Coordination},<br \/>\r\nauthor = {Moumita Choudhury and Sandhya Saisubramanian and Hao Zhang and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CSZZaamas24.pdf},<br \/>\r\nyear  = {2024},<br \/>\r\ndate = {2024-01-01},<br \/>\r\nbooktitle = {Proceedings of the The 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS)},<br \/>\r\naddress = {Auckland, New Zealand},<br \/>\r\nabstract = {Autonomous agents in real-world environments may encounter undesirable outcomes or negative side effects (NSEs) when working collaboratively alongside other agents. We frame the challenge of minimizing NSEs in a multi-agent setting as a lexicographic decentralized Markov decision process in which we assume independence of rewards and transitions with respect to the primary assigned tasks, but allowing negative side effects to create a form of dependence among the agents. We present a lexicographic Q-learning approach to mitigate the NSEs using human feedback models while maintaining near-optimality with respect to the assigned tasks\u2013up to some given slack. Our empirical evaluation across two domains demonstrates that our collaborative approach effectively mitigates NSEs, outperforming non-collaborative methods.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1190','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1190\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Autonomous agents in real-world environments may encounter undesirable outcomes or negative side effects (NSEs) when working collaboratively alongside other agents. We frame the challenge of minimizing NSEs in a multi-agent setting as a lexicographic decentralized Markov decision process in which we assume independence of rewards and transitions with respect to the primary assigned tasks, but allowing negative side effects to create a form of dependence among the agents. We present a lexicographic Q-learning approach to mitigate the NSEs using human feedback models while maintaining near-optimality with respect to the assigned tasks\u2013up to some given slack. Our empirical evaluation across two domains demonstrates that our collaborative approach effectively mitigates NSEs, outperforming non-collaborative methods.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1190','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1190\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CSZZaamas24.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CSZZaamas24.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CSZZaamas24.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1190','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Choudhury, Moumita;  Saisubramanian, Sandhya;  Zhang, Hao;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1191','tp_links')\" style=\"cursor:pointer;\">Minimizing Negative Side Effects in Cooperative Multi-Agent Systems Using Distributed Coordination<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the The 37th International FLAIRS Conference, <\/span><span class=\"tp_pub_additional_address\">Miramar Beach, Florida, <\/span><span class=\"tp_pub_additional_year\">2024<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1191\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1191','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1191\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1191','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1191\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1191','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1191\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:CSZZflairs24,<br \/>\r\ntitle = {Minimizing Negative Side Effects in Cooperative Multi-Agent Systems Using Distributed Coordination},<br \/>\r\nauthor = {Moumita Choudhury and Sandhya Saisubramanian and Hao Zhang and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CSZZflairs24.pdf},<br \/>\r\nyear  = {2024},<br \/>\r\ndate = {2024-01-01},<br \/>\r\nbooktitle = {Proceedings of the The 37th International FLAIRS Conference},<br \/>\r\naddress = {Miramar Beach, Florida},<br \/>\r\nabstract = {Autonomous agents operating in real-world environments frequently encounter undesirable outcomes or negative side effects (NSEs) when working collaboratively alongside other agents. Even when agents can execute their primary task optimally when operating in isolation, their training may not account for potential negative interactions that arise in the presence of other agents. We frame the challenge of minimizing NSEs as a lexicographic decentralized Markov decision process in which we assume independence of rewards and transitions with respect to the primary assigned tasks, but recognize that addressing negative side effects creates a form of dependence among the agents. We present a lexicographic Q-learning approach to mitigate the NSEs using human feedback models while maintaining near-optimality with respect to the assigned tasks\u2013up to some given slack. Our empirical evaluation across two domains demonstrates that our collaborative approach effectively mitigates NSEs, outperforming non-collaborative methods.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1191','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1191\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Autonomous agents operating in real-world environments frequently encounter undesirable outcomes or negative side effects (NSEs) when working collaboratively alongside other agents. Even when agents can execute their primary task optimally when operating in isolation, their training may not account for potential negative interactions that arise in the presence of other agents. We frame the challenge of minimizing NSEs as a lexicographic decentralized Markov decision process in which we assume independence of rewards and transitions with respect to the primary assigned tasks, but recognize that addressing negative side effects creates a form of dependence among the agents. We present a lexicographic Q-learning approach to mitigate the NSEs using human feedback models while maintaining near-optimality with respect to the assigned tasks\u2013up to some given slack. Our empirical evaluation across two domains demonstrates that our collaborative approach effectively mitigates NSEs, outperforming non-collaborative methods.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1191','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1191\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CSZZflairs24.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CSZZflairs24.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/CSZZflairs24.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1191','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Basich, Connor;  Svegliato, Justin;  Wray, Kyle Hollins;  Witwicki, Stefan;  Biswas, Joydeep;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1166','tp_links')\" style=\"cursor:pointer;\">Competence-Aware Systems<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Artificial Intelligence (AIJ), <\/span><span class=\"tp_pub_additional_issue\">iss. 316, <\/span><span class=\"tp_pub_additional_pages\">pp. 103844, <\/span><span class=\"tp_pub_additional_year\">2023<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1166\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1166','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1166\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1166','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1166\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1166','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1166\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:BSWWBZaij23,<br \/>\r\ntitle = {Competence-Aware Systems},<br \/>\r\nauthor = {Connor Basich and Justin Svegliato and Kyle Hollins Wray and Stefan Witwicki and Joydeep Biswas and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSWWBZaij23.pdf},<br \/>\r\ndoi = {10.1016\/j.artint.2022.103844},<br \/>\r\nyear  = {2023},<br \/>\r\ndate = {2023-03-16},<br \/>\r\nurldate = {2023-03-16},<br \/>\r\njournal = {Artificial Intelligence (AIJ)},<br \/>\r\nissue = {316},<br \/>\r\npages = {103844},<br \/>\r\nabstract = {Building autonomous systems for deployment in the open world has been a longstanding objective in both artificial intelligence and robotics. The open world, however, presents challenges that question some of the assumptions often made in contemporary AI models. Autonomous systems that operate in the open world face complex, non-stationary environments wherein enumerating all situations the system may face over the course of its deployment is intractable. Nevertheless, these systems are expected to operate safely and reliably for extended durations. Consequently, AI systems often rely on some degree of human assistance to mitigate risks while completing their tasks, and are hence better treated as semi-autonomous systems. In order to reduce unnecessary reliance on humans and optimize autonomy, we propose a novel introspective planning model\u2014competence-aware systems (CAS)\u2014that enables a semi-autonomous system to reason about its own competence and allowed level of autonomy by leveraging human feedback or assistance. A CAS learns to adjust its level of autonomy based on experience and interactions with a human authority so as to reduce improper reliance on the human and optimize the degree of autonomy it employs in any given circumstance. To handle situations in which the initial CAS model has insufficient state information to properly discriminate feedback received from humans, we introduce a methodology called iterative state space refinement that gradually increases the granularity of the state space online. The approach exploits information that exists in the standard CAS model and requires no additional input from the human. The result is an agent that can more confidently predict the correct feedback from the human authority in each level of autonomy, enabling it learn its competence in a larger portion of the state space.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1166','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1166\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Building autonomous systems for deployment in the open world has been a longstanding objective in both artificial intelligence and robotics. The open world, however, presents challenges that question some of the assumptions often made in contemporary AI models. Autonomous systems that operate in the open world face complex, non-stationary environments wherein enumerating all situations the system may face over the course of its deployment is intractable. Nevertheless, these systems are expected to operate safely and reliably for extended durations. Consequently, AI systems often rely on some degree of human assistance to mitigate risks while completing their tasks, and are hence better treated as semi-autonomous systems. In order to reduce unnecessary reliance on humans and optimize autonomy, we propose a novel introspective planning model\u2014competence-aware systems (CAS)\u2014that enables a semi-autonomous system to reason about its own competence and allowed level of autonomy by leveraging human feedback or assistance. A CAS learns to adjust its level of autonomy based on experience and interactions with a human authority so as to reduce improper reliance on the human and optimize the degree of autonomy it employs in any given circumstance. To handle situations in which the initial CAS model has insufficient state information to properly discriminate feedback received from humans, we introduce a methodology called iterative state space refinement that gradually increases the granularity of the state space online. The approach exploits information that exists in the standard CAS model and requires no additional input from the human. The result is an agent that can more confidently predict the correct feedback from the human authority in each level of autonomy, enabling it learn its competence in a larger portion of the state space.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1166','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1166\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSWWBZaij23.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSWWBZaij23.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BSWWBZaij23.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1016\/j.artint.2022.103844\" title=\"Follow DOI:10.1016\/j.artint.2022.103844\" target=\"_blank\">doi:10.1016\/j.artint.2022.103844<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1166','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Kamath, Aishwarya;  Saisubramanian, Sandhya;  Paruchuri, Praveen;  Kumar, Akshat;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1167','tp_links')\" style=\"cursor:pointer;\">Planning and Learning for Non-Markovian Negative Side Effects Using Finite State Controllers<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 37th Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_year\">2023<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1167\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1167','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1167\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1167','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1167\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1167','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1167\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SSPKZaaai23,<br \/>\r\ntitle = {Planning and Learning for Non-Markovian Negative Side Effects Using Finite State Controllers},<br \/>\r\nauthor = {Aishwarya Kamath and Sandhya Saisubramanian and Praveen Paruchuri and Akshat Kumar and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SSPKZaaai23.pdf},<br \/>\r\nyear  = {2023},<br \/>\r\ndate = {2023-01-01},<br \/>\r\nurldate = {2023-01-01},<br \/>\r\nbooktitle = {Proceedings of the 37th Conference on Artificial Intelligence (AAAI)},<br \/>\r\nabstract = {Autonomous systems are often deployed in the open world where it is hard to obtain complete specifications of objectives and constraints. Operating based on an incomplete model can produce negative side effects (NSEs), which affect the safety and reliability of the system. We focus on mitigating NSEs in environments modeled as Markov decision processes (MDPs). First, we learn a model of NSEs using observed data that contains state-action trajectories and the severity of the associated NSEs. Unlike previous works that associate NSEs with state-action pairs, our framework associates NSEs with entire trajectories, which is more general and captures non-Markovian dependence on states and actions. Second, we learn finite state controllers (FSCs) that predict the NSE severity for a given trajectory and generalize well to unseen data. Finally, we develop a constrained MDP model that uses information from both the underlying MDP and the learned FSC for planning while avoiding NSEs. Our empirical evaluation demonstrates the effective- ness of our approach in learning and mitigating Markovian and non-Markovian NSEs.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1167','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1167\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Autonomous systems are often deployed in the open world where it is hard to obtain complete specifications of objectives and constraints. Operating based on an incomplete model can produce negative side effects (NSEs), which affect the safety and reliability of the system. We focus on mitigating NSEs in environments modeled as Markov decision processes (MDPs). First, we learn a model of NSEs using observed data that contains state-action trajectories and the severity of the associated NSEs. Unlike previous works that associate NSEs with state-action pairs, our framework associates NSEs with entire trajectories, which is more general and captures non-Markovian dependence on states and actions. Second, we learn finite state controllers (FSCs) that predict the NSE severity for a given trajectory and generalize well to unseen data. Finally, we develop a constrained MDP model that uses information from both the underlying MDP and the learned FSC for planning while avoiding NSEs. Our empirical evaluation demonstrates the effective- ness of our approach in learning and mitigating Markovian and non-Markovian NSEs.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1167','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1167\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SSPKZaaai23.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SSPKZaaai23.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SSPKZaaai23.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1167','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Basich, Connor;  Zilberstein, Shlomo;  Biswas, Joydeep<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1168','tp_links')\" style=\"cursor:pointer;\">Competence-Aware Autonomy: An Essential Skill for Robots in the Real World<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 37th Conference on Artificial Intelligence (AAAI) Bridge Program, <\/span><span class=\"tp_pub_additional_year\">2023<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1168\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1168','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1168\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1168','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1168\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1168','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1168\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:BBZaaai23bridge,<br \/>\r\ntitle = {Competence-Aware Autonomy: An Essential Skill for Robots in the Real World},<br \/>\r\nauthor = {Connor Basich and Shlomo Zilberstein and Joydeep Biswas},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BBZaaai23bridge.pdf},<br \/>\r\nyear  = {2023},<br \/>\r\ndate = {2023-01-01},<br \/>\r\nbooktitle = {Proceedings of the 37th Conference on Artificial Intelligence (AAAI) Bridge Program},<br \/>\r\nabstract = {Recent efforts in AI and robotics towards deploying intelligent robotic systems in the real world offer the possibility of transformational impacts on society. For such systems to be successful while reliably maintaining safe operation, they must be cognizant of their limitations, and when uncertain about their autonomous capabilities, solicit human assistance. However, system designers cannot fully enumerate the space of all situations that a robot deployed in the real world might face, prompting the challenge of endowing robots with actionable awareness of their capabilities and limitations in unseen settings. We propose competence-aware autonomy as a means of addressing this challenge in a well-defined manner motivated by real world examples. We discuss recent prior work in this area and suggest several research challenges and opportunities for future work.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1168','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1168\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Recent efforts in AI and robotics towards deploying intelligent robotic systems in the real world offer the possibility of transformational impacts on society. For such systems to be successful while reliably maintaining safe operation, they must be cognizant of their limitations, and when uncertain about their autonomous capabilities, solicit human assistance. However, system designers cannot fully enumerate the space of all situations that a robot deployed in the real world might face, prompting the challenge of endowing robots with actionable awareness of their capabilities and limitations in unseen settings. We propose competence-aware autonomy as a means of addressing this challenge in a well-defined manner motivated by real world examples. We discuss recent prior work in this area and suggest several research challenges and opportunities for future work.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1168','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1168\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BBZaaai23bridge.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BBZaaai23bridge.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BBZaaai23bridge.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1168','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Mahmud, Saaduddin;  Basich, Connor;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1169','tp_links')\" style=\"cursor:pointer;\">Semi-Autonomous Systems with Contextual Competence Awareness<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 22nd International Conference on Autonomous Agents and MultiAgent Systems (AAMAS), <\/span><span class=\"tp_pub_additional_year\">2023<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1169\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1169','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1169\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1169','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1169\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1169','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1169\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:MBZaamas23,<br \/>\r\ntitle = {Semi-Autonomous Systems with Contextual Competence Awareness},<br \/>\r\nauthor = {Saaduddin Mahmud and Connor Basich and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MBZaamas23.pdf},<br \/>\r\nyear  = {2023},<br \/>\r\ndate = {2023-01-01},<br \/>\r\nbooktitle = {Proceedings of the 22nd International Conference on Autonomous Agents and MultiAgent Systems (AAMAS)},<br \/>\r\npages = {689\u2013697},<br \/>\r\nabstract = {Competence modeling is critical for the efficient and safe operation of semi-autonomous systems (SAS) with varying levels of autonomy. In this paper, we extend the notion of competence modeling by introducing a contextual competence model. While previous work on competence-aware systems (CAS) defined the competence of a SAS relative to a single static operator, we present an augmented operator model that is contextualized by Markovian state information capable of capturing multiple operators. Access to such information allows the SAS to account for the stochastic shifts that may occur in the behavior of the operator(s) during deployment and optimize its autonomy accordingly. We show that the extended model called Contextual Competence Aware System (CoCAS) has the same convergence guarantees as CAS, and empirically illustrate the benefit of our approach over both the original CAS model as well as other relevant work in shared autonomy.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1169','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1169\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Competence modeling is critical for the efficient and safe operation of semi-autonomous systems (SAS) with varying levels of autonomy. In this paper, we extend the notion of competence modeling by introducing a contextual competence model. While previous work on competence-aware systems (CAS) defined the competence of a SAS relative to a single static operator, we present an augmented operator model that is contextualized by Markovian state information capable of capturing multiple operators. Access to such information allows the SAS to account for the stochastic shifts that may occur in the behavior of the operator(s) during deployment and optimize its autonomy accordingly. We show that the extended model called Contextual Competence Aware System (CoCAS) has the same convergence guarantees as CAS, and empirically illustrate the benefit of our approach over both the original CAS model as well as other relevant work in shared autonomy.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1169','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1169\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MBZaamas23.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MBZaamas23.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MBZaamas23.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1169','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Nashed, Samer B.;  Mahmud, Saaduddin;  Goldman, Claudia V.;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1170','tp_links')\" style=\"cursor:pointer;\">Causal Explanations for Sequential Decision Making Under Uncertainty (Extended Abstract)<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 22nd International Conference on Autonomous Agents and MultiAgent Systems (AAMAS), <\/span><span class=\"tp_pub_additional_year\">2023<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1170\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1170','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1170\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1170','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1170\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1170','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1170\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:NMGZaamas23,<br \/>\r\ntitle = {Causal Explanations for Sequential Decision Making Under Uncertainty (Extended Abstract)},<br \/>\r\nauthor = {Samer B. Nashed and Saaduddin Mahmud and Claudia V. Goldman and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/NMGZaamas23.pdf},<br \/>\r\nyear  = {2023},<br \/>\r\ndate = {2023-01-01},<br \/>\r\nbooktitle = {Proceedings of the 22nd International Conference on Autonomous Agents and MultiAgent Systems (AAMAS)},<br \/>\r\npages = {2307\u20132309},<br \/>\r\nabstract = {Competence modeling is critical for the efficient and safe operation of semi-autonomous systems (SAS) with varying levels of autonomy. In this paper, we extend the notion of competence modeling by introducing a contextual competence model. While previous work on competence-aware systems (CAS) defined the competence of a SAS relative to a single static operator, we present an augmented operator model that is contextualized by Markovian state information capable of capturing multiple operators. Access to such information allows the SAS to account for the stochastic shifts that may occur in the behavior of the operator(s) during deployment and optimize its autonomy accordingly. We show that the extended model called Contextual Competence Aware System (CoCAS) has the same convergence guarantees as CAS, and empirically illustrate the benefit of our approach over both the original CAS model as well as other relevant work in shared autonomy.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1170','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1170\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Competence modeling is critical for the efficient and safe operation of semi-autonomous systems (SAS) with varying levels of autonomy. In this paper, we extend the notion of competence modeling by introducing a contextual competence model. While previous work on competence-aware systems (CAS) defined the competence of a SAS relative to a single static operator, we present an augmented operator model that is contextualized by Markovian state information capable of capturing multiple operators. Access to such information allows the SAS to account for the stochastic shifts that may occur in the behavior of the operator(s) during deployment and optimize its autonomy accordingly. We show that the extended model called Contextual Competence Aware System (CoCAS) has the same convergence guarantees as CAS, and empirically illustrate the benefit of our approach over both the original CAS model as well as other relevant work in shared autonomy.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1170','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1170\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/NMGZaamas23.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/NMGZaamas23.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/NMGZaamas23.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1170','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Mahmud, Saaduddin;  Saisubramanian, Sandhya;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1171','tp_links')\" style=\"cursor:pointer;\">Explanation-Guided Reward Alignment<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 32nd International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_year\">2023<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1171\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1171','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1171\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1171','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1171\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1171','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1171\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:MSZijcai23,<br \/>\r\ntitle = {Explanation-Guided Reward Alignment},<br \/>\r\nauthor = {Saaduddin Mahmud and Sandhya Saisubramanian and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MSZijcai23.pdf},<br \/>\r\nyear  = {2023},<br \/>\r\ndate = {2023-01-01},<br \/>\r\nurldate = {2020-01-01},<br \/>\r\nbooktitle = {Proceedings of the 32nd International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\nabstract = {Agents often need to infer a reward function from observations in order to learn desired behaviors. However, agents may infer a reward function that does not align with the original intent, as there can be multiple reward functions consistent with their observations. Operating based on such misaligned rewards can be risky. Furthermore, black-box representations make it difficult to verify the learned reward functions and prevent harmful behavior. We present a framework for verifying and improving reward alignment using explanations, and we show how explanations can help detect misalignment and reveal failure cases in novel scenarios. The problem is formulated as inverse reinforcement learning from ranked trajectories. Verification tests created from the trajectory dataset are used to iteratively verify and improve reward alignment. The agent explains its learned reward, and a tester signals whether the explanation passes the test. In cases where the explanation fails, the agent offers alternative explanations to gather feedback, which is then used to improve the learned reward. We analyze the efficiency of our approach in improving reward alignment using different types of explanations and demonstrate its effectiveness in five domains.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1171','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1171\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Agents often need to infer a reward function from observations in order to learn desired behaviors. However, agents may infer a reward function that does not align with the original intent, as there can be multiple reward functions consistent with their observations. Operating based on such misaligned rewards can be risky. Furthermore, black-box representations make it difficult to verify the learned reward functions and prevent harmful behavior. We present a framework for verifying and improving reward alignment using explanations, and we show how explanations can help detect misalignment and reveal failure cases in novel scenarios. The problem is formulated as inverse reinforcement learning from ranked trajectories. Verification tests created from the trajectory dataset are used to iteratively verify and improve reward alignment. The agent explains its learned reward, and a tester signals whether the explanation passes the test. In cases where the explanation fails, the agent offers alternative explanations to gather feedback, which is then used to improve the learned reward. We analyze the efficiency of our approach in improving reward alignment using different types of explanations and demonstrate its effectiveness in five domains.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1171','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1171\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MSZijcai23.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MSZijcai23.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MSZijcai23.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1171','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Basich, Connor;  Mahmud, Sadduddin;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1174','tp_links')\" style=\"cursor:pointer;\">Learning Constraints on Autonomous Behavior from Proactive Feedback<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), <\/span><span class=\"tp_pub_additional_address\">Detroit, Michigan, <\/span><span class=\"tp_pub_additional_year\">2023<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1174\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1174','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1174\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1174','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1174\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1174','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1174\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:BMZiros23,<br \/>\r\ntitle = {Learning Constraints on Autonomous Behavior from Proactive Feedback},<br \/>\r\nauthor = {Connor Basich and Sadduddin Mahmud and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BMZiros23.pdf},<br \/>\r\nyear  = {2023},<br \/>\r\ndate = {2023-01-01},<br \/>\r\nbooktitle = {Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS)},<br \/>\r\npages = {3680\u20133687},<br \/>\r\naddress = {Detroit, Michigan},<br \/>\r\nabstract = {Learning from feedback is a common paradigm to acquire information that is hard to specify a priori. In this work, we consider a planning agent with a known nominal reward model that captures their high-level task objective, but is subject to constraints that are unknown a priori and must be inferred from human interventions. Unlike existing methods, our approach does not rely on full or partial demonstration trajectories or assume a fully reactive human. Instead, we assume access only to sparse interventions, which may in fact be generated proactively by the human, and make only minimal assumptions about the human. We provide both theoretical bounds on performance, and empirical validations of our method. We show that our method enables an agent to learn a constraint set with high accuracy that generalizes well to new environments within a domain, whereas methods that only consider reactive feedback learn an incorrect constraint set that does not generalize well, making constraint violations more likely in new environments.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1174','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1174\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Learning from feedback is a common paradigm to acquire information that is hard to specify a priori. In this work, we consider a planning agent with a known nominal reward model that captures their high-level task objective, but is subject to constraints that are unknown a priori and must be inferred from human interventions. Unlike existing methods, our approach does not rely on full or partial demonstration trajectories or assume a fully reactive human. Instead, we assume access only to sparse interventions, which may in fact be generated proactively by the human, and make only minimal assumptions about the human. We provide both theoretical bounds on performance, and empirical validations of our method. We show that our method enables an agent to learn a constraint set with high accuracy that generalizes well to new environments within a domain, whereas methods that only consider reactive feedback learn an incorrect constraint set that does not generalize well, making constraint violations more likely in new environments.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1174','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1174\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BMZiros23.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BMZiros23.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BMZiros23.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1174','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Nakamura, Mason;  Svegliato, Justin;  Nashed, Samer B.;  Zilberstein, Shlomo;  Russell, Stuart<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1175','tp_links')\" style=\"cursor:pointer;\">Formal Composition of Robotic Systems as Contract Programs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), <\/span><span class=\"tp_pub_additional_address\">Detroit, Michigan, <\/span><span class=\"tp_pub_additional_year\">2023<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1175\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1175','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1175\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1175','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1175\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1175','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1175\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:NSNZRiros23,<br \/>\r\ntitle = {Formal Composition of Robotic Systems as Contract Programs},<br \/>\r\nauthor = {Mason Nakamura and Justin Svegliato and Samer B. Nashed and Shlomo Zilberstein and Stuart Russell},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/NSNZRiros23.pdf},<br \/>\r\nyear  = {2023},<br \/>\r\ndate = {2023-01-01},<br \/>\r\nbooktitle = {Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS)},<br \/>\r\npages = {6727\u20136732},<br \/>\r\naddress = {Detroit, Michigan},<br \/>\r\nabstract = {Robotic systems are often composed of modular algorithms that each perform a specific function within a larger architecture, ranging from state estimation and task planning to trajectory optimization and object recognition. Existing work for specifying these systems as a formal composition of contract algorithms has limited expressiveness compared to the variety of sophisticated architectures that are commonly used in practice. Therefore, in this paper, we (1) propose a novel metareasoning framework for formally composing robotic systems as a contract program with programming constructs for functional, conditional, and looping semantics and (2) introduce a recursive hill climbing algorithm that finds a locally optimal time allocation of a contract program. In our experiments, we demonstrate that our approach outperforms baseline techniques in a simulated pick-and-place robot domain.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1175','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1175\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Robotic systems are often composed of modular algorithms that each perform a specific function within a larger architecture, ranging from state estimation and task planning to trajectory optimization and object recognition. Existing work for specifying these systems as a formal composition of contract algorithms has limited expressiveness compared to the variety of sophisticated architectures that are commonly used in practice. Therefore, in this paper, we (1) propose a novel metareasoning framework for formally composing robotic systems as a contract program with programming constructs for functional, conditional, and looping semantics and (2) introduce a recursive hill climbing algorithm that finds a locally optimal time allocation of a contract program. In our experiments, we demonstrate that our approach outperforms baseline techniques in a simulated pick-and-place robot domain.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1175','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1175\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/NSNZRiros23.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/NSNZRiros23.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/NSNZRiros23.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1175','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Saisubramanian, Sandhya;  Zilberstein, Shlomo;  Kamar, Ece<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1156','tp_links')\" style=\"cursor:pointer;\">Avoiding Negative Side Effects due to Incomplete Knowledge of AI Systems<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">AI Magazine, <\/span><span class=\"tp_pub_additional_volume\">vol. 42, <\/span><span class=\"tp_pub_additional_number\">no. 4, <\/span><span class=\"tp_pub_additional_pages\">pp. 62\u201371, <\/span><span class=\"tp_pub_additional_year\">2022<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1156\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1156','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1156\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1156','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1156\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1156','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1156\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:SZKaimag22,<br \/>\r\ntitle = {Avoiding Negative Side Effects due to Incomplete Knowledge of AI Systems},<br \/>\r\nauthor = {Sandhya Saisubramanian and Shlomo Zilberstein and Ece Kamar},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZKaimag22.pdf},<br \/>\r\ndoi = {\u000e10.1609\/aaai.12028},<br \/>\r\nyear  = {2022},<br \/>\r\ndate = {2022-01-01},<br \/>\r\nurldate = {2022-01-01},<br \/>\r\njournal = {AI Magazine},<br \/>\r\nvolume = {42},<br \/>\r\nnumber = {4},<br \/>\r\npages = {62--71},<br \/>\r\nabstract = {Autonomous agents acting in the real-world often operate based on models that ignore certain aspects of the environment. The incompleteness of any given model \u2013 handcrafted or machine acquired \u2013 is inevitable due to practical limitations of any modeling technique for complex real-world settings. Due to the limited fidelity of its model, an agent\u2019s actions may have unexpected, undesirable consequences during execution. Learning to recognize and avoid such negative side effects (NSEs) of an agent\u2019s actions is critical to improve the safety and reliability of autonomous systems. Mitigating NSEs is an emerging research topic that is attracting increased attention due to the rapid growth in the deployment of AI systems and their broad societal impacts. This article provides a comprehensive overview of different forms of NSEs and the recent research efforts to address them. We identify key characteristics of NSEs, highlight the challenges in avoiding NSEs, and discuss recently developed approaches, contrasting their benefits and limitations. The article concludes with a discussion of open questions and suggestions for future research directions.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1156','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1156\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Autonomous agents acting in the real-world often operate based on models that ignore certain aspects of the environment. The incompleteness of any given model \u2013 handcrafted or machine acquired \u2013 is inevitable due to practical limitations of any modeling technique for complex real-world settings. Due to the limited fidelity of its model, an agent\u2019s actions may have unexpected, undesirable consequences during execution. Learning to recognize and avoid such negative side effects (NSEs) of an agent\u2019s actions is critical to improve the safety and reliability of autonomous systems. Mitigating NSEs is an emerging research topic that is attracting increased attention due to the rapid growth in the deployment of AI systems and their broad societal impacts. This article provides a comprehensive overview of different forms of NSEs and the recent research efforts to address them. We identify key characteristics of NSEs, highlight the challenges in avoiding NSEs, and discuss recently developed approaches, contrasting their benefits and limitations. The article concludes with a discussion of open questions and suggestions for future research directions.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1156','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1156\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZKaimag22.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZKaimag22.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZKaimag22.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/\u000e10.1609\/aaai.12028\" title=\"Follow DOI:\u000e10.1609\/aaai.12028\" target=\"_blank\">doi:\u000e10.1609\/aaai.12028<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1156','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Saisubramanian, Sandhya;  Zilberstein, Shlomo;  Kamar, Ece<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1157','tp_links')\" style=\"cursor:pointer;\">Avoiding Negative Side Effects of Autonomous Systems in the Open World<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Journal of Artificial Intelligence Research (JAIR), <\/span><span class=\"tp_pub_additional_volume\">vol. 74, <\/span><span class=\"tp_pub_additional_pages\">pp. 143\u2013177, <\/span><span class=\"tp_pub_additional_year\">2022<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1157\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1157','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1157\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1157','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1157\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1157','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1157\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:SZKjair22,<br \/>\r\ntitle = {Avoiding Negative Side Effects of Autonomous Systems in the Open World},<br \/>\r\nauthor = {Sandhya Saisubramanian and Shlomo Zilberstein and Ece Kamar},<br \/>\r\nurl = {https:\/\/www.jair.org\/index.php\/jair\/article\/view\/13581\/26799},<br \/>\r\ndoi = {10.1613\/jair.1.13581},<br \/>\r\nyear  = {2022},<br \/>\r\ndate = {2022-01-01},<br \/>\r\nurldate = {2022-01-01},<br \/>\r\njournal = {Journal of Artificial Intelligence Research (JAIR)},<br \/>\r\nvolume = {74},<br \/>\r\npages = {143--177},<br \/>\r\nabstract = {Autonomous systems that operate in the open world often use incomplete models of their environment. Model incompleteness is inevitable due to the practical limitations in precise model specification and data collection about open-world environments. Due to the limited fidelity of the model, agent actions may produce negative side effects (NSEs) when deployed. Negative side effects are undesirable, unmodeled effects of agent actions on the environment. NSEs are inherently challenging to identify at design time and may affect the reliability, usability and safety of the system. We present two complementary approaches to mitigate the NSE via: (1) learning from feedback, and (2) environment shaping. The solution approaches target settings with different assumptions and agent responsibilities. In learning from feedback, the agent learns a penalty function associated with a NSE. We investigate the efficiency of different feedback mechanisms, including human feedback and autonomous exploration. The problem is formulated as a multi-objective Markov decision process such that optimizing the agent\u2019s assigned task is prioritized over mitigating NSE. A slack parameter denotes the maximum allowed deviation from the optimal expected reward for the agent\u2019s task in order to mitigate NSE. In environment shaping, we examine how a human can assist an agent, beyond providing feedback, and utilize their broader scope of knowledge to mitigate the impacts of NSE. We formulate the problem as a human-agent collaboration with decoupled objectives. The agent optimizes its assigned task and may produce NSE during its operation. The human assists the agent by performing modest reconfigurations of the environment so as to mitigate the impacts of NSE, without affecting the agent\u2019s ability to complete its assigned task. We present an algorithm for shaping and analyze its properties. Empirical evaluations demonstrate the trade-offs in the performance of different approaches in mitigating NSE in different settings.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1157','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1157\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Autonomous systems that operate in the open world often use incomplete models of their environment. Model incompleteness is inevitable due to the practical limitations in precise model specification and data collection about open-world environments. Due to the limited fidelity of the model, agent actions may produce negative side effects (NSEs) when deployed. Negative side effects are undesirable, unmodeled effects of agent actions on the environment. NSEs are inherently challenging to identify at design time and may affect the reliability, usability and safety of the system. We present two complementary approaches to mitigate the NSE via: (1) learning from feedback, and (2) environment shaping. The solution approaches target settings with different assumptions and agent responsibilities. In learning from feedback, the agent learns a penalty function associated with a NSE. We investigate the efficiency of different feedback mechanisms, including human feedback and autonomous exploration. The problem is formulated as a multi-objective Markov decision process such that optimizing the agent\u2019s assigned task is prioritized over mitigating NSE. A slack parameter denotes the maximum allowed deviation from the optimal expected reward for the agent\u2019s task in order to mitigate NSE. In environment shaping, we examine how a human can assist an agent, beyond providing feedback, and utilize their broader scope of knowledge to mitigate the impacts of NSE. We formulate the problem as a human-agent collaboration with decoupled objectives. The agent optimizes its assigned task and may produce NSE during its operation. The human assists the agent by performing modest reconfigurations of the environment so as to mitigate the impacts of NSE, without affecting the agent\u2019s ability to complete its assigned task. We present an algorithm for shaping and analyze its properties. Empirical evaluations demonstrate the trade-offs in the performance of different approaches in mitigating NSE in different settings.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1157','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1157\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/www.jair.org\/index.php\/jair\/article\/view\/13581\/26799\" title=\"https:\/\/www.jair.org\/index.php\/jair\/article\/view\/13581\/26799\" target=\"_blank\">https:\/\/www.jair.org\/index.php\/jair\/article\/view\/13581\/26799<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1613\/jair.1.13581\" title=\"Follow DOI:10.1613\/jair.1.13581\" target=\"_blank\">doi:10.1613\/jair.1.13581<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1157','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Rabiee, Sadegh;  Basich, Connor;  Wray, Kyle Hollins;  Zilberstein, Shlomo;  Biswas, Joydeep<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1158','tp_links')\" style=\"cursor:pointer;\">Competence-Aware Path Planning Via Introspective Perception<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">IEEE Robotics and Automation Letters, <\/span><span class=\"tp_pub_additional_volume\">vol. 7, <\/span><span class=\"tp_pub_additional_number\">no. 2, <\/span><span class=\"tp_pub_additional_pages\">pp. 3218\u20133225, <\/span><span class=\"tp_pub_additional_year\">2022<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1158\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1158','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1158\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1158','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1158\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1158','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1158\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:RBWZBlra22,<br \/>\r\ntitle = {Competence-Aware Path Planning Via Introspective Perception},<br \/>\r\nauthor = {Sadegh Rabiee and Connor Basich and Kyle Hollins Wray and Shlomo Zilberstein and Joydeep Biswas},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/RBWZBlra22.pdf},<br \/>\r\ndoi = {10.1109\/LRA.2022.3145517},<br \/>\r\nyear  = {2022},<br \/>\r\ndate = {2022-01-01},<br \/>\r\njournal = {IEEE Robotics and Automation Letters},<br \/>\r\nvolume = {7},<br \/>\r\nnumber = {2},<br \/>\r\npages = {3218--3225},<br \/>\r\nabstract = {Robots deployed in the real world over extended periods of time need to reason about unexpected failures, learn to predict them, and to proactively take actions to avoid future failures. Existing approaches for competence-aware planning are either model-based, requiring explicit enumeration of known failure sources, or purely statistical, using state- and location-specific failure statistics to infer competence. We instead propose a structured model-free approach to competence-aware planning by reasoning about plan execution failures due to errors in perception, without requiring a priori enumeration of failure sources or requiring location-specific failure statistics. We introduce competence-aware path planning via introspective perception (CPIP) , a Bayesian framework to iteratively learn and exploit task-level competence in novel deployment environments. CPIP factorizes the competence-aware planning problem into two components. First, perception errors are learned in a model-free and location-agnostic setting via introspective perception prior to deployment in novel environments. Second, during actual deployments, the prediction of task-level failures is learned in a context-aware setting. Experiments in a simulation show that the proposed CPIP approach outperforms the frequentist baseline in multiple mobile robot tasks, and is further validated via real robot experiments in environments with perceptually challenging obstacles and terrain.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1158','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1158\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Robots deployed in the real world over extended periods of time need to reason about unexpected failures, learn to predict them, and to proactively take actions to avoid future failures. Existing approaches for competence-aware planning are either model-based, requiring explicit enumeration of known failure sources, or purely statistical, using state- and location-specific failure statistics to infer competence. We instead propose a structured model-free approach to competence-aware planning by reasoning about plan execution failures due to errors in perception, without requiring a priori enumeration of failure sources or requiring location-specific failure statistics. We introduce competence-aware path planning via introspective perception (CPIP) , a Bayesian framework to iteratively learn and exploit task-level competence in novel deployment environments. CPIP factorizes the competence-aware planning problem into two components. First, perception errors are learned in a model-free and location-agnostic setting via introspective perception prior to deployment in novel environments. Second, during actual deployments, the prediction of task-level failures is learned in a context-aware setting. Experiments in a simulation show that the proposed CPIP approach outperforms the frequentist baseline in multiple mobile robot tasks, and is further validated via real robot experiments in environments with perceptually challenging obstacles and terrain.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1158','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1158\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/RBWZBlra22.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/RBWZBlra22.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/RBWZBlra22.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1109\/LRA.2022.3145517\" title=\"Follow DOI:10.1109\/LRA.2022.3145517\" target=\"_blank\">doi:10.1109\/LRA.2022.3145517<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1158','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Svegliato, Justin;  Basich, Connor;  Saisubramanian, Sandhya;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1159','tp_links')\" style=\"cursor:pointer;\">Metareasoning for Safe Decision Making in Autonomous Systems<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), <\/span><span class=\"tp_pub_additional_address\">Philadelphia, Pennsylvania, <\/span><span class=\"tp_pub_additional_year\">2022<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1159\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1159','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1159\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1159','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1159\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1159','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1159\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SBSZicra22,<br \/>\r\ntitle = {Metareasoning for Safe Decision Making in Autonomous Systems},<br \/>\r\nauthor = {Justin Svegliato and Connor Basich and Sandhya Saisubramanian and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SBSZicra22.pdf},<br \/>\r\nyear  = {2022},<br \/>\r\ndate = {2022-01-01},<br \/>\r\nbooktitle = {Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},<br \/>\r\naddress = {Philadelphia, Pennsylvania},<br \/>\r\nabstract = {Although experts carefully specify the high-level decision-making models in autonomous systems, it is infeasible to guarantee safety across every scenario during operation. We therefore propose a safety metareasoning system that optimizes the severity of the system's safety concerns and the interference to the system's task: the system executes in parallel a task process that completes a specified task and safety processes that each address a specified safety concern with a conflict resolver for arbitration. This paper offers a formal definition of a safety metareasoning system, a recommendation algorithm for a safety process, an arbitration algorithm for a conflict resolver, an application of our approach to planetary rover exploration, and a demonstration that our approach is effective in simulation.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1159','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1159\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Although experts carefully specify the high-level decision-making models in autonomous systems, it is infeasible to guarantee safety across every scenario during operation. We therefore propose a safety metareasoning system that optimizes the severity of the system's safety concerns and the interference to the system's task: the system executes in parallel a task process that completes a specified task and safety processes that each address a specified safety concern with a conflict resolver for arbitration. This paper offers a formal definition of a safety metareasoning system, a recommendation algorithm for a safety process, an arbitration algorithm for a conflict resolver, an application of our approach to planetary rover exploration, and a demonstration that our approach is effective in simulation.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1159','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1159\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SBSZicra22.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SBSZicra22.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SBSZicra22.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1159','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Basich, Connor;  Russino, Joseph A.;  Chien, Steve;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1163','tp_links')\" style=\"cursor:pointer;\">A Sampling Based Approach to Robust Planning for a Planetary Lander<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), <\/span><span class=\"tp_pub_additional_address\">Kyoto, Japan, <\/span><span class=\"tp_pub_additional_year\">2022<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1163\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1163','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1163\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1163','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1163\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1163','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1163\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:BRCZiros22,<br \/>\r\ntitle = {A Sampling Based Approach to Robust Planning for a Planetary Lander},<br \/>\r\nauthor = {Connor Basich and Joseph A. Russino and Steve Chien and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BRCZiros22.pdf},<br \/>\r\nyear  = {2022},<br \/>\r\ndate = {2022-01-01},<br \/>\r\nbooktitle = {Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS)},<br \/>\r\npages = {4106--4111},<br \/>\r\naddress = {Kyoto, Japan},<br \/>\r\nabstract = {Planning for autonomous operation in unknown environments poses a number of technical challenges. The agent must ensure robustness to unknown phenomena, un- predictable variation in execution, and uncertain resources, all while maximizing its objective. These challenges are ex- acerbated in the context of space missions where uncertainty is often higher, long communication delays necessitate robust autonomous execution, and severely constrained computational resources limit the scope of planning techniques that can be used. We examine this problem in the context of a Europa Lander concept mission where an autonomous lander must collect valuable data and communicate that data back to Earth. We model the problem as a hierarchical task network, framing it as a utility maximization problem constrained by a strictly monotonically decreasing energy resource. We propose a novel deterministic planning framework that uses periodic replanning and sampling-based optimization to better handle model uncertainty and execution variation, while remaining computationally tractable. We demonstrate the efficacy of our framework through simulations of a Europa Lander concept mission in which our approach outperforms several baselines in utility maximization and robustness.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1163','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1163\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Planning for autonomous operation in unknown environments poses a number of technical challenges. The agent must ensure robustness to unknown phenomena, un- predictable variation in execution, and uncertain resources, all while maximizing its objective. These challenges are ex- acerbated in the context of space missions where uncertainty is often higher, long communication delays necessitate robust autonomous execution, and severely constrained computational resources limit the scope of planning techniques that can be used. We examine this problem in the context of a Europa Lander concept mission where an autonomous lander must collect valuable data and communicate that data back to Earth. We model the problem as a hierarchical task network, framing it as a utility maximization problem constrained by a strictly monotonically decreasing energy resource. We propose a novel deterministic planning framework that uses periodic replanning and sampling-based optimization to better handle model uncertainty and execution variation, while remaining computationally tractable. We demonstrate the efficacy of our framework through simulations of a Europa Lander concept mission in which our approach outperforms several baselines in utility maximization and robustness.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1163','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1163\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BRCZiros22.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BRCZiros22.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/BRCZiros22.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1163','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Svegliato, Justin;  Nashed, Samer B;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1136','tp_links')\" style=\"cursor:pointer;\">Ethically Compliant Sequential Decision Making<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 35th Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_year\">2021<\/span><span class=\"tp_pub_additional_note\">, (Distinguished Paper Award)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1136\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1136','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1136\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1136','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1136\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1136','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1136\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SNZaaai21,<br \/>\r\ntitle = {Ethically Compliant Sequential Decision Making},<br \/>\r\nauthor = {Justin Svegliato and Samer B Nashed and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SNZaaai21.pdf},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\nbooktitle = {Proceedings of the 35th Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {11657--11665},<br \/>\r\nabstract = {Enabling autonomous systems to comply with an ethical theory is critical given their accelerating deployment in domains that impact society. While many ethical theories have been studied extensively in moral philosophy, they are still challenging to implement by developers who build autonomous systems. This paper proposes a novel approach for building ethically compliant autonomous systems that optimize completing a task while following an ethical framework. First, we introduce a definition of an ethically compliant autonomous system and its properties. Next, we offer a range of ethical frameworks for divine command theory, prima facie duties, and virtue ethics. Finally, we demonstrate the accuracy and usability of our approach in a set of autonomous driving simulations and a user study of planning and robotics experts.},<br \/>\r\nnote = {Distinguished Paper Award},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1136','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1136\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Enabling autonomous systems to comply with an ethical theory is critical given their accelerating deployment in domains that impact society. While many ethical theories have been studied extensively in moral philosophy, they are still challenging to implement by developers who build autonomous systems. This paper proposes a novel approach for building ethically compliant autonomous systems that optimize completing a task while following an ethical framework. First, we introduce a definition of an ethically compliant autonomous system and its properties. Next, we offer a range of ethical frameworks for divine command theory, prima facie duties, and virtue ethics. Finally, we demonstrate the accuracy and usability of our approach in a set of autonomous driving simulations and a user study of planning and robotics experts.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1136','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1136\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SNZaaai21.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SNZaaai21.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SNZaaai21.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1136','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Galhotra, Sainyam;  Saisubramanian, Sandhya;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1132','tp_links')\" style=\"cursor:pointer;\">Learning to Generate Fair Clusters from Demonstrations<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the AAAI\/ACM Conference on AI, Ethics, and Society (AIES), <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1132\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1132','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1132\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1132','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1132\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1132','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1132\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:GSZaies21,<br \/>\r\ntitle = {Learning to Generate Fair Clusters from Demonstrations},<br \/>\r\nauthor = {Sainyam Galhotra and Sandhya Saisubramanian and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/GSZaies21.pdf},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\nbooktitle = {Proceedings of the AAAI\/ACM Conference on AI, Ethics, and Society (AIES)},<br \/>\r\nabstract = {Fair clustering is the process of grouping similar entities together, while satisfying a mathematically well-defined fairness metric as a constraint. Due to the practical challenges in precise model specification, the prescribed fairness constraints are often incomplete and act as proxies to the intended fairness requirement. Clustering with proxies may lead to biased outcomes when the system is deployed. We examine how to identify the intended fairness constraint for a problem based on limited demonstrations from an expert. Each demonstration is a clustering over a subset of the data. We present an algorithm to identify the fairness metric from demonstrations and generate clusters using existing off-the-shelf clustering techniques, and analyze its theoretical properties. To extend our approach to novel fairness metrics for which clustering algorithms do not currently exist, we present a greedy method for clustering. Additionally, we investigate how to generate interpretable solutions using our approach. Empirical evaluation on three real-world datasets demonstrates the effectiveness of our approach in quickly identifying the underlying fairness and interpretability constraints, which are then used to generate fair and interpretable clusters.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1132','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1132\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Fair clustering is the process of grouping similar entities together, while satisfying a mathematically well-defined fairness metric as a constraint. Due to the practical challenges in precise model specification, the prescribed fairness constraints are often incomplete and act as proxies to the intended fairness requirement. Clustering with proxies may lead to biased outcomes when the system is deployed. We examine how to identify the intended fairness constraint for a problem based on limited demonstrations from an expert. Each demonstration is a clustering over a subset of the data. We present an algorithm to identify the fairness metric from demonstrations and generate clusters using existing off-the-shelf clustering techniques, and analyze its theoretical properties. To extend our approach to novel fairness metrics for which clustering algorithms do not currently exist, we present a greedy method for clustering. Additionally, we investigate how to generate interpretable solutions using our approach. Empirical evaluation on three real-world datasets demonstrates the effectiveness of our approach in quickly identifying the underlying fairness and interpretability constraints, which are then used to generate fair and interpretable clusters.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1132','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1132\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/GSZaies21.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/GSZaies21.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/GSZaies21.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1132','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Nashed, Samer B;  Svegliato, Justin;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1133','tp_links')\" style=\"cursor:pointer;\">Ethically Compliant Planning within Moral Communities<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the AAAI\/ACM Conference on AI, Ethics, and Society (AIES), <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1133\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1133','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1133\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1133','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1133\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1133','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1133\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:NSZaies21,<br \/>\r\ntitle = {Ethically Compliant Planning within Moral Communities},<br \/>\r\nauthor = {Samer B Nashed and Justin Svegliato and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/NSZaies21.pdf},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\nbooktitle = {Proceedings of the AAAI\/ACM Conference on AI, Ethics, and Society (AIES)},<br \/>\r\nabstract = {Ethically compliant autonomous systems (ECAS) are the state-of- the-art for solving sequential decision-making problems under un- certainty while respecting constraints that encode ethical considerations. This paper defines a novel concept in the context of ECAS that is from moral philosophy, the moral community, which leads to a nuanced taxonomy of explicit ethical agents. We then propose new ethical frameworks that extend the applicability of ECAS to domains where a moral community is required. Next, we provide a formal analysis of the proposed ethical frameworks and conduct experiments that illustrate their differences. Finally, we discuss the implications of explicit moral communities that could shape research on standards and guidelines for ethical agents in order to better understand and predict common errors in their design and communicate their capabilities.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1133','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1133\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Ethically compliant autonomous systems (ECAS) are the state-of- the-art for solving sequential decision-making problems under un- certainty while respecting constraints that encode ethical considerations. This paper defines a novel concept in the context of ECAS that is from moral philosophy, the moral community, which leads to a nuanced taxonomy of explicit ethical agents. We then propose new ethical frameworks that extend the applicability of ECAS to domains where a moral community is required. Next, we provide a formal analysis of the proposed ethical frameworks and conduct experiments that illustrate their differences. Finally, we discuss the implications of explicit moral communities that could shape research on standards and guidelines for ethical agents in order to better understand and predict common errors in their design and communicate their capabilities.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1133','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1133\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/NSZaies21.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/NSZaies21.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/NSZaies21.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1133','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Saisubramanian, Sandhya;  Roberts, Shannon C;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1134','tp_links')\" style=\"cursor:pointer;\">Understanding User Attitudes Towards Negative Side Effects of AI Systems<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">CHI Conference on Human Factors in Computing Systems, Late-Breaking Work, <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1134\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1134','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1134\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1134','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1134\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1134','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1134\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SRZchi21,<br \/>\r\ntitle = {Understanding User Attitudes Towards Negative Side Effects of AI Systems},<br \/>\r\nauthor = {Sandhya Saisubramanian and Shannon C Roberts and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SRZchi21.pdf},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\nbooktitle = {CHI Conference on Human Factors in Computing Systems, Late-Breaking Work},<br \/>\r\npages = {368:1--368:6},<br \/>\r\nabstract = {Artificial Intelligence (AI) systems deployed in the open world may produce negative side effects\u2014which are unanticipated, undesirable outcomes that occur in addition to the intended outcomes of the system\u2019s actions. These negative side effects affect users directly or indirectly, by violating their preferences or altering their environment in an undesirable, potentially harmful, manner. While the existing literature has started to explore techniques to overcome the impacts of negative side effects in deployed systems, there has been no prior efforts to determine how users perceive and respond to negative side effects. We surveyed 183 participants to develop an understanding of user attitudes towards side effects and how side effects impact user trust in the system. The surveys targeted two domains: an autonomous vacuum cleaner and an autonomous vehicle, each with 183 respondents. The results indicate that users are willing to tolerate side effects that are not safety-critical but prefer to minimize them as much as possible. Furthermore, users are willing to assist the system in mitigating negative side effects by providing feedback and reconfiguring the environment. Trust in the system diminishes if it fails to minimize the impacts of negative side effects over time. These results support key fundamental assumptions in existing techniques and facilitate the development of new methods to overcome negative side effects of AI systems.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1134','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1134\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Artificial Intelligence (AI) systems deployed in the open world may produce negative side effects\u2014which are unanticipated, undesirable outcomes that occur in addition to the intended outcomes of the system\u2019s actions. These negative side effects affect users directly or indirectly, by violating their preferences or altering their environment in an undesirable, potentially harmful, manner. While the existing literature has started to explore techniques to overcome the impacts of negative side effects in deployed systems, there has been no prior efforts to determine how users perceive and respond to negative side effects. We surveyed 183 participants to develop an understanding of user attitudes towards side effects and how side effects impact user trust in the system. The surveys targeted two domains: an autonomous vacuum cleaner and an autonomous vehicle, each with 183 respondents. The results indicate that users are willing to tolerate side effects that are not safety-critical but prefer to minimize them as much as possible. Furthermore, users are willing to assist the system in mitigating negative side effects by providing feedback and reconfiguring the environment. Trust in the system diminishes if it fails to minimize the impacts of negative side effects over time. These results support key fundamental assumptions in existing techniques and facilitate the development of new methods to overcome negative side effects of AI systems.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1134','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1134\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SRZchi21.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SRZchi21.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SRZchi21.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1134','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Saisubramanian, Sandhya;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1135','tp_links')\" style=\"cursor:pointer;\">Mitigating Negative Side Effects via Environment Shaping (Extended Abstract)<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS), <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1135\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1135','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1135\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1135','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1135\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1135','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1135\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SZaamas21,<br \/>\r\ntitle = {Mitigating Negative Side Effects via Environment Shaping (Extended Abstract)},<br \/>\r\nauthor = {Sandhya Saisubramanian and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZaamas21.pdf},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\nbooktitle = {Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS)},<br \/>\r\nabstract = {Agents operating in the open world often produce negative side effects (NSE), which are difficult to identify at design time. We examine how a human can assist an agent, beyond providing feedback, and exploit their broader scope of knowledge to mitigate the impacts of NSE. We formulate this problem as a human-agent team with decoupled objectives. The agent optimizes its assigned task, during which its actions may produce NSE. The human shapes the environment through minor reconfiguration actions so as to mitigate the impacts of agent's side effects, without significantly degrading agent performance. We present an algorithm to solve this problem. Empirical evaluation shows that the proposed framework can successfully mitigate NSE, without affecting the agent\u2019s ability to complete its assigned task.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1135','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1135\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Agents operating in the open world often produce negative side effects (NSE), which are difficult to identify at design time. We examine how a human can assist an agent, beyond providing feedback, and exploit their broader scope of knowledge to mitigate the impacts of NSE. We formulate this problem as a human-agent team with decoupled objectives. The agent optimizes its assigned task, during which its actions may produce NSE. The human shapes the environment through minor reconfiguration actions so as to mitigate the impacts of agent's side effects, without significantly degrading agent performance. We present an algorithm to solve this problem. Empirical evaluation shows that the proposed framework can successfully mitigate NSE, without affecting the agent\u2019s ability to complete its assigned task.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1135','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1135\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZaamas21.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZaamas21.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZaamas21.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1135','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Saisubramanian, Sandhya;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1137','tp_links')\" style=\"cursor:pointer;\">Mitigating Negative Side Effects via Environment Shaping<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">CoRR, <\/span><span class=\"tp_pub_additional_volume\">vol. abs\/2102.07017, <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1137\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1137','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1137\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1137','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1137\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1137','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1137\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:SZarXiv21b,<br \/>\r\ntitle = {Mitigating Negative Side Effects via Environment Shaping},<br \/>\r\nauthor = {Sandhya Saisubramanian and Shlomo Zilberstein},<br \/>\r\nurl = {https:\/\/arxiv.org\/abs\/2102.07017},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\njournal = {CoRR},<br \/>\r\nvolume = {abs\/2102.07017},<br \/>\r\nabstract = {Agents operating in unstructured environments often produce negative side effects (NSE), which are difficult to identify at design time. While the agent can learn to mitigate the side effects from human feedback, such feedback is often expensive and the rate of learning is sensitive to the agent's state representation. We examine how humans can assist an agent, beyond providing feedback, and exploit their broader scope of knowledge to mitigate the impacts of NSE. We formulate this problem as a human-agent team with decoupled objectives. The agent optimizes its assigned task, during which its actions may produce NSE. The human shapes the environment through minor reconfiguration actions so as to mitigate the impacts of the agent's side effects, without affecting the agent's ability to complete its assigned task. We present an algorithm to solve this problem and analyze its theoretical properties. Through experiments with human subjects, we assess the willingness of users to perform minor environment modifications to mitigate the impacts of NSE. Empirical evaluation of our approach shows that the proposed framework can successfully mitigate NSE, without affecting the agent's ability to complete its assigned task.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1137','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1137\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Agents operating in unstructured environments often produce negative side effects (NSE), which are difficult to identify at design time. While the agent can learn to mitigate the side effects from human feedback, such feedback is often expensive and the rate of learning is sensitive to the agent's state representation. We examine how humans can assist an agent, beyond providing feedback, and exploit their broader scope of knowledge to mitigate the impacts of NSE. We formulate this problem as a human-agent team with decoupled objectives. The agent optimizes its assigned task, during which its actions may produce NSE. The human shapes the environment through minor reconfiguration actions so as to mitigate the impacts of the agent's side effects, without affecting the agent's ability to complete its assigned task. We present an algorithm to solve this problem and analyze its theoretical properties. Through experiments with human subjects, we assess the willingness of users to perform minor environment modifications to mitigate the impacts of NSE. Empirical evaluation of our approach shows that the proposed framework can successfully mitigate NSE, without affecting the agent's ability to complete its assigned task.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1137','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1137\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"ai ai-arxiv\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/arxiv.org\/abs\/2102.07017\" title=\"https:\/\/arxiv.org\/abs\/2102.07017\" target=\"_blank\">https:\/\/arxiv.org\/abs\/2102.07017<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1137','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Rabiee, Sadegh;  Basich, Connor;  Wray, Kyle Hollins;  Zilberstein, Shlomo;  Biswas, Joydeep<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1147','tp_links')\" style=\"cursor:pointer;\">Competence-Aware Path Planning via Introspective Perception<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">CoRR, <\/span><span class=\"tp_pub_additional_volume\">vol. abs\/2109.13974, <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1147\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1147','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1147\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1147','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1147\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1147','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1147\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:SZarXiv21c,<br \/>\r\ntitle = {Competence-Aware Path Planning via Introspective Perception},<br \/>\r\nauthor = {Sadegh Rabiee and Connor Basich and Kyle Hollins Wray and Shlomo Zilberstein and Joydeep Biswas},<br \/>\r\nurl = {https:\/\/arxiv.org\/abs\/2109.13974},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\njournal = {CoRR},<br \/>\r\nvolume = {abs\/2109.13974},<br \/>\r\nabstract = {Robots deployed in the real world over extended periods of time need to reason about unexpected failures, learn to predict them, and to proactively take actions to avoid future failures. Existing approaches for competence-aware planning are either model-based, requiring explicit enumeration of known failure modes, or purely statistical, using state- and location-specific failure statistics to infer competence. We instead propose a structured model-free approach to competence-aware planning by reasoning about plan execution failures due to errors in perception, without requiring a-priori enumeration of failure modes or requiring location-specific failure statistics. We introduce competence-aware path planning via introspective perception (CPIP), a Bayesian framework to iteratively learn and exploit task-level competence in novel deployment environments. CPIP factorizes the competence-aware planning problem into two components. First, perception errors are learned in a model-free and location-agnostic setting via introspective perception prior to deployment in novel environments. Second, during actual deployments, the prediction of task-level failures is learned in a context-aware setting. Experiments in a simulation show that the proposed CPIP approach outperforms the frequentist baseline in multiple mobile robot tasks, and is further validated via real robot experiments in an environment with perceptually challenging obstacles and terrain.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1147','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1147\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Robots deployed in the real world over extended periods of time need to reason about unexpected failures, learn to predict them, and to proactively take actions to avoid future failures. Existing approaches for competence-aware planning are either model-based, requiring explicit enumeration of known failure modes, or purely statistical, using state- and location-specific failure statistics to infer competence. We instead propose a structured model-free approach to competence-aware planning by reasoning about plan execution failures due to errors in perception, without requiring a-priori enumeration of failure modes or requiring location-specific failure statistics. We introduce competence-aware path planning via introspective perception (CPIP), a Bayesian framework to iteratively learn and exploit task-level competence in novel deployment environments. CPIP factorizes the competence-aware planning problem into two components. First, perception errors are learned in a model-free and location-agnostic setting via introspective perception prior to deployment in novel environments. Second, during actual deployments, the prediction of task-level failures is learned in a context-aware setting. Experiments in a simulation show that the proposed CPIP approach outperforms the frequentist baseline in multiple mobile robot tasks, and is further validated via real robot experiments in an environment with perceptually challenging obstacles and terrain.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1147','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1147\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"ai ai-arxiv\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/arxiv.org\/abs\/2109.13974\" title=\"https:\/\/arxiv.org\/abs\/2109.13974\" target=\"_blank\">https:\/\/arxiv.org\/abs\/2109.13974<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1147','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Saisubramanian, Sandhya;  Kamar, Ece;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1121','tp_links')\" style=\"cursor:pointer;\">A Multi-Objective Approach to Mitigate Negative Side Effects<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_year\">2020<\/span><span class=\"tp_pub_additional_note\">, (Distinguished Paper Award)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1121\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1121','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1121\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1121','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1121\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1121','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1121\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SKZijcai20,<br \/>\r\ntitle = {A Multi-Objective Approach to Mitigate Negative Side Effects},<br \/>\r\nauthor = {Sandhya Saisubramanian and Ece Kamar and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SKZijcai20.pdf},<br \/>\r\nyear  = {2020},<br \/>\r\ndate = {2020-01-01},<br \/>\r\nbooktitle = {Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\nabstract = {Agents operating in unstructured environments of- ten create negative side effects (NSE) that may not be easy to identify at design time. We examine how various forms of human feedback or autonomous exploration can be used to learn a penalty function associated with NSE during system deployment. We formulate the problem of mitigating the impact of NSE as a multi-objective Markov deci- sion process with lexicographic reward preferences and slack. The slack denotes the maximum deviation from an optimal policy with respect to the agent\u2019s primary objective allowed in order to mitigate NSE as a secondary objective. Empirical evaluation of our approach shows that the proposed framework can successfully mitigate NSE and that different feedback mechanisms introduce different biases, which influence the identification of NSE.},<br \/>\r\nnote = {Distinguished Paper Award},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1121','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1121\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Agents operating in unstructured environments of- ten create negative side effects (NSE) that may not be easy to identify at design time. We examine how various forms of human feedback or autonomous exploration can be used to learn a penalty function associated with NSE during system deployment. We formulate the problem of mitigating the impact of NSE as a multi-objective Markov deci- sion process with lexicographic reward preferences and slack. The slack denotes the maximum deviation from an optimal policy with respect to the agent\u2019s primary objective allowed in order to mitigate NSE as a secondary objective. Empirical evaluation of our approach shows that the proposed framework can successfully mitigate NSE and that different feedback mechanisms introduce different biases, which influence the identification of NSE.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1121','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1121\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SKZijcai20.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SKZijcai20.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SKZijcai20.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1121','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Saisubramanian, Sandhya;  Galhotra, Sainyam;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1115','tp_links')\" style=\"cursor:pointer;\">Balancing the Tradeoff Between Clustering Value and Interpretability<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the AAAI\/ACM Conference on AI, Ethics, and Society (AIES), <\/span><span class=\"tp_pub_additional_address\">New York, NY, <\/span><span class=\"tp_pub_additional_year\">2020<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1115\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1115','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1115\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1115','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1115\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1115','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1115\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SGZaies20,<br \/>\r\ntitle = {Balancing the Tradeoff Between Clustering Value and Interpretability},<br \/>\r\nauthor = {Sandhya Saisubramanian and Sainyam Galhotra and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SGZaies20.pdf},<br \/>\r\ndoi = {10.1145\/3375627.3375843},<br \/>\r\nyear  = {2020},<br \/>\r\ndate = {2020-01-01},<br \/>\r\nbooktitle = {Proceedings of the AAAI\/ACM Conference on AI, Ethics, and Society (AIES)},<br \/>\r\npages = {351--357},<br \/>\r\naddress = {New York, NY},<br \/>\r\nabstract = {Graph clustering groups entities -- the vertices of a graph -- based on their similarity, typically using a complex distance function over a large number of features. Successful integration of clustering approaches in automated decision-support systems hinges on the interpretability of the resulting clusters. This paper addresses the problem of generating interpretable clusters, given features of interest that signify interpretability to an end-user, by optimizing interpretability in addition to common clustering objectives. We propose a \u03b2-interpretable clustering algorithm that ensures that at least \u03b2 fraction of nodes in each cluster share the same feature value. The tunable parameter \u03b2 is user-specified. We also present a more efficient algorithm for scenarios with \u03b2 = 1 and analyze the theoretical guarantees of the two algorithms. Finally, we empirically demonstrate the benefits of our approaches in generating interpretable clusters using four real-world datasets. The interpretability of the clusters is complemented by generating simple explanations denoting the feature values of the nodes in the clusters, using frequent pattern mining.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1115','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1115\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Graph clustering groups entities -- the vertices of a graph -- based on their similarity, typically using a complex distance function over a large number of features. Successful integration of clustering approaches in automated decision-support systems hinges on the interpretability of the resulting clusters. This paper addresses the problem of generating interpretable clusters, given features of interest that signify interpretability to an end-user, by optimizing interpretability in addition to common clustering objectives. We propose a \u03b2-interpretable clustering algorithm that ensures that at least \u03b2 fraction of nodes in each cluster share the same feature value. The tunable parameter \u03b2 is user-specified. We also present a more efficient algorithm for scenarios with \u03b2 = 1 and analyze the theoretical guarantees of the two algorithms. Finally, we empirically demonstrate the benefits of our approaches in generating interpretable clusters using four real-world datasets. The interpretability of the clusters is complemented by generating simple explanations denoting the feature values of the nodes in the clusters, using frequent pattern mining.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1115','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1115\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SGZaies20.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SGZaies20.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SGZaies20.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1145\/3375627.3375843\" title=\"Follow DOI:10.1145\/3375627.3375843\" target=\"_blank\">doi:10.1145\/3375627.3375843<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1115','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Saisubramanian, Sandhya;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('857','tp_links')\" style=\"cursor:pointer;\">Minimizing the Negative Side Effects of Planning with Reduced Models<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">AAAI Workshop on Artificial Intelligence Safety, <\/span><span class=\"tp_pub_additional_address\">Honolulu, Hawaii, <\/span><span class=\"tp_pub_additional_year\">2019<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_857\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('857','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_857\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('857','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_857\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('857','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_857\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SZaaai19ws,<br \/>\r\ntitle = {Minimizing the Negative Side Effects of Planning with Reduced Models},<br \/>\r\nauthor = {Sandhya Saisubramanian and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZaaai19ws.pdf},<br \/>\r\nyear  = {2019},<br \/>\r\ndate = {2019-01-01},<br \/>\r\nbooktitle = {AAAI Workshop on Artificial Intelligence Safety},<br \/>\r\naddress = {Honolulu, Hawaii},<br \/>\r\nabstract = {Reduced models of large Markov decision processes accelerate planning by considering a subset of outcomes for each state-action pair. This reduction in reachable states leads to replanning when the agent encounters states without a pre-computed action during plan execution. However, not all states are suitable for replanning. In the worst case, the agent may not be able to reach the goal from the newly encountered state. Agents should be better prepared to handle such risky situations and avoid replanning in risky states. Hence, we consider replanning in states that are unsafe for deliberation as a negative side effect of planning with reduced models. While the negative side effects can be minimized by always using the full model, this defeats the purpose of using reduced models. The challenge is to plan with reduced models, but somehow account for the possibility of encountering risky situations. An agent should thus only replan in states that the user has approved as safe for replanning. To that end, we propose planning using a portfolio of reduced models, a planning paradigm that minimizes the negative side effects of planning using reduced models by alternating between different outcome selection approaches. We empirically demonstrate the effectiveness of our approach on three domains: an electric vehicle charging domain using real-world data from a university campus and two benchmark planning problems.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('857','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_857\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Reduced models of large Markov decision processes accelerate planning by considering a subset of outcomes for each state-action pair. This reduction in reachable states leads to replanning when the agent encounters states without a pre-computed action during plan execution. However, not all states are suitable for replanning. In the worst case, the agent may not be able to reach the goal from the newly encountered state. Agents should be better prepared to handle such risky situations and avoid replanning in risky states. Hence, we consider replanning in states that are unsafe for deliberation as a negative side effect of planning with reduced models. While the negative side effects can be minimized by always using the full model, this defeats the purpose of using reduced models. The challenge is to plan with reduced models, but somehow account for the possibility of encountering risky situations. An agent should thus only replan in states that the user has approved as safe for replanning. To that end, we propose planning using a portfolio of reduced models, a planning paradigm that minimizes the negative side effects of planning using reduced models by alternating between different outcome selection approaches. We empirically demonstrate the effectiveness of our approach on three domains: an electric vehicle charging domain using real-world data from a university campus and two benchmark planning problems.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('857','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_857\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZaaai19ws.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZaaai19ws.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZaaai19ws.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('857','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Saisubramanian, Sandhya;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('869','tp_links')\" style=\"cursor:pointer;\">Safe Reduced Models for Probabilistic Planning<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">ICML\/IJCAI\/AAMAS Workshop on Planning and Learning (PAL), <\/span><span class=\"tp_pub_additional_address\">Stockholm, Sweden, <\/span><span class=\"tp_pub_additional_year\">2018<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_869\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('869','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_869\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('869','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_869\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('869','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_869\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:SZijcaiPLW18,<br \/>\r\ntitle = {Safe Reduced Models for Probabilistic Planning},<br \/>\r\nauthor = {Sandhya Saisubramanian and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZijcaiPLW18.pdf},<br \/>\r\nyear  = {2018},<br \/>\r\ndate = {2018-01-01},<br \/>\r\nbooktitle = {ICML\/IJCAI\/AAMAS Workshop on Planning and Learning (PAL)},<br \/>\r\naddress = {Stockholm, Sweden},<br \/>\r\nabstract = {Reduced models allow autonomous agents to cope with the complexity of planning under uncertainty by reducing the accuracy of the model. However, the solution quality of a reduced model varies as the model fidelity changes. We present planning using a portfolio of reduced models with cost adjustments, a framework to increase the safety of a reduced model by selectively improving its fidelity in certain states, without significantly compromising runtime. Our framework provides the flexibility to create reduced models with different levels of de- tail using a portfolio, and a means to account for the ignored details by adjusting the actions costs in the reduced model. We show the conditions under which cost adjustments achieve optimal action selection and describe how to use cost adjustments as a heuristic for choosing outcome selection principles in a portfolio. Finally, we present empirical results of our approach on three domains that includes an electric vehicle charging problem using real-world data from a university campus.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('869','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_869\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Reduced models allow autonomous agents to cope with the complexity of planning under uncertainty by reducing the accuracy of the model. However, the solution quality of a reduced model varies as the model fidelity changes. We present planning using a portfolio of reduced models with cost adjustments, a framework to increase the safety of a reduced model by selectively improving its fidelity in certain states, without significantly compromising runtime. Our framework provides the flexibility to create reduced models with different levels of de- tail using a portfolio, and a means to account for the ignored details by adjusting the actions costs in the reduced model. We show the conditions under which cost adjustments achieve optimal action selection and describe how to use cost adjustments as a heuristic for choosing outcome selection principles in a portfolio. Finally, we present empirical results of our approach on three domains that includes an electric vehicle charging problem using real-world data from a university campus.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('869','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_869\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZijcaiPLW18.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZijcaiPLW18.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/SZijcaiPLW18.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('869','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Freedman, Richard G;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('896','tp_links')\" style=\"cursor:pointer;\">Safety in AI-HRI: Challenges Complementing User Experience Quality<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">AAAI Fall Symposium on Artificial Intelligence and Human-Robot Interaction (AI-HRI), <\/span><span class=\"tp_pub_additional_address\">Arlington, Virginia, <\/span><span class=\"tp_pub_additional_year\">2016<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_896\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('896','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_896\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('896','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_896\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('896','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_896\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:FZfall16,<br \/>\r\ntitle = {Safety in AI-HRI: Challenges Complementing User Experience Quality},<br \/>\r\nauthor = {Richard G Freedman and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FZfall16.pdf},<br \/>\r\nyear  = {2016},<br \/>\r\ndate = {2016-01-01},<br \/>\r\nbooktitle = {AAAI Fall Symposium on Artificial Intelligence and Human-Robot Interaction (AI-HRI)},<br \/>\r\naddress = {Arlington, Virginia},<br \/>\r\nabstract = {Contemporary research in human-robot interaction (HRI) predominantly focuses on the user's experience while con- trolling a robot. However, with the increased deployment of artificial intelligence (AI) techniques, robots are quickly be- coming more autonomous in both academic and industrial experimental settings. In addition to improving the user's inter- active experience with AI-operated robots through personalization, dialogue, emotions, and dynamic behavior, there is also a growing need to consider the safety of the interaction. AI may not account for the user's less likely responses, making it possible for an unaware user to be injured by the robot if they have a collision. Issues of trust and acceptance may also come into play if users cannot always understand the robot's thought process, creating a potential for emotional harm. We identify challenges that will need to be addressed in safe AI-HRI and provide an overview of approaches to consider for them, many stemming from the contemporary research.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('896','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_896\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Contemporary research in human-robot interaction (HRI) predominantly focuses on the user's experience while con- trolling a robot. However, with the increased deployment of artificial intelligence (AI) techniques, robots are quickly be- coming more autonomous in both academic and industrial experimental settings. In addition to improving the user's inter- active experience with AI-operated robots through personalization, dialogue, emotions, and dynamic behavior, there is also a growing need to consider the safety of the interaction. AI may not account for the user's less likely responses, making it possible for an unaware user to be injured by the robot if they have a collision. Issues of trust and acceptance may also come into play if users cannot always understand the robot's thought process, creating a potential for emotional harm. We identify challenges that will need to be addressed in safe AI-HRI and provide an overview of approaches to consider for them, many stemming from the contemporary research.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('896','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_896\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FZfall16.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FZfall16.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FZfall16.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('896','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><\/table><\/div><\/div>\n<div><\/div><\/div><\/div>\n<\/div>\n<h3><span style=\"color: #264278\"><b>Plan and Activity Recognition<\/b><\/span><\/h3>\n<div>\n<div>How can agents recognize the plans, activities, and intents of other agents and use that information to plan their response?<\/div>\n<div><div class=\"bg-margin-for-link\"><input type='hidden' bg_collapse_expand='6a2d3029069ec1021863568' value='6a2d3029069ec1021863568'><input type='hidden' id='bg-show-more-text-6a2d3029069ec1021863568' value='Show Related Publications'><input type='hidden' id='bg-show-less-text-6a2d3029069ec1021863568' value='Hide Related Publications'><a id='bg-showmore-action-6a2d3029069ec1021863568' class='bg-showmore-plg-link bg-arrow '  style=\" color:#7C2622;;\" href='#'>Show Related Publications<\/a><div id='bg-showmore-hidden-6a2d3029069ec1021863568' ><div class=\"teachpress_pub_list\"><form name=\"tppublistform\" method=\"get\"><a name=\"tppubs\" id=\"tppubs\"><\/a><\/form><table class=\"teachpress_publication_list\"><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Miura, Shuwa;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1188','tp_links')\" style=\"cursor:pointer;\">Observer-Aware Planning with Implicit and Explicit Communication<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the The 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS), <\/span><span class=\"tp_pub_additional_address\">Auckland, New Zealand, <\/span><span class=\"tp_pub_additional_year\">2024<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1188\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1188','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1188\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1188','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1188\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1188','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1188\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:MZaamas24,<br \/>\r\ntitle = {Observer-Aware Planning with Implicit and Explicit Communication},<br \/>\r\nauthor = {Shuwa Miura and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZaamas24.pdf},<br \/>\r\nyear  = {2024},<br \/>\r\ndate = {2024-01-01},<br \/>\r\nbooktitle = {Proceedings of the The 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS)},<br \/>\r\naddress = {Auckland, New Zealand},<br \/>\r\nabstract = {This paper presents a computational model designed for planning both implicit and explicit communication of intentions, goals, and desires. Building upon previous research focused on implicit communication of intention via actions, our model seeks to strategically influence an observer\u2019s belief using both the agent\u2019s actions and explicit messages. We show that our proposed model can be considered to be a special case of general multi-agent problems with explicit communication under certain assumptions. Since the mental state of the observer depends on histories, computing a policy for the proposed model amounts to optimizing a non-Markovian objective, which we show to be intractable in the worst case. To mitigate this challenge, we propose a technique based on splitting domain and communication actions during planning. We conclude with experimental evaluations of the proposed approach that illustrate its effectiveness.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1188','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1188\" style=\"display:none;\"><div class=\"tp_abstract_entry\">This paper presents a computational model designed for planning both implicit and explicit communication of intentions, goals, and desires. Building upon previous research focused on implicit communication of intention via actions, our model seeks to strategically influence an observer\u2019s belief using both the agent\u2019s actions and explicit messages. We show that our proposed model can be considered to be a special case of general multi-agent problems with explicit communication under certain assumptions. Since the mental state of the observer depends on histories, computing a policy for the proposed model amounts to optimizing a non-Markovian objective, which we show to be intractable in the worst case. To mitigate this challenge, we propose a technique based on splitting domain and communication actions during planning. We conclude with experimental evaluations of the proposed approach that illustrate its effectiveness.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1188','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1188\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZaamas24.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZaamas24.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZaamas24.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1188','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_article\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Nashed, Samer;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1155','tp_links')\" style=\"cursor:pointer;\">A Survey of Opponent Modeling in Adversarial Domains<\/a> <span class=\"tp_pub_type tp_  article\">Journal Article<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_in\">In: <\/span><span class=\"tp_pub_additional_journal\">Journal of Artificial Intelligence Research (JAIR), <\/span><span class=\"tp_pub_additional_volume\">vol. 73, <\/span><span class=\"tp_pub_additional_pages\">pp. 277\u2013327, <\/span><span class=\"tp_pub_additional_year\">2022<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1155\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1155','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1155\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1155','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1155\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1155','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1155\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@article{SZ:NZjair22,<br \/>\r\ntitle = {A Survey of Opponent Modeling in Adversarial Domains},<br \/>\r\nauthor = {Samer Nashed and Shlomo Zilberstein},<br \/>\r\nurl = {https:\/\/jair.org\/index.php\/jair\/article\/view\/12889\/26762},<br \/>\r\ndoi = {10.1613\/jair.1.12889},<br \/>\r\nyear  = {2022},<br \/>\r\ndate = {2022-01-01},<br \/>\r\njournal = {Journal of Artificial Intelligence Research (JAIR)},<br \/>\r\nvolume = {73},<br \/>\r\npages = {277--327},<br \/>\r\nabstract = {Opponent modeling is the ability to use prior knowledge and observations in order to predict the behavior of an opponent. This survey presents a comprehensive overview of existing opponent modeling techniques for adversarial domains, many of which must address stochastic, continuous, or concurrent actions, and sparse, partially observable payoff structures. We discuss all the components of opponent modeling systems, including feature extraction, learning algorithms, and strategy abstractions. These discussions lead us to propose a new form of analysis for describing and predicting the evolution of game states over time. We then introduce a new framework that facilitates method comparison, analyze a representative selection of techniques using the proposed framework, and highlight common trends among recently proposed methods. Finally, we list several open problems and discuss future research directions inspired by AI research on opponent modeling and related research in other disciplines.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {article}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1155','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1155\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Opponent modeling is the ability to use prior knowledge and observations in order to predict the behavior of an opponent. This survey presents a comprehensive overview of existing opponent modeling techniques for adversarial domains, many of which must address stochastic, continuous, or concurrent actions, and sparse, partially observable payoff structures. We discuss all the components of opponent modeling systems, including feature extraction, learning algorithms, and strategy abstractions. These discussions lead us to propose a new form of analysis for describing and predicting the evolution of game states over time. We then introduce a new framework that facilitates method comparison, analyze a representative selection of techniques using the proposed framework, and highlight common trends among recently proposed methods. Finally, we list several open problems and discuss future research directions inspired by AI research on opponent modeling and related research in other disciplines.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1155','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1155\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/jair.org\/index.php\/jair\/article\/view\/12889\/26762\" title=\"https:\/\/jair.org\/index.php\/jair\/article\/view\/12889\/26762\" target=\"_blank\">https:\/\/jair.org\/index.php\/jair\/article\/view\/12889\/26762<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1613\/jair.1.12889\" title=\"Follow DOI:10.1613\/jair.1.12889\" target=\"_blank\">doi:10.1613\/jair.1.12889<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1155','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Miura, Shuwa;  Cohen, Andrew L;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1145','tp_links')\" style=\"cursor:pointer;\">Maximizing Legibility in Stochastic Environments<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 30th IEEE International Conference on Robot &amp; Human Interactive Communication, (RO-MAN), <\/span><span class=\"tp_pub_additional_address\">Vancouver, BC, Canada, <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1145\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1145','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1145\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1145','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1145\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1145','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1145\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:MCZroman21,<br \/>\r\ntitle = {Maximizing Legibility in Stochastic Environments},<br \/>\r\nauthor = {Shuwa Miura and Andrew L Cohen and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MCZroman21.pdf},<br \/>\r\ndoi = {10.1109\/RO-MAN50785.2021.9515318},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\nbooktitle = {Proceedings of the 30th IEEE International Conference on Robot & Human Interactive Communication, (RO-MAN)},<br \/>\r\npages = {1053--1059},<br \/>\r\naddress = {Vancouver, BC, Canada},<br \/>\r\nabstract = {Making an agent's intentions clear from its observed behavior is crucial for seamless human-agent interaction and for increased transparency and trust in AI systems. Existing methods that address this challenge and maximize legibility of behaviors are limited to deterministic domains. We develop a technique for maximizing legibility in stochastic environments and illustrate that using legibility as an objective improves interpretability of agent behavior in several scenarios. We provide initial empirical evidence that human subjects can better interpret legible behavior.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1145','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1145\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Making an agent's intentions clear from its observed behavior is crucial for seamless human-agent interaction and for increased transparency and trust in AI systems. Existing methods that address this challenge and maximize legibility of behaviors are limited to deterministic domains. We develop a technique for maximizing legibility in stochastic environments and illustrate that using legibility as an objective improves interpretability of agent behavior in several scenarios. We provide initial empirical evidence that human subjects can better interpret legible behavior.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1145','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1145\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MCZroman21.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MCZroman21.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MCZroman21.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1109\/RO-MAN50785.2021.9515318\" title=\"Follow DOI:10.1109\/RO-MAN50785.2021.9515318\" target=\"_blank\">doi:10.1109\/RO-MAN50785.2021.9515318<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1145','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Miura, Shuwa;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1146','tp_links')\" style=\"cursor:pointer;\">A Unifying Framework for Observer-Aware Planning and its Complexity<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 37th Conference on Uncertainty in Artificial Intelligence (UAI), <\/span><span class=\"tp_pub_additional_address\">Virtual Event, <\/span><span class=\"tp_pub_additional_year\">2021<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1146\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1146','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1146\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1146','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1146\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1146','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1146\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:MZuai21,<br \/>\r\ntitle = {A Unifying Framework for Observer-Aware Planning and its Complexity},<br \/>\r\nauthor = {Shuwa Miura and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZuai21.pdf},<br \/>\r\nyear  = {2021},<br \/>\r\ndate = {2021-01-01},<br \/>\r\nbooktitle = {Proceedings of the 37th Conference on Uncertainty in Artificial Intelligence (UAI)},<br \/>\r\npages = {610--620},<br \/>\r\naddress = {Virtual Event},<br \/>\r\nabstract = {Being aware of observers and the inferences they make about an agent's behavior is crucial for successful multi-agent interaction. Existing works on observer-aware planning use different assumptions and techniques to produce observer-aware behaviors. We argue that observer-aware planning, in its most general form, can be modeled as an Interactive POMDP (I-POMDP), which requires complex modeling and is hard to solve. Hence, we introduce a less complex framework for producing observer-aware behaviors called Observer-Aware MDP (OAMDP) and analyze its relationship to I-POMDP. We establish the complexity of OAMDPs and show that they can improve interpretability of agent behaviors in several scenarios.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1146','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1146\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Being aware of observers and the inferences they make about an agent's behavior is crucial for successful multi-agent interaction. Existing works on observer-aware planning use different assumptions and techniques to produce observer-aware behaviors. We argue that observer-aware planning, in its most general form, can be modeled as an Interactive POMDP (I-POMDP), which requires complex modeling and is hard to solve. Hence, we introduce a less complex framework for producing observer-aware behaviors called Observer-Aware MDP (OAMDP) and analyze its relationship to I-POMDP. We establish the complexity of OAMDPs and show that they can improve interpretability of agent behaviors in several scenarios.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1146','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1146\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZuai21.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZuai21.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/MZuai21.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1146','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wayllace, Christabel;  Keren, Sarah;  Gal, Avigdor;  Karpas, Erez;  Yeoh, William;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1122','tp_links')\" style=\"cursor:pointer;\">Accounting for Observer's Partial Observability in Stochastic Goal Recognition Design<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 24th European Conference on Artificial Intelligence (ECAI), <\/span><span class=\"tp_pub_additional_year\">2020<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1122\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1122','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1122\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1122','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1122\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1122','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1122\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WKGKYZecai20,<br \/>\r\ntitle = {Accounting for Observer's Partial Observability in Stochastic Goal Recognition Design},<br \/>\r\nauthor = {Christabel Wayllace and Sarah Keren and Avigdor Gal and Erez Karpas and William Yeoh and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WKGKYZecai20.pdf},<br \/>\r\nyear  = {2020},<br \/>\r\ndate = {2020-01-01},<br \/>\r\nbooktitle = {Proceedings of the 24th European Conference on Artificial Intelligence (ECAI)},<br \/>\r\nabstract = {Motivated by security applications, where agent intentions are unknown, actions may have stochastic outcomes, and an ob- server may have an obfuscated view due to low sensor resolution, we introduce partially-observable states and unobservable actions into a stochastic goal recognition design framework. The proposed model is accompanied by a method for calculating the expected maximal number of steps before the goal of an agent is revealed and a new sensor refinement modification that can be applied to enhance goal recognition. A preliminary empirical evaluation on a range of bench- mark applications shows the effectiveness of our approach.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1122','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1122\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Motivated by security applications, where agent intentions are unknown, actions may have stochastic outcomes, and an ob- server may have an obfuscated view due to low sensor resolution, we introduce partially-observable states and unobservable actions into a stochastic goal recognition design framework. The proposed model is accompanied by a method for calculating the expected maximal number of steps before the goal of an agent is revealed and a new sensor refinement modification that can be applied to enhance goal recognition. A preliminary empirical evaluation on a range of bench- mark applications shows the effectiveness of our approach.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1122','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1122\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WKGKYZecai20.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WKGKYZecai20.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WKGKYZecai20.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1122','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Dwaraki, Abhishek;  Freedman, Richard G;  Zilberstein, Shlomo;  Wolf, Tilman<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('858','tp_links')\" style=\"cursor:pointer;\">Using Natural Language Constructs and Concepts to Aid Network Management<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the International Conference on Computing, Networking and Communications, <\/span><span class=\"tp_pub_additional_address\">Honolulu, Hawaii, <\/span><span class=\"tp_pub_additional_year\">2019<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_858\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('858','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_858\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('858','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_858\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('858','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_858\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:DFZWiccnc19,<br \/>\r\ntitle = {Using Natural Language Constructs and Concepts to Aid Network Management},<br \/>\r\nauthor = {Abhishek Dwaraki and Richard G Freedman and Shlomo Zilberstein and Tilman Wolf},<br \/>\r\nurl = {https:\/\/doi.org\/10.1109\/ICCNC.2019.8685639},<br \/>\r\ndoi = {10.1109\/ICCNC.2019.8685639},<br \/>\r\nyear  = {2019},<br \/>\r\ndate = {2019-01-01},<br \/>\r\nbooktitle = {Proceedings of the International Conference on Computing, Networking and Communications},<br \/>\r\npages = {802--808},<br \/>\r\naddress = {Honolulu, Hawaii},<br \/>\r\nabstract = {The increasing complexity of networks together with technological trends that allow for fine-grained control and programmability have made network management a pressing challenge. In this work, we propose to harness the vast amounts of network management data that are available from different sources in an automated system that can infer context and semantics. We present an argument for a Network Processing Language that is based on the ideas of natural language processing. Our approach shows how concepts, such as collocations, can be applied to network management data. We demonstrate the effectiveness of our approach to detect route prefix and sub-prefix hijacks. This work presents one step toward effectively using automated tools for network management in complex, programmable networks.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('858','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_858\" style=\"display:none;\"><div class=\"tp_abstract_entry\">The increasing complexity of networks together with technological trends that allow for fine-grained control and programmability have made network management a pressing challenge. In this work, we propose to harness the vast amounts of network management data that are available from different sources in an automated system that can infer context and semantics. We present an argument for a Network Processing Language that is based on the ideas of natural language processing. Our approach shows how concepts, such as collocations, can be applied to network management data. We demonstrate the effectiveness of our approach to detect route prefix and sub-prefix hijacks. This work presents one step toward effectively using automated tools for network management in complex, programmable networks.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('858','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_858\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-globe\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/doi.org\/10.1109\/ICCNC.2019.8685639\" title=\"https:\/\/doi.org\/10.1109\/ICCNC.2019.8685639\" target=\"_blank\">https:\/\/doi.org\/10.1109\/ICCNC.2019.8685639<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.1109\/ICCNC.2019.8685639\" title=\"Follow DOI:10.1109\/ICCNC.2019.8685639\" target=\"_blank\">doi:10.1109\/ICCNC.2019.8685639<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('858','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Keren, Sarah;  Pineda, Luis Enrique;  Gal, Avigdor;  Karpas, Erez;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1102','tp_links')\" style=\"cursor:pointer;\">Responsive Planning and Recognition for Closed-Loop Interaction<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 29th International Conference on Automated Planning and Scheduling (ICAPS), <\/span><span class=\"tp_pub_additional_address\">Berkeley, CA, <\/span><span class=\"tp_pub_additional_year\">2019<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1102\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1102','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1102\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1102','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1102\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1102','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1102\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:KPGKZicaps19,<br \/>\r\ntitle = {Responsive Planning and Recognition for Closed-Loop Interaction},<br \/>\r\nauthor = {Sarah Keren and Luis Enrique Pineda and Avigdor Gal and Erez Karpas and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KPGKZicaps19.pdf},<br \/>\r\nyear  = {2019},<br \/>\r\ndate = {2019-01-01},<br \/>\r\nbooktitle = {Proceedings of the 29th International Conference on Automated Planning and Scheduling (ICAPS)},<br \/>\r\npages = {246--254},<br \/>\r\naddress = {Berkeley, CA},<br \/>\r\nabstract = {Given an environment, the utility measure of the agents acting within it, a set of possible environment modifications, and a description of design constraints, the objective of equi-reward utility maximizing design (ER-UMD) is to find a valid sequence of modifications to apply to the environment in order to maximize agent utility. To efficiently traverse the typically large space of possible design options, we use heuristic search and propose new heuristics, which relax the design process; instead of computing the value achieved by a single modification, we use a dominating modification guaranteed to be at least as beneficial. The proposed technique enables heuristic caching for similar nodes thereby saving computational overhead. We specify sufficient conditions under which our approach is guaranteed to produce admissible estimates, and describe a range of models that comply with these requirements. Also, for models with lifted representations of environment modifications, we provide simple methods to automatically generate dominating modifications. We evaluate our approach on a range of stochastic settings for which our heuristic is admissible. We demonstrate its efficiency by comparing it to a previously suggested heuristic that employs a relaxation of the environment, and to a compilation from ER-UMD to planning.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1102','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1102\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Given an environment, the utility measure of the agents acting within it, a set of possible environment modifications, and a description of design constraints, the objective of equi-reward utility maximizing design (ER-UMD) is to find a valid sequence of modifications to apply to the environment in order to maximize agent utility. To efficiently traverse the typically large space of possible design options, we use heuristic search and propose new heuristics, which relax the design process; instead of computing the value achieved by a single modification, we use a dominating modification guaranteed to be at least as beneficial. The proposed technique enables heuristic caching for similar nodes thereby saving computational overhead. We specify sufficient conditions under which our approach is guaranteed to produce admissible estimates, and describe a range of models that comply with these requirements. Also, for models with lifted representations of environment modifications, we provide simple methods to automatically generate dominating modifications. We evaluate our approach on a range of stochastic settings for which our heuristic is admissible. We demonstrate its efficiency by comparing it to a previously suggested heuristic that employs a relaxation of the environment, and to a compilation from ER-UMD to planning.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1102','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1102\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KPGKZicaps19.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KPGKZicaps19.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KPGKZicaps19.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1102','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Freedman, Richard G;  Fung, Yi Ren;  Ganchin, Roman;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('864','tp_links')\" style=\"cursor:pointer;\">Towards Quicker Probabilistic Recognition with Multiple Goal Heuristic Search<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">AAAI Workshop on Plan, Activity, and Intent Recognition (PAIR), <\/span><span class=\"tp_pub_additional_address\">New Orleans, Louisiana, <\/span><span class=\"tp_pub_additional_year\">2018<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_864\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('864','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_864\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('864','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_864\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('864','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_864\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:FFGZpair18,<br \/>\r\ntitle = {Towards Quicker Probabilistic Recognition with Multiple Goal Heuristic Search},<br \/>\r\nauthor = {Richard G Freedman and Yi Ren Fung and Roman Ganchin and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FFGZpair18.pdf},<br \/>\r\nyear  = {2018},<br \/>\r\ndate = {2018-01-01},<br \/>\r\nbooktitle = {AAAI Workshop on Plan, Activity, and Intent Recognition (PAIR)},<br \/>\r\npages = {601--606},<br \/>\r\naddress = {New Orleans, Louisiana},<br \/>\r\nabstract = {Referred to as an approach for either plan or goal recognition, the original method proposed by Ramirez and Geffner introduced a domain-based approach that did not need a library containing specific plan instances. This introduced a more generalizable means of representing tasks to be recognized, but was also very slow due to its need to run simulations via multiple executions of an off-the-shelf classical planner. Several variations have since been proposed for quicker recognition, but each one uses a drastically different approach that must sacrifice other qualities useful for processing the recognition results in more complex systems. We present work in progress that takes advantage of the shared state space between planner executions to perform multiple goal heuristic search. This single execution of a planner will potentially speed up the recognition process using the original method, which also maintains the sacrificed properties and improves some of the assumptions made by Ramirez and Geffner.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('864','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_864\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Referred to as an approach for either plan or goal recognition, the original method proposed by Ramirez and Geffner introduced a domain-based approach that did not need a library containing specific plan instances. This introduced a more generalizable means of representing tasks to be recognized, but was also very slow due to its need to run simulations via multiple executions of an off-the-shelf classical planner. Several variations have since been proposed for quicker recognition, but each one uses a drastically different approach that must sacrifice other qualities useful for processing the recognition results in more complex systems. We present work in progress that takes advantage of the shared state space between planner executions to perform multiple goal heuristic search. This single execution of a planner will potentially speed up the recognition process using the original method, which also maintains the sacrificed properties and improves some of the assumptions made by Ramirez and Geffner.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('864','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_864\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FFGZpair18.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FFGZpair18.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FFGZpair18.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('864','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Freedman, Richard G;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('865','tp_links')\" style=\"cursor:pointer;\">Roles that Plan, Activity, and Intent Recognition with Planning Can Play in Games<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">AAAI Workshop on Knowledge Extraction from Games (KEG), <\/span><span class=\"tp_pub_additional_address\">New Orleans, Louisiana, <\/span><span class=\"tp_pub_additional_year\">2018<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_865\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('865','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_865\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('865','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_865\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('865','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_865\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:FZkeg18,<br \/>\r\ntitle = {Roles that Plan, Activity, and Intent Recognition with Planning Can Play in Games},<br \/>\r\nauthor = {Richard G Freedman and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FZkeg18.pdf},<br \/>\r\nyear  = {2018},<br \/>\r\ndate = {2018-01-01},<br \/>\r\nbooktitle = {AAAI Workshop on Knowledge Extraction from Games (KEG)},<br \/>\r\naddress = {New Orleans, Louisiana},<br \/>\r\nabstract = {Planning is one of the oldest areas of research within artificial intelligence, studying the selection of actions for accomplishing goals. The more recently established areas of plan, activity, and intent recognition instead study an agent's behavior and task(s) given observations of its chosen actions. While these areas have been independently studied and applied to games in the past for both understanding player behavior and developing game characters, the potential for their integration presents even more opportunities via adaptive interaction with the player. In this manuscript, we discuss recent research on the integration of these areas and investigate potential uses for such integrated systems in games.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('865','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_865\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Planning is one of the oldest areas of research within artificial intelligence, studying the selection of actions for accomplishing goals. The more recently established areas of plan, activity, and intent recognition instead study an agent's behavior and task(s) given observations of its chosen actions. While these areas have been independently studied and applied to games in the past for both understanding player behavior and developing game characters, the potential for their integration presents even more opportunities via adaptive interaction with the player. In this manuscript, we discuss recent research on the integration of these areas and investigate potential uses for such integrated systems in games.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('865','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_865\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FZkeg18.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FZkeg18.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FZkeg18.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('865','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Freedman, Richard G;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('875','tp_links')\" style=\"cursor:pointer;\">Integration of Planning with Recognition for Responsive Interaction Using Classical Planners<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 31st Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_address\">San Francisco, California, <\/span><span class=\"tp_pub_additional_year\">2017<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_875\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('875','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_875\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('875','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_875\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('875','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_875\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:FZaaai17,<br \/>\r\ntitle = {Integration of Planning with Recognition for Responsive Interaction Using Classical Planners},<br \/>\r\nauthor = {Richard G Freedman and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FZaaai17.pdf},<br \/>\r\nyear  = {2017},<br \/>\r\ndate = {2017-01-01},<br \/>\r\nbooktitle = {Proceedings of the 31st Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {4581--4588},<br \/>\r\naddress = {San Francisco, California},<br \/>\r\nabstract = {Interaction between multiple agents requires some form of coordination and a level of mutual awareness. When computers and robots interact with people, they need to recognize human plans and react appropriately. Plan and goal recognition techniques have focused on identifying an agent's task given a sufficiently long action sequence. However, by the time the plan and\/or goal are recognized, it may be too late for computing an interactive response. We propose an integration of planning with probabilistic recognition where each method uses intermediate results from the other as a guid- ing heuristic for recognition of the plan\/goal in-progress as well as the interactive response. We show that, like the used recognition method, these interaction problems can be com- piled into classical planning problems and solved using off- the-shelf methods. In addition to the methodology, this paper introduces problem categories for different forms of interaction, an evaluation metric for the benefits from the interaction, and extensions to the recognition algorithm that make its intermediate results more practical while the plan is in progress.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('875','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_875\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Interaction between multiple agents requires some form of coordination and a level of mutual awareness. When computers and robots interact with people, they need to recognize human plans and react appropriately. Plan and goal recognition techniques have focused on identifying an agent's task given a sufficiently long action sequence. However, by the time the plan and\/or goal are recognized, it may be too late for computing an interactive response. We propose an integration of planning with probabilistic recognition where each method uses intermediate results from the other as a guid- ing heuristic for recognition of the plan\/goal in-progress as well as the interactive response. We show that, like the used recognition method, these interaction problems can be com- piled into classical planning problems and solved using off- the-shelf methods. In addition to the methodology, this paper introduces problem categories for different forms of interaction, an evaluation metric for the benefits from the interaction, and extensions to the recognition algorithm that make its intermediate results more practical while the plan is in progress.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('875','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_875\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FZaaai17.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FZaaai17.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FZaaai17.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('875','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Keren, Sarah;  Pineda, Luis Enrique;  Gal, Avigdor;  Karpas, Erez;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('1101','tp_links')\" style=\"cursor:pointer;\">Equi-Reward Utility Maximizing Design in Stochastic Environments<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_address\">Melbourne, Australia, <\/span><span class=\"tp_pub_additional_year\">2017<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_1101\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1101','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_1101\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1101','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_1101\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('1101','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_1101\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:KPGKZijcai17,<br \/>\r\ntitle = {Equi-Reward Utility Maximizing Design in Stochastic Environments},<br \/>\r\nauthor = {Sarah Keren and Luis Enrique Pineda and Avigdor Gal and Erez Karpas and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KPGKZijcai17.pdf},<br \/>\r\ndoi = {10.24963\/ijcai.2017\/608},<br \/>\r\nyear  = {2017},<br \/>\r\ndate = {2017-01-01},<br \/>\r\nbooktitle = {Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\npages = {4353--4360},<br \/>\r\naddress = {Melbourne, Australia},<br \/>\r\nabstract = {We present the Equi-Reward Utility Maximizing Design (ER-UMD) problem for redesigning stochastic environments to maximize agent performance. ER-UMD fits well contemporary applications that require offline design of environments where robots and humans act and cooperate. To find an optimal modification sequence we present two novel solution techniques: a compilation that embeds design into a planning problem, allowing use of off-the-shelf solvers to find a solution, and a heuristic search in the modifications space, for which we present an admissible heuristic. Evaluation shows the feasibility of the approach using standard benchmarks from the probabilistic planning competition and a benchmark we created for a vacuum cleaning robot setting.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1101','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_1101\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We present the Equi-Reward Utility Maximizing Design (ER-UMD) problem for redesigning stochastic environments to maximize agent performance. ER-UMD fits well contemporary applications that require offline design of environments where robots and humans act and cooperate. To find an optimal modification sequence we present two novel solution techniques: a compilation that embeds design into a planning problem, allowing use of off-the-shelf solvers to find a solution, and a heuristic search in the modifications space, for which we present an admissible heuristic. Evaluation shows the feasibility of the approach using standard benchmarks from the probabilistic planning competition and a benchmark we created for a vacuum cleaning robot setting.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1101','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_1101\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KPGKZijcai17.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KPGKZijcai17.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KPGKZijcai17.pdf<\/a><\/li><li><i class=\"ai ai-doi\"><\/i><a class=\"tp_pub_list\" href=\"https:\/\/dx.doi.org\/10.24963\/ijcai.2017\/608\" title=\"Follow DOI:10.24963\/ijcai.2017\/608\" target=\"_blank\">doi:10.24963\/ijcai.2017\/608<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('1101','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Freedman, Richard G;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('908','tp_links')\" style=\"cursor:pointer;\">Automated Interpretations of Unsupervised Learning-Derived Clusters for Activity Recognition<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Ro-Man Workshop on Learning for Human-Robot Collaboration, <\/span><span class=\"tp_pub_additional_address\">Kobe, Japan, <\/span><span class=\"tp_pub_additional_year\">2015<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_resource_link\"><a id=\"tp_links_sh_908\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('908','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_908\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('908','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_908\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:FZromanW15,<br \/>\r\ntitle = {Automated Interpretations of Unsupervised Learning-Derived Clusters for Activity Recognition},<br \/>\r\nauthor = {Richard G Freedman and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FZromanW15.pdf},<br \/>\r\nyear  = {2015},<br \/>\r\ndate = {2015-01-01},<br \/>\r\nbooktitle = {Ro-Man Workshop on Learning for Human-Robot Collaboration},<br \/>\r\naddress = {Kobe, Japan},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('908','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_908\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FZromanW15.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FZromanW15.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FZromanW15.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('908','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Freedman, Richard G;  Jung, Hee-Tae;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('910','tp_links')\" style=\"cursor:pointer;\">Temporal and Object Relations in Unsupervised Plan and Activity Recognition<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">AAAI Fall Symposium on Artificial Intelligence and Human-Robot Interaction (AI-HRI), <\/span><span class=\"tp_pub_additional_address\">Arlington, Virginia, <\/span><span class=\"tp_pub_additional_year\">2015<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_910\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('910','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_910\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('910','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_910\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('910','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_910\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:FJZfall15,<br \/>\r\ntitle = {Temporal and Object Relations in Unsupervised Plan and Activity Recognition},<br \/>\r\nauthor = {Richard G Freedman and Hee-Tae Jung and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FJZfall15.pdf},<br \/>\r\nyear  = {2015},<br \/>\r\ndate = {2015-01-01},<br \/>\r\nbooktitle = {AAAI Fall Symposium on Artificial Intelligence and Human-Robot Interaction (AI-HRI)},<br \/>\r\naddress = {Arlington, Virginia},<br \/>\r\nabstract = {We consider ways to improve the performance of unsupervised plan and activity recognition techniques by considering temporal and object relations in addition to postural data. Temporal relationships can help recognize activities with cyclic structure and are often implicit because plans have degrees of ordering actions. Relations with objects can help disambiguate observed activities that otherwise share a user's posture and position. We develop and investigate graphical models that extend the popular latent Dirichlet allocation approach with temporal and object relations, examine the relative performance and runtime trade-offs using a standard dataset, and consider the cost\/benefit trade-offs these extensions offer in the context of human-robot and human- computer interaction.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('910','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_910\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We consider ways to improve the performance of unsupervised plan and activity recognition techniques by considering temporal and object relations in addition to postural data. Temporal relationships can help recognize activities with cyclic structure and are often implicit because plans have degrees of ordering actions. Relations with objects can help disambiguate observed activities that otherwise share a user's posture and position. We develop and investigate graphical models that extend the popular latent Dirichlet allocation approach with temporal and object relations, examine the relative performance and runtime trade-offs using a standard dataset, and consider the cost\/benefit trade-offs these extensions offer in the context of human-robot and human- computer interaction.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('910','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_910\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FJZfall15.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FJZfall15.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FJZfall15.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('910','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Freedman, Richard G;  Jung, Hee-Tae;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('915','tp_links')\" style=\"cursor:pointer;\">Plan and Activity Recognition from a Topic Modeling Perspective<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 24thInternational Conference on Automated Planning and Scheduling (ICAPS), <\/span><span class=\"tp_pub_additional_address\">Portsmouth, New Hampshire, <\/span><span class=\"tp_pub_additional_year\">2014<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_915\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('915','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_915\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('915','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_915\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('915','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_915\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:FJZicaps14,<br \/>\r\ntitle = {Plan and Activity Recognition from a Topic Modeling Perspective},<br \/>\r\nauthor = {Richard G Freedman and Hee-Tae Jung and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FJZicaps14.pdf},<br \/>\r\nyear  = {2014},<br \/>\r\ndate = {2014-01-01},<br \/>\r\nbooktitle = {Proceedings of the 24thInternational Conference on Automated Planning and Scheduling (ICAPS)},<br \/>\r\npages = {360--364},<br \/>\r\naddress = {Portsmouth, New Hampshire},<br \/>\r\nabstract = {We examine new ways to perform plan recognition (PR) using natural language processing (NLP) techniques. PR often focuses on the structural relationships between consecutive observations and ordered activities that comprise plans. However, NLP commonly treats text as a bag-of-words, omitting such structural relationships and using topic models to break down the distribution of concepts discussed in documents. In this paper, we examine an analogous treatment of plans as distributions of activities. We explore the application of Latent Dirichlet Allocation topic models to human skeletal data of plan execution traces obtained from a RGB-D sensor. This investigation focuses on representing the data as text and interpreting learned activities as a form of activity recognition (AR). Additionally, we explain how the system may perform PR. The initial empirical results suggest that such NLP methods can be useful in complex PR and AR tasks.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('915','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_915\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We examine new ways to perform plan recognition (PR) using natural language processing (NLP) techniques. PR often focuses on the structural relationships between consecutive observations and ordered activities that comprise plans. However, NLP commonly treats text as a bag-of-words, omitting such structural relationships and using topic models to break down the distribution of concepts discussed in documents. In this paper, we examine an analogous treatment of plans as distributions of activities. We explore the application of Latent Dirichlet Allocation topic models to human skeletal data of plan execution traces obtained from a RGB-D sensor. This investigation focuses on representing the data as text and interpreting learned activities as a form of activity recognition (AR). Additionally, we explain how the system may perform PR. The initial empirical results suggest that such NLP methods can be useful in complex PR and AR tasks.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('915','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_915\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FJZicaps14.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FJZicaps14.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/FJZicaps14.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('915','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><\/table><\/div><\/div>\n<div><\/div><\/div><\/div>\n<\/div>\n<h3><b><span style=\"color: #264278\">Stochastic Network Design and Optimization<\/span><\/b><\/h3>\n<div>\n<div>How to develop\u00a0scalable algorithms to optimize diffusion processes and use them to control\u00a0the spread of various phenomena such as information over a social network or species over fragmented landscape?<\/div>\n<div><div class=\"bg-margin-for-link\"><input type='hidden' bg_collapse_expand='6a2d302908efc0063084525' value='6a2d302908efc0063084525'><input type='hidden' id='bg-show-more-text-6a2d302908efc0063084525' value='Show Related Publications'><input type='hidden' id='bg-show-less-text-6a2d302908efc0063084525' value='Hide Related Publications'><a id='bg-showmore-action-6a2d302908efc0063084525' class='bg-showmore-plg-link bg-arrow '  style=\" color:#7C2622;;\" href='#'>Show Related Publications<\/a><div id='bg-showmore-hidden-6a2d302908efc0063084525' ><div class=\"teachpress_pub_list\"><form name=\"tppublistform\" method=\"get\"><a name=\"tppubs\" id=\"tppubs\"><\/a><\/form><table class=\"teachpress_publication_list\"><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wu, Xiaojian;  Kumar, Akshat;  Sheldon, Daniel;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('877','tp_links')\" style=\"cursor:pointer;\">Robust Optimization for Tree-Structured Stochastic Network Design<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 31st Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_address\">San Francisco, California, <\/span><span class=\"tp_pub_additional_year\">2017<\/span><span class=\"tp_pub_additional_note\">, (Best Paper Award)<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_877\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('877','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_877\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('877','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_877\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('877','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_877\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WKSZaaai17,<br \/>\r\ntitle = {Robust Optimization for Tree-Structured Stochastic Network Design},<br \/>\r\nauthor = {Xiaojian Wu and Akshat Kumar and Daniel Sheldon and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WKSZaaai17.pdf},<br \/>\r\nyear  = {2017},<br \/>\r\ndate = {2017-01-01},<br \/>\r\nbooktitle = {Proceedings of the 31st Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {4545--4551},<br \/>\r\naddress = {San Francisco, California},<br \/>\r\nabstract = {Stochastic network design is a general framework for optimizing network connectivity. It has several applications in computational sustainability including spatial conservation planning, pre-disaster network preparation, and river net- work optimization. A common assumption in previous work has been made that network parameters (e.g., probability of species colonization) are precisely known, which is unrealistic in real-world settings. We therefore address the robust river network design problem where the goal is to optimize river connectivity for fish movement by removing barriers. We assume that fish passability probabilities are known only imprecisely, but are within some interval bounds. We then develop a planning approach that computes the policies with either high robust ratio or low regret. Empirically, our approach scales well to large river networks. We also provide insights into the solutions generated by our robust approach, which has significantly higher robust ratio than the baseline solution with mean parameter estimates.},<br \/>\r\nnote = {Best Paper Award},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('877','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_877\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Stochastic network design is a general framework for optimizing network connectivity. It has several applications in computational sustainability including spatial conservation planning, pre-disaster network preparation, and river net- work optimization. A common assumption in previous work has been made that network parameters (e.g., probability of species colonization) are precisely known, which is unrealistic in real-world settings. We therefore address the robust river network design problem where the goal is to optimize river connectivity for fish movement by removing barriers. We assume that fish passability probabilities are known only imprecisely, but are within some interval bounds. We then develop a planning approach that computes the policies with either high robust ratio or low regret. Empirically, our approach scales well to large river networks. We also provide insights into the solutions generated by our robust approach, which has significantly higher robust ratio than the baseline solution with mean parameter estimates.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('877','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_877\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WKSZaaai17.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WKSZaaai17.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WKSZaaai17.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('877','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wu, Xiaojian;  Sheldon, Daniel;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('889','tp_links')\" style=\"cursor:pointer;\">Optimizing Resilience in Large Scale Networks<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 30th Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_address\">Phoenix, Arizona, <\/span><span class=\"tp_pub_additional_year\">2016<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_889\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('889','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_889\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('889','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_889\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('889','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_889\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WSZaaai16,<br \/>\r\ntitle = {Optimizing Resilience in Large Scale Networks},<br \/>\r\nauthor = {Xiaojian Wu and Daniel Sheldon and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WSZaaai16.pdf},<br \/>\r\nyear  = {2016},<br \/>\r\ndate = {2016-01-01},<br \/>\r\nbooktitle = {Proceedings of the 30th Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {3922--3928},<br \/>\r\naddress = {Phoenix, Arizona},<br \/>\r\nabstract = {We propose a decision making framework to optimize the resilience of road networks to natural disasters such as floods. Our model generalizes an existing one for this problem by allowing roads with a broad class of stochastic delay models. We then present a fast algorithm based on the sample average approximation (SAA) method and network design techniques to solve this problem approximately. On a small existing benchmark, our algorithm produces near-optimal solutions and the SAA method converges quickly with a small number of samples.We then apply our algorithm to a large real-world problem to optimize the resilience of a road network to failures of stream crossing structures to minimize travel times of emergency medical service vehicles. On medium-sized networks, our algorithm obtains solutions of comparable quality to a greedy baseline method but is 30--60 times faster. Our algorithm is the only existing algorithm that can scale to the full network, which has many thousands of edges.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('889','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_889\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We propose a decision making framework to optimize the resilience of road networks to natural disasters such as floods. Our model generalizes an existing one for this problem by allowing roads with a broad class of stochastic delay models. We then present a fast algorithm based on the sample average approximation (SAA) method and network design techniques to solve this problem approximately. On a small existing benchmark, our algorithm produces near-optimal solutions and the SAA method converges quickly with a small number of samples.We then apply our algorithm to a large real-world problem to optimize the resilience of a road network to failures of stream crossing structures to minimize travel times of emergency medical service vehicles. On medium-sized networks, our algorithm obtains solutions of comparable quality to a greedy baseline method but is 30--60 times faster. Our algorithm is the only existing algorithm that can scale to the full network, which has many thousands of edges.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('889','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_889\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WSZaaai16.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WSZaaai16.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WSZaaai16.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('889','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wu, Xiaojian;  Sheldon, Daniel;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('906','tp_links')\" style=\"cursor:pointer;\">Fast Combinatorial Algorithm for Optimizing the Spread of Cascades<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_address\">Buenos Aires, Argentina, <\/span><span class=\"tp_pub_additional_year\">2015<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_906\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('906','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_906\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('906','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_906\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('906','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_906\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WSZijcai15,<br \/>\r\ntitle = {Fast Combinatorial Algorithm for Optimizing the Spread of Cascades},<br \/>\r\nauthor = {Xiaojian Wu and Daniel Sheldon and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WSZijcai15.pdf},<br \/>\r\nyear  = {2015},<br \/>\r\ndate = {2015-01-01},<br \/>\r\nbooktitle = {Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\npages = {2655--2661},<br \/>\r\naddress = {Buenos Aires, Argentina},<br \/>\r\nabstract = {We address a spatial conservation planning problem in which the planner purchases a budget-constrained set of land parcels in order to maximize the expected spread of a population of an endangered species. Existing techniques based on the sample average approximation scheme and standard integer programming methods have high complexity and limited scalability. We propose a fast combinatorial optimization algorithm using Lagrangian relaxation and primal-dual techniques to solve the problem approximately. The algorithm provides a new way to address a range of conservation planning and scheduling problems. On the Red-cockaded Woodpecker data, our algorithm produces near optimal solutions and runs significantly faster than a standard mixed integer program solver. Compared with a greedy baseline, the solution quality is comparable or better, but our algorithm is 10--30 times faster. On synthetic problems that do not exhibit submodularity, our algorithm significantly outperforms the greedy baseline.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('906','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_906\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We address a spatial conservation planning problem in which the planner purchases a budget-constrained set of land parcels in order to maximize the expected spread of a population of an endangered species. Existing techniques based on the sample average approximation scheme and standard integer programming methods have high complexity and limited scalability. We propose a fast combinatorial optimization algorithm using Lagrangian relaxation and primal-dual techniques to solve the problem approximately. The algorithm provides a new way to address a range of conservation planning and scheduling problems. On the Red-cockaded Woodpecker data, our algorithm produces near optimal solutions and runs significantly faster than a standard mixed integer program solver. Compared with a greedy baseline, the solution quality is comparable or better, but our algorithm is 10--30 times faster. On synthetic problems that do not exhibit submodularity, our algorithm significantly outperforms the greedy baseline.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('906','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_906\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WSZijcai15.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WSZijcai15.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WSZijcai15.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('906','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wu, Xiaojian;  Sheldon, Daniel;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('918','tp_links')\" style=\"cursor:pointer;\">Rounded Dynamic Programming for Tree-Structured Stochastic Network Design<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 28th Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_address\">Quebec City, Canada, <\/span><span class=\"tp_pub_additional_year\">2014<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_918\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('918','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_918\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('918','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_918\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('918','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_918\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WSZaaai14,<br \/>\r\ntitle = {Rounded Dynamic Programming for Tree-Structured Stochastic Network Design},<br \/>\r\nauthor = {Xiaojian Wu and Daniel Sheldon and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WSZaaai14.pdf},<br \/>\r\nyear  = {2014},<br \/>\r\ndate = {2014-01-01},<br \/>\r\nbooktitle = {Proceedings of the 28th Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {479--485},<br \/>\r\naddress = {Quebec City, Canada},<br \/>\r\nabstract = {We develop a fast approximation algorithm called rounded dynamic programming (RDP) for stochastic network design problems on directed trees. The underlying model describes phenomena that spread away from the root of a tree, for example, the spread of influence in a hierarchical organization or fish in a river network. Actions can be taken to intervene in the network---for some cost---to increase the probability of propagation along an edge. Our algorithm selects a set of actions to maximize the overall spread in the network under a limited budget. We prove that the algorithm is a fully polynomial-time approximation scheme (FPTAS), that is, it finds (1??)-optimal solutions in time polynomial in the input size and 1\/?. We apply the algorithm to the problem of allocating funds efficiently to remove barriers in a river network so fish can reach greater portions of their native range. Our experiments show that the algorithm is able to produce near-optimal solutions much faster than an existing technique.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('918','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_918\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We develop a fast approximation algorithm called rounded dynamic programming (RDP) for stochastic network design problems on directed trees. The underlying model describes phenomena that spread away from the root of a tree, for example, the spread of influence in a hierarchical organization or fish in a river network. Actions can be taken to intervene in the network---for some cost---to increase the probability of propagation along an edge. Our algorithm selects a set of actions to maximize the overall spread in the network under a limited budget. We prove that the algorithm is a fully polynomial-time approximation scheme (FPTAS), that is, it finds (1??)-optimal solutions in time polynomial in the input size and 1\/?. We apply the algorithm to the problem of allocating funds efficiently to remove barriers in a river network so fish can reach greater portions of their native range. Our experiments show that the algorithm is able to produce near-optimal solutions much faster than an existing technique.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('918','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_918\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WSZaaai14.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WSZaaai14.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WSZaaai14.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('918','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wu, Xiaojian;  Sheldon, Daniel;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('919','tp_links')\" style=\"cursor:pointer;\">Stochastic Network Design in Bidirected Trees<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 28th Neural Information Processing Systems Conference (NIPS), <\/span><span class=\"tp_pub_additional_address\">Montreal, Canada, <\/span><span class=\"tp_pub_additional_year\">2014<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_919\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('919','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_919\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('919','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_919\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('919','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_919\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WSZnips14,<br \/>\r\ntitle = {Stochastic Network Design in Bidirected Trees},<br \/>\r\nauthor = {Xiaojian Wu and Daniel Sheldon and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WSZnips14.pdf},<br \/>\r\nyear  = {2014},<br \/>\r\ndate = {2014-01-01},<br \/>\r\nbooktitle = {Proceedings of the 28th Neural Information Processing Systems Conference (NIPS)},<br \/>\r\npages = {882--890},<br \/>\r\naddress = {Montreal, Canada},<br \/>\r\nabstract = {We investigate the problem of stochastic network design in bidirected trees. In this problem, an underlying phenomenon (e.g., a behavior, rumor, or disease) starts at multiple sources in a tree and spreads in both directions along its edges. Actions can be taken to increase the probability of propagation on edges, and the goal is to maximize the total amount of spread away from all sources. Our main result is a rounded dynamic programming approach that leads to a fully polynomial-time approximation scheme (FPTAS), that is, an algorithm that can find (1??)-optimal solutions for any problem instance in time polynomial in the input size and 1\/?. Our algorithm outperforms competing approaches on a motivating problem from computational sustainability to remove barriers in river networks to restore the health of aquatic ecosystems.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('919','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_919\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We investigate the problem of stochastic network design in bidirected trees. In this problem, an underlying phenomenon (e.g., a behavior, rumor, or disease) starts at multiple sources in a tree and spreads in both directions along its edges. Actions can be taken to increase the probability of propagation on edges, and the goal is to maximize the total amount of spread away from all sources. Our main result is a rounded dynamic programming approach that leads to a fully polynomial-time approximation scheme (FPTAS), that is, an algorithm that can find (1??)-optimal solutions for any problem instance in time polynomial in the input size and 1\/?. Our algorithm outperforms competing approaches on a motivating problem from computational sustainability to remove barriers in river networks to restore the health of aquatic ecosystems.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('919','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_919\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WSZnips14.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WSZnips14.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WSZnips14.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('919','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Wu, Xiaojian;  Kumar, Akshat;  Sheldon, Daniel;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('924','tp_links')\" style=\"cursor:pointer;\">Parameter Learning for Latent Network Diffusion<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 23rd International Joint Conference on Artificial Intelligence (IJCAI), <\/span><span class=\"tp_pub_additional_address\">Beijing, China, <\/span><span class=\"tp_pub_additional_year\">2013<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_924\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('924','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_924\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('924','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_924\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('924','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_924\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:WKSZijcai13,<br \/>\r\ntitle = {Parameter Learning for Latent Network Diffusion},<br \/>\r\nauthor = {Xiaojian Wu and Akshat Kumar and Daniel Sheldon and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WKSZijcai13.pdf},<br \/>\r\nyear  = {2013},<br \/>\r\ndate = {2013-01-01},<br \/>\r\nbooktitle = {Proceedings of the 23rd International Joint Conference on Artificial Intelligence (IJCAI)},<br \/>\r\npages = {2923--2930},<br \/>\r\naddress = {Beijing, China},<br \/>\r\nabstract = {Diffusion processes in networks are increasingly used to model dynamic phenomena such as the spread of information, wildlife, or social influence. Our work addresses the problem of learning the underlying parameters that govern such a diffusion process by observing the time at which nodes become active. A key advantage of our approach is that, unlike previous work, it can tolerate missing observations for some nodes in the diffusion process. Having incomplete observations is characteristic of offline networks used to model the spread of wildlife. We develop an EM algorithm to address parameter learning in such settings. Since both the E and M steps are computationally challenging, we employ a number of optimization methods such as nonlinear and difference-of-convex programming to address these challenges. Evaluation of the approach on the Red-cockaded Woodpecker conservation problem shows that it is highly robust and accurately learns parameters in various settings, even with more than 80% missing data.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('924','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_924\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Diffusion processes in networks are increasingly used to model dynamic phenomena such as the spread of information, wildlife, or social influence. Our work addresses the problem of learning the underlying parameters that govern such a diffusion process by observing the time at which nodes become active. A key advantage of our approach is that, unlike previous work, it can tolerate missing observations for some nodes in the diffusion process. Having incomplete observations is characteristic of offline networks used to model the spread of wildlife. We develop an EM algorithm to address parameter learning in such settings. Since both the E and M steps are computationally challenging, we employ a number of optimization methods such as nonlinear and difference-of-convex programming to address these challenges. Evaluation of the approach on the Red-cockaded Woodpecker conservation problem shows that it is highly robust and accurately learns parameters in various settings, even with more than 80% missing data.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('924','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_924\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WKSZijcai13.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WKSZijcai13.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/WKSZijcai13.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('924','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Kumar, Akshat;  Zilberstein, Shlomo;  Toussaint, Marc<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('929','tp_links')\" style=\"cursor:pointer;\">Message-Passing Algorithms for MAP Estimation Using DC Programming<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 15th International Conference on Artificial Intelligence and Statistics (AISTATS), <\/span><span class=\"tp_pub_additional_address\">La Palma, Canary Islands, <\/span><span class=\"tp_pub_additional_year\">2012<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_929\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('929','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_929\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('929','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_929\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('929','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_929\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:KZTaistats12,<br \/>\r\ntitle = {Message-Passing Algorithms for MAP Estimation Using DC Programming},<br \/>\r\nauthor = {Akshat Kumar and Shlomo Zilberstein and Marc Toussaint},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZTaistats12.pdf},<br \/>\r\nyear  = {2012},<br \/>\r\ndate = {2012-01-01},<br \/>\r\nbooktitle = {Proceedings of the 15th International Conference on Artificial Intelligence and Statistics (AISTATS)},<br \/>\r\npages = {656--664},<br \/>\r\naddress = {La Palma, Canary Islands},<br \/>\r\nabstract = {We address the problem of finding the most likely assignment or MAP estimation in a Markov random field. We analyze the linear programming formulation of MAP through the lens of difference of convex functions (DC) programming, and use the concave-convex procedure (CCCP) to develop efficient message-passing solvers. The resulting algorithms are guaranteed to converge to a global optimum of the well-studied local polytope, an outer bound on the MAP marginal polytope. To tighten the outer bound, we show how to combine it with the mean-field based inner bound and, again, solve it using CCCP. We also identify a useful relationship between the DC formulations and some recently proposed algorithms based on Bregman divergence. Experimentally, this hybrid approach produces optimal solutions for a range of hard OR problems and nearoptimal solutions for standard benchmarks.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('929','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_929\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We address the problem of finding the most likely assignment or MAP estimation in a Markov random field. We analyze the linear programming formulation of MAP through the lens of difference of convex functions (DC) programming, and use the concave-convex procedure (CCCP) to develop efficient message-passing solvers. The resulting algorithms are guaranteed to converge to a global optimum of the well-studied local polytope, an outer bound on the MAP marginal polytope. To tighten the outer bound, we show how to combine it with the mean-field based inner bound and, again, solve it using CCCP. We also identify a useful relationship between the DC formulations and some recently proposed algorithms based on Bregman divergence. Experimentally, this hybrid approach produces optimal solutions for a range of hard OR problems and nearoptimal solutions for standard benchmarks.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('929','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_929\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZTaistats12.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZTaistats12.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZTaistats12.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('929','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Kumar, Akshat;  Wu, Xiaojian;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('930','tp_links')\" style=\"cursor:pointer;\">Lagrangian Relaxation Techniques for Scalable Spatial Conservation Planning<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 26th Conference on Artificial Intelligence (AAAI), <\/span><span class=\"tp_pub_additional_address\">Toronto, Canada, <\/span><span class=\"tp_pub_additional_year\">2012<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_930\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('930','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_930\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('930','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_930\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('930','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_930\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:KWZaaai12,<br \/>\r\ntitle = {Lagrangian Relaxation Techniques for Scalable Spatial Conservation Planning},<br \/>\r\nauthor = {Akshat Kumar and Xiaojian Wu and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KWZaaai12.pdf},<br \/>\r\nyear  = {2012},<br \/>\r\ndate = {2012-01-01},<br \/>\r\nbooktitle = {Proceedings of the 26th Conference on Artificial Intelligence (AAAI)},<br \/>\r\npages = {309--315},<br \/>\r\naddress = {Toronto, Canada},<br \/>\r\nabstract = {We address the problem of spatial conservation planning in which the goal is to maximize the expected spread of cascades of an endangered species by strategically purchasing land parcels within a given budget. This problem can be solved by standard integer programming methods using the sample average approximation (SAA) scheme. Our main contribution lies in exploiting the separable structure present in this problem and using Lagrangian relaxation techniques to gain scalability over the flat representation. We also generalize the approach to allow the application of the SAA scheme to a range of stochastic optimization problems. Our iterative approach is highly efficient in terms of space requirements and it provides an upper bound over the optimal solution at each iteration. We apply our approach to the Red-cockaded Woodpecker conservation problem. The results show that it can find the optimal solution significantly faster -- sometimes by an order-of-magnitude -- than using the flat representation for a range of budget sizes.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('930','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_930\" style=\"display:none;\"><div class=\"tp_abstract_entry\">We address the problem of spatial conservation planning in which the goal is to maximize the expected spread of cascades of an endangered species by strategically purchasing land parcels within a given budget. This problem can be solved by standard integer programming methods using the sample average approximation (SAA) scheme. Our main contribution lies in exploiting the separable structure present in this problem and using Lagrangian relaxation techniques to gain scalability over the flat representation. We also generalize the approach to allow the application of the SAA scheme to a range of stochastic optimization problems. Our iterative approach is highly efficient in terms of space requirements and it provides an upper bound over the optimal solution at each iteration. We apply our approach to the Red-cockaded Woodpecker conservation problem. The results show that it can find the optimal solution significantly faster -- sometimes by an order-of-magnitude -- than using the flat representation for a range of budget sizes.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('930','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_930\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KWZaaai12.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KWZaaai12.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KWZaaai12.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('930','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Kumar, Akshat;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('938','tp_links')\" style=\"cursor:pointer;\">Message-Passing Algorithms for Quadratic Programming Formulations of MAP Estimation<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (UAI), <\/span><span class=\"tp_pub_additional_address\">Barcelona, Spain, <\/span><span class=\"tp_pub_additional_year\">2011<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_938\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('938','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_938\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('938','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_938\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('938','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_938\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:KZuai11,<br \/>\r\ntitle = {Message-Passing Algorithms for Quadratic Programming Formulations of MAP Estimation},<br \/>\r\nauthor = {Akshat Kumar and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZuai11.pdf},<br \/>\r\nyear  = {2011},<br \/>\r\ndate = {2011-01-01},<br \/>\r\nbooktitle = {Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (UAI)},<br \/>\r\npages = {428--435},<br \/>\r\naddress = {Barcelona, Spain},<br \/>\r\nabstract = {Computing maximum a posteriori (MAP) estimation in graphical models is an important inference problem with many applications. We present message-passing algorithms for quadratic programming (QP) formulations of MAP estimation for pairwise Markov random fields. In particular, we use the concave-convex procedure (CCCP) to obtain a locally optimal algorithm for the non-convex QP formulation. A similar technique is used to derive a globally convergent algorithm for the convex QP relaxation of MAP. We also show that a recently developed expectation-maximization (EM) algorithm for the QP formulation of MAP can be derived from the CCCP perspective. Experiments on synthetic and real-world problems confirm that our new approach is competitive with max-product and its variations. Compared with CPLEX, we achieve more than an order-of-magnitude speedup in solving optimally the convex QP relaxation.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('938','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_938\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Computing maximum a posteriori (MAP) estimation in graphical models is an important inference problem with many applications. We present message-passing algorithms for quadratic programming (QP) formulations of MAP estimation for pairwise Markov random fields. In particular, we use the concave-convex procedure (CCCP) to obtain a locally optimal algorithm for the non-convex QP formulation. A similar technique is used to derive a globally convergent algorithm for the convex QP relaxation of MAP. We also show that a recently developed expectation-maximization (EM) algorithm for the QP formulation of MAP can be derived from the CCCP perspective. Experiments on synthetic and real-world problems confirm that our new approach is competitive with max-product and its variations. Compared with CPLEX, we achieve more than an order-of-magnitude speedup in solving optimally the convex QP relaxation.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('938','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_938\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZuai11.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZuai11.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZuai11.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('938','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Kumar, Akshat;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('941','tp_links')\" style=\"cursor:pointer;\">On Message-Passing, MAP Estimation in Graphical Models and DCOPs<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">International Workshop on Distributed Constraint Reasoning (DCR), <\/span><span class=\"tp_pub_additional_address\">Barcelona, Spain, <\/span><span class=\"tp_pub_additional_year\">2011<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_941\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('941','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_941\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('941','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_941\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('941','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_941\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:KYZdcr11,<br \/>\r\ntitle = {On Message-Passing, MAP Estimation in Graphical Models and DCOPs},<br \/>\r\nauthor = {Akshat Kumar and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KYZdcr11.pdf},<br \/>\r\nyear  = {2011},<br \/>\r\ndate = {2011-01-01},<br \/>\r\nbooktitle = {International Workshop on Distributed Constraint Reasoning (DCR)},<br \/>\r\npages = {57--70},<br \/>\r\naddress = {Barcelona, Spain},<br \/>\r\nabstract = {The maximum a posteriori (MAP) estimation problem in graphical models is a problem common in many applications such as computer vision and bioinformatics. For example, they are used to identify the most likely orientation of proteins in protein design problems. As such, researchers in the machine learning community have developed a variety of approximate algorithms to solve them. On the other hand, distributed constraint optimization problems (DCOPs) are well-suited for modeling many multi-agent coordination problems such as the coordination of sensors in a network and the coordination of power plants. In this paper, we show that MAP estimation problems and DCOPs bear strong similarities and, as such, some approximate MAP algorithms such as iterative message passing algorithms can be easily tailored to solve DCOPs as well.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('941','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_941\" style=\"display:none;\"><div class=\"tp_abstract_entry\">The maximum a posteriori (MAP) estimation problem in graphical models is a problem common in many applications such as computer vision and bioinformatics. For example, they are used to identify the most likely orientation of proteins in protein design problems. As such, researchers in the machine learning community have developed a variety of approximate algorithms to solve them. On the other hand, distributed constraint optimization problems (DCOPs) are well-suited for modeling many multi-agent coordination problems such as the coordination of sensors in a network and the coordination of power plants. In this paper, we show that MAP estimation problems and DCOPs bear strong similarities and, as such, some approximate MAP algorithms such as iterative message passing algorithms can be easily tailored to solve DCOPs as well.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('941','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_941\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KYZdcr11.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KYZdcr11.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KYZdcr11.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('941','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><tr class=\"tp_publication tp_publication_conference\"><td class=\"tp_pub_info\"><p class=\"tp_pub_author\"> Kumar, Akshat;  Zilberstein, Shlomo<\/p><p class=\"tp_pub_title\"><a class=\"tp_title_link\" onclick=\"teachpress_pub_showhide('959','tp_links')\" style=\"cursor:pointer;\">MAP Estimation for Graphical Models by Likelihood Maximization<\/a> <span class=\"tp_pub_type tp_  conference\">Conference<\/span> <\/p><p class=\"tp_pub_additional\"><span class=\"tp_pub_additional_booktitle\">Proceedings of the 24th Neural Information Processing Systems Conference (NIPS), <\/span><span class=\"tp_pub_additional_address\">Vancouver, British Columbia, Canada, <\/span><span class=\"tp_pub_additional_year\">2010<\/span>.<\/p><p class=\"tp_pub_menu\"><span class=\"tp_abstract_link\"><a id=\"tp_abstract_sh_959\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('959','tp_abstract')\" title=\"Show abstract\" style=\"cursor:pointer;\">Abstract<\/a><\/span> | <span class=\"tp_resource_link\"><a id=\"tp_links_sh_959\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('959','tp_links')\" title=\"Show links and resources\" style=\"cursor:pointer;\">Links<\/a><\/span> | <span class=\"tp_bibtex_link\"><a id=\"tp_bibtex_sh_959\" class=\"tp_show\" onclick=\"teachpress_pub_showhide('959','tp_bibtex')\" title=\"Show BibTeX entry\" style=\"cursor:pointer;\">BibTeX<\/a><\/span><\/p><div class=\"tp_bibtex\" id=\"tp_bibtex_959\" style=\"display:none;\"><div class=\"tp_bibtex_entry\"><pre>@conference{SZ:KZnips10,<br \/>\r\ntitle = {MAP Estimation for Graphical Models by Likelihood Maximization},<br \/>\r\nauthor = {Akshat Kumar and Shlomo Zilberstein},<br \/>\r\nurl = {http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZnips10.pdf},<br \/>\r\nyear  = {2010},<br \/>\r\ndate = {2010-01-01},<br \/>\r\nbooktitle = {Proceedings of the 24th Neural Information Processing Systems Conference (NIPS)},<br \/>\r\npages = {1180--1188},<br \/>\r\naddress = {Vancouver, British Columbia, Canada},<br \/>\r\nabstract = {Computing a maximum a posteriori (MAP) assignment in graphical models is a crucial inference problem for many practical applications. Several provably convergent approaches have been successfully developed using linear programming (LP) relaxation of the MAP problem. We present an alternative approach, which transforms the MAP problem into that of inference in a mixture of simple Bayes nets. We then derive the Expectation Maximization (EM) algorithm for this mixture that also monotonically increases a lower bound on the MAP assignment until convergence. The update equations for the EM algorithm are remarkably simple, both conceptually and computationally, and can be implemented using a graph-based message passing paradigm similar to max-product computation. Experiments on the real-world protein design dataset show that EM's convergence rate is significantly higher than the previous LP relaxation based approach MPLP. EM also achieves a solution quality within 95% of optimal for most instances.},<br \/>\r\nkeywords = {},<br \/>\r\npubstate = {published},<br \/>\r\ntppubtype = {conference}<br \/>\r\n}<br \/>\r\n<\/pre><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('959','tp_bibtex')\">Close<\/a><\/p><\/div><div class=\"tp_abstract\" id=\"tp_abstract_959\" style=\"display:none;\"><div class=\"tp_abstract_entry\">Computing a maximum a posteriori (MAP) assignment in graphical models is a crucial inference problem for many practical applications. Several provably convergent approaches have been successfully developed using linear programming (LP) relaxation of the MAP problem. We present an alternative approach, which transforms the MAP problem into that of inference in a mixture of simple Bayes nets. We then derive the Expectation Maximization (EM) algorithm for this mixture that also monotonically increases a lower bound on the MAP assignment until convergence. The update equations for the EM algorithm are remarkably simple, both conceptually and computationally, and can be implemented using a graph-based message passing paradigm similar to max-product computation. Experiments on the real-world protein design dataset show that EM's convergence rate is significantly higher than the previous LP relaxation based approach MPLP. EM also achieves a solution quality within 95% of optimal for most instances.<\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('959','tp_abstract')\">Close<\/a><\/p><\/div><div class=\"tp_links\" id=\"tp_links_959\" style=\"display:none;\"><div class=\"tp_links_entry\"><ul class=\"tp_pub_list\"><li><i class=\"fas fa-file-pdf\"><\/i><a class=\"tp_pub_list\" href=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZnips10.pdf\" title=\"http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZnips10.pdf\" target=\"_blank\">http:\/\/rbr.cs.umass.edu\/shlomo\/papers\/KZnips10.pdf<\/a><\/li><\/ul><\/div><p class=\"tp_close_menu\"><a class=\"tp_close\" onclick=\"teachpress_pub_showhide('959','tp_links')\">Close<\/a><\/p><\/div><\/td><\/tr><\/table><\/div><\/div>\n<div><\/div><\/div><\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>We study a wide range of problems in artificial intelligence, automated planning and learning, autonomous systems, reasoning under uncertainty, multi-agent systems, and resource-bounded reasoning. We are particularly interested in the implications of uncertainty and limited computational resources on the design of autonomous agents. In most practical settings, it is not feasible or desirable to find &hellip; <a href=\"https:\/\/groups.cs.umass.edu\/shlomo\/research\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Research&#8221;<\/span><\/a><\/p>\n","protected":false},"author":3,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"footnotes":""},"class_list":["post-12","page","type-page","status-publish","hentry","group-blog","hfeed"],"_links":{"self":[{"href":"https:\/\/groups.cs.umass.edu\/shlomo\/wp-json\/wp\/v2\/pages\/12","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/groups.cs.umass.edu\/shlomo\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/groups.cs.umass.edu\/shlomo\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/groups.cs.umass.edu\/shlomo\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/groups.cs.umass.edu\/shlomo\/wp-json\/wp\/v2\/comments?post=12"}],"version-history":[{"count":45,"href":"https:\/\/groups.cs.umass.edu\/shlomo\/wp-json\/wp\/v2\/pages\/12\/revisions"}],"predecessor-version":[{"id":338,"href":"https:\/\/groups.cs.umass.edu\/shlomo\/wp-json\/wp\/v2\/pages\/12\/revisions\/338"}],"wp:attachment":[{"href":"https:\/\/groups.cs.umass.edu\/shlomo\/wp-json\/wp\/v2\/media?parent=12"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}