In this project we study how the user of a machine learning (ML) algorithm (method) can place constraints on the algorithm’s behavior. We contend that standard ML algorithms are not user-friendly, in that they can require ML and data science expertise to apply responsibly to real-world applications. We present a new type of ML algorithm that shifts, from the user of the algorithm to the researcher who designs the algorithm, many of the challenges associated with ensuring that the ML method is safe to use. The resulting algorithms provide a simple interface for specifying what constitutes undesirable behavior of the ML algorithm, and provide high-probability guarantees that it will not produce this undesirable behavior.
Publications
- Yair Zick, Reza Shokri, Martin Stobel. Privacy Risks of Explaining Machine Learning Models.
- Philip S. Thomas, Bruno Castro da Silva, Andrew G. Barto, Stephen Giguere, Yuriy Brun, Emma Brunskill. Preventing undesirable behavior of intelligent machines.
- Philip S. Thomas, Bruno Castro da Silva, Andrew G. Barto, Emma Brunskill. On Ensuring that Intelligent Machines Are Well-Behaved.
- Przemyslaw A. Grabowicz, Kenta Takatsu, Luis F. Lafuerza. Supervised learning algorithms resilient to discriminatory data perturbations.
- B. Metevier, S. Giguere, S. Brockman, A. Kobren, Y. Brun, E. Brunskill, and P. S. Thomas. Offline Contextual Bandits with High Probability Fairness Guarantees. In Advances in Neural Information Processing Systems, 2019.