{"id":17,"date":"2019-01-31T21:36:44","date_gmt":"2019-01-31T21:36:44","guid":{"rendered":"http:\/\/groups.cs.umass.edu\/equate\/?p=17"},"modified":"2019-04-03T17:20:58","modified_gmt":"2019-04-03T17:20:58","slug":"data-diversity","status":"publish","type":"post","link":"https:\/\/groups.cs.umass.edu\/equate\/research\/data-diversity","title":{"rendered":"Data Diversity"},"content":{"rendered":"<p><span style=\"font-weight: 400\">The big data revolution and advancements in machine learning technologies have\u00a0<\/span><span style=\"font-weight: 400\">revolutionized decision making, advertising, medicine, and even election\u00a0<\/span><span style=\"font-weight: 400\">campaigns. Yet, data is an imperfect medium, often tainted by skews and\u00a0<\/span><span style=\"font-weight: 400\">biases. Learning systems and analysis software learn and amplify these biases.\u00a0<\/span><span style=\"font-weight: 400\">As a result, discrimination shows up in many data-driven applications, such as\u00a0<\/span><span style=\"font-weight: 400\">advertisements, hotel bookings, image search, and vendor services. Since data\u00a0<\/span><span style=\"font-weight: 400\">skew is often a cause of algorithmic bias, the ability to retrieve balanced,\u00a0<\/span><span style=\"font-weight: 400\">diverse datasets can mitigate the underlying problem. Diversification also has\u00a0<\/span><span style=\"font-weight: 400\">usability implications, as it allows us to produce representative samples of a\u00a0<\/span><span style=\"font-weight: 400\">dataset that are small enough for human consumption. Our research focuses on\u00a0<\/span><span style=\"font-weight: 400\">developing methods for producing appropriately diverse subsets of given\u00a0<\/span><span style=\"font-weight: 400\">datasets efficiently and scalably, aiming to alleviate biases in the\u00a0<\/span><span style=\"font-weight: 400\">underlying data and to facilitate user-facing data exploration systems.<\/span><!--more--><\/p>\n<h4>Publications<\/h4>\n<ul>\n<li><span style=\"font-weight: 400\">Yue Wang, Alexandra Meliou, and Gerome Miklau, RC-Index: <a href=\"http:\/\/www.vldb.org\/pvldb\/vol11\/p773-wang.pdf\" target=\"_blank\" rel=\"noopener\">Diversifying Answers to Range Queries<\/a><\/span><span style=\"font-weight: 400\">, PVLDB, vol. 11, no. 7, sep 2018, pp. 773\u2013786.<\/span><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>The big data revolution and advancements in machine learning technologies have\u00a0revolutionized decision making, advertising, medicine, and even election\u00a0campaigns. Yet, data is an imperfect medium, often tainted by skews and\u00a0biases. Learning systems and analysis software learn and amplify these biases.\u00a0As a result, discrimination shows up in many data-driven applications, such as\u00a0advertisements, hotel bookings, image search, and [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4,2],"tags":[],"class_list":["post-17","post","type-post","status-publish","format-standard","hentry","category-featured","category-research"],"_links":{"self":[{"href":"https:\/\/groups.cs.umass.edu\/equate\/wp-json\/wp\/v2\/posts\/17","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/groups.cs.umass.edu\/equate\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/groups.cs.umass.edu\/equate\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/groups.cs.umass.edu\/equate\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/groups.cs.umass.edu\/equate\/wp-json\/wp\/v2\/comments?post=17"}],"version-history":[{"count":5,"href":"https:\/\/groups.cs.umass.edu\/equate\/wp-json\/wp\/v2\/posts\/17\/revisions"}],"predecessor-version":[{"id":118,"href":"https:\/\/groups.cs.umass.edu\/equate\/wp-json\/wp\/v2\/posts\/17\/revisions\/118"}],"wp:attachment":[{"href":"https:\/\/groups.cs.umass.edu\/equate\/wp-json\/wp\/v2\/media?parent=17"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/groups.cs.umass.edu\/equate\/wp-json\/wp\/v2\/categories?post=17"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/groups.cs.umass.edu\/equate\/wp-json\/wp\/v2\/tags?post=17"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}