Data Visualization and Exploration – Spring 2020

Data Visualization and Analysis

Spring 2020 Tuesday, Thursday, 8:30-9:45 AM LGRT room 121

Instructor: Dr. Ali Sarvghad

asarv@cs.umass.edu,  CICS 344

Office hours: Monday 3-4:30 pm Other times, by appointment only.

Teaching Assistant: Nazanin Jafari

nazaninjafar@umass.edu

Office hours: 9 am – 11 am, CICS 207


Overview

Information visualization is an area of research that helps people analyze and understand data using visualization techniques. The multi-disciplinary area draws from other areas of science, including human-computer interaction, data science, psychology, and art, to develop new visualization methods and understand how (and why) they are effective.

Information visualization methods are applied to data from many different application domains, including:

  • Political reporting and forecasting – as seen on TV and in the papers in the election season.
  • News reporting – look at the interactive visualizations used by the New York Times, Wall Street Journal, Slate, etc.
  • Social science and economic data, such as census and other surveys, and micro and macroeconomic trends.
  • Social networking and web traffic to understand patterns of communication
  • Business intelligence and business dashboards – to forecast sales trends, understand competitive marketplace positions, allocate resources, manage production, and logistics.
  • Text analysis – to determine trends and relationships for literary analysis and information retrieval.
  • Criminal investigations – to portray the relationships between events, people, places, and things.
  • Performance analysis of computer networks and systems.
  • Software engineering – developing, debugging, and maintaining software.
  • Bioinformatics, to understand DNA, gene expressions, systems biology.

Course objectives

  • Learn the principles involved in information visualization
  • Understand the wide variety of information visualizations and know what visualizations are appropriate for various types of data and for different goals
  • Develop skills in critiquing different visualization techniques in the context of user goals and objectives
  • Learn how to implement compelling information visualizations

Recommended text

The following textbooks are strongly recommended for this course. Particularly, we will closely follow Tamara Munzner’s book:

  • Visualization Analysis and Design, Tamara Munzner, CRC Press, ISBN 9781466508910. Principles and paradigms of visualization design
  • Interactive Data Visualization for the Web, Scott Murray, O’Reilly Media, ISBN 9781449339739. All about D3, the programming tool we will be using for homework and projects. 

Evaluation

Grading will be based on the project deliverables, midterm exam, class participation, and final project demo.  Final course grades may be curved (but not always). Grading weights are:

Midterm exam 30%
Class participation 10%
Paper reading & Discussion 15%
Course project & deliverables 45%

Image result for pointing hand icon    Project details can be found under the Project tab. 

Course Schedule (Lectures, midterm, due dates)

WeekDateTopicProject Milestones
1

1/21Course overview & Intro to InfoVis 
1/23Data abstraction, Exercise: data abstraction 
2

1/28D3: set up, drawing with SVG 
1/30Task abstraction, Exercise: task abstraction 
3

2/4Validation, Marks & Channels, rules of thumb, Exercise: DecodingForm groups
2/6 
4

2/11Tables, color, Exercise: 2N/BR/CP 
2/13 
5

2/18No class – Monday Schedule 
2/20Visualizing spatial data, Networks, & Trees, ExerciseDue: Dataset
6

2/25 
2/27D3: Making basic charts Due: Data abstraction
7

3/3Handle data complexity: manipulate, Facet, Reduce, Exercise 
3/5 Due: Task abstraction
8


3/10Review 
3/12Midterm exam
9

3/17Spring break
3/19
10

3/24D3: scales, axes 
3/26D3: interactivity, layouts 
11

3/31Storytelling with dataDue: project proposal
4/2Proposal peer evaluation 
12

4/7User evaluation methods for visualization 
4/9Data analysis 
13


4/14TBD 
4/16TBD 
14


4/21TBD 
4/23Wrap up 
154/28Projects demoDemo

Project overview

The course project carries 45% of the overall course grade. 

  • Groups

    • Each group MUST be comprised of both grad and undergrad students. The preferred size of a group is 4. Groups smaller than 3 and bigger than 5 will not be allowed (unless under special circumstances and with instructor’s approval). Expectations will NOT be adjusted according to group size.

  • Project proposal (35%)  

    • Data & problem selection (10%)
    • Users and tasks identification (10%)
    • Data and task abstraction (10%)
    • Design (20%)
    • Image result for pointing hand icon See details
  • Peer evaluation (10%)  

    • Provide feedback about another group’s project proposal
    • Image result for pointing hand icon See details
  • User evaluation (10%)  

    • Report of methodology, data analysis, and findings
    • Image result for pointing hand icon See details
  • Final demo (35%)

    • Interactivity (5%)
    • Data manipulation (15%)
    • Supporting exploration (20%)
    • Image result for pointing hand icon See details
  • Final report (10%)

    • Implementation details 
    • Evaluation (10%)
    • Image result for pointing hand icon See details

No matter what topic you choose, I am expecting a high-quality project. This project accounts for 45%  your grade in this course  and will require a significant amount of time and effort. In particular, I’m seeking creative projects showcasing interesting ideas. A good project should consist of visualization designs and a software artifact that implements the designs. Interaction is key in information visualization, and it is difficult to understand the interaction issues in your project without a running system.  Ideally, I would like your efforts to be innovative and to result in some form of potential publication to similar venues and styles as the papers that we have read throughout the semester.

You should develop a web-deployable system so that your system can be shown to everyone in the world, and use D3 for visualizing data! Arguments will be entertained for using different visualization toolkits, but in general, D3 is preferred. Using a different toolkit should be approved by the professor prior to starting any code


Project details

The idea of the project is to take the knowledge and background that you are learning this semester about Information Visualization and put it to good use in a new, creative effort. A real key to the project, however, is to select a data set that people will find interesting and intriguing. Even better would be to select a data set with a clearly identified set of “users” or “analysts” who care deeply about that data. Select a topic that people want to know more about! I cannot emphasize strongly enough the importance of your topic and data set.

Project proposal

The project proposal is a document that you will gradually complete before its due date on TBD. In this document, you will provide detailed information about the most important aspects of your project:

  1.  Data
  2. Intended users
  3. Problem
  4. Prior work related to the problem you are investigating
  5. Your proposed visual solution
  6. Argue why/how your solution will address the problem

 

  1. Data
    • BYOD (Bring Your Own Data)
      • you (or your teammates) have your own data to analyze such as:
        • thesis/research topic
        • personal interest
        • dovetail with another course (sometimes works, but timing may be tricky)
    • FDOI (Find Data of Interest)
      • many existing datasets on the internet
      • Can be tricky to determine reasonable analysis tasks that users may want to do
  2. Intended user
    •  
  3. Problem
    •  
  4. Related work
    •  
  5. Solution
    •  
  6. Proposal examples

Tips for a Successful Project

It is extremely important to select an interesting problem with data that some group of people will care deeply about. I cannot stress enough how vital it is to start with interesting data. Find some topic that almost everyone cares about (e.g., baby names, feature films, traffic in Boston, flight delays, the stock market, weather, etc. — THERE’S DATA ALL AROUND YOU!) or that some subset of people really care about (e.g., sports data, politics, personal health etc.). Consider combining different data sets to produce a new composite data set of special interest. Such a fusion of data often creates a dataset that people want to learn about. Remember that this often takes time and effort to “fuse” multiple data types, so you want to make sure you pick them wisely (i.e., they should be in support of the questions that you want people to be able to answer using your tool).

Two possible styles of successful visualization projects (definitely the space is not limited to these two):

In the first style, the group created a visualization system that has only one view/representation but this representation is new and creative. Here, you should focus on designing an innovative new visual representation. The actual user interface may have different components or pieces, but it should be tightly integrated. The real focus here is on creativity and innovation, and the novel representation of the information. These projects emphasize the mappings between the data (and characteristics/variables of the data) to visual encodings, glyphs, and metaphors.

The second type of successful project employs multiple coordinated views where each view may use some well-known visualization techniques, perhaps customized a little for this problem. The emphasis in this type of project is to create a sound, functional system implementation that clearly can be of help for data analysis and understanding. It is important in this type of project to have coordinated views that work well together and provide different perspectives on the data. This type of project does not have the same level of visualization innovation as the first, but it comes together in strong system implementation, including well-designed user interactions that allow users to explore the data and progress through their task to answer the questions they may have of the data.

Required Reading (15%)

Each Tuesday, we will post a paper(s) on Moodle related to the topics that we have covered in the class. All the students MUST read the paper(s) and provide a summary(s) of the research on Piazza. In addition to the summary, graduate students MUST post at least one question/critique about the paper and answer others’ questions/critiques. This is optional for undergrads though we encourage them to participate in discussions around the papers. Summary(s) is due on Monday of the following week. 

Your summary MUST include the following:

  • Explain the problem that the paper investigates. Why is it important? Whom does it affect?
  • How the authors propose to investigate/solve this problem?
  • Why & how the proposed solution can address the problem?
  • How do they evaluate their solution? What methodology do they use in their evaluation?
  • What are the most important findings of their evaluation?

We will read all of your summaries and questions/critiques every week. However, we will NOT provide you with feedback unless your work is below the bar or is missing.

You can find many guidelines online that describe how to read and summarize a research paper. Here is an example from Harvard. You can also talk to me our the TA if you have any questions about reading and summarizing research papers.

 

D3 Resources

http://www.youtube.com/watch?v=8jvoTV54nXw – nice overview and run-through video/talk
http://alignedleft.com/tutorials/d3/ – thorough d3 tutorials from an academic instructor and the author of the open OReilly book, “Interactive Data Visualization for the Web” (look for free preview link for the actual book draft
http://sightlinevis.com/ – many d3 examples
https://www.youtube.com/user/d3Vienno/videos?view=0&flow=grid – many tutorial videos by d3Vienno
http://www.cs171.org/2015/resources/ – list of d3 resources from Harvard CS 171 class
https://github.com/mbostock/d3/wiki/Tutorials – big list of resources from the author of D3
https://github.com/mbostock/d3/wiki/API-Reference – well-done D3 documentation
http://www.d3noob.org – free ebook with lots of tips and tricks, actively updated
http://www.jeromecukier.net/wp-content/uploads/2012/10/d3-cheat-sheet.pdf – cheat sheet for D3, also see parent site for blog posts
https://groups.google.com/forum/?fromgroups=#!forum/d3-js – D3 Google group
http://bost.ocks.org/mike/selection/ – Guide to understanding selections, key part of D3.
http://benclinkinbeard.com/talks/2012/NCDevCon/ – A talk, with interactive examples and code snippets, explaining d3
http://www.udacity.com/course/data-visualization-and-d3js–ud507 – d3.js Udacity Course
http://bl.ocks.org/curran/3a68b0c81991e2e94b19 – Responsive Visualizations (Resizing)
http://bl.ocks.org/hubgit/raw/9133448/ – Nesting CSV Data
http://bost.ocks.org/mike/nest/ – Nesting Visualization Elements
http://www.visualcinnamon.com/blog – Creative Tutorials from Nadieh Bremer