COMPSCI 546: Applied Information Retrieval – Spring 2022

COMPSCI 546 is a graduate level course intended to cover information retrieval and other information processing activities, from an applied perspective. There will be numerous programming projects and assignments. It provides a richer technical follow on to COMPSCI 446 (Search Engines), for undergraduates interested in a deeper understanding of the technologies. It also provides a strong basis for continuing on with COMPSCI 646 (Information Retrieval), for those graduate students who are interested in a more complete theoretical coverage of the area. Topics will include: search engine construction (document acquisition, processing, indexing, and querying); learning to rank; information retrieval system performance evaluation; classification and clustering; other machine learning information processing tasks (e.g., basic deep learning models for information retrieval); and many more. Undergraduate prerequisites: COMPSCI 320 and either COMPSCI 383, COMPSCI 446, or COMPSCI 585. 3 credits.

Tue & Thu, 4:00 – 5:15 PM

CS 142


Hamed Zamani - UMass Amherst

Instructor:
Hamed Zamani
Contact: zamani@cs.umass.edu
Office Hours: Tue & Thu, 9:00 – 10:00 AM @ CS 350

Teaching Assistant:
Lakshmi Vikraman
Contact: lvnair@cs.umass.edu
Office Hours: Mon & Wed, 10:00 – 11:00 AM @ LGRT T220

Prerequisites

  • Proficiency in Python and/or Java
  • Basic knowledge of probability, statistics, and information theory
  • Foundations of applied machine learning and deep learning


Textbook

 

Grading

  • Assignments (8×10%) – One assignment is optional
  • Final exam (30%)

 

Tentative Schedule

# Lecture Date Readings Note
1 Introduction Tue 1/25
  • [WBC] Ch.1
  • [WBC] Ch.7.1
  • [CDM] Ch.8.1, 8.2
 
2 IR Basics Thu 1/27
3 IR Evaluation and Ranking Metrics Tue 2/1  
4 Thu 2/3 Assignment 1 (IR Metrics)
5 Text Processing and Indexing Tue 2/8
  • [WBC] Ch.4.1, 4.2, 4.3
  • [WBC] Ch.5.1, 5.2, 5.3, 5.4, 5.7
 
6 Thu 2/10 Assignment 2 (Text Processing & Indexing)
7 Basic Retrieval Models Tue 2/15  
8 Thu 2/17
  No Class (Monday Class Schedule) Tue 2/22    
9 Language Models for IR Thu 2/24 If you are interested in learning more about language modeling for IR, the book “Statistical Language Models for Information Retrieval” by ChengXiang Zhai is recommended.
10 Tue 3/1 Assignment 3 (Retrieval Models)
11 Query Expansion and Relevance Feedback Thu 3/3  
12 Web Search and Search Engine Technologies Tue 3/8 Assignment 4 (Query Expansion)
13 Thu 3/10  
14 Novelty and Diversity Tue 3/15 Assignment 5 (Link Analysis)
15 Machine Learning Basics Thu 3/17

 

 

16 Document Clustering Tue 3/22

 

 

17 Document Classification Thu 3/24

 

Assignment 6 (Clustering)

18 Learning to Rank Tue 3/29  
19 Thu 3/31
20 Introduction to Neural Networks and Neural IR Tue 4/5    
21

Distributed Representation Learning for Text

Thu 4/7 Assignment 7 (Learning to Rank)
22 Neural Ranking Models Tue 4/12  
23 Thu 4/14
24 Question Answering Tue 4/19    
25 Information Filtering and Recommendation Thu 4/21  
26 Tue 4/26 Assignment 8 (Collaborative Filtering)
27 IR Applications Thu 4/28    
28 IR Research Tue 5/3    

 

Course Policy

Late Submission

Each student has a total of 6 late days without penalty. You can use up to 3 late days per assignment. Late submission of team assignments will result in each member of the team being charged for the late days. For example, if a group of two students submitted their project proposal 23 hours after the deadline, this results in 1 late day being used per student.

Once all 6 late days are used, any assignments turned in late will be penalized 20% per late day.

In case of multiple submissions of an assignment, only the last one will be taken into account for the number of late days.

Collaboration and Help

You may discuss the ideas behind assignments with others. You may ask for help understanding class and IR concepts. You may study with friends. However…

The work that you submit must be your own. It may not be copied from the web, from another student in the class, or from anyone else. If you stumble upon and use a solution from the textbook or from class, you are expected to acknowledge the source of the work.

Your effort on the final exam must be your own. Your assignment submissions must be your own work and not in collaboration with anyone. Your project work must be your own work and not a copy of someone else’s work.

Relevant UMass Resources