COMPSCI 646: Information Retrieval – Fall 2022
COMPSCI 646 is a graduate-level course in Information Retrieval, the science and engineering of indexing, organizing, searching, and making sense of unstructured or mostly unstructured information, particularly text. The class focuses primarily on the underlying models used for effective search and organization, but includes some discussion of efficiency concerns. The course also covers current research problems and methodologies in the field of Information Retrieval.
Prerequisites
- Proficiency in Python and/or Java
- Basic knowledge of probability, statistics, and information theory
- Foundations of applied machine learning and deep learning
Textbook
- [WBC] W. Bruce Croft, Donald Metzler, and Trevor Strohman. Search Engines: Information Retrieval in Practice. Pearson Education, 2009.
- [CDM] Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. Introduction to Information Retrieval. Cambridge University Press, 2008.
Grading
- Assignments (3×10%)
- Midterm exam (35%): November 3, 2022 from 7 to 9 PM. Location: HAS0134.
- Final project (35%)
Tentative Schedule
# | Lecture | Date | Readings |
1 | Introduction | Tue 9/6 |
|
2 | IR Basics | Thu 9/8 | |
3 | IR Evaluation Methodologies, Metrics, and User Models | Tue 9/13 |
|
4 | Thu 9/15 | ||
5 | Text Processing and Indexing | Tue 9/20 |
|
6 | Basic Retrieval Models: Vector Space Models & Probabilistic Retrieval Models | Thu 9/22 |
|
7 | Language Modeling | Tue 9/27 |
|
8 | Enhanced Language Modeling | Thu 9/29 |
|
9 | Relevance Feedback | Tue 10/4 |
|
10 | ML Basics & Learning to Rank | Thu 10/6 | |
11 | Introduction to Neural Networks and Neural IR | Tue 10/11 | |
12 |
Distributed Representation Learning for Text |
Thu 10/13 |
|
13 | Neural Ranking Models | Tue 10/18 |
|
14 | Thu 10/20 | ||
15 | Implicit Feedback, Biases, and Click Models | Tue 10/25 |
|
16 | Web Search: Link Analysis, Spam Filtering, MapReduce | Thu 10/27 |
|
17 | Context-Awareness and Personalization in Search | Tue 11/1 |
|
18 | Novelty and Diversity | Thu 11/3 |
|
19 | User Study and Crowdsourcing in IR | Tue 11/8 |
|
20 | Cross- and Multi-Lingual IR | Thu 11/10 | |
21 | Information Filtering and Recommendation | Tue 11/15 |
|
22 | Thu 11/17 | ||
No Class: Friday Schedule | Tue 11/22 | ||
No Class: Thanksgiving | Thu 11/24 | ||
23 | Question Answering | Tue 11/29 | |
24 | Transparency, Controllability, and Fairness in IR | Thu 12/1 | |
25 | Conversational Information Seeking | Tue 12/6 |
|
26 | Current IR Research | Thu 12/8 |
Course Policy
Late Submission
Each student has a total of 5 late days without penalty. You can use up to 3 late days per assignment or project milestone excluding the project’s final report. Once all 5 late days are used, any assignments turned in late will be penalized 20% per late day.
In case of multiple submissions of an assignment, only the last one will be taken into account for the number of late days.
Collaboration and Help
You may discuss the ideas behind assignments with others. You may ask for help understanding class and IR concepts. You may study with friends. However…
The work that you submit must be your own. It may not be copied from the web, from another student in the class, or from anyone else. If you stumble upon and use a solution from the textbook or from class, you are expected to acknowledge the source of the work.
Your effort on the midterm exam must be your own. Your assignment submissions must be your own work and not in collaboration with anyone. Your project work must be your own work and not a copy of someone else’s work.