Contents
CS 490 :: Search Engines & Recommendation Systems :: Spring 2017
Announcements
- Formative Course Evalution for CS490
- Final Presentation Time Slots have been posted.
Schedule
Readings from Manning, Raghavan, & Schutze. Introduction to Information Retrieval (2008) unless otherwise stated.
Week | Topic / Lecture Notes | Readings | Labs |
---|---|---|---|
Week 1 - 1/23 | Intro, Boolean Retrieval | Ch. 1, 2 | Lab Overview, Survey |
Week 2 - 1/30 | Text Pre-Processing, Web Crawling | Ch. 19, 20 RegEx Tutorial in Python |
Lab 1: Text Pre-Processing |
Week 3 - 2/6 | Designing a Meta-Search Engine | Ch. 4.3 | Lab 2: Web Crawling |
Week 4 - 2/13 | Ranked Retrieval | Ch. 6, TF-IDF Exercies |
Lab 3: Boolean Retrieval |
Week 5 - 2/20 | Faster TF-IDF | Ch. 6 | Lab 4: Ranked Retrieval |
Week 6 - 2/27 | More Ranked-Retrieval | Ch. 7 (optional) | |
Week 7 - 3/6 | Evaluation | Ch. 8 | Lab 5: Evaluation |
SB - 3/13 | Spring Break | IR Presentation | Research Project |
Week 8 - 3/20 | Recommender Systems: Intro (Doug, Wednesday) Collaborative Filtering (David D., Eric G., Friday) |
Ricci Ch 1 Netflix 2009 & Amazon 2003 |
Idea Generation |
Week 9 - 3/27 | Search Engines: PageRank (Matt B., Chris W., Wednesday) Architecture (David M., Will S., Friday) |
Ch 21.1-21.2 & PageRank 1998 Google 1998 |
Literature Review (Google Scholar) Proposal Due (3/31) |
Week 10 - 4/4 | Classification: naive Bayes (The Joes, Monday kNN (Erika R. , Kizito U., Wednesday) |
Ch. 13 & Better NB 2003 Ch. 14 |
Data Collection (API, Crawl) |
Week 11 - 4/10 | Clustering: Flat (Ryan D., Jimmy W., Monday) Hierarchical (Noah Z., Yaw A., Wednesday) |
Ch. 16 Ch. 17 |
Methods (Search, Classify, Cluster, Recommend) Update Due (4/14) |
Week 12 - 4/17 | Recommendation: Content-based (Shelby C., Jot S., Monday) Music (Jeremy S., Trevor W., Wednesday) |
CB 2007 Celma 2009 Ch 2 & 3 |
Experiments |
Week 13 - 4/24 | Human Computation (Nicole L., Monday) Final Exam (Wednesday) |
ESPGame 2004 & reCaptcha 2008 |
Clean Up & Rerun |
Week 14 - 5/1 | NLP (Luke) Final Project Presentations |
Disseminate (Latex) Talk Slides Due | |
Finals Week (5/8) | Final Project Reports Due (Monday 5/8 at 5pm) |
Course Overview
Search engines, such as Google, YouTube and Flickr, have had a huge impact on how people find and use information (e.g., webpages, videos, photos). Recommendation system like Netflix, Facebook, and Pandora, help people discover new and exciting things (e.g., movies, friends, songs). In this course, we will explore how information retrieval (IR) and recommendation systems (RecSys) are designed and implemented.
The first half of the class will be devoted to developing traditional IR skills such as web-crawling, text & multimedia processing, boolean & vector-space modeling, classification, clustering, and similarity analysis. The second half of the course will be devoted to creating a information retrieval or recommendation system as a collaborative class project. For this project, groups of students will design and develop individual components of this large-scale system. In the final weeks we will combine these components and (if all goes well) launch a new IR/RecSys for public use on the Internet.
COURSE FORMAT/STYLE: Lecture, Lab Meeting, Programming Assignments, Research Paper Reading and Dissection Collaborative Final Project
COURSE REQUIREMENTS & GRADING: Strong programming experience (CS220 or above) is required. Experience with Python is recommended. Advanced web programming (e.g., CS205) experience will also be useful but is not necessary.
Course Information
- Prof. Doug Turnbull (dturnbull@ithaca.edu, (607) 274-5743)
- Class: MWF 12-12:50pm
- Lab: Tu 2:35-3:50pm
- Office Hours: Williams 321E
- 2pm on Wednesdays, 3pm on Fridays in Williams 321E
- By Appointment
- Whenever my door is open
- Course Teaching Assistant: Luke Waldner
- Evening Help Sessions: 7-9pm on Wednesdays in Williams 309
Course Material
- Syllabus
- Gradebook on Sakai]
- Textbook: Introduction to Information Retrieval
- Java vs. Python Side-by-Side
Related Courses
- IR and Web Search - Manning, Nayakm Raghavan, Stanford, Spring 2012
- Intro to IR - Kauchak, Pomona College, Fall 2012
- Text & Multimedia IR - Turnbull, Swarthmore College, Spring 2009