CPSC 470/670: Information Storage and Retrieval

Spring Semester, 2007
Time and place: T/TH 12:45pm - 2:00pm Room 104 HRBB
Instructor: Dr. Frank Shipman
Office hours: HRBB 402B, TBA, or by appointment

Description of Course

Information retrieval (IR) covers issues of representation, storage, and access to very large multimedia document collections. This course covers the fundamental data structures, algorithms, and access methods of current information storage and retrieval systems and relates the various techniques to the design and evaluation of complete retrieval systems delivered on the Internet and in digital libraries. Course content includes coverage of algorithms for indexing, compressing, and querying very large digital collections and tools and techniques for managing information services on the Internet.


Students should be able to design and develop large JAVA programs and learn new software libraries on their own.

Readings (all required)

Modern Information Retrieval, Ricardo Baeza-Yates and Berthier Ribeiro-Neto,
Addison Wesly and ACM Press

collected journal and conference papers

Major Topics

Topics to be included in the course are:

Class Work

The class will include readings, homeworks, exams, and projects. Projects will be 3-5 person group projects, with more members indicating a larger project. Individual student's grades for projects will be influenced by their teamwork as evaluated by the other project group members. Projects are to include selecting a collection of materials to provide, using Lucene (or Greenstone or ...) to index the contents, and create a user interface for searching and browsing the collection. The projects will develop an initial prototype for demonstration to the class at the end of the semester and planning an evaluation of the prototype's success or failure. Project topics must be approved by the instructor.


Grading will be based on reading and participation in class, exams, homeworks, and projects.
  For CPSC 470:                     For CPSC 670:
    class participation    10%        class participation    10%
    exams                  45%        exams                  45%
    homeworks              20%        homeworks              10%
    project                25%        project                25%
                                      term paper             15%

Final Report Format

Your final project reports is to be 8-12 pages formatted according to the ACM Conference Format. You can cut and paste into this format and use the paragraph styles provided. Here is a link to the MS Word Template. You can find RTF and Maker Interchange File formats at this ACM SIGCHI page.