CPSC 670/IR/Fall 2003/Leggett/Programming Lab 5/Due November 12

WWW Indexing Bots

Write an interactive web-based application that uses Wget to mirror a set of web-based resources in preparation for inclusion in a digital library.

      Input:

            The web-based resources available by following NO MORE THAN two links
            (level 2, in-order, as discussed in class) from the class home page:
            www.csdl.tamu.edu/~leggett/courses/ir/

            A button that produces the mirror.

      Output:

            The mirrored resources with the same directory structure and relative links.

            A message indicating that the mirror has been built successfully (or not!).

            A message indicating the number of files and total number of bytes downloaded.

            A button that is linked to the home page of the mirror.

Notes:

   1. The web page should be well-designed.

   2. Do not retry URLs that fail.

   3. You should design a web page for the lab that contains links to: 1) the application web page and 2) your source code.

   4. When you have completed the lab, send an email which includes your full name, userid, and complete URL for the web page mentioned in #3 above. The lab grade will be emailed sometime after receiving your email.