1The Aurora OCR system is under development by Dr. Art Wetzel, currently with Carnegie Mellon University. The authors wish to recognize his contributions to the project, and to this paper.

2The project is using a smaller subset of the full TEI DTD, called at the moment TEILITE [2] which is extended with markup for tables and for other content-specific elements.