The Geo. H.N. Luhrs Family in Phoenix and Arizona 1847-1984, part of the Luhrs Family Collection housed in the Department of Archives and Manuscripts in Hayden Library at Arizona State University, is a typescript volume containing the recollections of George H.N. Luhrs Jr. and assorted family documents he assembled. It was printed in a limited edition of 150 copies. The over-900-page text has no index and contains important information about territorial period and early twentieth century Phoenix, Wickenburg and Arizona history. It was selected for digitization to make the content more accessible to researchers seeking primary source material on these topics. The book was scanned as one of the 1998-99 Digitization Pilot Projects in Hayden Library.

Our aim in this project was to maintain the look of the original manuscript as much as we could, while creating an online version that was as close to fully machine-readable (and thus searchable online) as possible. We also wanted the online version to load quickly and be readable by any graphical browser. For these reasons we chose not to use PDF but instead to OCR the text and take the time to correct it and mark it up as HTML. We gave up speed and ease of scanning but we gained greater searchability and smaller file sizes, which we feel makes the manuscript much more accessible to remote users. We used the "type text" tag in Netscape to produce a font on screen which looks similar to the typewritten text of the original, and attempted to copy the layout of each page to the extent that it was possible and practical to do so.

We made a few changes in the online version, however, which do not maintain the appearance of the original. For example,

  • We did not indent the first line of each paragraph, as in the manuscript, due to lack of a browser-independent HTML tag for indenting. Instead we skipped a line between paragraphs. This is adequate for the most part, but there are some pages in the online version where it is not possible to tell if the first sentence is the beginning of a new paragraph or a continuation of the last paragraph on the preceding page.
  • The manuscript is double-spaced, but we used HTML standard single-spaced text for the online version. Had we attempted to copy the double spacing of the original by inserting break tags, the online version would not have displayed correctly on narrower screens because of the proportional setup of each page (i.e., we set the navigation component of each page at 25% of the width of the viewer's screen, and the text component at 73% of the width of the viewer's screen - thus the display width is dependent on the individual viewer's monitor settings, rather than on a set width chosen by us in creating the pages).
  • We left hyphenated words as they were in the manuscript, except when words were hyphenated at the ends of lines. Since the lines of text display differently on the web according to screen width rather than set line width as noted above, hyphens were not necessary in the online version to split words.
  • Mistakes in punctuation and spelling that appeared in the original manuscript were not corrected. Some examples include "Isthmas" instead of "Isthmus," "Duppe" instead of "Duppa," periods in place of commas and vice-versa, and various other errors. We did not use [sic] to call attention to the errors or to the fact that we had noticed them and chosen not to correct them; instead we simply presented the manuscript "warts and all." Because each page went through at least two careful proofreadings, we feel confident that we came very close to 100% accuracy in reproducing the original manuscript, including the errors it contains. Remote users who have questions about errors they find in the online version can check them against the original manuscript, which is housed in the Department of Archives and Manuscripts.
  • The project as originally proposed would have included only the typed memoir of George Luhrs, Jr. Project staff decided to reproduce the entire volume, including all letters and photographs. The handwritten letters (not conducive to reproduction with OCR), and all photographic images, were scanned in Photoshop as grayscale images and displayed mostly as 72 dpi JPGs. All handwritten information and signatures were scanned in the same manner. Many of the full-page letters and photos resulted in very large image files, which even at 72 dpi resolution and considerably reduced from their original print size, still load at over 500K. These images can not be enlarged and the smaller print on many of them is not readable. Furthermore, many images in the manuscript were photocopies of originals, and the quality of the reproduction was very poor -- hence, the digital versions of these images also look rather poor. These are displayed simply as representations of the content of the book, with the recognition that the information content of many of the items is not sufficient to warrant larger file sizes or re-scanning from true original documents (rather than photocopies) to allow for improved readability and appearance. Printed textual information that accompanied image material, such as captions, and the typed English translations of correspondence originally handwritten in German, was OCR'd and is readable and searchable online.
  • Pages which were unnumbered in the original manuscript were assigned numbers based on their placement within the text, and the assigned numbers put in brackets ([]). For example, the pages of photos appearing between pages 49 and 50 in the manuscript were numbered [49a], [49b], and so on.
  • An oversized foldout genealogy chart in the manuscript was not included in the online version, as the information appears in text form in the section titled "The Families."

The online version of the manuscript is set up so that each page of the original volume corresponds to a single web page. Each of these web pages has a navigation bar on the left-hand side, similar to the one that appears on this page, which allows the user to jump to the next page, previous page, title (first) page, last page, and Table of Contents. The online Table of Contents page is a reproduction of the original except for the addition of links to various sections of the manuscript which correspond to the page numbers given in the original work. Likewise, links have also been added to the List of Photographs page, which allow the user to jump to the first page of each group of image pages in the manuscript.

The title page, Table of Contents page, and this page each have a link to the search interface page, which allows the user to enter a word and the system finds every page on which that word appears.

Each page's navigation bar has links to the University Libraries' web site and to the Department of Archives and Manuscripts' main page as well.

This project was produced in the spring of 1999 by staff in the Library Instruction, Systems and Technology department of ASU Libraries. The scanning and OCR conversion for the project were primarily performed by graduate assistant Shobana Balasubramanian. Heather Knowles designed the site and created the web pages. Pam Dunlock programmed the site's search capability.