We have scanned the original newspaper pages using high quality scanners and an optical character recognition (OCR) process which converts the printed text to electronic text. Both these processes produce the most accurate results possible; however, it is inevitable that some errors slip through.
The quality of the original newspaper affects the outcome and accuracy of the OCR scanning process.
A range of factors are taken into account, including:
- Highly complex layout
- Radical differences in layout over time
- Variable font sizes and character types (especially Gothic)
- Narrow space between lines
- Narrow gutter between columns
- Missing or misprinted text
- Poor quality or deteriorated inks
- Poor quality or deteriorated papers
- Irregular alignment of characters in hand-set press
- Annotations by hand
- Graphic devices and/or elements