Dirty or Broken Type and Typesetting Mistakes Can Make Newspaper OCR Difficult

Categories // Genealogy Research, Rants, Raves and Kudos

Newspaper Optical Character Recognition Research Issues

Digitized newspapers are a boon to genealogists, family historians and researchers, especially for the big three, births, marriages and deaths, folowed closely behind by divorces, separation, bed and board issues, court cases, flitting (the movement of renters on a yearly date), moving, accidents, confirmations, military news, etc.


If the OCR program can not read the type, the program has a hard time working. 

Below are two examples. the first appears to be broken, the right arm of a lower case r is missing. The second is an example of a piece of type that is very likely broken, if not broken, it certainly is dirty and has ink blotting the lower case n so it is unrecognizable, though not necessarily unreadable by humans.

Broken Type.web

Below is an example of a headline that may or may not have a piece of broken type or perhaps the wrong piece of type inserted for the capital G in WAGE instead of WACE. I can't tell if the capital G is broken or if a captial C was inserted by mistake. I doubt the OCR program can either.

OCR issues.webOCR issues.web 

 OCR issues.web

Be creative when using programs with OCR to find articles pertaining to your subject. When type got old and broken, there wasn't an apprentice's helper who reviewed every single lead box every single night. Sometimes ink dried in the creases, sometimes the wrong type was in right box, sometimes the type has a broken arm and appeared to be another letter or just didn't appear to be the letter needed.

Leave a comment

You are commenting as guest.

Ida Sherwood Bettis is my paternal grand mother. Aunt Clara is my great aunt. I can remember every nooks and crany of that house and yard...

Eric Bettis Eric Bettis 25. July, 2017 |

I would be happy to forward your name, connection, and email if you wish.

Barbara K. Henritze Barbara K. Henritze 06. November, 2016 |

Shopping Cart

The cart is empty