Thursday, 17 August 2023

OCR converts to yiddish, rubbish and gibberish

 I tried an el cheapo Optical Character Recognition program that I have had laying around. The ABBYY finereader has been supplied with most scanners for years but when you get something for nothing then maybe that is what you get.  Unless you want to pay lots extra I don't think using OCR will help much in getting written pages into typed manuscript word format.  These written pages are in your traditional cursive flowing style on different papers with different pens and over a long period of time have fading problems as well.  The results I got from this program were totally unrecognisable.  The process looks encouraging as colours are cast down the page, highlighting here and underscoring there. You are meant to do a manual intervention here, so I just saved it into a file that turned out to be complete hieroglyphics.  The best thing here is just to sit down and patiently retype every word into new word files.

As far as I could tell the green area was what it found, the red area was to be deleted and the white area was totally ignored.


No comments:

Post a Comment