Why is OCR at 300 dpi a standard?
Scanning at 300 dpi (dots per inch) is not officially a standard for OCR (optical character recognition), but it is considered the gold standard.
Some people think you can scan at a lower dpi, such as 200 dpi, and then use scanner software to increase the resolution through interpolation. However, you have to remember that interpolation doesn’t actually provide a meaningful benefit for OCR. It usually just makes an image larger by adding extra pixels in between your scanned ones or by stretching pixels, but these approaches always just result in an approximation. Your image always loses some clarity and quality. You’re better off just scanning your document at 300 dpi to begin with.
Most leading OCR and Automated Forms Processing software companies recommend scanning at a minimum resolution of 300 dots per inch for effective data extraction. In fact many have 300 dpi as their default setting. In other words, for every square inch of paper, the scanner is capturing 300 dots horizontally and 300 dots vertically or 90,000 total dots (300 X 300 = 90,000 dots per square inch). If you use a 200 dpi setting instead of 300 dpi, you’ll only see 40,000 dots per square inch as opposed to 90,000. That’s a significant difference when you think about it.
So to answer this question quite simply, higher resolution scanning equates to improved automated OCR accuracy. Since OCR is a technology where a computer is making a decision about a scanned character, having more dots per inch allows the computer a higher level of accuracy because it has more data to make the correct decision on the character.
Below is an example of a problem not having enough dots per inch can cause. Is the second character a B or an 8?