Many times I come across bitmaps with nothing but text paragraphs, so I was looking for a way to identify the font used, the paragraph alignment, line spacing and color, bold, italics.
Would an OCR package allow me to do that?
If not, what other options do I have?
Answer
There are several online utilities can be used to identify fonts, including:
These utilities cannot be used to determine the formatting of the text in an image. However, you can use OCR programs such as Tesseract (open source) and Smart OCR (commercial, starting from US$99.90) to detect formatting such as paragraph alignment and line spacing as well as font styles such as bold or italic (see this Stack Overflow question). Note that some OCR programs can attempt to identify the font(s) in an image as well.
No comments:
Post a Comment