TestComplete 4 Sneak Peek - OCR

Here's a sneak preview of a new feature which will be debuted in the forthcoming TestComplete 4:

(No release date available yet. You'll be the first to know when it's announced.)


For most applications, TestComplete can 'read' the words and characters displayed on-screen. This makes it easy to create tests that detect the state of the application and react accordingly. Some applications are difficult to test because they write text to the screen in a way that is difficult to read. We call them 'black box' applications. Usually they are hard to read because they 'paint' the text to the screen instead of using Windows to display the characters. Text rendered this way is just dots to the computer and to testing applications. That makes it difficult or impossible to create robust, reliable tests for black-box applications. TestComplete 4 will include a new feature which solves this problem for many applications, Optical Character Recognition or OCR.

What is OCR?

OCR translates images of printed text into computer readable text. TestComplete 4 can capture an image of a black-box application screen and use OCR to 'read' the text on it and convert it to usable ASCII or Unicode text. This text can be used to create solid, reliable tests.

OCR Is Scriptable

TestComplete 4's OCR feature is, of course, completely scriptable using the new elements, OCR and OCRObject. The new script object, OCR, has just one method: CreateObject. Pass OCR.CreateObject a captured screen image that contains the text to be recognized and it returns a new 'OCRObject' which is used to perform the text recognition.

A little bit about OCRObject

To start the character recognition process, we can call OCRObject.GetText or OCRObject.FindRectByText.
OCRObject.GetText takes no parameters and returns all OCR readable text from the image.
OCRObject.FindRectByText takes a string parameter and tries to locate that text in the image. If it can find the text then it returns the image coordinates of the region where the text was found.

OCRObject.OCROptions provides access to OCR customization settings.

How about a script example?

Here is a simple, one-line TestComplete 4 script that takes an image of the active window and returns all readable text found in it. Examples for all five of TestComplete's scripting languages are included.

VBScript:

Log.Message OCR.CreateObject(Sys.Desktop.ActiveWindow).GetText

JScript:

Log.Message(OCR.CreateObject(Sys.Desktop.ActiveWindow).GetText());

DelphiScript:

Log.Message(OCR.CreateObject(Sys.Desktop.ActiveWindow).GetText);

C#Script and C++Script:

Log["Message"](OCR["CreateObject"](Sys["Desktop"]["ActiveWindow"])["GetText"]());

How does it work?

TestComplete 4 can recognize 52 lower-case and upper-case Latin characters, 10 digits, and 31 special characters in almost any font, size or style. We've shown it can be done in a script with as little as one line of code, but what's going on inside of TestComplete 4? How does it read the text in the image? To successfully read on-screen text, TestComplete 4 has to create a common ground between the installed Windows fonts and the captured image of the black-box application text.


Recognizing any installed font on a Windows PC with dozens or even hundreds of installed fonts would waste valuable processing time. TestComplete 4 creates and uses 'font collections' to limit the readable fonts to just the ones needed for the tested application. The default font collection is Arial, Courier New, Times Roman, Fixed Sys, System, and MS Sans Serif, each in five sizes and five styles. You'll be able to create custom font collections with any combination of installed fonts, sizes and styles.


To prepare the font collection to be recognized, TestComplete 4 generates an image of every recognized character in all designated sizes and styles for each font in the collection. It stores these character images in a master character table used later to compare to the on-screen image.


A process called 'fragmentation is used to prepare the screen image of the black-box application for comparison. Fragmentation helps to simplify the internal representation of the image, identify the recognizable elements, and helps separate the text fragments. It locates the rectangular regions within the screen image and tries to find several non-intersecting rectangular fragments, each with its own predominant color. Then TestComplete 4 transforms the 'fragmented' screen image to a binary representation. Every pixel becomes completely black and white with no shades of gray. The simple black and white image of each character is the common ground used to compare the contents of the font collection to the contents of the black-box application screen. TestComplete 4 simplifies the elements and compares every possible item. When a match is found, a character is 'read'.

The hard work needed to make OCR tick goes on inside the TestComplete 4 engine, so we can just write a couple of lines of script and get back all of the readable text or search for a specific string. OCR in TestComplete 4 is going to make black-box application testing much easier.



Close

By submitting this form, you agree to our
Terms of Use and Privacy Policy

Thanks for Subscribing

Keep an eye on your inbox for more great content.

Continue Reading

Add a little SmartBear to your life

Stay on top of your Software game with the latest developer tips, best practices and news, delivered straight to your inbox