Why we built vision AI into TestComplete: Solving the complex app testing challenge
When we talk to testing teams at enterprise organizations, we hear the same frustrations repeatedly: “Our automation breaks every time the UI changes.” “We can’t test this application because it doesn’t expose accessible properties.” “We spend more time maintaining tests than creating new ones.” These scenarios block test automation adoption for teams that need it most.
That’s why we built vision AI into SmartBear TestComplete – as a practical solution to challenges our customers face every day when traditional property-based testing falls short.
Common problems with testing complex apps
Property-based object recognition has been the foundation of test automation for years, and for good reason. When applications expose clean, stable properties, it’s fast, reliable, and efficient. But here’s the reality: many applications don’t cooperate with this approach.
Consider the scenarios testing teams actually encounter in production environments:
Canvas-based interfaces that render everything as pixels without accessible DOM or accessible properties. Think about financial trading platforms, engineering design tools, or interactive dashboards where data visualizations are drawn directly to canvas, rather than built from standard HTML elements.
Custom graphics engines in specialized software. CAD applications, medical imaging systems, and simulation tools often use proprietary rendering that doesn’t expose the technical properties traditional automation relies on.
Legacy desktop applications built before modern accessibility standards existed. These applications are business-critical – they can’t be replaced or redesigned – but they weren’t built with test automation in mind.
Embedded controls and third-party components that don’t follow standard patterns. Custom drop-downs, specialized data grids, and proprietary UI controls that work perfectly for users but are invisible to property-based automation.
Applications running in virtualized environments like Citrix or remote desktop, where everything appears as a single rendered image without individual element properties accessible to automation tools.
Scenarios like these block test automation adoption for teams that need it most.
Why we added vision AI to TestComplete
The decision to integrate vision AI into TestComplete came from listening to our customers and understanding a fundamental truth: test automation tools need to adapt to applications as they exist, not as we wish they were designed.
We saw teams making painful compromises. Some were maintaining separate automation tools – one for standard applications, another for complex scenarios. Others were falling back to manual testing for critical workflows because their automation simply couldn’t interact with certain UI elements. Many were spending a significant amount of their QA capacity on test maintenance rather than actual quality assurance.
Vision AI addresses this by expanding how TestComplete can identify and interact with UI elements. Instead of relying exclusively on technical properties that might not exist or might change unpredictably, vision AI uses computer vision and machine learning to recognize elements the way users see them visually.
How vision AI works in TestComplete
Vision AI in TestComplete uses artificial intelligence to identify UI elements based on their visual appearance rather than their technical properties. It analyzes screenshots of your application to locate buttons, text fields, images, and other controls by understanding what they look like and where they’re positioned.
The practical advantage is straightforward: vision AI can test applications that property-based tools can’t touch, and it can maintain test stability when UI implementations change, even if visual appearance remains consistent for users.
Here’s why TestComplete’s implementation is particularly effective:
Production-grade AI models
We’ve trained our models on diverse enterprise applications across Fortune 500 deployments. This isn’t experimental technology – it’s battle-tested across real-world complexity.
Context-aware element identification
Vision AI doesn’t just match pixels – it understands element purpose and context. It can distinguish between multiple buttons that look similar by understanding their position and surrounding elements.
Automatic and manual modes
Vision AI can work as an automatic failover when property-based identification encounters unsupported controls or unstable properties. You can also choose to use visual recognition directly for specific scenarios where it’s the most reliable approach. The system gives you control while eliminating the need for manual mode switching or script modifications when used as automatic failover.
Combined with Google Vision API for text extraction, vision AI can read and validate text that appears in graphics, charts, or any visual element – essential for testing data visualizations and legacy terminal screens.
Real-world use cases for vision AI
Testing teams use vision AI across diverse scenarios. Here are just a few examples:
- Financial trading platforms – Identify chart elements rendered on canvas and extract and validate real-time market data that appears purely as graphics.
- Engineering design software – Automate CAD applications with proprietary graphics engines by recognizing custom toolbars and specialized controls visually.
- Legacy insurance systems – Enable test coverage for business-critical applications with non-standard controls and limited property exposure.
- Remote desktop applications – Test applications delivered through Citrix or remote desktop where the entire interface appears as a single rendered image.
- Embedded medical devices – Validate specialized controls in custom-built healthcare interfaces before reaching clinical environments.
How vision AI complements hybrid object recognition
Vision AI doesn’t replace property-based testing – it complements it as part of TestComplete’s hybrid recognition approach. This is a critical distinction.
When applications expose stable properties, property-based identification remains the most efficient method. It’s fast, precise, and works beautifully for standard controls. Vision AI steps in when properties aren’t available, aren’t reliable, or when visual validation is specifically needed.
This hybrid model delivers several advantages:
Automatic optimization
As you create your tests, you can choose which object detection capabilities to invoke, or you can allow TestComplete to intelligently choose the best recognition method for each element. You don’t need to manually decide – the system uses property-based identification when it’s available and reliable, and fails over to vision AI when it’s not.
Broader application coverage
The combination handles everything from modern web applications with clean React components to legacy desktop systems with proprietary controls, all within a single automation framework.
Reduced maintenance burden
When UI implementations change but visual appearance remains consistent, vision AI maintains test stability. When properties are stable, property-based testing provides efficiency. You get resilience across different types of change.
Flexible testing strategies
For teams with diverse application portfolios, the hybrid approach eliminates the need for multiple automation tools. Whether you’re testing a modern Angular app or a 15-year-old desktop application, TestComplete provides appropriate recognition methods for each scenario.
Addressing common vision AI questions
When teams first hear about vision AI, they typically have similar questions:
Does this mean I should stop using property-based testing?
No. Property-based identification remains primary when properties are available and stable. Vision AI extends what’s possible when property-based approaches aren’t sufficient.
How does this handle UI changes?
Vision AI maintains stability when visual appearance is consistent even if technical implementation changes. For significant visual changes, you may need to update visual references, similar to how property changes require updates to property-based tests.
What is vision AI’s impact on performance?
Vision AI is optimized for test execution speed. While visual recognition is computationally more intensive than property matching, the performance impact is minimal in practical test runs, and the tradeoff is worthwhile when it’s the only viable recognition method.
Can I use this with my existing tests?
Yes. Vision AI integrates naturally into TestComplete’s existing test framework. You can add visual recognition to existing test suites without rewriting everything.
The practical impact
The real measure of vision AI’s value is what testing teams can accomplish with it.
Teams report being able to automate workflows they previously tested manually because traditional tools couldn’t interact with custom controls. Organizations with legacy systems can finally implement comprehensive test coverage without waiting for application modernization projects. Teams testing across mixed environments – desktop and web, legacy and modern – can use a single automation platform rather than maintaining multiple tools.
Perhaps most importantly, teams spend less time fighting with their automation tools and more time on actual quality assurance. When your automation can reliably interact with the applications you actually need to test, regardless of how they’re built, test automation becomes an enabler rather than a constant maintenance burden.
Moving forward with vision AI
TestComplete’s vision AI represents our commitment to solving real automation challenges rather than just adding features. It’s built on the recognition that enterprise testing teams face diverse, complex applications that don’t always fit neat patterns, and automation tools need to adapt to that reality.
If your testing challenges include complex interfaces, legacy systems, custom controls, or virtualized environments, vision AI provides a practical path forward. Start your free TestComplete trial to experience how vision-based recognition expands what’s possible with test automation.
Frequently asked questions about vision AI for TestComplete
What does vision AI work in TestComplete?
Vision AI is a feature in TestComplete that uses artificial intelligence to identify UI elements based on their visual appearance rather than their technical properties. It analyzes screenshots of your application to locate buttons, text fields, images, and other controls by understanding what they look like and where they’re positioned. This allows for testing complex interfaces like canvas-based applications and legacy systems.
Does vision AI replace property-based testing?
No, vision AI does not replace property-based testing; it complements it as part of TestComplete’s hybrid recognition approach. Property-based identification remains the primary method when properties are available and stable. Vision AI extends what’s possible when property-based approaches aren’t sufficient and can act as an automatic failover.
How does vision AI handle changes in the UI?
When UI implementations change but the visual appearance remains consistent for users, vision AI maintains test stability. It recognizes elements the way users see them visually, rather than relying exclusively on technical properties that might change unpredictably. This resilience across different types of change is a key advantage of the hybrid approach.
How does vision AI impact test execution performance?
Vision AI is optimized for test execution speed. While visual recognition is computationally more intensive than property matching, the performance impact is minimal in practical test runs. You can add visual recognition to existing test suites without rewriting everything, as it integrates naturally into TestComplete’s existing framework.
Can vision AI read text within images or charts?
Yes, combined with Google vision API for text extraction, vision AI can read and validate text that appears in graphics, charts, or any visual element. This capability is essential for testing data visualizations and legacy terminal screens where traditional text extraction methods fail.
What types of applications benefit most from vision AI object recognition?
TestComplete’s vision AI is particularly effective for canvas-based interfaces, custom graphics engines, embedded PDFs, and remote desktop solutions like Citrix that render as pixels without accessible properties. It excels in legacy modernization projects where applications span both old and new technology stacks, framework migrations like Angular upgrades, and platform transitions from desktop to web. Vision AI also handles environments like financial dashboards, reporting systems, and legacy terminal screens where critical data appears as rendered text in graphics or charts.