Create tests in Reflect directly from your coding agent!

Create tests in Reflect directly from your coding agent!
Prashant Mohan
  March 30, 2026

If you’ve used Claude Code, GitHub Copilot, Cursor, or any coding agent, you already know the feeling. You describe what you want in plain language, the agent figures out the steps, and you watch it work. When something goes wrong, it backs up and tries a different approach.

Reflect now brings that same agentic workflow to test automation. Through the SmartBear MCP server, any coding agent that supports MCP can connect to Reflect and build tests from high-level objectives.

I’ve spent the last several months watching this play out in practice, and I think most teams are underestimating how much the economics of test automation are about to shift. Not because AI can “generate tests” (that’s been promised and under-delivered for years), but because Reflect’s agentic integration introduces something fundamentally different: an automation partner that can see your application, reason about what it’s looking at, and recover when things don’t go as planned.

There’s a deeper shift happening in how we think about test authoring itself. Traditionally, building an automated test meant providing explicit instructions i.e. click this element, type this value, wait for this condition. Every interaction had to be spelled out. The agentic model moves beyond that. Instead of giving the agent a sequence of recorded steps, you give it test intent – “log in and create a sales quotation” or “search for a hotel in Miami Beach and verify results.” The agent figures out the specific interactions needed to fulfill that objective, adapting to what it actually sees on screen. That’s a fundamentally different starting point, and it means tests can be authored by anyone who understands what the application should do.

How it works

You connect your coding agent to the SmartBear MCP server and give it an objective. The agent gets access to a set of purpose-built tools: it can add test steps using natural language prompts, take screenshots to understand application state, reuse existing test segments, delete and retry steps when something fails, and validate outcomes. Reflect handles all the execution, visual detection, and self-healing underneath.

Because the agent is working within your existing development environment, it can pull in context from the sources your team already uses: requirement specs like a Jira ticket, pull request describing new functionality, existing test cases, previous execution results, or test steps from repositories like Zephyr or QMetry.

This works for any type of web or mobile application. To make that concrete, here are two real examples.

SAP: We created a purpose-built SAP skill that’s specifically trained to navigate the complexities of SAP’s interface patterns, transaction flows, and menu hierarchies. In this demo, the agent read a 32-page test document (provided by SAP), connected to Reflect, and built an end-to-end sales quotation workflow: login, form entry, quotation creation, and status verification. Thirty-five steps from a single prompt. When the agent encountered an unexpected error dialog mid-flow, it recovered on its own by deleting the failed steps and trying a different approach. Watch SAP testing in action here.

Mobile apps: The same agentic workflow ran against a native mobile app on a real device. The agent connected to Reflect and built a complete hotel search flow: dismissing permission dialogs, entering a destination, selecting dates, and executing the search. Watch mobile app testing in action here.

The human stays in the loop

What I’ve described isn’t about automating humans out of critical business processes. Even when the agent does most of the heavy lifting, a human is always in the loop. The agent will ask which path to take when there’s ambiguity. It will request test data when it needs specific values. It will pause and ask for guidance when the expected outcome isn’t obvious from the screen alone.

Fully autonomous test generation will ultimately be where this lands. But most teams today are moving from zero to one with agentic workflows, and in that phase, having a human in the loop is what ensures the right guardrails are in place. The human makes sure the tests being built are real, repeatable, and scalable, not just technically valid but actually meaningful for the application.

Agentic test creation from external coding agents is the first of several agent-driven capabilities we’re building into Reflect. Self-healing agents that use agentic loops to automatically fix broken tests are next. Beyond that, we’re working on regression suite generation agents that handle test case selection and prioritization, release readiness agents that assess coverage and risk, and agents that convert production errors and user monitoring signals directly into test cases.

Why this matters now

Most organizations I talk to have a coverage gap that grows wider every sprint. Developers ship faster than QA can build automated checks, so coverage either stalls or gets backfilled with brittle scripts that break every time the UI changes. The standard response has been to hire more SDETs or invest months building out a code-based framework. Both approaches are slow, expensive, and don’t scale linearly with the rate of product change.

Agentic test creation in Reflect changes the ratio. A single QA engineer working with an agent can produce structured, maintainable tests at a pace that was previously impossible without a dedicated automation team.

The enterprise angle gets overlooked. Large organizations often have detailed manual test cases already documented in Jira, Zephyr, or QMetry. Hundreds, sometimes thousands of them. The labor of translating “manual test steps” to “working automation” is real. An agent connected to Reflect that can read those existing test cases and build automation from them is unlocking a backlog that many teams have given up on ever converting.

Why Reflect is built for this

Reflect supports a broad surface area of testing use cases: web, mobile, API, and visual testing, all from one platform. That matters because real-world applications aren’t just web pages. They’re multi-platform flows that span browsers, native apps, API calls, and visual validations. An agent building tests inside Reflect can cover all of these without switching tools or frameworks.

Reflect also uses vision-based technology alongside traditional element detection. This significantly increases the range of applications it can support (including complex enterprise UIs like SAP that break conventional selectors) and reduces test maintenance dramatically. When a button moves ten pixels or a label changes, vision-based detection adapts without requiring someone to update a fragile CSS selector. And when something does go wrong, Reflect provides enterprise-level debugging capabilities. Humans can review step-by-step execution logs, screenshots, and video recordings to understand exactly what happened and correct the test.

For enterprises that care about governance, Reflect integrates directly with major tools across the DevOps ecosystem, from CI/CD pipelines to issue trackers, as well as Zephyr and QMetry, so test executions flow into the system of record with full traceability and reporting. There’s a broader point here that I think is underappreciated. Test automation tools and test management tools have always existed as separate categories, separate vendors, separate workflows. In the agentic era, that separation becomes a liability. Agents need to move fluidly between creating tests, pulling requirements, syncing results, and assessing coverage. That only works when automation and test management are connected. SmartBear already has this with Reflect, Zephyr, and QMetry, and hundreds of customers are benefiting from it today.

In an upcoming post, I’ll show what this looks like end-to-end: code changes in GitHub automatically triggering impact analysis, test case generation in QMetry, coverage gap detection, and test execution in Reflect, all orchestrated by multiple MCP servers working together.

What I’d watch for

The teams that move first on this will have a meaningful advantage, and not just in test coverage. The real shift is organizational. When test creation stops being a specialized skill that requires deep framework knowledge, the people closest to the product (manual testers, BAs, product owners) can contribute directly to the automation suite. That changes who owns quality and how fast quality feedback reaches the team.

One uncomfortable observation: a lot of the resistance to this kind of tooling comes from automation engineers who’ve built their careers around framework expertise. I get it. But the same thing happened when codeless testing emerged, and the engineers who adapted became more valuable, not less. They moved up the stack to focus on test strategy, coverage architecture, and the harder problems that agents can’t solve on their own yet.

If your team is still manually translating test cases into automation scripts, or if your automation backlog keeps growing faster than your team can clear it, get started here.

You Might Also Like