What Is Behavior-Driven Development (BDD)?

The three main ‘pieces’ of BDD are the artifacts, the domain language, and the process. When people talk about BDD, they tend to focus on the artifacts -- an innovative way of expressing requirements and tests at the same time. Without all three pieces in place, attempts to “do” BDD tend to fail. As Curtis Petit, a leading contributor to the TestRetreat format once put it “I would say that BDD is a very specific term that gets used to describe a wide variety of abuses of the tools made to support it.”

Here’s how to do better.

BDD As Code and English

After Dan North introduced BDD, primarily for programmers as a way to change their thinking, away from testing and toward describing behavior. Shortly after introducing the idea, Stephen Baker created what would become RSpec, a ruby implementation that eliminated the term ‘test’, instead using language like ‘expect’, ‘should’, and ‘is_expected.’

From developers and unit tests, the idea of driving software with behavior bubbled up, to include acceptance criteria the customer can understand, and even the title of the story itself. One classic example is the bank ATM:

As a bank customer

I want to withdraw money from an ATM

So that I’m not constrained by hours or lines at the teller

This “as a (role) I want a (feature) so that I can (benefit)” is a template to describe high-level features.

When people say “BDD” today, they usually mean Gherkin and the use of given/when/then to describe acceptance criteria. Here’s a typical scenario laid out this way:

Feature: Blogging

   In order to monetize content

   As a site operator

   I want people to be able to create a blog post

 

   Scenario: Creating a blog post

            Given I am on the home page

            And I follow "New Post"

            When I fill in "Title" with "Hello World!"

            And I fill in "Body" with "This is my first post!"

            And I press "Publish"

            Then I should see "Hello World!"

            And I should see "This is my first post"

The Scenario described above looks and reads like English, but is it actually a grammar that can be transformed to call software code. For example, and_i_follow is a function, and “new post” is a variable. Once the greater team agrees on the text, a programmer can use a tool called cucumber to transform the near-english text into stub functions, like these examples of real ruby code:

When("I fill in {string} with {string}") do |field, value|

      pending # Write code here that turns the phrase into concrete actions

end

 

When("I press {string}") do |button_name|

      pending # Write code here that turns the phrase into concrete actions

end

 

Then("I should see {string}") do |content|     

      ending # Write code here that turns the phrase into concrete actions

end

The programmers then fill in the step definitions. The scenarios use these functions. Once a full reasonable set of steps exist, any member of the team, technical or non-technical, can create new scenarios easily. Creating a new scenario could be 80-100% reuse and 0-20% new code.

The artifacts are powerful as an idea, but they don’t actually drive development without real process change.

BDD as a Process

It’s worth noting that BDD is a process - a way of driving software development. The story template and the Gherkin are just the artifact, the permanent record. The process is a collaboration between development, test, and product owner, sometimes called the three amigos. That three amigos meeting is critical to Acceptance Test Driven Development, which BehaviorDriven.org describes as an evolutionary ancestor to BDD.

Instead of one person, working alone to create given/when/then, a greater team comes together to discuss what the software will do. This discussion creates a shared understanding. All roles in development can contribute to the work. The programmer needs to understand what to build; the tester, an expert in miscommunication and details, can ask powerful questions to make sure the behavior acts as described matches the intent. The product owner, analyst, or business person can use the process as a second-check, to prevent building the wrong thing.

The goal here is to improve first-time quality by building the right thing up-front and reduce misunderstandings between technical staff. The artifacts make it possible to run actual code, checking the application for high-level regressions on every build. That has value, but if all the staff does is discuss how the software will work and how it will be tested up-front, without creating or automating the artifacts - the results, while not BDD, might just be a step in the right direction.

To add additional power, make the steps use a true domain language.

BDD as Domain Language

Pressing a button or typing into a text field is a good thing, but it is not a domain language. Domain Driven Design, the third “parent” of BDD, is about using specific, concrete terms that make it possible to share complex ideas easily.

This domain language is the language of the business. That doesn’t mean web servers, databases, APIs, HTML, CSS or Javascript. The domain language is the language of the business - customers, books, orders, catalogs, and invoices. We can think of invoices, for example, as an object, with data like invoice number, customer, due date, date created, line items, and so on. Invoices can also have actions, such as receive payment.

Combining domain language with the artifacts make for a powerful combination. The code will be more clear, shorter, and easier to reuse.

The domain language, the artifacts, and the process put together create a combination that is much more than the sum of its parts.

Why BDD Is Important

Spec files and scenarios create readable tests, tests that anyone can understand, from the business analyst to the business customer to the programmer. Not only can anyone read them, but anyone can extend them, changing the order of the flow, changing the variables that go into the function calls and the expected results. For example, once a simple login tests is created, it is easy to extend that test to include incorrect passwords and user Id’s. Once trained on what is possible, business users can “spec out” new steps, which will require a programmer to implement, with an incredible reuse rate.

Re-use and automated tests are only a part of what makes BDD so valuable. One of the core features of BDD is that it combines the specification and the test itself. That is, if the feature itself is defined in terms of “given / when / then”, that specification is also the test. One term for this is executable specification, or specification by example. Instead of a battery of tests that exhaustively demonstrate the feature, most teams come up with just enough examples to “flesh out” the feature. These examples are stored, run against new builds automatically, can be easily be recalled and changed.

Once the domain language and basic steps are in place, it is often possible to take a bug and turn it into a failing Gherkin scenario. Once the scenario exists (and fails), then “fixing the bug” becomes almost as easy as making the test pass -- without causing any other existing tests to fail.

Finally, by bringing all the key players of development together to create something with shared understanding, BDD prevents the “telephone game.” This acts as a sort of lubricant for the development process, reducing friction. When a question arises about if something is a bug, the team can check the spec files to see if the behavior is expected or even defined -- then add a spec file to make the expected behavior clear.

Who Should be Practicing Behavior-Driven Development

The Context of BDD Tools

In BDD, the tests are described in prose English - WHEN this event happens AND this event happens THEN this event is expected. These appear on different lines. The previous generation, Acceptance Test Driven Development (ATDD), was more spreadsheet based: When this function is called and these inputs appear (each in a column) expect these results (each in a column). Rows before the tests could contain setup instructions, and rows after might contain cleanup.

Because of their input/transformation/output focus, the ATDD tools seem to be a better fit for transactional or batch systems with no user interface, or a very simple one. Trying to decide how to navigate a complex GUI with a spreadsheet can be a challenge to debug. BDD tools like cucumber/gherkin, on the other hand, are in much more common use in systems with a user flow, such as eCommerce or Social Media. In those sorts of systems you can have a user click different buttons, or in a different order, and expect a different result.

BDD does support the idea of a table - that essentially the same given/when/then can run many times, with different inputs and expected results. Yet the Gherkin for complex transactional systems tends to result in a lot of words.

When it comes to testing in a behavior-driven development process, there are a wide variety of automated testing tools that can work with Cucumber or Gherkin requirements to streamline your efforts. Automation is vital to the success of any team looking to implement BDD. Since behavior-driven development focuses on the end-user experience and requirements are 'scenario-focused,' it greatly impacts how unit testing and acceptance testing are implemented.

Testers would need to write test cases while keeping the scenario in mind and not just the underlying code. The functions they are writing tests to validate, would need to match the BDD requirements being tested. With a larger focus on acceptance and unit testing in BDD, automation is a must. Unit tests can reach into the thousands depending on the scope of the project and it could take teams days or weeks to work their way through testing without automation.

Challenges of BDD

Ask a dozen people to explain BDD, and you’ll likely get fifteen answers. Tell the team to “do” BDD and the analysts will likely run off and write a huge amount of Gherkin that is never tied to a test - and they’ll do it all by themselves. The programmers will write some rspec or other unit-level tests and claim to be “done”; the testers will likely complain that all this work “isn’t really testing.” There is even some truth to that. BDD scenarios tend to be a few, light, quick, valuable descriptions to system behavior, not an exhaustive check of everything. Companies that try to write BDD scenarios to cover all behavior quickly find themselves exhausted.

Then there is getting the scenarios to actually run. The implementation work typically needs to be done by production programmers, and will only happen if the programmers and analysts are truely collaborating.

While it looks a bit like English, Given/When/Then is not. Because it is executable, the format is actually computer code. The English it does resemble is that of 17th century legal proclamation of a feast day. Analysts and other team members that create BDD scenarios by themselves will be creating long, hard-to-read, badly written specifications.

In summary, the way BDD works is when all pieces - artifacts, process, and domain language come together.

BDD Frameworks

The last few years have seen an explosion in BDD frameworks -- RSPEC, EasyB, JBehave, and by far the most popular is Cucumber. Popularity comes with the benefit of lots of support in the form of language bindings, integrations, and help resources like books, user groups, and conferences. There are Cucumber implementations in just about every language you would care about like Java, Ruby, and PHP.

It is Cucumber in particular that is known for Gherkin. Gherkin is the engine behind the scenes that makes it possible to use a human readable format to create tests. Using tags in the Gherkin file, code written in the Cucumber framework works to perform that set of behaviors.

Regardless of how simple things seem externally, there are still layers of frameworks and code that make the tool and checks come together.

How to Get Started with BDD

The place to start really depends on the problems of the organization. If there is a great deal of confusion about what to build, or communications errors are leading to defects, then start with the process and the domain language. If the team implements just the process, skipping the artifacts and automation to start, they can still see considerable gains. George Dinwiddie’s excellent article The Three Amigos goes into much more detail on how to do this effectively.

If the problem is not communication but regression defects, then the developers might consider a unit test tool like RSpec, implementing quality from the code level up.

Finally, if the team lacks a shared language, consider building a domain language for the company.

Once one piece is in place, look at the risks involved in adding the other pieces, and conduct some small experiments. For the scenarios, we recommend not just writing the given/when/then but making a policy that for future development “the story isn’t done until the BDD tests run.” Do this on future development - be reluctant to create tests for existing functionality. Over time, as the team adds new features, the old functionality will slowly be covered by new tests.

That is, we mean specs.

Or, to be clear: We mean both for the price of one.

Trying Out Behavior-Driven Development

In the next article on BDD, you'll continue to learn how to get started with the process and review some of the top automated testing tools you can use to efficiently scale your BDD efforts.

Keep Reading: Getting Started With BDD

Errata

The sample given/when/then example above is derived from the workshop by teamtreehouse.com on behavior driven development with Cucumber.