What is BDD Testing? A Guide to Behavior Driven Development on Mobile

BDD (behavior-driven development) is a way of writing software tests in plain language so that everyone on the team, not just developers, can read and understand them. Instead of writing test code that only engineers can parse, you write scenarios in structured English that describe how the app should behave from the user's perspective.

The format is Given-When-Then, and it looks like this:

    
gherkin
 Feature: Checkout with promo codeScenario: Valid promo code applies discount Given the user has a $50 item in the cart When the user enters promo code "SAVE20" And taps "Apply" Then the total should update to $40 And the discount line should show "-$10.00"

That scenario is readable by a product manager, a QA engineer, and a developer. Everyone agrees on what the app should do before anyone writes code. That's the core idea behind BDD: define the expected behavior in a shared language, then build and test against it.

Dan North introduced BDD in 2003 as an evolution of test-driven development (TDD). His argument was that TDD's language ("test," "assert," "expect") confused teams about what they were testing and why. BDD reframes the conversation around behavior ("given this situation, when this happens, then this should be the result") which maps directly to user stories and acceptance criteria.

BDD vs TDD: the actual difference

TDD says: write a test before you write the code. The test defines what the code should do. You write the minimum code to pass the test, then refactor.

BDD says: write a scenario that describes the behavior from the user's perspective before you write the code. The scenario defines what the user should experience. The test is derived from the scenario.

In practice:

TDD test (unit level):

TDD

Perspective: Developer (code level)
Language: Programming language (Jest, pytest, JUnit)
Scope: Single function or class
Audience: Developers only
Example: "expect(applyDiscount(5000, 20)).toBe(4000)"

‍

BDD scenario (behavior level):

BDD

Perspective: User (behavior level)
Language: Gherkin (Given-When-Then)
Scope: Full user flow or feature
Audience: Everyone (PM, QA, dev, stakeholder)
Example: "Given the user has a $50 item... Then the total should be $40"

‍

The TDD test validates a function. The BDD scenario validates a user experience. TDD operates at the code level. BDD operates at the feature level. Both are useful. TDD catches logic bugs in individual functions. BDD catches flow bugs where the user's journey doesn't match the expected behavior.

How BDD works in practice (and where it gets complicated)

The BDD workflow has three phases. Cucumber's documentation calls them Discovery, Formulation, and Automation.

Discovery. The team (PM, developer, QA) discusses a feature and agrees on examples of how it should behave. "What happens if the promo code is expired? What if the cart is empty? What if two promo codes are applied?" These conversations uncover edge cases before any code is written.

Formulation. The agreed examples are written as Gherkin scenarios in .feature files. Each scenario follows the Given-When-Then structure. These become living documentation of the app's behavior.

Automation. Each Gherkin step is mapped to a step definition in code. The step definition is the glue that connects the plain-English scenario to actual test code.

Here's where it gets complicated on mobile:

    
javascript
 // Step definition for "When the user enters promo code 'SAVE20'" When('the user enters promo code {string}', async (code) => { const promoField = await driver.$('~promo-code-input'); await promoField.setValue(code); const applyButton = await driver.$('~apply-promo-button'); await applyButton.click(); }); 

That step definition uses Appium selectors (~promo-code-input, ~apply-promo-button). Those selectors break when the developer renames the accessibility ID, restructures the view hierarchy, or changes the screen layout. The Gherkin scenario stays readable. The step definition underneath it rots.

This is BDD's structural weakness on mobile: you maintain two layers instead of one. The Gherkin scenarios (readable, stable) and the step definitions (code, selector-dependent, fragile). When the UI changes, the Gherkin still makes sense. The step definitions break. Someone has to open Appium Inspector, find the new selector, update the step definition, and re-run the test. The beautiful plain-English scenario is a facade over the same selector maintenance problem every mobile testing approach has.

And there's a second problem most BDD guides won't mention: in most teams, the product manager never reads the Gherkin files. The scenarios are written by QA engineers, reviewed by developers, and maintained by QA. The "shared language" promise of BDD works in theory. In practice, the Gherkin becomes overhead that a technical team maintains for a non-technical audience that doesn't look at it.

What if the plain English was the test itself?

Drizz's authoring model is built on the same principle as BDD: describe what the app should do in plain English, and the test executes it. But it removes the Gherkin layer and the step definitions entirely.

In Drizz, the test IS the plain English:

Tap on the promo code field

Type "SAVE20"

Tap "Apply"

Validate total shows "$40.00"

Validate discount line shows "-$10.00"

‍

There are no step definitions. There are no selectors. Vision AI reads the screen and finds the promo code field visually. It taps "Apply" by recognizing the button text, not by querying an accessibility ID. If the developer renames the field or moves the button, the test doesn't break because it was never coupled to the implementation.

The test runs on real Android and iOS devices. The popup agent handles OEM dialogs. Self-healing adapts when layouts shift. Teams go from 15 tests per month to 200, with flakiness at ~5%.

This is BDD's promise fulfilled without BDD's overhead. The plain English IS the executable test. No Gherkin to maintain. No step definitions to update. No gap between what the scenario says and what the test actually does.

When traditional BDD still makes sense

BDD with Gherkin and Cucumber (or SpecFlow, or Behave) still works well in specific situations:

When stakeholders genuinely participate. If your product manager actively reads, writes, and reviews Gherkin scenarios as part of sprint planning, the shared-language benefit is real. The scenarios become living documentation that the whole team references.

When your app has stable, well-identified UI elements. If your app has consistent accessibility IDs that don't change between builds, step definitions stay stable and the maintenance burden is manageable.

When you're testing APIs, not UI. BDD scenarios for API testing don't depend on selectors. "Given I send a POST to /checkout with a valid promo code, then the response total should be $40" maps directly to HTTP calls with no UI fragility.

On mobile, where the UI changes every sprint, selectors break across OEM skins, and the device matrix multiplies maintenance, plain-English testing without the Gherkin/step-definition layer is a more practical path to the same goal.

FAQ

What does BDD stand for?

Behavior-driven development. It's a software development approach where tests are written in plain language (usually Given-When-Then format) that describes how the app should behave from the user's perspective.

What is the difference between BDD and TDD?

TDD writes tests for code (functions, classes). BDD writes scenarios for behavior (user flows, features). TDD uses programming language. BDD uses structured English (Gherkin). Both write tests before code, but at different levels.

What is Gherkin in BDD?

Gherkin is the structured language used to write BDD scenarios. It uses the Given-When-Then format. Gherkin files are readable by non-technical stakeholders but require step definitions (code) to execute.

What are common BDD testing tools?

Cucumber (Java, Ruby, JS), SpecFlow (.NET), Behave (Python), and Karate (API testing). On mobile, Appium + Cucumber is the standard pairing. Drizz achieves the same plain-English testing without requiring Gherkin or step definitions.

Is BDD the same as acceptance testing?

BDD scenarios often serve as acceptance tests, but they're not the same thing. Acceptance testing is a testing phase. BDD is a development methodology. BDD scenarios can function as acceptance criteria, but BDD also includes the discovery and collaboration process.

Can BDD work for mobile app testing?

Yes, but the step definitions (the code behind Gherkin scenarios) depend on selectors that break across devices and builds. For mobile teams, plain-English testing tools like Drizz deliver the same behavior-first approach without the selector maintenance.

‍

About the Author:

Schedule a demo