Smoke Testing: What to Test and How to Automate It

A smoke test answers one question: is this build stable enough to test further?

You don't test every feature. You don't test edge cases. You run a handful of checks on most basic functions (can app launch, can user log in, does main screen load, does primary workflow complete without crashing), and if any of those fail, you stop. The build goes back to development. No one wastes time running a 500-test regression suite on a build where login page throws a 500 error.

The name comes from hardware engineering. When you power on a new circuit board for first time, you watch for smoke. If smoke comes out, you turn it off. No further testing needed. The Wikipedia article on smoke testing traces this to same practice in plumbing, where pipes are filled with smoke to detect leaks. In software, concept is identical: a fast, surface-level check before investing time in detailed testing.

What goes in a smoke test (and what doesn't)

This is where most smoke testing guides get vague. They say "test critical functionality" and leave you to figure out what that means. Here's a concrete filter.

Ask three questions about each potential test case:

If this fails, can user do anything at all? If app doesn't launch, or login screen crashes, or database connection is down, nothing else matters. These are smoke test candidates.
Does every user hit this path? The checkout flow in an e-commerce app is used by every buyer. The admin panel's CSV export feature is used by three people. The checkout flow is a smoke test candidate. The CSV export is not.
Can this break because of a build/deployment issue? Smoke tests catch deployment problems: wrong environment variables, missing database migrations, broken API endpoints, misconfigured CDN paths. If a feature depends on infrastructure that can break during deployment, it's a smoke test candidate.

If a test case passes all three questions, it belongs in your smoke suite. If it passes only one, it probably belongs in your regression suite instead.

A real smoke test suite (e-commerce app example)

Here's what a smoke test suite looks like for a typical e-commerce application. Not hypothetical, not abstract. These are actual checks.

#	Test case	What it validates	Expected time
1	App/site loads without errors	Server is running, assets deployed, no 500 errors	2s
2	Homepage renders with product listings	Database connection works, API returns data	3s
3	User can log in with valid credentials	Authentication service is up, session management works	3s
4	Search returns results for a known term	Search index is connected, query pipeline works	2s
5	User can add a product to cart	Cart service works, product data is valid	3s
6	Checkout page loads with cart items	Cart-to-checkout handoff works, pricing engine responds	3s
7	Payment form accepts test card data	Payment gateway integration is configured	4s
8	Order confirmation page displays	Order creation pipeline works end to end	3s

Total: 8 test cases, ~23 seconds.

That's a smoke suite. Eight tests. Under 30 seconds. If any of these fail, build is broken in a way that makes further testing pointless. If all pass, build is stable enough for full test plan to run.

Notice what's NOT in this suite: password reset flow, guest checkout, coupon codes, product reviews, wishlist functionality, admin dashboard, email notifications, multi-language support. Those are all real features that need testing. But they're not smoke tests. A user can still complete core purchase flow without them.

Smoke testing vs sanity testing vs regression testing

These three terms get confused constantly. Here's short version.

Smoke testing runs after a new build to check whether build is stable enough for any testing at all. It's broad (covers many features) but shallow (only basic checks per feature). Runs first. If it fails, nothing else runs.

Sanity testing runs after a specific bug fix or feature change to check whether that fix actually works and didn't break closely related features. It's narrow (focuses on changed area) but deeper than smoke. Runs after smoke passes.

Regression testing runs full test suite to verify that nothing previously working is now broken. It's both broad and deep. Takes longest. Runs after smoke and sanity pass.

The sequence is: smoke → sanity → regression. Each one is a gate. If smoke fails, you don't run sanity. If sanity fails, you don't run regression. For a deeper comparison between first two, see our smoke testing vs sanity testing guide.

How to automate smoke tests

Manual smoke testing works for small teams and infrequent releases. You open app, click through critical paths, and confirm everything loads. Takes 10-15 minutes.

But if you're releasing daily (or multiple times a day), manual smoke testing becomes a bottleneck. Automating smoke suite and running it as a CI/CD pipeline gate is standard approach for teams practicing continuous delivery.

The pattern:

Developer pushes code
CI server builds application
Application deploys to a staging environment
Automated smoke suite runs against staging
If smoke passes → trigger full regression suite
If smoke fails → notify team, block deployment, stop further testing

What makes a good automated smoke test:

Fast. The entire suite should finish in under 2 minutes. If your smoke suite takes 10 minutes, it's too big. Move tests to regression suite.
Stable. Smoke tests should not be flaky. A flaky smoke test that fails randomly teaches team to ignore smoke failures, which defeats purpose. If a smoke test is flaky, fix it or remove it.
Independent. Each test should run without depending on result of another test. If test #3 requires test #2 to have run first, and test #2 fails, you've lost ability to know whether test #3's feature works.
Maintained. When a feature changes, smoke test for that feature needs updating. A smoke test that checks for a button label that was changed three releases ago is testing nothing useful.

Common mistakes

Making smoke suite too big

A team starts with 10 smoke tests. Over six months, it grows to 85. Every new feature gets a smoke test "just in case." The suite now takes 12 minutes. It's no longer a quick gate. It's a slow regression suite with a misleading name. The fix: set a hard rule. Smoke suite stays under 20 tests and under 2 minutes. Anything beyond that goes into regression.

No clear pass/fail criteria

"Check that homepage looks correct" is not a smoke test. What does "correct" mean? Does hero image need to load? Does navigation menu need 6 items? Does footer need to be visible? A smoke test needs a binary outcome: it passes or it fails. "Homepage returns HTTP 200 and contains at least one product listing" is a smoke test.

Running smoke tests against wrong environment

Smoke tests validate a deployed build. Running them against a local development server catches local bugs, not deployment bugs. The whole point of smoke testing is catching problems that happen during build-and-deploy process: wrong configurations, missing environment variables, failed database migrations. Run smoke tests against environment you're validating (staging or production).

Skipping smoke tests because "we have good unit tests"

Unit tests verify individual functions work correctly. They don't verify that those functions work together after deployment. You can have 100% unit test coverage and still ship a build where login page returns a blank screen because a CSS file didn't deploy. Smoke tests catch integration and deployment failures that unit tests can't. These are different testing layers, and your test automation strategy needs both.

When smoke testing matters most

After every deployment. Whether it's staging or production, run smoke suite. Automated deployments should include smoke as a built-in step, not an afterthought.

After infrastructure changes. Database migrations, CDN updates, SSL certificate renewals, server scaling events. Any infrastructure change can break application in ways that only a deployed smoke test can catch.

After dependency updates. Updating a framework version, a third-party SDK, or an API client can introduce subtle incompatibilities. A smoke test won't catch all of them, but it will catch ones that break core functionality.

Before a major QA cycle. If your QA team is about to spend a week on manual regression testing, run smoke suite first. There's nothing worse than finding on day 3 that build was fundamentally broken from start.

FAQ

What is smoke testing in software testing?

Smoke testing is a quick check that verifies whether a new software build's core functions work. It runs before any other testing. If smoke test fails, build goes back to development. The term comes from hardware engineering, where engineers watch for literal smoke when powering on a new device.

How many test cases should a smoke test have?

Most smoke suites have 10-20 test cases covering login, navigation, primary workflows, and key integrations. The full suite should finish in under 2 minutes. If it takes longer, some tests should move to regression suite instead.

What is difference between smoke testing and regression testing?

Smoke testing is fast and shallow, checking if build is stable enough to test at all. Regression testing is thorough and deep, verifying that no previously working features broke. Smoke runs first (under 2 minutes). Regression runs after smoke passes (can take hours).

Can smoke tests be automated?

Yes, and they should be for teams releasing frequently. Automated smoke tests run as a gate in CI/CD pipeline: if smoke fails, deployment stops and team gets notified. Selenium, Cypress, Playwright, and most test automation frameworks support smoke suite configuration.

Who performs smoke testing?

In most teams, automated smoke tests run without human involvement as part of CI/CD pipeline. For manual smoke testing, either developers or QA engineers run checks. Some teams rotate responsibility so everyone stays familiar with core user paths.

Should smoke tests run in production?

Yes, as post-deployment verification. A "production smoke test" runs immediately after a production deployment to confirm release didn't break anything. It should use synthetic test accounts, not real user data. If it fails, team can roll back before users are affected.

‍