Best practices for Scout tests
This guide covers best practices for writing Scout UI and API tests that are reliable, maintainable, and fast.
Scout is built on Playwright, so the official Playwright Best Practices apply.
New to Scout? Start with our Scout introduction page.
Best practices that apply to both UI and API tests.
Scout is deployment-agnostic: write once, run locally and on Elastic Cloud.
- Every suite must have deployment tags. Use tags to target the environments where your tests apply (for example, a feature that only exists in stateful deployments).
- Within a test, avoid relying on configuration, data, or behavior specific to a single deployment. Test logic should produce the same result locally and on Cloud.
- Run your tests against a real Elastic Cloud project before merging to catch environment-specific surprises early. See Run tests on Elastic Cloud for setup instructions.
When a feature is gated behind a flag, enable it at runtime with apiServices.core.settings() rather than creating a custom server config. Runtime flags work locally and on Cloud, don’t require a server restart, and avoid the CI cost of a dedicated server instance.
For the full guide (including when a custom server config is unavoidable), see Feature flags.
When you add new tests, fix flakes, or make significant changes, run the same tests multiple times to catch flakiness early. A good starting point is 20–50 runs.
Prefer doing this locally first (faster feedback), and use the Flaky Test Runner in CI when needed. See Debug flaky tests for guidance.
- Keep one top-level suite per file (
test.describe). - Avoid nested
describeblocks. Usetest.stepfor structure inside a test. - Don’t rely on test file execution order (it’s not guaranteed).
Test names should read like a sentence describing expected behavior. Clear names make failures self-explanatory and test suites scannable.
Examples
❌ Don’t:
test('test 1', async ({ page }) => {
/* ... */
});
test('works correctly', async ({ page }) => {
/* ... */
});
❌ Don’t: use variables or template literals in test titles as they look opaque in stack traces and test reports:
test(`handles ${dataView.title} correctly`, async ({ page }) => {
/* ... */
});
✔️ Do:
test('viewer can see dashboard but cannot edit', async ({ page }) => {
/* ... */
});
test('returns 403 when missing read privilege', async ({ apiClient }) => {
/* ... */
});
Prefer “one role + one flow per file” and keep spec files small (roughly 4–5 short tests or 2–3 longer ones). The test runner balances work at the spec-file level, so oversized files become bottlenecks during parallel execution. Put shared login/navigation in beforeEach.
Example
// dashboard_viewer.spec.ts
test.beforeEach(async ({ browserAuth, pageObjects }) => {
await browserAuth.loginAsViewer();
await pageObjects.dashboard.goto();
});
test('can see dashboard', async ({ page }) => {
// assertions...
});
If many files share the same “one-time” work (archives, API calls, settings), move it to a global setup hook.
Example
globalSetupHook('Load shared test data (if needed)', async ({ esArchiver, log }) => {
log.debug('[setup] loading archives (only if indexes do not exist)...');
await esArchiver.loadIfNeeded(MY_ARCHIVE);
});
It’s common for test suites to load Elasticsearch or Kibana archives that are barely used (or not used at all). Unused archives slow down setup, waste resources, and make it harder to understand what a test actually depends on. Check if your tests ingest the data they actually need.
Use esArchiver.loadIfNeeded(), which skips ingestion if the index and documents already exist (useful when multiple suites share the same data).
Examples
❌ Don’t: load archives that no test in the suite relies on:
test.beforeAll(async ({ esArchiver }) => {
await esArchiver.loadIfNeeded('large_metrics_archive');
await esArchiver.loadIfNeeded('user_actions_archive');
});
test('shows metrics dashboard', async ({ page }) => {
// only uses large_metrics_archive — user_actions_archive is never referenced
});
✔️ Do: load only what the suite needs:
test.beforeAll(async ({ esArchiver }) => {
await esArchiver.loadIfNeeded('large_metrics_archive');
});
Cleanup in the test body doesn’t run after a failure. Prefer afterEach / afterAll.
Examples
❌ Don’t: put cleanup at the end of the test body (it’s skipped if the test fails):
test('creates and deletes index', async ({ esClient }) => {
await esClient.indices.create({ index: testIndexName });
// ... assertions ...
await esClient.indices.delete({ index: testIndexName });
});
- skipped on failure!
✔️ Do: use hooks so cleanup always runs:
test.afterEach(async ({ esClient, log }) => {
try {
await esClient.indices.delete({ index: testIndexName });
} catch (e: any) {
log.debug(`Index cleanup failed: ${e.message}`);
}
});
Tests should be clean and declarative. If a helper might return an expected error (for example, 404 during cleanup), the helper should handle it internally, for example by accepting an ignoreErrors option or treating a 404 during deletion as a success.
Examples
❌ Don’t: catch errors in the test:
test.afterAll(async ({ apiServices }) => {
try {
await apiServices.cases.delete(caseId);
} catch {
// might already be deleted
}
});
✔️ Do: let the helper handle expected errors:
test.afterAll(async ({ apiServices }) => {
await apiServices.cases.cleanup.deleteAllCases();
});
If a value is reused across suites (archive paths, fixed time ranges, endpoints, common headers), extract it into a shared constants.ts file. This reduces duplication and typos, and makes updates safer.
Example
// test/scout/ui/constants.ts
export const LENS_BASIC_TIME_RANGE = {
from: 'Sep 22, 2015 @ 00:00:00.000',
to: 'Sep 23, 2015 @ 00:00:00.000',
};
export const DASHBOARD_SAVED_SEARCH_ARCHIVE =
'src/platform/test/functional/fixtures/kbn_archiver/dashboard/current/kibana';
export const DASHBOARD_DEFAULT_INDEX_TITLE = 'logstash-*';
// test/scout/api/constants.ts
export const COMMON_HEADERS = {
'kbn-xsrf': 'some-xsrf-token',
'x-elastic-internal-origin': 'kibana',
'Content-Type': 'application/json;charset=UTF-8',
} as const;
Avoid admin unless there’s no alternative. Minimal permissions catch real permission bugs and keep tests realistic. Also test the forbidden path: verify that an under-privileged role receives 403 for endpoints it shouldn’t access.
See browser authentication and API authentication.
Examples
❌ Don’t: default to admin for convenience:
test.beforeEach(async ({ browserAuth }) => {
await browserAuth.loginAsAdmin();
});
✔️ Do: use a built-in role when it fits (viewer, editor, etc.), or create a custom one for tighter scoping:
// built-in role
await browserAuth.loginAsViewer();
// custom role for finer-grained control
await browserAuth.loginWithCustomRole('logs_analyst', {
elasticsearch: {
indices: [{ names: ['logs-*'], privileges: ['read'] }],
},
kibana: [{ spaces: ['*'], base: [], feature: { discover: ['read'] } }],
});
Tip: extend browserAuth for repeated roles
If the same custom role appears in many specs, extract it into a browserAuth fixture extension instead of repeating the role descriptor everywhere. Tests then read like intent:
// in your plugin's fixtures/index.ts
await use({
...browserAuth,
loginAsPlatformEngineer: () =>
browserAuth.loginWithCustomRole('platform_engineer', roleDescriptor),
});
// in specs
await browserAuth.loginAsPlatformEngineer();
For setup details, see Reuse role helpers.
Best practices specific to UI tests.
Default to parallel UI suites when possible. Parallel workers share the same Kibana/ES deployment, but run in isolated Spaces.
| Mode | When to use |
|---|---|
| Parallel | UI tests (most suites), suites that share pre-ingested data (often using the global setup hook) |
| Sequential | API tests, suites that require a “clean” Elasticsearch state |
UI tests should answer “does this feature work for the user?” Verify that components render, respond to interaction, and navigate correctly. Leave exact data validation (computed values, aggregation results, edge cases) to API or unit tests, which are faster and less brittle.
| What you’re testing | Recommended layer |
|---|---|
| User flows, navigation, rendering | Scout UI test |
| Data correctness, API contracts, edge cases | Scout API test |
| Isolated component logic (loading/error states, tooltips, field validation) | RTL/Jest unit test |
Examples
❌ Don’t: verify computed values that belong in an API test:
await expect(page.testSubj.locator('row-0-col-count')).toHaveText('1,024');
await expect(page.testSubj.locator('row-0-col-avg')).toHaveText('42.7');
✔️ Do: verify that the UI renders and responds to interaction:
await expect(page.testSubj.locator('datasetQualityTable-loaded')).toBeVisible();
await page.testSubj.click('tableSortByLastActivity');
await expect(page.testSubj.locator('row-0-col-dataset')).not.toHaveText('');
Use test.step() to structure a multi-step flow while keeping one browser context (faster, clearer reporting).
Example
test('navigates through pages', async ({ pageObjects }) => {
await test.step('go to Dashboards', async () => {
await pageObjects.navigation.clickDashboards();
});
await test.step('go to Overview', async () => {
await pageObjects.navigation.clickOverview();
});
});
Setup/teardown using UI is slow and brittle. Prefer Kibana APIs and fixtures.
Examples
❌ Don’t: create test data through the UI:
test.beforeEach(async ({ page, browserAuth }) => {
await browserAuth.loginAsAdmin();
await page.testSubj.click('createDataViewButton');
await page.testSubj.fill('indexPatternInput', 'logs-*');
await page.testSubj.click('saveDataViewButton');
});
✔️ Do: use API fixtures:
test.beforeEach(async ({ uiSettings, kbnClient }) => {
await uiSettings.setDefaultTime({ from: startTime, to: endTime });
await kbnClient.importExport.load(DATA_VIEW_ARCHIVE_PATH);
});
Playwright actions and web-first assertions already wait/retry. Don’t add redundant waits, and never use page.waitForTimeout() as it’s a hard sleep with no readiness signal and a common source of flakiness.
Examples
❌ Don’t: add unnecessary waits before actions or assertions:
await page.testSubj.waitForSelector('myButton', { state: 'visible' });
await page.testSubj.click('myButton');
await page.testSubj.locator('successToast').waitFor();
await expect(page.testSubj.locator('successToast')).toBeVisible();
✔️ Do: let Playwright handle waiting automatically:
await page.testSubj.click('myButton');
await expect(page.testSubj.locator('successToast')).toBeVisible();
When an action triggers async UI work (navigation, saving, loading data), wait for the resulting state before your next step. This ensures the UI is ready and prevents flaky interactions with elements that haven’t rendered yet.
Example
await page.gotoApp('sample/page/here');
await page.testSubj.waitForSelector('mainContent', { state: 'visible' });
If an action fails, don't wrap it in a retry loop. Playwright already waits for actionability; repeated failures usually point to an app issue (unstable DOM, non-unique selectors, re-render bugs). Fix the component or make your waiting/locators explicit and stable.
Examples
❌ Don't: retry actions in a loop:
for (let i = 0; i < 3; i++) {
try {
await page.testSubj.click('submitButton');
break;
} catch {
await page.waitForTimeout(1000);
}
}
✔️ Do: fix the root cause (for example, wait for a readiness signal):
await expect(page.testSubj.locator('formReady')).toBeVisible();
await page.testSubj.click('submitButton');
Prefer stable data-test-subj attributes accessed using page.testSubj. If data-test-subj is missing, prefer adding one to source code. If that’s not possible, use getByRole inside a scoped container.
Examples
❌ Don’t: use raw CSS selectors or unscoped text matchers (searching the entire page for text is unreliable when duplicates exist):
await page.click('[data-test-subj="myButton"]');
await page.getByText('Delete').click();
❌ Don’t: select elements by index (flagged by Playwright’s recommended ESLint rules), as they break on non-clean environments where tests run without server restart and extra data may exist:
await page.testSubj.locator('tableRow').nth(0).click();
✔️ Do: use page.testSubj or scoped getByRole:
await page.testSubj.click('myButton');
await page.testSubj.locator('confirmDeleteModal').getByRole('button', { name: 'Delete' }).click();
Scout configures Playwright timeouts (source). Prefer defaults.
- Don’t override suite-level timeouts/retries with
test.describe.configure()unless you have a strong reason. - If you increase a timeout for one operation, keep it well below the test timeout and leave a short rationale. An assertion timeout that exceeds the test timeout is ignored.
- Time spent in hooks (
beforeEach,afterEach) counts toward the test timeout. If setup is slow, the test itself may time out even though its assertions are fast.
Example
await expect(editor).toBeVisible();
// justified: report generation can be slow
await expect(downloadBtn).toBeEnabled({ timeout: 30_000 });
- will use the default timeout
Tables/maps/visualizations can appear before data is rendered. Prefer waiting on a component-specific “loaded” signal rather than global indicators like the Kibana chrome spinner (our data shows they are unreliable for confirming that a particular component has finished rendering).
Example
In source code, use a dynamic data-test-subj:
<EuiBasicTable
data-test-subj={`myTable-${isLoading ? 'loading' : 'loaded'}`}
loading={isLoading}
items={items}
columns={columns}
/>
In tests, wait for the loaded state:
await expect(page.testSubj.locator('myTable-loaded')).toBeVisible();
For Kibana Maps, data-render-complete="true" is often the right “ready” signal.
Prefer existing page objects (and their methods) over rebuilding EUI interactions in test files.
Example
await pageObjects.datePicker.setAbsoluteRange({
from: 'Sep 19, 2015 @ 06:31:44.000',
to: 'Sep 23, 2015 @ 18:31:44.000',
});
Create methods for repeated flows (and make them wait for readiness).
Example
async openNewDashboard() {
await this.page.testSubj.click('newItemButton');
await this.page.testSubj.waitForSelector('emptyDashboardWidget', { state: 'visible' });
}
Playwright creates a fresh browser context for each test, so there is no cached state to work around. Both page object methods and test code should be explicit about the action they perform, not defensive about the current state. Conditional flows (like "if modal is open, close it first") hide bugs, waste time, and make failures harder to understand.
Examples
❌ Don’t: add conditional logic to handle unknown state:
async switchToEditMode() {
const isViewMode = await this.page.testSubj.locator('dashboardViewMode').isVisible();
if (isViewMode) {
await this.page.testSubj.click('dashboardEditMode');
}
}
✔️ Do: make the action explicit, since the caller knows the expected state:
async openEditMode() {
await this.page.testSubj.click('dashboardEditMode');
await this.page.testSubj.waitForSelector('dashboardIsEditing', { state: 'visible' });
}
Prefer explicit expect() in the test file so reviewers can see intent and failure modes. Also prefer expect() over manual boolean checks, as Playwright’s error output includes the locator, call log, and a clear message, which if/throw patterns lose.
Examples
❌ Don’t: hide assertions inside page objects:
// inside page object
async createIndexAndVerify(name: string) {
await this.page.testSubj.click('saveButton');
await expect(this.page.testSubj.locator('indicesTable')).toContainText(name);
}
✔️ Do: keep assertions in the test file:
await pageObjects.indexManagement.clickCreateIndexSaveButton();
await expect(page.testSubj.locator('indicesTable')).toContainText(testIndexName);
When a test verifies multiple independent items (KPI tiles, chart counts, table columns), use expect.soft() so the test continues checking everything instead of stopping at the first failure. Playwright still fails the test at the end if any soft assertion failed.
Example
test('Overview tab shows all KPI values', async ({ pageObjects }) => {
await pageObjects.nodeDetails.clickOverviewTab();
await expect.soft(pageObjects.nodeDetails.getKPI('cpuUsage')).toHaveText('50.0%');
await expect.soft(pageObjects.nodeDetails.getKPI('memoryUsage')).toHaveText('35.0%');
await expect.soft(pageObjects.nodeDetails.getKPI('diskUsage')).toHaveText('80.0%');
});
If you must interact with EUI internals, use wrappers from Scout to keep that complexity out of tests.
Example
import { EuiComboBoxWrapper, ScoutPage } from '@kbn/scout';
export class StreamsAppPage {
public readonly fieldComboBox: EuiComboBoxWrapper;
constructor(private readonly page: ScoutPage) {
this.fieldComboBox = new EuiComboBoxWrapper(this.page, 'fieldSelectorComboBox');
}
async selectField(value: string) {
await this.fieldComboBox.selectSingleOption(value);
}
}
Scout supports automated accessibility (a11y) scanning via page.checkA11y. Add checks at high-value points in your UI tests (landing pages, modals, flyouts, wizard steps) rather than on every interaction.
Example
const { violations } = await page.checkA11y({ include: ['[data-test-subj="myPanel"]'] });
expect(violations).toHaveLength(0);
For the full guide (scoping, exclusions, handling pre-existing violations), see Accessibility testing.
If a page has onboarding/getting-started state, set localStorage before navigation.
Example
test.beforeEach(async ({ page, browserAuth, pageObjects }) => {
await browserAuth.loginAsViewer();
await page.addInitScript(() => {
window.localStorage.setItem('gettingStartedVisited', 'true');
});
await pageObjects.homepage.goto();
});
If you build a helper that will benefit other tests, consider upstreaming it:
- Reusable across many plugins/teams: contribute to
@kbn/scout - Reusable but solution-scoped: contribute to the relevant solution Scout package
- Plugin-specific: keep it in your plugin’s
test/scouttree
For the full guidance, see Scout.
Best practices specific to API tests.
Use the right fixture for the right purpose:
| Fixture | Use for |
|---|---|
apiClient |
The endpoint under test (with scoped credentials from API auth) |
apiServices |
Setup/teardown and side effects |
kbnClient, esClient, etc. |
Lower-level setup when apiServices doesn’t have a suitable helper |
Prefer tests that read like “call endpoint X as role Y, assert outcome”.
Example
import { expect } from '@kbn/scout/api';
apiTest.beforeAll(async ({ requestAuth, apiServices }) => {
await apiServices.myFeature.createTestData();
viewerCredentials = await requestAuth.getApiKeyForViewer();
});
apiTest('returns data for viewer', async ({ apiClient }) => {
const response = await apiClient.get('api/my-feature/data', {
headers: { ...COMMON_HEADERS, ...viewerCredentials.apiKeyHeader },
});
expect(response).toHaveStatusCode(200);
expect(response.body.items).toHaveLength(3);
});
This pattern validates both endpoint behavior and the permission model.
Status code assertions are necessary but not sufficient. Also validate shape and key fields.
Examples
❌ Don’t: assert only the status code:
apiTest('returns autocomplete definitions', async ({ apiClient }) => {
const response = await apiClient.get('api/console/api_server', {
headers: { ...COMMON_HEADERS, ...viewerCredentials.apiKeyHeader },
});
expect(response).toHaveStatusCode(200);
});
✔️ Do: validate shape and key fields too:
apiTest('returns autocomplete definitions', async ({ apiClient }) => {
const response = await apiClient.get('api/console/api_server', {
headers: { ...COMMON_HEADERS, ...viewerCredentials.apiKeyHeader },
});
expect(response).toHaveStatusCode(200);
expect(response.body).toMatchObject({
es: {
endpoints: expect.any(Object),
globals: expect.any(Object),
name: 'es',
},
});
});