WebQA Agent supports two execution modes, designed for different testing scenarios and workflows.
Use Cases: AI autonomously explores web pages, decomposes business objectives (e.g., “test search logic”), generates test cases, and executes them end to end. This mode is suitable for exploratory testing and comprehensive quality evaluation.
Functional Testing (AI type):
-
Two-stage planning: Stage 1 (
filter_model) prioritizes element filtering for efficiency; Stage 2 (primary modelmodel) performs page understanding to generate comprehensive test cases. -
Automatic test plan generation: Test cases are generated according to test design standards, page content, and custom business objectives. When page structure is complex and no explicit goal is provided, WebQA-Agent automatically plans broader test coverage.
-
Adaptive test plan reflection: Test plans are reflected and re-generated at the planning level based on execution results and coverage feedback.
-
Dynamic Step Generation:
Automatically generates additional test steps when new UI elements appear during execution, significantly improving test coverage without manual intervention.
How It Works:
- After each action, the system performs DOM diff analysis to detect new elements
- When ≥
min_elements_thresholdnew elements appear, triggers LLM-based step generation - LLM analyzes new elements and generates up to
max_dynamic_stepsrelevant test steps - Steps are inserted or replace remaining steps based on test plan coherence
Configuration:
test_config: function_test: type: "ai" dynamic_step_generation: enabled: true # Master switch (default: true) max_dynamic_steps: 8 # Max steps per generation (default: 8, range: 3-15) min_elements_threshold: 2 # Min new elements to trigger (default: 2, range: 1-5)
Parameter Guide:
Parameter Default Purpose Tuning Guidance enabledtrueEnable/disable feature Disable for simple static pages or strict time limits max_dynamic_steps8Upper limit on generated steps Increase to 10-12 for complex flows (e-commerce, dashboards), decrease to 5 for simple UIs min_elements_threshold2Sensitivity control Use 1 for maximum coverage (triggers more often), use 3+ for performance-critical scenarios Real-World Scenarios:
- Dropdown Selection: User clicks dropdown → 6 option elements appear → Generates 3-4 steps to test each option
- Modal Forms: User clicks "Settings" → Modal with 5 form fields appears → Generates 5-7 steps to fill and validate fields
- Loading Spinner (Filtered): User clicks "Load" → Single spinner element appears → Skipped (below threshold=2)
Performance Impact:
- Each generation adds 5-15 seconds to test execution
- Typical test with 3 generations: +15-45 seconds total
- Token usage: ~4500-5500 tokens per generation
When to Adjust:
Scenario Recommended Settings Rationale E-commerce product browsing max: 10, threshold: 2Complex category/filter interactions SaaS admin dashboard max: 12, threshold: 1Frequent nested menus, critical features Content/blog site max: 5, threshold: 3Static content, reduce noise High-speed smoke tests max: 5, threshold: 4Prioritize speed over coverage Mobile app testing max: 7, threshold: 2Compact UI, modal-heavy Strategies:
- Insert: Adds dynamic steps after current step (preserves test plan structure)
- Replace: Replaces remaining steps with dynamic steps (used when new elements provide alternative path to test objective)
- Example 1: AI Functional Testing + UX Testing
target:
url: https://example.com # Website URL to test
description: Website QA testing
max_concurrent_tests: 2 # Optional, default 2
test_config:
business_objectives: Test search functionality, generate 3 test cases
dynamic_step_generation:
enabled: True # Enable dynamic step generation
max_dynamic_steps: 8 # Generate up to 8 steps per discovery
min_elements_threshold: 2 # Require at least 2 new elements to trigger
custom_tools: # Optional custom tools (default: UI actions, UX verification)
enabled: [] # Empty list: use only default tools- Example 2: Default Functional Traversal + UX + Performance + Security
target:
url: https://example.com # Website URL to test
description: Website QA testing
max_concurrent_tests: 4 # Optional, default 2
test_config:
business_objectives: Comprehensive testing with performance and security analysis
custom_tools: # Enable custom tools
enabled:
- lighthouse # Lighthouse performance testing (requires: npm install -g lighthouse)
- nuclei # Nuclei security scanning (requires: nuclei installed)Use Cases: Precisely define each step of test cases through YAML files. AI executes according to instructions, suitable for repeatable and traceable testing scenarios.
-
Explicit Test Steps: Test steps and expected behaviors are precisely defined in YAML.
-
Multi-modal AI-Driven Actions: Supported browser and page operations include
Click,Hover,Input,ClearKeyboardPressScrollMouseMove,MouseWheel,DragSleepUploadGoToPage,GoBack
-
Multi-modal Verification: Supports visual confirmation, URL/path validation, and combined image–element verification.
-
End-to-End Automatic Monitoring: Captures browser Console logs and Network request status in real time. Optional
ignore_rulescan be used to suppress known console or network noise.
Run Mode configuration files must include the cases field.
target:
url: https://example.com # Target website URL
max_concurrent_tests: 2 # Maximum concurrent test count
browser_config: # Browser configuration
viewport: {"width": 1280, "height": 720}
cookies: /path/to/cookie.json # Load cookie data
# cookies: []
headless: false
ignore_rules: # Ignore rules configuration (optional)
network: # Network request ignore rules
- pattern: ".*\\.google-analytics\\.com.*"
type: "domain"
console: # Console log ignore rules
- pattern: "Failed to load resource.*favicon"
match_type: "regex"
- pattern: "Warning:"
match_type: "contains"
cases: # Test case list
- name: Image Upload # Test case name
steps: # Test steps
- action: Upload icon is the image icon in the input box, located next to the baidu search button, used for uploading files
args:
file_path: ./tests/data/test.jpeg
- action: Wait for image upload
- verify: Verify that the input field displays an open palm/hand icon image
- action: Enter "How many fingers are in the image?" in the search input box, then press Enter, wait 2 secondsBecause LLMs may produce ambiguous or incorrect interpretations, explicit and observable descriptions significantly improve execution stability.
Example Comparison:
| ❌ Incorrect Example | ✅ Correct Example |
|---|---|
| Click dropdown | Click the dropdown below form item A |
| First file parsed successfully | The first file in the list displays file name, file size, parsing status as "parsed successfully", model as "xxx", then the test passes |
| ❌ Incorrect Example | ✅ Correct Example |
|---|---|
| Browser has two tabs open | The "XXX" title is blue |
| Verify page has content "xxx" | Scroll 1000px with mouse wheel, verify page content includes "xxx" |
If there are new popups or new forms, additional steps are needed.
❌ Incorrect Example:
Click "Create" button, enter "xxx" name, click confirm
✅ Correct Example: Break down the task into multiple AI call steps
Click "Create" button
Enter a name with "test" prefix plus 5 random English letters click "Submit" button
Click "Confirm" button
- Check Report Files: Identify whether failures occur during planning or localization
- Check Previous Steps: Errors may originate earlier
- Planning Step Errors: Too many or too few steps → improve business context description
- Localization Step Errors: Wrong elements or offsets → add more visual or positional details
- Consider switching to a stronger vision-capable model
Basic Operations:
- action: Click "Submit" button
- action: Enter "test_user" in the "Username" input box # When there are multiple input boxes, more detailed content is needed to guide the model
- action: Clear search box content # clear
- action: Press "Enter" key # keyboard input
- action: Wait 5s # sleepElement Identification:
# For elements without clear text like icons: describe the icon's position on the page as much as possible, use other elements with clear text to help the model understand
- action: Click the second icon from left to right below the input box, the leftmost icon has text "**"
- action: Upload icon is the image icon below the input box, upload file "test.jpg"
# When there are multiple identical elements
- action: In the middle conversation area of the page, click the first cardScroll/Mouse:
- action: Scroll to the bottom of the page # Frontend pages support window full-page scrolling
- action: Move mouse above the "History" list, scroll down 800px with mouse wheel # Combine mouse movement for scrolling operations (recommended)
- action: Move mouse to "xx" node # In complex drawing/Canvas scenarios, rely on model's coordinate or semantic movement judgmentPage Operations:
- action: Click "xx" in the navigation bar, get the newly opened page
- action: Navigate to https://example.com/docs
- action: Go back to previous pageVisual Content Confirmation:
- verify: Verify that the input box displays "[expected content]"
- verify: After clicking "[button name]", verify that the popup disappears
- verify: Verify the first item in the list displays [field1], [field2], status is [expected status]URL and Path Validation:
- verify: Verify page navigation, URL contains "/[path]"Data/Record Validation:
- verify: Verify that a record with name containing "[keyword]" prefix appears
- verify: Verify that the [first/specific] row in the list contains a record with name containing "[keyword]"Combined Validation:
- verify: Verify current output is complete, and text content at [element position] is "[expected text]", color is [expected color], status is [expected status]- Initialization
# Create config.yaml in current directory (default Generate Mode)
webqa-agent init
# Specify output path and filename
webqa-agent init -o myconfig.yaml
# Force overwrite existing configuration file
webqa-agent init --force
# Create Run Mode configuration file (default generates config_run.yaml)
webqa-agent init --mode run
- Execute Tests
# Generate Mode test, auto-discover configuration file (prioritizes ./config.yaml or ./config/config.yaml)
webqa-agent gen
# Generate Mode with specified config file path, execute tests with 4 parallel workers
webqa-agent gen -c /path/to/config.yaml -w 4
# Run Mode with specified config file path, execute tests with 4 parallel workers
webqa-agent run -c /path/to/config_run.yaml -w 4
Run Mode also supports batch execution of YAML files in a directory.
Feature Notes:
- Each YAML file can be independently configured, supporting different
target.urlfor different files - Different
browser_config(e.g., viewport size) andignore_rules(ignore rules for specific scenarios) can be set for different files - Run Mode automatically loads and aggregates all
casesfrom all files for unified execution
# Specify execution folder, execute tests with 4 parallel workers
webqa-agent run -c config/case_folder -w 4