A plain-language explainer for proffesor-for-testing/agentic-qe by Dragan Spiridonov (@proffesor-for-testing) and contributors. An independent walkthrough of the open-source project.

Updated 2026-06-25 · source @ c092dce · v3.11.1 · MIT License

Agentic QE

proffesor-for-testing · agentic-qe

Quality is
the product.

Agentic QE Fleet is a free, open-source AI-powered quality engineering platform. It deploys 60 specialized agents and 75 skills that generate tests, find coverage gaps, detect flaky tests, and learn your codebase patterns — across 11 coding agent platforms.

View on GitHub →
A warm, glowing command hub surrounded by 13 orbiting agent nodes — the Agentic QE Fleet at a glance: one Queen Coordinator, 60 agents, 13 domains.
The Agentic QE Fleet at a Glance Queen Coordinator Test Gen 4 agents Execution 3 agents Coverage 2 agents Quality 4 agents Defect 4 agents Security 3 agents Require. 2 agents Code Intel 4 agents Contracts 2 agents Visual/A11y 3 agents Chaos 3 agents 60 75 11 Agents Skills Platforms across 13 domains 49 verified (tier 3) one MCP server

This explainer is built from the public GitHub repository:

https://github.com/proffesor-for-testing/agentic-qe

All source code, releases, and downloads linked from this page are publicly available under the MIT License. No login required.

GitHub Repo →
01

Why it was built

The problem every team hits: tests that don’t keep up with code.
Before: chaos of broken tests, red X marks, coverage gaps. After: organized green checkmarks, golden shield reading 96 percent, clean and confident.
Before Agentic QE: manual testing can't keep up. After: 60 AI agents auto-generate, analyze, and gate quality across the full surface.
Before Agentic QE vs. After Before: Manual QE Dev ships code Manual testwriting Coveragegaps Tests fall behind — days to write by hand Flaky tests erode trust in CI Tool sprawl: SAST, coverage, linting = 5+ tools No pattern memory — every test starts from scratch After: Agentic QE Fleet Dev ships code 60 AI agentsauto-generate 96%+ Tests generated in seconds, not days ML-powered flaky test detection + fixes One fleet = 13 QE domains, one interface Pattern learning across sessions + projects
Before: manual testing can't keep up with shipping speed. After: 60 AI agents auto-generate, analyze, and gate quality across the full surface.

Every team ships faster than their tests can follow. New features land daily; test suites fall behind. Coverage gaps hide in the cracks. Flaky tests erode trust until the team stops reading CI results.

Dragan Spiridonov built Agentic QE because existing solutions attacked one slice of the problem — generation or coverage or flakiness — but never the whole surface. A coordinated fleet of AI agents, each a specialist in one QE domain, can cover the full lifecycle: generate, analyze, execute, learn, and gate.

The project is open-source (MIT), free to use, fork, and contribute to. It is based on the Agentic QE Framework created by Dragan Spiridonov.

View the source → github.com/proffesor-for-testing/agentic-qe

02

What problem it solves

Tests fall behind, coverage rots, flaky tests erode trust.
Six problems AQE solves at once Each one has a dedicated domain of specialized agents Test Generation Gap Writing tests by hand takes days. AQE generates them in seconds across 15+ frameworks (Jest, pytest, JUnit...) 4 agents Coverage Blind Spots Line-coverage says what ran, not what matters. AQE does risk-weighted analysis to prioritize the most impactful paths. 2 agents Flaky Tests Tests that pass today and fail tomorrow. ML-powered detection, root-cause analysis, and automated stabilization. 4 agents Pattern Amnesia Every new test file starts from scratch. AQE remembers patterns across sessions and projects, improving over time. 4 agents Tool Sprawl SAST here, coverage there, linter elsewhere. AQE coordinates 60 agents across 13 domains through one interface. 13 domains AI Cost Overhead Sending everything to the biggest model is expensive. TinyDancer routes to the right tier: Haiku / Sonnet / Opus. auto-routed Each problem mapped to a dedicated agent domain — all orchestrated by the Queen Coordinator
Six problems that plague every team's quality process, each addressed by a dedicated domain of specialized AI agents.

Modern codebases grow faster than any human team can test. Agentic QE attacks the full quality surface at once, not just one slice. Each problem above maps to a dedicated agent domain with specialists trained for that exact task.

03

Why now

Coding agents exist — now QE needs agents too.
Three forces converging right now Coding Agents Exist Claude Code, Cursor, Copilot accelerate dev 10x — QE must keep pace MCP Standard Model Context Protocol lets AI tools share context and call specialized servers Free Local Models Ollama + open models = $0 routine test gen, cloud only when you need it aqe init --auto → 60 QE agents on call
Three technological forces converging: coding agents accelerate dev, MCP enables tool interop, free local models cut costs to zero for routine tasks.

Coding agents like Claude Code, Cursor, and Copilot accelerate development. Code output has increased 10x for many teams. But quality engineering hasn’t kept pace. The Model Context Protocol (MCP) standard now lets AI tools share context, and free local models make routine test generation cost $0.

AQE plugs into all three: one aqe init --auto and your coding agent has 60 QE specialists on call. No new IDE, no new workflow. It meets developers where they already work.

04

How it works

A fleet of 60 specialized agents, coordinated by a Queen.
How the Queen coordinates 60 agents YOUR REQUEST "Generate tests for PaymentService with 95% coverage" Queen Coordinator Decomposes, spawns, coordinates, synthesizes test-architect Generate unit tests Jest / Vitest / pytest coverage-specialist Risk-weighted gaps property-tester Edge case discovery quality-gate Anti-sycophancy security-scanner SAST/DAST sweep Synthesized Result 48 tests / 96.2% coverage / 78% pattern reuse Grounded, gated, ready to commit
The Queen Coordinator decomposes your request, fans out to domain specialists running in parallel, then synthesizes results with quality gates.
DomainAgentsWhat they do
Test Generation4Generate tests, test-driven development (TDD) workflows, mutation testing, property testing
Test Execution3Run tests in parallel, handle retries, integration testing
Coverage Analysis2Find untested code, prioritize by risk
Quality Assessment4Go/no-go gates, risk scoring, adversarial review
Defect Intelligence4Predict bugs, find root causes, fix flaky tests
Requirements2Validate testability, generate behavior-driven scenarios
Code Intelligence4Knowledge graphs, semantic search, change impact
Security3Static & dynamic security analysis, compliance audits, exploit validation
Contracts2API contracts, GraphQL schema testing
Visual & Accessibility3Visual regression, accessibility compliance, viewport testing
Chaos & Performance3Fault injection, load testing, performance validation
Learning4Cross-project learning, pattern discovery, metrics
Enterprise7SAP systems, legacy web services, message queues (Kafka, RabbitMQ), enterprise integrations
Three tiers of intelligent cost routing: a small green lane for simple tasks (fast and cheap), a medium amber lane for moderate tasks, and a large copper lane for critical reasoning — a sorting funnel directs each task to the right tier.
TinyDancer sorts tasks by complexity and routes them to the cheapest model that can handle them — from free local models to full Opus reasoning.
TinyDancer — Intelligent Model Routing Automatically routes tasks to the right model tier to minimize cost Task Any QE request TinyDancer Complexity scorer 0–100 scale Simple (0–20) Haiku — type additions, refactors $ Moderate (20–70) Sonnet — bug fixes, test gen $$ Critical (70+) Opus — architecture, security $$$ Free Tier Mode AQE_FREE_TIER=1 Route simple tasks to local Ollama $0 cost
TinyDancer scores task complexity (0–100) and routes to the cheapest model that can handle it. Free-tier mode uses local Ollama for routine tasks at $0.
05

What solved looks like

Ask for tests and get them — generated, validated, gated.
A glowing golden terminal showing test results: green checkmarks flowing down, a circular progress gauge reading 96 percent in warm amber, and neat test file cards below.
What "solved" looks like: ask for tests in plain English, get 48 tests at 96% coverage, quality-gated and ready to commit.
claude — agentic-qe $ "Generate tests for src/services/PaymentService.ts with 95% coverage" [queen-coordinator] Decomposing task... [queen-coordinator] Spawning: test-architect, coverage-specialist, property-tester, quality-gate [test-architect]Generating unit tests (Jest)... [coverage-specialist]Analyzing risk-weighted gaps... [property-tester]Discovering edge cases... [quality-gate]Anti-sycophancy check: PASS (no tautological assertions) Generated 48 tests across 4 files: - unit/PaymentService.test.ts(32 unit tests) - property/PaymentValidation.property.test.ts(8 property tests) - integration/PaymentFlow.integration.test.ts(8 integration tests) Coverage: 96.2% | Pattern reuse: 78% | Quality gate: PASS
A real interaction: the Queen spawns 4 specialists in parallel, each handling their domain. Result: 48 tests at 96.2% coverage, quality-gated.

75 Skills, Trust-Tiered

75 Skills — Trust Tiers Tier 3 — Verified Full evaluation test suite, production-ready 49 Tier 2 — Validated Has executable validator 7 Tier 1 — Structured JSON output schema 5 Tier 0 — Advisory 5 75 total skills 65% verified
75 skills rated by trust tier. 49 are fully verified with test suites. Tier-0 untested skills are excluded by policy.
06

Platform support

One server, 11 coding agent platforms.
One MCP Server → 11 Platforms AQE MCP Server npx agentic-qe mcp Claude Code GitHub Copilot Cursor Cline OpenCode AWS Kiro Kilo Code Roo Code OpenAI Codex CLI Windsurf Continue.dev aqe init --auto --with-all-platforms — set up every platform in one command
One MCP server connects to 11 coding agent platforms. Set up all at once or add platforms one by one.
# Set up all platforms at once
aqe init --auto --with-all-platforms

# Or add a platform later
aqe platform setup cursor
aqe platform list       # show install status
aqe platform verify cursor  # validate config

LLM Providers

AQE’s HybridRouter auto-detects available providers. Set one or more API keys and it picks the best route:

  • Claude (ANTHROPIC_API_KEY) — default
  • OpenAI (OPENAI_API_KEY)
  • Gemini (GOOGLE_AI_API_KEY) — free tier available
  • OpenRouter (OPENROUTER_API_KEY) — 300+ models
  • Ollama — local, free, offline
  • Azure OpenAI, Bedrock
07

How to start

Three commands and you’re running.
1 Install npm install -g agentic-qe Public npm package. Free. MIT. Works on macOS, Linux, Windows. 2 Initialize cd your-project aqe init --auto Auto-detects tech stack + MCP. 3 Use it "Generate tests for src/" 60 agents available instantly in your coding agent.
Three commands: install, initialize, use. MCP tools are available immediately in your coding agent.

Quick Start

# Install globally from npm (public, free)
npm install -g agentic-qe

# Initialize your project (auto-detects your tech stack and connects to your coding agent)
cd your-project && aqe init --auto

# That's it — 60 QE agents are available immediately in Claude Code

All packages are public on npm: npmjs.com/package/agentic-qe

Alternative: Claude Code Plugin

# From a local checkout
git clone https://github.com/proffesor-for-testing/agentic-qe.git
claude --plugin-dir ./agentic-qe/plugins/agentic-qe-fleet

Development Setup

git clone https://github.com/proffesor-for-testing/agentic-qe.git
cd agentic-qe && npm install && npm run build && npm test -- --run

Full source on GitHub → github.com/proffesor-for-testing/agentic-qe

08

Use cases

From test generation to chaos engineering to security audits.
Test Generation

Generate a full test suite

qe-test-architect generates unit, integration, property-based, and behavior-driven tests across 15+ frameworks.

"Create tests for PaymentService with 95% coverage"
Coverage Analysis

Find coverage gaps by risk

qe-coverage-specialist finds untested code and ranks it by risk, not line count.

"Find coverage gaps in src/ and prioritize"
Flaky Tests

Hunt and fix flaky tests

qe-flaky-hunter uses ML detection, root-cause analysis, and stabilization fixes.

"Stabilize flaky tests in tests/integration/"
Security

Security audit

qe-security-scanner: static and dynamic security analysis, dependency scanning, API security, and chaos testing.

"Run security scan on the auth module"
TDD

Red-Green-Refactor

qe-tdd-specialist coordinates 5 subagents through the full TDD cycle.

"Implement UserAuth with full TDD cycle"
Chaos Engineering

Fault injection

qe-chaos-engineer injects network partitions, latency spikes, resource exhaustion.

"Inject network partitions into order flow"
09

Get it

All links are public. No login, no paywall, no lock-in.
One download splitting into two halves: For You (a person reading a primer and watching a video) and For Your AI (a glowing brain with a magnifying glass searching it).
Everything is public: source on GitHub, package on npm, releases freely downloadable. No login, no paywall, no lock-in.

Everything about Agentic QE is publicly available. No Vercel login, no gated downloads, no auth walls. Every link below goes to a public URL you can access right now.

Command-Line Reference

aqe init [--auto]              # Initialize project
aqe agent list                 # List available agents
aqe fleet status               # Fleet health
aqe learning stats             # Learning statistics
aqe brain export/import        # Portable intelligence
aqe platform list/setup/verify # Manage platforms
aqe health                     # System health
aqe code index src/            # Index codebase
aqe code search "auth"         # Semantic search