The QA Checklist Runner

What you’re building

You have a combat system, enemy behaviors, a HUD, dialogue, and a level editor. You have been playtesting by running the game, pressing buttons, and eyeballing whether things work. That process has two problems: it is slow (you test one scenario at a time), and it is unreliable (you forget to test edge cases, and “it seemed fine when I played it” is not evidence).

Professional game studios have QA teams. You do not. What you can have is an automated test harness that scripts specific scenarios — “spawn the player at position X, place 3 enemies at positions Y, simulate 60 frames of gameplay, check if the player is still alive” — and reports pass or fail. Run it after every change. If a test that used to pass now fails, you know exactly what you broke.

BloodRayne had plenty of bugs that automated testing would have caught: enemies spawning inside walls, collision boxes that missed at certain angles, difficulty spikes where the player could not survive a room without exploiting a mechanic. Your test harness catches these problems before a player ever sees them.

Forty minutes from now you will have a Node.js test runner that loads Phaser in a headless mode, runs scripted scenarios, checks assertions, records scenario states for playback, and generates an HTML report showing which tests passed and which failed.

ℹSoftware pattern: Headless simulation with assertion testing

Load the application in a non-visual environment, script interactions, check outcomes. This is the same pattern used in end-to-end web testing (Playwright, Cypress), API testing, and simulation validation. The core concept — run the system automatically and verify results — is universal.

The showcase

Here is what the finished test harness produces:

Collision tests: Does the player’s hitbox collide with enemies at the expected positions? Does the player stop at walls? Do projectiles hit targets?
Spawn verification: Do enemies appear at the positions defined in the level JSON? Does the player start at the correct spawn point? Are all expected entities present after scene initialization?
Balance checks: Can the player survive a room with 3 basic enemies? How many hits does it take to defeat a heavy enemy? Does the health potion restore the expected amount?
Scenario recording: Define a scenario as a sequence of inputs (move right for 30 frames, attack, wait 10 frames, dodge). Record the game state at each frame. Play it back to verify identical results.
HTML report: A single-page report with a summary (X passed, Y failed, Z skipped), a table of all tests with status, duration, and details, and expandable sections showing failure details with expected vs. actual values.

The prompt

Open your terminal, navigate to your game project folder, start your AI CLI tool, and paste this prompt:

Build a Node.js game test harness for a Phaser 3 action game. The
harness runs scripted scenarios in a headless Phaser instance and
generates an HTML test report. This is for a BloodRayne-inspired
game with combat, enemies, health, and level loading.

PROJECT STRUCTURE:
game-test-harness/
  package.json
  src/
    runner.js           (test runner entry point — runs all test files)
    harness.js          (headless Phaser game bootstrapper)
    scenario.js         (scenario definition and execution engine)
    assertions.js       (game-specific assertion helpers)
    reporter.js         (HTML report generator)
  tests/
    collision.test.js   (collision detection tests)
    spawn.test.js       (entity spawn verification tests)
    balance.test.js     (gameplay balance tests)
    scenario-replay.test.js  (recorded scenario playback tests)
  fixtures/
    test-level.json     (small test level with known entity positions)
    scenario-recording.json  (recorded input sequence for replay)
  output/
    (generated HTML reports go here)

REQUIREMENTS:

1. HARNESS (src/harness.js)
   Bootstraps a headless Phaser game instance for testing:

   a. HEADLESS SETUP
      - Use Phaser's HEADLESS render type (Phaser.HEADLESS)
      - No canvas, no rendering — physics and game logic only
      - Create a Phaser game instance with Arcade physics
      - Expose methods to: add entities, simulate frames, read state

   b. GAME API
      - createPlayer(x, y) — adds a player entity with health, velocity, hitbox
      - createEnemy(x, y, type) — adds an enemy ('basic', 'fast', 'heavy')
        with type-appropriate stats (health, damage, speed)
      - createWall(x, y, width, height) — adds a static physics body
      - createPlatform(x, y, width) — adds a one-way platform
      - createHazard(x, y, width, height) — adds a damage zone
      - createPickup(x, y, type) — adds a collectible item
      - loadLevel(jsonPath) — reads a level JSON (from L17 editor format)
        and creates all entities from it
      - simulateFrames(n) — advances the game by n frames (16ms each)
      - simulateInput(inputs) — applies a sequence of input commands:
        [{ frame: 0, action: 'moveRight' }, { frame: 30, action: 'attack' }, ...]
      - getState() — returns current game state snapshot:
        { player: { x, y, health, alive }, enemies: [{ x, y, health, alive }],
          pickups: [...], frameCount }
      - reset() — destroys all entities, resets to clean state
      - destroy() — shuts down the Phaser instance

   c. ENTITY STATS
      - Player: 100 health, 200 speed, 32x32 hitbox, 10 melee damage
      - Basic enemy: 30 health, 80 speed, 24x24 hitbox, 10 contact damage
      - Fast enemy: 20 health, 150 speed, 20x20 hitbox, 8 contact damage
      - Heavy enemy: 80 health, 40 speed, 40x40 hitbox, 20 contact damage
      - Health potion: restores 25 health
      - Damage: calculated per-frame when hitboxes overlap

2. SCENARIO ENGINE (src/scenario.js)
   Defines and executes test scenarios:

   a. SCENARIO FORMAT
      {
        name: "Player survives 3 basic enemies",
        setup: (harness) => {
          harness.createPlayer(100, 300);
          harness.createEnemy(300, 300, 'basic');
          harness.createEnemy(400, 300, 'basic');
          harness.createEnemy(500, 300, 'basic');
        },
        inputs: [
          { frame: 0, action: 'moveRight' },
          { frame: 20, action: 'attack' },
          { frame: 40, action: 'moveRight' },
          { frame: 60, action: 'attack' },
          { frame: 80, action: 'moveRight' },
          { frame: 100, action: 'attack' }
        ],
        duration: 120,
        assertions: (state, assert) => {
          assert.isAlive('player');
          assert.healthAbove('player', 30);
          assert.allEnemiesDead();
        }
      }

   b. EXECUTION
      - Run setup function to create entities
      - Execute simulateInput with the input sequence
      - After duration frames, capture state
      - Run assertion functions against the captured state
      - Return: { name, passed, duration_ms, failures: [] }

   c. RECORDING
      - Record mode: capture state snapshot at every 10th frame
      - Save recording as JSON: { scenario_name, frames: [stateSnapshot, ...] }
      - Replay mode: load recording, re-run scenario, compare state at each
        recorded frame (within tolerance of 1px position, 1 health point)

3. ASSERTIONS (src/assertions.js)
   Game-specific assertion helpers:

   - isAlive(entity) — entity exists and health > 0
   - isDead(entity) — entity health <= 0
   - healthAbove(entity, value) — entity health > value
   - healthEquals(entity, value) — entity health === value
   - positionNear(entity, x, y, tolerance) — within tolerance pixels
   - allEnemiesDead() — all enemy entities have health <= 0
   - enemyCount(n) — exactly n enemies exist in the scene
   - collisionOccurred(entityA, entityB) — hitboxes overlapped during simulation
   - pickupCollected(type) — pickup of that type was removed from scene
   - noClipping(entity) — entity never overlapped a wall during simulation

   Each assertion returns { passed: bool, message: string, expected, actual }

4. TEST FILES

   a. COLLISION TESTS (tests/collision.test.js)
      - "Player collides with enemy at same position" — create both at (200,200),
        simulate 1 frame, verify player took damage
      - "Player does not collide with distant enemy" — player at (100,100),
        enemy at (700,500), simulate 30 frames, verify no damage
      - "Player stops at wall" — place wall at x:400, player moving right
        from x:100, simulate 200 frames, verify player x <= 400
      - "Projectile hits enemy" — player attacks at range toward enemy,
        verify enemy took damage

   b. SPAWN TESTS (tests/spawn.test.js)
      - "Level JSON spawns correct enemy count" — load test-level.json,
        verify enemy count matches spawn points in JSON
      - "Player spawns at level start position" — load level, verify player
        at expected coordinates
      - "All entities are within level bounds" — load level, verify no entity
        position exceeds grid dimensions * tile size

   c. BALANCE TESTS (tests/balance.test.js)
      - "Player survives 3 basic enemies with attacks" — setup + inputs
        from scenario format, verify player alive with health > 0
      - "Player cannot survive 3 basic enemies without attacking" — same
        setup, inputs are only movement (no attacks), verify player dies
      - "Health potion restores 25 health" — damage player to 50 health,
        create potion, walk player over it, verify health is 75
      - "Heavy enemy requires 8 hits to defeat" — create heavy enemy (80hp),
        simulate 8 attacks (10 damage each), verify enemy dead

   d. SCENARIO REPLAY (tests/scenario-replay.test.js)
      - "Recorded scenario produces identical results" — load
        scenario-recording.json, replay, compare final state to recorded state
      - "Scenario is deterministic" — run same scenario twice, compare
        states at every checkpoint, verify identical within tolerance

5. FIXTURES
   a. test-level.json — 10x8 grid with ground floor, 2 walls, 3 enemy spawns,
      1 player spawn, 1 exit, 1 hazard pit
   b. scenario-recording.json — 120-frame recording of "defeat 3 enemies"
      scenario with state snapshots every 10 frames

6. REPORTER (src/reporter.js)
   Generates output/test-report.html:

   a. SUMMARY BAR
      - Total tests, passed (green), failed (red), skipped (gray)
      - Overall pass rate as percentage
      - Total run time
      - Timestamp

   b. TEST TABLE
      - Columns: Status (icon), Test Name, Category, Duration, Details
      - Passed: green checkmark
      - Failed: red X with expandable failure details
      - Failure details: expected value, actual value, assertion message
      - Grouped by category (Collision, Spawn, Balance, Replay)

   c. STYLING
      - Dark theme: background #09090b, cards #111118, text #e5e5e5
      - Pass/fail colors: green #22c55e, red #ef4444
      - Font: system monospace for values, system sans-serif for labels
      - Printable with white background media query

7. RUNNER (src/runner.js)
   Entry point:
   - Usage: node src/runner.js [--filter <pattern>] [--verbose]
   - Discovers all .test.js files in tests/
   - Runs each test file's exported scenarios sequentially
   - Passes harness instance to each test
   - Collects results, generates HTML report
   - Prints summary to terminal (colored with chalk)
   - Exit code: 0 if all pass, 1 if any fail

DEPENDENCIES: phaser (^3.80), chalk, glob
The Phaser package includes a headless mode that runs in Node.js.

💡QA without a QA team

One human playing through a game tests one path per session. An automated harness tests dozens of scenarios in seconds. This is not replacing playtesting — you still need human eyes on the game. But it catches the dumb stuff (collision off by one pixel, enemy spawns inside a wall, health math wrong) so your playtesting time is spent on the things only humans can evaluate: does it feel good?

What you get

After generation:

game-test-harness/
  package.json
  src/
    runner.js
    harness.js
    scenario.js
    assertions.js
    reporter.js
  tests/
    collision.test.js
    spawn.test.js
    balance.test.js
    scenario-replay.test.js
  fixtures/
    test-level.json
    scenario-recording.json
  output/

Fire it up

cd game-test-harness
npm install
node src/runner.js

You should see terminal output:

Game Test Harness — Running all tests...

COLLISION
  ✓ Player collides with enemy at same position (12ms)
  ✓ Player does not collide with distant enemy (8ms)
  ✓ Player stops at wall (15ms)
  ✓ Projectile hits enemy (11ms)

SPAWN
  ✓ Level JSON spawns correct enemy count (6ms)
  ✓ Player spawns at level start position (5ms)
  ✓ All entities within level bounds (7ms)

BALANCE
  ✓ Player survives 3 basic enemies with attacks (22ms)
  ✓ Player cannot survive without attacking (18ms)
  ✓ Health potion restores 25 health (9ms)
  ✓ Heavy enemy requires 8 hits (14ms)

REPLAY
  ✓ Recorded scenario matches (31ms)
  ✓ Scenario is deterministic (28ms)

13 passed, 0 failed (186ms)
Report: output/test-report.html

Open output/test-report.html in your browser to see the full visual report.

If something is off

Problem	Follow-up prompt
Phaser crashes in Node.js	`Phaser's HEADLESS mode needs a canvas polyfill in Node.js. Install the 'canvas' npm package (npm install canvas) and make sure harness.js sets the Phaser config to type: Phaser.HEADLESS. If the canvas package has native build issues, use jsdom as a lightweight DOM mock instead.`
Physics do not simulate in headless mode	`The physics bodies exist but no collisions are detected during simulateFrames(). Make sure the Phaser game loop is being manually stepped with game.step(16) for each frame, and that the physics world is using Arcade physics with gravity and collision detection enabled in the config.`
Tests run but all fail	`All tests report failures. Check that the harness entity creation methods return references that the assertion functions can inspect. getState() should read from the actual Phaser physics bodies, not from stale variables. After simulateFrames(), call getState() to capture the current physics positions and health values.`
HTML report is empty	`The report file is generated but shows no test results. Make sure the reporter receives the array of test results from the runner and passes them into the HTML template. Each result should have: name, category, passed, duration, and failures array. Check that JSON.stringify is used for embedding data in the HTML script tag.`

Deep dive

The test harness has four layers, and each one solves a different problem.

The harness is a headless game instance. Phaser supports a HEADLESS render type that runs game logic and physics without creating a visible canvas. This means you can run the game in Node.js, simulate frames programmatically, and inspect the game state — all without opening a browser. It is the same approach that web testing frameworks like Playwright use: run the application in a controlled environment where you can script interactions and inspect outcomes.

Scenarios are reproducible by design. A scenario defines an initial setup (which entities exist and where), an input sequence (what the player does and when), and assertions (what should be true after execution). Given the same setup and inputs, the physics simulation produces the same results every time. This determinism is what makes automated testing possible — if a test passes today but fails tomorrow, something in your game code changed.

Assertions speak game language. Instead of generic expect(value).toBe(other), the assertion helpers use game concepts: isAlive, allEnemiesDead, healthAbove, noClipping. This makes tests readable. “Player survives 3 basic enemies” with assertion assert.isAlive('player') reads like a game design requirement, because it is one.

The HTML report is your CI dashboard. After every code change, run the test harness. The report shows what passed and what broke. If you add a new enemy type and suddenly “Player survives 3 basic enemies” fails, you know your balance changed. The report is evidence, not opinion.

🔍Testing balance, not just bugs

The balance tests are the most interesting category. They encode game design decisions as testable assertions:

“Player survives 3 basic enemies” means your combat is not impossibly hard.
“Player cannot survive without attacking” means the game requires engagement, not just running through.
“Heavy enemy requires 8 hits” means damage numbers are calibrated correctly.

When you tweak enemy health, player damage, or attack speed, these tests tell you immediately whether your changes preserved the intended difficulty curve. That is not bug detection — it is design validation. Professional studios call this “balance regression testing,” and it is one of the most valuable things automated testing can do for a game.

Customize it

Add visual replay

Add a --replay flag to the runner that generates a frame-by-frame
visual replay of failed tests. For each failed test, output a sequence
of PNG images (one per 10 frames) showing entity positions as colored
rectangles on a grid. Stitch them into an animated GIF using the
gif-encoder package. Save to output/replay-{testname}.gif. This lets
you SEE what went wrong, not just read about it.

Add performance benchmarks

Add a "performance" test category. Each test measures: frames per
second at different entity counts (10, 50, 100, 200 enemies), memory
usage before and after 1000 frames, and entity creation time. Report
results as a table in the HTML report with green/yellow/red indicators
based on thresholds (60fps = green, 30-60 = yellow, <30 = red). This
catches performance regressions before they become player-facing.

Add level validation tests

Add a test that loads every JSON level file from a directory and
validates: all spawn points are on walkable ground (not inside walls),
at least one exit exists, the player spawn is reachable from at least
one exit (basic pathfinding check), no entities are placed outside
the grid bounds. This catches level design errors that would cause
soft-locks or invisible walls.

Try it yourself

Paste the main prompt and generate the project.
Run npm install && node src/runner.js and see all tests pass.
Open the HTML report and review the results. Click a test to see its details.
Break something on purpose: edit fixtures/test-level.json and change an enemy spawn to a position outside the grid. Run the tests again. The spawn test should fail with a clear error message.
Write a new balance test: “Player with 50% health survives 2 basic enemies.” Define the scenario setup, inputs, and assertions. Run the harness and check whether your game supports that difficulty claim.
Run with --verbose to see frame-by-frame state output for debugging.

Key takeaways

Automated testing catches regressions instantly. Change enemy damage, run tests, see what broke. No manual playtesting required for known scenarios.
Headless Phaser runs in Node.js. The same physics engine that runs in the browser runs in your test environment. Results are identical.
Balance tests encode design decisions. “Player survives 3 enemies” is not just a test — it is a documented design constraint that breaks loudly if you violate it.
Scenario recording ensures determinism. Record a playthrough, save it, replay it later. If the result changes, your code changed. That is the definition of a regression.
The HTML report is shareable evidence. Send it to a collaborator, show it to an investor, attach it to a build. “All 13 tests pass” is more convincing than “I played through it and it seemed fine.”

💡Run tests after every change

The harness takes seconds to run. Make it a habit: edit code, run tests, check report. This loop catches problems when they are small and cheap to fix, not after you have built three more features on top of a broken foundation. That discipline separates hobby projects from shippable products.

What’s next

You have combat, movement, VFX, a HUD, dialogue, a level editor, and automated tests. Every piece exists. None of them are connected. In the next lesson you will build the Playable Demo Assembly — a single cohesive 2-minute playable demo that stitches together the combat loop, movement system, visual effects, HUD, and dialogue into one complete level with a title screen, mission briefing, combat encounters, a boss fight, and a victory screen. That is the artifact that proves the concept works.

What you'll learn