CI: Benchmarking task runs during code coverage test causing interference #36

New Issue

david · 2026-03-19T00:18:55Z

david commented

2026-03-19 00:18:55 +00:00

Problem

The mise run ci task runs test-cov, which executes pytest --cov=engine. This runs all tests, including @pytest.mark.benchmark tests in tests/test_benchmark.py.

This causes:

Unintended benchmark execution during coverage collection
Potential interference with coverage statistics
Slower CI runs due to performance test overhead
Timing issues if benchmarks are sensitive to system load

Current State

mise.toml line 53: ci = { run = "mise run topics-init && mise run lint && mise run test-cov", depends = [...] }
pyproject.toml defines @pytest.mark.benchmark marker
No explicit exclusion of benchmarks in test-cov command

Proposed Solution

Add benchmark exclusion to test-cov: Modify test-cov to exclude benchmark tests using -m "not benchmark"
Create dedicated benchmark task: Add benchmark task to mise.toml for explicit benchmark execution
Update CI to run benchmarks separately: Modify ci task to run benchmarks in a separate step after coverage

Files to Modify

mise.toml: Update test-cov and add benchmark task
pyproject.toml: Ensure benchmark marker is properly configured

Acceptance Criteria

mise run test-cov excludes benchmark tests
mise run benchmark runs only benchmark tests
mise run ci runs coverage, then benchmarks in sequence

## Problem The `mise run ci` task runs `test-cov`, which executes `pytest --cov=engine`. This runs **all** tests, including `@pytest.mark.benchmark` tests in `tests/test_benchmark.py`. This causes: 1. Unintended benchmark execution during coverage collection 2. Potential interference with coverage statistics 3. Slower CI runs due to performance test overhead 4. Timing issues if benchmarks are sensitive to system load ## Current State - `mise.toml` line 53: `ci = { run = "mise run topics-init && mise run lint && mise run test-cov", depends = [...] }` - `pyproject.toml` defines `@pytest.mark.benchmark` marker - No explicit exclusion of benchmarks in `test-cov` command ## Proposed Solution 1. **Add benchmark exclusion to `test-cov`**: Modify `test-cov` to exclude benchmark tests using `-m "not benchmark"` 2. **Create dedicated benchmark task**: Add `benchmark` task to `mise.toml` for explicit benchmark execution 3. **Update CI to run benchmarks separately**: Modify `ci` task to run benchmarks in a separate step after coverage ## Files to Modify - `mise.toml`: Update `test-cov` and add `benchmark` task - `pyproject.toml`: Ensure benchmark marker is properly configured ## Acceptance Criteria - `mise run test-cov` excludes benchmark tests - `mise run benchmark` runs only benchmark tests - `mise run ci` runs coverage, then benchmarks in sequence

david commented

2026-03-19 03:42:49 +00:00

Root Cause Identified (2026-03-18)

The issue is NOT that benchmarks run during test-cov - they already don't! Line 13 in mise.toml:

test-cov = { run = "uv run pytest --cov=engine --cov-report=term-missing -m \"not benchmark\"", ... }

The -m "not benchmark" already excludes benchmarks from coverage runs.

The Actual Problem

Line 54 in mise.toml has a configuration error:

ci = { run = "mise run topics-init && mise run lint && mise run test-cov && mise run benchmark", depends = ["topics-init", "lint", "test-cov", "benchmark"] }

This runs everything TWICE:

First, depends runs the tasks: topics-init, lint, test-cov, benchmark
Then, run runs the commands again: mise run topics-init && ...

Fix Required

The ci task should EITHER use depends OR run, not both. Recommended fix:

ci = { run = "mise run topics-init && mise run lint && mise run test-cov && mise run benchmark" }

(Remove the redundant depends list)

Or alternatively:

ci = { depends = ["topics-init", "lint", "test-cov", "benchmark"] }

(But this doesn't print which tasks are running)

We'll implement the first option - just use run without depends.

Should we proceed with fixing this in Phase 4 (Benchmarking improvements)?

## Root Cause Identified (2026-03-18) The issue is NOT that benchmarks run during `test-cov` - they already don't! Line 13 in `mise.toml`: ```toml test-cov = { run = "uv run pytest --cov=engine --cov-report=term-missing -m \"not benchmark\"", ... } ``` The `-m "not benchmark"` already excludes benchmarks from coverage runs. ### The Actual Problem Line 54 in `mise.toml` has a configuration error: ```toml ci = { run = "mise run topics-init && mise run lint && mise run test-cov && mise run benchmark", depends = ["topics-init", "lint", "test-cov", "benchmark"] } ``` **This runs everything TWICE:** 1. First, `depends` runs the tasks: `topics-init`, `lint`, `test-cov`, `benchmark` 2. Then, `run` runs the commands again: `mise run topics-init && ...` ### Fix Required The `ci` task should EITHER use `depends` OR `run`, not both. Recommended fix: ```toml ci = { run = "mise run topics-init && mise run lint && mise run test-cov && mise run benchmark" } ``` (Remove the redundant `depends` list) Or alternatively: ```toml ci = { depends = ["topics-init", "lint", "test-cov", "benchmark"] } ``` (But this doesn't print which tasks are running) We'll implement the first option - just use `run` without `depends`. Should we proceed with fixing this in Phase 4 (Benchmarking improvements)?

david referenced this issue

2026-03-19 05:32:36 +00:00

Bug: PIL-based block height estimation causes severe performance degradation #38

david closed this issue

2026-03-19 05:34:10 +00:00

david referenced this issue from a commit

2026-03-19 11:18:53 +00:00

fix(performance): use simple height estimation instead of PIL rendering

Sign in to join this conversation.