Mainline/AGENTS.md

# Agent Development Guide

## Development Environment

This project uses:
- **mise** (mise.jdx.dev) - tool version manager and task runner
- **hk** (hk.jdx.dev) - git hook manager
- **uv** - fast Python package installer
- **ruff** - linter and formatter
- **pytest** - test runner

### Setup

```bash
# Install dependencies
mise run install

# Or equivalently:
uv sync --all-extras   # includes mic, websocket, sixel support
```

### Available Commands

```bash
mise run test           # Run tests
mise run test-v         # Run tests verbose
mise run test-cov       # Run tests with coverage report
mise run test-browser   # Run e2e browser tests (requires playwright)
mise run lint           # Run ruff linter
mise run lint-fix       # Run ruff with auto-fix
mise run format         # Run ruff formatter
mise run ci             # Full CI pipeline (topics-init + lint + test-cov)
```

### Runtime Commands

```bash
mise run run            # Run mainline (terminal)
mise run run-poetry    # Run with poetry feed
mise run run-firehose  # Run in firehose mode
mise run run-websocket # Run with WebSocket display only
mise run run-sixel     # Run with Sixel graphics display
mise run run-both      # Run with both terminal and WebSocket
mise run run-client    # Run both + open browser
mise run cmd           # Run C&C command interface
```

## Git Hooks

**At the start of every agent session**, verify hooks are installed:

```bash
ls -la .git/hooks/pre-commit
```

If hooks are not installed, install them with:

```bash
hk init --mise
mise run pre-commit
```

**IMPORTANT**: Always review the hk documentation before modifying `hk.pkl`:
- [hk Configuration Guide](https://hk.jdx.dev/configuration.html)
- [hk Hooks Reference](https://hk.jdx.dev/hooks.html)
- [hk Builtins](https://hk.jdx.dev/builtins.html)

The project uses hk configured in `hk.pkl`:
- **pre-commit**: runs ruff-format and ruff (with auto-fix)
- **pre-push**: runs ruff check + benchmark hook

## Benchmark Runner

Run performance benchmarks:

```bash
mise run benchmark           # Run all benchmarks (text output)
mise run benchmark-json     # Run benchmarks (JSON output)
mise run benchmark-report   # Run benchmarks (Markdown report)
```

### Benchmark Commands

```bash
# Run benchmarks
uv run python -m engine.benchmark

# Run with specific displays/effects
uv run python -m engine.benchmark --displays null,terminal --effects fade,glitch

# Save baseline for hook comparisons
uv run python -m engine.benchmark --baseline

# Run in hook mode (compares against baseline)
uv run python -m engine.benchmark --hook

# Hook mode with custom threshold (default: 20% degradation)
uv run python -m engine.benchmark --hook --threshold 0.3

# Custom baseline location
uv run python -m engine.benchmark --hook --cache /path/to/cache.json
```

### Hook Mode

The `--hook` mode compares current benchmarks against a saved baseline. If performance degrades beyond the threshold (default 20%), it exits with code 1. This is useful for preventing performance regressions in feature branches.

The pre-push hook runs benchmark in hook mode to catch performance regressions before pushing.

## Workflow Rules

### Before Committing

1. **Always run the test suite** - never commit code that fails tests:
   ```bash
   mise run test
   ```

2. **Always run the linter**:
   ```bash
   mise run lint
   ```

3. **Fix any lint errors** before committing (or let the pre-commit hook handle it).

4. **Review your changes** using `git diff` to understand what will be committed.

### On Failing Tests

When tests fail, **determine whether it's an out-of-date test or a correctly failing test**:

- **Out-of-date test**: The test was written for old behavior that has legitimately changed. Update the test to match the new expected behavior.

- **Correctly failing test**: The test correctly identifies a broken contract. Fix the implementation, not the test.

**Never** modify a test to make it pass without understanding why it failed.

### Code Review

Before committing significant changes:
- Run `git diff` to review all changes
- Ensure new code follows existing patterns in the codebase
- Check that type hints are added for new functions
- Verify that tests exist for new functionality

## Testing

Tests live in `tests/` and follow the pattern `test_*.py`.

Run all tests:
```bash
mise run test
```

Run with coverage:
```bash
mise run test-cov
```

The project uses pytest with strict marker enforcement. Test configuration is in `pyproject.toml` under `[tool.pytest.ini_options]`.

### Test Coverage Strategy

Current coverage: 56% (433 tests)

Key areas with lower coverage (acceptable for now):
- **app.py** (8%): Main entry point - integration heavy, requires terminal
- **scroll.py** (10%): Terminal-dependent rendering logic
- **benchmark.py** (0%): Standalone benchmark tool, runs separately

Key areas with good coverage:
- **display/backends/null.py** (95%): Easy to test headlessly
- **display/backends/terminal.py** (96%): Uses mocking
- **display/backends/multi.py** (100%): Simple forwarding logic
- **effects/performance.py** (99%): Pure Python logic
- **eventbus.py** (96%): Simple event system
- **effects/controller.py** (95%): Effects command handling

Areas needing more tests:
- **websocket.py** (48%): Network I/O, hard to test in CI
- **ntfy.py** (50%): Network I/O, hard to test in CI
- **mic.py** (61%): Audio I/O, hard to test in CI

Note: Terminal-dependent modules (scroll, layers render) are harder to test in CI.
Performance regression tests are in `tests/test_benchmark.py` with `@pytest.mark.benchmark`.

## Architecture Notes

- **ntfy.py** and **mic.py** are standalone modules with zero internal dependencies
- **eventbus.py** provides thread-safe event publishing for decoupled communication
- **controller.py** coordinates ntfy/mic monitoring and event publishing
- **effects/** - plugin architecture with performance monitoring
- The render pipeline: fetch → render → effects → scroll → terminal output

### Pipeline Architecture

The new Stage-based pipeline architecture provides capability-based dependency resolution:

- **Stage** (`engine/pipeline/core.py`): Base class for pipeline stages
- **Pipeline** (`engine/pipeline/controller.py`): Executes stages with capability-based dependency resolution
- **StageRegistry** (`engine/pipeline/registry.py`): Discovers and registers stages
- **Stage Adapters** (`engine/pipeline/adapters.py`): Wraps existing components as stages

#### Capability-Based Dependencies

Stages declare capabilities (what they provide) and dependencies (what they need). The Pipeline resolves dependencies using prefix matching:
- `"source"` matches `"source.headlines"`, `"source.poetry"`, etc.
- This allows flexible composition without hardcoding specific stage names

#### Sensor Framework

- **Sensor** (`engine/sensors/__init__.py`): Base class for real-time input sensors
- **SensorRegistry**: Discovers available sensors
- **SensorStage**: Pipeline adapter that provides sensor values to effects
- **MicSensor** (`engine/sensors/mic.py`): Self-contained microphone input
- **OscillatorSensor** (`engine/sensors/oscillator.py`): Test sensor for development

Sensors support param bindings to drive effect parameters in real-time.

### Preset System

Presets use TOML format (no external dependencies):

- Built-in: `engine/presets.toml`
- User config: `~/.config/mainline/presets.toml`
- Local override: `./presets.toml`

- **Preset loader** (`engine/pipeline/preset_loader.py`): Loads and validates presets
- **PipelinePreset** (`engine/pipeline/presets.py`): Dataclass for preset configuration

Functions:
- `validate_preset()` - Validate preset structure
- `validate_signal_path()` - Detect circular dependencies
- `generate_preset_toml()` - Generate skeleton preset

### Display System

- **Display abstraction** (`engine/display/`): swap display backends via the Display protocol
  - `display/backends/terminal.py` - ANSI terminal output
  - `display/backends/websocket.py` - broadcasts to web clients via WebSocket
  - `display/backends/sixel.py` - renders to Sixel graphics (pure Python, no C dependency)
  - `display/backends/null.py` - headless display for testing
  - `display/backends/multi.py` - forwards to multiple displays simultaneously
  - `display/__init__.py` - DisplayRegistry for backend discovery

- **WebSocket display** (`engine/display/backends/websocket.py`): real-time frame broadcasting to web browsers
  - WebSocket server on port 8765
  - HTTP server on port 8766 (serves HTML client)
  - Client at `client/index.html` with ANSI color parsing and fullscreen support

- **Display modes** (`--display` flag):
  - `terminal` - Default ANSI terminal output
  - `websocket` - Web browser display (requires websockets package)
  - `sixel` - Sixel graphics in supported terminals (iTerm2, mintty, etc.)
  - `both` - Terminal + WebSocket simultaneously

### Effect Plugin System

- **EffectPlugin ABC** (`engine/effects/types.py`): abstract base class for effects
  - All effects must inherit from EffectPlugin and implement `process()` and `configure()`
  - Runtime discovery via `effects_plugins/__init__.py` using `issubclass()` checks

- **EffectRegistry** (`engine/effects/registry.py`): manages registered effects
- **EffectChain** (`engine/effects/chain.py`): chains effects in pipeline order

### Command & Control

- C&C uses separate ntfy topics for commands and responses
- `NTFY_CC_CMD_TOPIC` - commands from cmdline.py
- `NTFY_CC_RESP_TOPIC` - responses back to cmdline.py
- Effects controller handles `/effects` commands (list, on/off, intensity, reorder, stats)

### Pipeline Documentation

The rendering pipeline is documented in `docs/PIPELINE.md` using Mermaid diagrams.

**IMPORTANT**: When making significant architectural changes to the rendering pipeline (new layers, effects, display backends), update `docs/PIPELINE.md` to reflect the changes:
1. Edit `docs/PIPELINE.md` with the new architecture
2. If adding new SVG diagrams, render them manually using an external tool (e.g., Mermaid Live Editor)
3. Commit both the markdown and any new diagram files