Mainline/AGENTS.md

# Agent Development Guide

## Development Environment

This project uses:
- **mise** (mise.jdx.dev) - tool version manager and task runner
- **hk** (hk.jdx.dev) - git hook manager
- **uv** - fast Python package installer
- **ruff** - linter and formatter
- **pytest** - test runner

### Setup

```bash
# Install dependencies
mise run install

# Or equivalently:
uv sync --all-extras   # includes mic, websocket, sixel support
```

### Available Commands

```bash
mise run test           # Run tests
mise run test-v         # Run tests verbose
mise run test-cov       # Run tests with coverage report
mise run test-browser   # Run e2e browser tests (requires playwright)
mise run lint           # Run ruff linter
mise run lint-fix       # Run ruff with auto-fix
mise run format         # Run ruff formatter
mise run ci             # Full CI pipeline (topics-init + lint + test-cov)
```

### Runtime Commands

```bash
mise run run            # Run mainline (terminal)
mise run run-poetry    # Run with poetry feed
mise run run-firehose  # Run in firehose mode
mise run run-websocket # Run with WebSocket display only
mise run run-sixel     # Run with Sixel graphics display
mise run run-both      # Run with both terminal and WebSocket
mise run run-client    # Run both + open browser
mise run cmd           # Run C&C command interface
```

## Git Hooks

**At the start of every agent session**, verify hooks are installed:

```bash
ls -la .git/hooks/pre-commit
```

If hooks are not installed, install them with:

```bash
hk init --mise
mise run pre-commit
```

**IMPORTANT**: Always review the hk documentation before modifying `hk.pkl`:
- [hk Configuration Guide](https://hk.jdx.dev/configuration.html)
- [hk Hooks Reference](https://hk.jdx.dev/hooks.html)
- [hk Builtins](https://hk.jdx.dev/builtins.html)

The project uses hk configured in `hk.pkl`:
- **pre-commit**: runs ruff-format and ruff (with auto-fix)
- **pre-push**: runs ruff check + benchmark hook

## Benchmark Runner

Run performance benchmarks:

```bash
mise run benchmark           # Run all benchmarks (text output)
mise run benchmark-json     # Run benchmarks (JSON output)
mise run benchmark-report   # Run benchmarks (Markdown report)
```

### Benchmark Commands

```bash
# Run benchmarks
uv run python -m engine.benchmark

# Run with specific displays/effects
uv run python -m engine.benchmark --displays null,terminal --effects fade,glitch

# Save baseline for hook comparisons
uv run python -m engine.benchmark --baseline

# Run in hook mode (compares against baseline)
uv run python -m engine.benchmark --hook

# Hook mode with custom threshold (default: 20% degradation)
uv run python -m engine.benchmark --hook --threshold 0.3

# Custom baseline location
uv run python -m engine.benchmark --hook --cache /path/to/cache.json
```

### Hook Mode

The `--hook` mode compares current benchmarks against a saved baseline. If performance degrades beyond the threshold (default 20%), it exits with code 1. This is useful for preventing performance regressions in feature branches.

The pre-push hook runs benchmark in hook mode to catch performance regressions before pushing.

## Workflow Rules

### Before Committing

1. **Always run the test suite** - never commit code that fails tests:
   ```bash
   mise run test
   ```

2. **Always run the linter**:
   ```bash
   mise run lint
   ```

3. **Fix any lint errors** before committing (or let the pre-commit hook handle it).

4. **Review your changes** using `git diff` to understand what will be committed.

### On Failing Tests

When tests fail, **determine whether it's an out-of-date test or a correctly failing test**:

- **Out-of-date test**: The test was written for old behavior that has legitimately changed. Update the test to match the new expected behavior.

- **Correctly failing test**: The test correctly identifies a broken contract. Fix the implementation, not the test.

**Never** modify a test to make it pass without understanding why it failed.

### Code Review

Before committing significant changes:
- Run `git diff` to review all changes
- Ensure new code follows existing patterns in the codebase
- Check that type hints are added for new functions
- Verify that tests exist for new functionality

## Testing

Tests live in `tests/` and follow the pattern `test_*.py`.

Run all tests:
```bash
mise run test
```

Run with coverage:
```bash
mise run test-cov
```

The project uses pytest with strict marker enforcement. Test configuration is in `pyproject.toml` under `[tool.pytest.ini_options]`.

## Architecture Notes

- **ntfy.py** and **mic.py** are standalone modules with zero internal dependencies
- **eventbus.py** provides thread-safe event publishing for decoupled communication
- **controller.py** coordinates ntfy/mic monitoring and event publishing
- **effects/** - plugin architecture with performance monitoring
- The render pipeline: fetch → render → effects → scroll → terminal output

### Display System

- **Display abstraction** (`engine/display.py`): swap display backends via the Display protocol
  - `TerminalDisplay` - ANSI terminal output
  - `WebSocketDisplay` - broadcasts to web clients via WebSocket
  - `SixelDisplay` - renders to Sixel graphics (pure Python, no C dependency)
  - `MultiDisplay` - forwards to multiple displays simultaneously

- **WebSocket display** (`engine/websocket_display.py`): real-time frame broadcasting to web browsers
  - WebSocket server on port 8765
  - HTTP server on port 8766 (serves HTML client)
  - Client at `client/index.html` with ANSI color parsing and fullscreen support

- **Display modes** (`--display` flag):
  - `terminal` - Default ANSI terminal output
  - `websocket` - Web browser display (requires websockets package)
  - `sixel` - Sixel graphics in supported terminals (iTerm2, mintty, etc.)
  - `both` - Terminal + WebSocket simultaneously

### Command & Control

- C&C uses separate ntfy topics for commands and responses
- `NTFY_CC_CMD_TOPIC` - commands from cmdline.py
- `NTFY_CC_RESP_TOPIC` - responses back to cmdline.py
- Effects controller handles `/effects` commands (list, on/off, intensity, reorder, stats)