Interactive Lesson

Roadmap & Changelog

Tip

The foundation is complete. The next phase is extending the agent carefully — more capability with better observability and control. Every new tool should be easier to observe, constrain, and undo than the last.

Roadmap

Planned Improvements

  • Stronger guardrails around file writes and Python execution.
  • Improved logging so every tool call and result is easy to inspect after the fact.
  • More tests around filesystem tools and the agent loop.
  • Keep provider differences fully isolated from the shared tool layer.
  • Try more local and hosted models, then compare reliability and cost.
  • Add new tools one at a time, only after the existing set is stable.
  • Run the agent on another small codebase after committing the current state.

Potential Future Tools

Tool Description Priority
inspect_git_status Show current git status and recent diffs High
run_tests Run a specific test suite and return output High
format_file Run ruff or black to format a Python file Medium
create_patch Generate a diff instead of directly overwriting files Medium
install_package Install a package into the .venv sandbox Low

Changelog

Concurrent tool execution

Columns

Before

Tool calls were dispatched sequentially — one at a time — even when the model issued several independent calls in a single step. This wasted latency on every multi-tool step.

After

Both the LMStudio and Gemini agent loops now execute multiple tool calls in parallel using ThreadPoolExecutor. When the model reads three files in one step, all three reads happen simultaneously.


`run_python_file` uses `sys.executable`

Columns

Before

bash
uv run file.py

Required uv to be installed and on PATH at runtime.

After

python
sys.executable + abs_file_path

Uses the current environment's interpreter directly. Works in any virtualenv without uv.


LMStudio runtime configuration

Two new environment variables let you tune the model at runtime without changing code:

Variable Default Effect
LMSTUDIO_TEMPERATURE 0 Controls randomness. 0 = fully deterministic.
LMSTUDIO_MAX_TOKENS 800 Caps output length. Raise for large file writes.

System prompt condensed

The system prompt in prompts.py was shortened to reduce token overhead per iteration. The agent is now instructed to prefer targeted search and read actions, make small edits, run code to verify, and answer briefly.


`search_files` tool added

The agent now has a search_files tool to search for text across files inside the sandbox.

  • Uses shorter tool schemas to reduce repeated prompt tokens.
  • Walks the requested directory safely within the working directory.
  • Skips noisy folders (.git, .venv, __pycache__, node_modules).
  • Skips common binary and media files.
  • Returns matching file paths, line numbers, and line previews.
  • Caps results with MAX_MATCHES and MAX_LINE_CHARS.