← back to index

How I Fixed Context Windows

#llms#tooling#process

Or, Better Than a 1M Context Window.

A gray Context Bonsai illustration.

You don’t have to use LLM coding agents much before you understand the problems with long-running sessions. First it gets a little slower. Then a little sloppier. Then it starts making weird and nonsensical choices.

This happens because the context is full of dead ends, stale plans, bad tool output, revised choices, and facts that were wrong the first time and somehow got promoted to scripture anyway. You correct it, but it still thinks the first, incorrect version is gospel.

Then auto-compaction rolls in with the delicate touch of a backhoe sent to organize a watchmaker’s bench. Your entire session gets replaced with a summary. The summary is inevitably missing a bunch of critical details that you’ve already painstakingly covered, and now all of that is just forgotten. At that point, you’re frequently better off completely starting over from scratch.

We’ve developed techniques for working around these limitations, and we’ve also developed an anxiety around filling up that context. The moment we see the LLM run a build script that produces pages and pages of mostly useless output, we start getting irritated because we already told it to use our special build script that hides that stuff.

Some models have a larger context window, but that’s not all it’s cracked up to be. A huge context window gives the model room to carry more history. That includes useful history, and it also includes junk. Old tool output, abandoned approaches, resolved bugs, stale assumptions, and conversational debris all stay in the context, where they keep affecting later turns.

Pruning the junk

What if all this could be improved? What if we could just tell the LLM, “hey, you can just forget all that build output,” and magically that content was removed from the context? Even better, what if the LLM itself realized this and just did it automatically? What if after a long debugging session, the LLM finally fixed a difficult issue, then before it moves on, it prunes away that lengthy debug session and just leaves behind a summary of the important bits it figured out?

That is why I made Context Bonsai. The idea is simple. Let the model archive older completed parts of the conversation, leave behind a compact summary placeholder, and bring the full details back later if they become relevant again. That keeps old junk out of the active prompt without throwing it away completely.

So the session has a chance to stay cleaner as it grows. Solved problems can move out of the way. The important thread stays visible. Archived details are still there when they are actually needed.

What stays in context

The LLM is given guidance about what should stay in the context. Any uncompleted goals stay. Unresolved work stays. Initial orientation discussion setting up the purpose for the session stays. Any clearly stated rules or guidelines about how to operate stay.

Otherwise, oldest content is prioritized for removal. The LLM is occasionally fed an ephemeral system message inviting it to consider whether there are any completed tasks or threads that can be pruned away. You get to have a session that is tighter, more focused on the task that actually needs to be done, and the session lasts significantly longer before hitting the auto-compaction wall.

What it looks like in use

Below are four real prune events captured from sessions where I used Context Bonsai. Each one is a snippet of conversation around a single context-bonsai-prune call — what triggered it, what the model decided to keep, and what it archived. Click a title to expand it.

Completed task transition prune

In this example, the LLM prunes a finished implementation segment before moving on to the next task, preserving only the few facts needed in working memory.

  1. Try using that image with the poisson method
  2. Done — ran the banana image with Poisson. Outputs: - `output/banana-poisson-maze.json` - `output/banana-poisson-maze.png`
  3. Whatever you're doing, it's wrong. You should not need to run multiple commands and you should not be supplying a giant argument at the command line. We specifically implemented somethign that directly generates png
  4. **Pruning large obsolete chunk**
  5. I need to remember: - You expect me to use the direct PNG-capable flow we implemented, without giant ad-hoc argument lists. - For this request, you want the Poisson result from `samples/banana_low_key.png` using that intended streamlined path.
  6. PRUNE
    from_pattern
    Next, we want to implement another method - the autorouter method. First, generate a plan by following ./.llm-conductor/planning_guidance.md
    to_pattern
    Try using that image with the poisson method
    summary
    Archived the autorouter planning and implementation segment plus banana image generation tests. This included creating and validating the autorouter plan, orchestrating implementation/review/judge cycles, and generating both autorouter and poisson outputs for `samples/banana_low_key.png`. Active context now is correcting command usage to the intended direct PNG flow with streamlined invocation.
    index_terms
    autorouter-plan, orchestration, poisson-vs-autorouter, banana-sample, png-output, command-usage-correction
    output
    Archived 39 messages from pattern "Next, we want to implement another method - the autorouter method. First, generate a plan by following ./.llm-conductor/planning_guidance.md" (resolved to msg_ce85f1cbe001MSWFKOZHySOwf5) to pattern "Try using that image with the poisson method" (resolved to msg_ce960736c001tj1XaRj6o32rov).
    Summary: Archived the autorouter planning and implementation segment plus banana image generation tests. This included creating and validating the autorouter plan, orchestrating implementation/review/judge cycles, and generating both autorouter and poisson outputs for `samples/banana_low_key.png`. Active context now is correcting command usage to the intended direct PNG flow with streamlined invocation.
    Index terms: autorouter-plan, orchestration, poisson-vs-autorouter, banana-sample, png-output, command-usage-correction
  7. You’re right — that was the wrong style. I should use the intended direct PNG flow, concise and minimal, not a huge hand-built command. I’ll follow that pattern from here.
Debugging path prune

In this example, the LLM prunes a completed debugging detour while preserving the key conclusions needed for the next step.

  1. Hold on - that is _NOT_ what I'm expecting. I thought we implemented something that overlayed the maze back onto the original image. Look through the commits to see if you can find that.
  2. Need prune now. preserve memory: user expects overlay mode commit exists and wants look through commits. prune large range maybe from early? but already pruned many. need immediate prune maybe from "Create a maze out of house sample" to last? includes current relevant maybe yes can keep short memory. do prune range older than recent user asks maybe from "Let's try generating another maze..." to "Try again"? We need remember overlay implementation commits: 5c6331b and fix 375dba3. I'll state memory then prune that chunk maybe big.
  3. PRUNE
    from_pattern
    Let's try generating another maze. Make sure you use the pure house tangram image to start. Which file is that?
    to_pattern
    Try again
    summary
    Archived the segment covering generation attempts, confusion about direct PNG flow, and the final correction request. Key outcomes: overlay mode with background image exists and was implemented in commits `5c6331b` (add overlay PNG mode) and `375dba3` (map missing background image failures to invalid-input exit 2); marker visibility in overlay was later fixed locally but not yet committed.
    index_terms
    overlay-png-mode, background-image, direct-png-flow, commits-5c6331b-375dba3, marker-visibility, uncommitted-fix
    output
    Archived 168 messages from pattern "Let's try generating another maze. Make sure you use the pure house tangram image to start. Which file is that?" (resolved to msg_cde262c43001UDI1T873EMACMa) to pattern "Try again" (resolved to msg_d01ffd595001SYkL1LYYFSnEO5).
    Summary: Archived the segment covering generation attempts, confusion about direct PNG flow, and the final correction request. Key outcomes: overlay mode with background image exists and was implemented in commits `5c6331b` (add overlay PNG mode) and `375dba3` (map missing background image failures to invalid-input exit 2); marker visibility in overlay was later fixed locally but not yet committed.
    Index terms: overlay-png-mode, background-image, direct-png-flow, commits-5c6331b-375dba3, marker-visibility, uncommitted-fix
  4. **Searching commits for overlay**
  5. command
    git log --oneline --grep="overlay" --grep="background" --grep="Render-Overlay" -n 30
    workdir
    /home/basil/projects/Img-to-maze
    description
    Finds commits related to overlay rendering
    output
    375dba3 [Story RBO-1] Map overlay background read failures to invalid-input
    5c6331b [Story RBO-1] Add background overlay PNG render mode
    24fc30f [Story B3] Add imprint overlay diagnostics and regression checks
  6. **Considering commit details display**
  7. command
    git show --stat --name-only 5c6331b
    workdir
    /home/basil/projects/Img-to-maze
    description
    Shows files changed for overlay PNG mode commit
    output
    commit 5c6331b2cca77d8d009ee91bcb2596a049178263
    Author: Basil Goff <shawn7400@gmail.com>
    Date:   Wed Mar 11 12:35:58 2026 -0700
    Show more
        [Story RBO-1] Add background overlay PNG render mode
    
    README.md
    fixtures/renders/hash-baseline.json
    scripts/render/update-hash.ts
    src/cli.ts
    src/render/png.ts
    tests/render/png.test.ts
    
  8. command
    git show --stat --name-only 375dba3
    workdir
    /home/basil/projects/Img-to-maze
    description
    Shows files changed for overlay error handling commit
    output
    commit 375dba3dd36b6a48f45400cf48f181230a5a59eb
    Author: Basil Goff <shawn7400@gmail.com>
    Date:   Wed Mar 11 12:42:55 2026 -0700
    Show more
        [Story RBO-1] Map overlay background read failures to invalid-input
    
    src/cli.ts
    tests/render/png.test.ts
    
  9. **Checking CLI background option**
  10. pattern
    background|overlay
    path
    /home/basil/projects/Img-to-maze/src/cli.ts
    output
    Found 30 matches
    /home/basil/projects/Img-to-maze/src/cli.ts:
      Line 25: } from "./imprint/overlay";
    Show more
      Line 99: 	overlay?: string;
      Line 123: 	overlayWhitePaths?: boolean;
      Line 124: 	backgroundImage?: string;
      Line 339: 		if (options.dryRun || options.dump || options.overlay || options.trace) {
      Line 341: 				"invalid-method: --dry-run, --dump, --overlay, and --trace are only supported for legacy imprint methods",
      Line 364: 		if (options.dryRun || options.dump || options.overlay || options.trace) {
      Line 366: 				"invalid-method: --dry-run, --dump, --overlay, and --trace are only supported for legacy imprint methods",
      Line 966: 	if (options.overlay) {
      Line 967: 		const overlaySvg = renderImprintOverlaySvg({
      Line 973: 		await mkdir(path.dirname(options.overlay), { recursive: true });
      Line 974: 		await Bun.write(options.overlay, overlaySvg);
      Line 1269: 	const overlayWhitePaths = options.overlayWhitePaths ?? false;
      Line 1270: 	const backgroundImage = options.backgroundImage?.trim();
      Line 1272: 	if (overlayWhitePaths && !backgroundImage) {
      Line 1274: 			"invalid-input: --overlay-white-paths requires --background-image <path>",
      Line 1279: 	if (backgroundImage && !overlayWhitePaths) {
      Line 1281: 			"invalid-input: --background-image requires --overlay-white-paths",
      Line 1286: 	if (overlayWhitePaths && exportFormat !== "png") {
      Line 1288: 			"invalid-input: --overlay-white-paths requires --export png",
      Line 1301: 		legend: overlayWhitePaths ? false : options.legend,
      Line 1320: 		if (overlayWhitePaths) {
      Line 1324: 					backgroundImage: backgroundImage ?? "",
      Line 1330: 						: "unknown overlay render error";
      Line 1363: 				overlays: report.overlays.length,
      Line 1449: 		.option("--overlay <path>", "Output overlay SVG path")
      Line 1496: 			"--overlay-white-paths",
      Line 1497: 			"Composite white corridor paths over a background image (PNG export only)",
      Line 1500: 			"--background-image <path>",
      Line 1501: 			"Background image path used with --overlay-white-paths",
  11. **Planning terse response format**
  12. Yes — you’re right, that was implemented. I found these commits: - `5c6331b` `[Story RBO-1] Add background overlay PNG render mode` - `375dba3` `[Story RBO-1] Map overlay background read failures to invalid-input` And the CLI supports it via: - `--overlay-white-paths` - `--background-image <path>` - (requires `--export png`) So the expected command pattern is: `bun run src/cli.ts render --input <maze.json> --theme ink --export png --overlay-white-paths --background-image <source-image> --out <output-base>`
  13. Okay, can you re-generate the banana mst maze using that?
Large command output prune

In this example, the LLM prunes a completed validation/build segment, retaining only the pass/fail outcome rather than the full command output.

  1. You are an Adversarial Code Reviewer. Read first: 1) /home/basil/projects/opencode_context_management/opencode/.llm-conductor/REVIEWER_SUBAGENTS.md 2) /home/basil/projects/opencode_context_management/opencode/.agents/plans/epic-basil-minimal-rebase/story-basil-minimal-rebase.2-replay-required-runtime-changes-on-fresh-upstream-base.md 3) /home/basil/projects/opencode_context_management/opencode/.opencode/context_bonsai/basil-rebase-commit-scope.md Story: 1.2 Replay required runtime changes on fresh upstream base Commits to review (replay worktree): - 68b8a458ddf1124d97a1ba5c1768043805ed8eeb - 197aec33681c0390ea9f8b2ba5e89cb8181c6f84 - 6489e5a5ab913e1deeb4a7065f93a3c8668161b4 - 0583c23392d1c1b15124dcc7d3a5d003cf406ec2 Review requirements: - Verify only required runtime items were replayed in deterministic order - Verify mapping/method evidence in `.opencode/context_bonsai/basil-rebase-runtime-execution-notes.md` - Validate acceptance criteria for Story 1.2 - Identify genuine issues only with severity and evidence Return: - Verdict PASS/FAIL - Findings (or none) - AC mapping - Follow-up guidance for judge
  2. command
    bun --cwd packages/opencode test test/tool/registry.test.ts test/session/message-v2.test.ts test/session/session.test.ts
    workdir
    /home/basil/projects/opencode_context_management/opencode-basil-minimal-runtime-replay
    timeout
    600000
    description
    Run targeted opencode replay validation tests
    output
    $ bun test --timeout 30000 test/tool/registry.test.ts "test/session/message-v2.test.ts" test/session/session.test.ts
    bun test v1.3.11 (af24e281)
    
    Show more
     28 pass
     0 fail
     54 expect() calls
    Ran 28 tests across 3 files. [5.46s]
    
  3. command
    bun --cwd packages/opencode typecheck
    workdir
    /home/basil/projects/opencode_context_management/opencode-basil-minimal-runtime-replay
    timeout
    600000
    description
    Run opencode package typecheck validation
    output
    $ tsgo --noEmit
    
  4. command
    bun --cwd . turbo build --filter=opencode
    workdir
    /home/basil/projects/opencode_context_management/opencode-basil-minimal-runtime-replay
    timeout
    600000
    description
    Build opencode target from replay worktree
    output
    • turbo 2.8.13
    • Packages in scope: opencode
    • Running build in 1 packages
    Show more
    • Remote caching disabled, using shared worktree cache
    opencode:build: cache miss, executing 9306de5ec2e6572c
    opencode:build: $ bun run script/build.ts
    opencode:build: opencode script {
    ... large build output omitted here; source session then counted the truncated log and inspected the tail ...
  5. command
    wc -l "/home/basil/.local/share/opencode/tool-output/tool_d2bedaa52001YtKi2JJ8aQniFF"
    description
    Count lines in truncated build log file
    output
    616 /home/basil/.local/share/opencode/tool-output/tool_d2bedaa52001YtKi2JJ8aQniFF
    
  6. filePath
    /home/basil/.local/share/opencode/tool-output/tool_d2bedaa52001YtKi2JJ8aQniFF
    offset
    590
    limit
    40
    output
    <path>/home/basil/.local/share/opencode/tool-output/tool_d2bedaa52001YtKi2JJ8aQniFF</path>
    <type>file</type>
    <content>
    Show more
    590#VQ:opencode:build: 
    591#HB:opencode:build: $ husky
    592#VQ:opencode:build: 
    593#VW:opencode:build: installed @parcel/watcher@2.5.1
    594#VQ:opencode:build: 
    595#VH:opencode:build: [857.00ms] done
    596#ZP:opencode:build: building opencode-linux-arm64
    597#XT:opencode:build: building opencode-linux-x64
    598#HN:opencode:build: Running smoke test: dist/opencode-linux-x64/bin/opencode --version
    599#XJ:opencode:build: Smoke test passed: 0.0.0-basil-minimal-runtime-replay-202603262054
    ...
    613#SZ: Tasks:    1 successful, 1 total
    614#KQ:Cached:    0 cached, 1 total
    615#NP:  Time:    25.421s 
    616#YH:
    
    (End of file - total 616 lines)
    </content>
  7. PRUNE
    from_pattern
    Run targeted opencode replay validation tests
    to_pattern
    Count lines in truncated build log file
    reason
    Large command output retained only as pass/fail evidence
    summary
    Executed replay validation commands in the replay worktree: targeted tests passed (28/28), typecheck completed without errors, and turbo build succeeded. Build output was large and truncated, so line count was captured for targeted tail inspection.
    index_terms
    story-1.2, replay-worktree, bun test, typecheck, turbo build, validation
    output
    Archived 2 messages from pattern "Run targeted opencode replay validation tests" (resolved to msg_d2bed02c2001cvLivq6rfxwSr6) to pattern "Count lines in truncated build log file" (resolved to msg_d2bedaa9d00141qJxfq1i4oIOz).
    Summary: Executed replay validation commands in the replay worktree: targeted tests passed (28/28), typecheck completed without errors, and turbo build succeeded. Build output was large and truncated, so line count was captured for targeted tail inspection.
    Index terms: story-1.2, replay-worktree, bun test, typecheck, turbo build, validation
  8. Verdict: **FAIL** - I reviewed the four replay commits (`68b8a458d`, `197aec336`, `6489e5a5a`, `0583c2339`) and validated branch state in the replay worktree. - Runtime checks are green, but Story 1.2 traceability/determinism evidence is incomplete, so acceptance is blocked. **Findings** - **CRITICAL — Missing authoritative required-set mapping artifact** - **Evidence**: `.opencode/context_bonsai` in replay worktree contains only `basil-rebase-runtime-execution-notes.md`; `basil-rebase-commit-scope.md` is absent (`/home/basil/projects/opencode_context_management/opencode-basil-minimal-runtime-replay/.opencode/context_bonsai`). - **Impact**: Cannot verify “only required runtime items were replayed” or prove exclusion coverage against Story 1 required-set contract. - **CRITICAL — Per-commit mapping/method evidence is incomplete for replayed commits** - **Evidence**: Mapping table documents 3 replay commits only (`/home/basil/projects/opencode_context_management/opencode-basil-minimal-runtime-replay/.opencode/context_bonsai/basil-rebase-runtime-execution-notes.md:13`), but replay branch includes 4 commits (`git log --reverse --oneline upstream/dev..HEAD`). - **Evidence**: `0583c2339` is only described in prose, not in the source->replay table with replay_order/source_sha/method (`/home/basil/projects/opencode_context_management/opencode-basil-minimal-runtime-replay/.opencode/context_bonsai/basil-rebase-runtime-execution-notes.md:21`). - **Impact**: Violates AC requiring method + intent/functionality equivalence evidence for each replayed commit and deterministic replay traceability. **AC Mapping (Story 1.2)** - **AC1** New replay worktree branch from latest `upstream/dev` — **PASS** - Evidence: merge-base equals upstream tip (`b7a06e...`) and notes record same base SHA (`...runtime-execution-notes.md:7`). - **AC2** All required runtime commits replayed with source mapping — **FAIL** - Evidence gap: missing `basil-rebase-commit-scope.md`; one replay commit lacks table mapping. - **AC3** Each replayed commit records method + equivalence evidence — **FAIL** - Evidence: table has 3 rows for 4 commits; `0583c2339` lacks structured row. - **AC4** No drop/excluded commits replayed — **PARTIAL (insufficient proof)** - Positive: transform-hook grep exclusion recorded (`...runtime-execution-notes.md:29`). - Blocker: without authoritative scope doc, cannot fully validate all drop/excluded decisions. - **AC5** Targeted runtime validations pass — **PASS** - Evidence: tests/typecheck/build succeeded (notes at `...runtime-execution-notes.md:45`, `:53`, `:61`; independently re-run and passed). **Follow-up guidance for judge** 1. Require restoration/attachment of `basil-rebase-commit-scope.md` as the authoritative required/drop set for Story 1. 2. Require structured mapping row for `0583c2339` (source SHA or explicit scope amendment reference, method, equivalence evidence). 3. Re-check AC2/AC3/AC4 only after (1) and (2); runtime validation status can remain accepted as already passing.
Self-initiated heavy-segment prune

In this example, the LLM independently decides to prune a large completed orchestration segment after a heavy tool/subagent stretch, retaining only the facts needed for the new request.

  1. I’m pruning the completed segment for the previous invert-density orchestration to keep context focused on the new `mask-floor` planning + execution flow. I’ll retain: - current request to add hard dark cutoff option, - newly created validated plan path, - and the requirement to orchestrate only after zero-issue validation.
  2. PRUNE
    from_pattern
    orchestrate .agents/plans/story-add-poisson-density-inversion-option.md
    to_pattern
    Use planning_guidance.md to make a plan for that, then orchestrate only if the plan gets validated with no remainging issues
    summary
    Archived the completed Poisson invert-density story orchestration and subsequent banana generation/testing. Outcome: feature was implemented and approved, and user shifted focus to a new hard dark-value exclusion option requiring separate planning and orchestration.
    index_terms
    poisson-invert-density, story-orchestration, dev-review-judge, banana-overlay, feature-complete, new-request-mask-floor
    output
    Archived 27 messages from pattern "orchestrate .agents/plans/story-add-poisson-density-inversion-option.md" (resolved to msg_d023fcfb6001jdBXeQnYnG3wa9) to pattern "Use planning_guidance.md to make a plan for that, then orchestrate only if the plan gets validated with no remainging issues" (resolved to msg_d076288220016f0Vl2bbFGyp1a).
    Summary: Archived the completed Poisson invert-density story orchestration and subsequent banana generation/testing. Outcome: feature was implemented and approved, and user shifted focus to a new hard dark-value exclusion option requiring separate planning and orchestration.
    Index terms: poisson-invert-density, story-orchestration, dev-review-judge, banana-overlay, feature-complete, new-request-mask-floor
  3. description
    Developer mask-floor story
    subagent_type
    general
    output
    Implemented PMF-1 on `main` with Poisson-only `--mask-floor` hard exclusion, including CLI validation/gating, generator eligibility enforcement, determinism artifact update, tests, and README docs.
    
    Commits:
    Show more
    - `9d8e1859425dafe0b83b2e0fa1b4d68d6f767349` — `[Story PMF-1] Add Poisson mask floor hard exclusion`
    
    Validation:
    - targeted tests passed
    - `bun test` passed
    - no blockers

Try it

If you want to try it, start at the hub repo. That is the front door. It explains the idea, links to the harness-specific implementations, and points to the current state of each port.

If you use it, I’d love some feedback. I’m especially interested if you see it prune content it shouldn’t have or not prune content it should have. Did you find it helpful? Did the prune tool ever encounter failures?

The main reference implementation is the OpenCode version. There’s also a Claude Code version that I’ve tested and seems to work fine, but I don’t use Claude Code as my primary harness, so it has less testing. There are also other harnesses that I’ve only started working on, but I haven’t tested or reviewed them yet.