THE AI OS FOR FOUNDERS · BUILT AT FINCH · USED IN PRODUCTION DAILY

Claude that doesn't forget.

Finch Engine is the operating layer under Claude Code. It makes the agent follow protocols, remember across sessions, and survive context overflow — the same engine my four-person team runs an entire claims-intelligence platform on every day.

  • 30-day free trial
  • Card on file
  • Cancel in one click
  • No setup call required

Built by Finch. We use it every day. We eat this for breakfast.

§ 01 · the pain

Four ways your AI agent fails you this week.

FAILURE I

Amnesiac.

Every session starts at zero. Yesterday's two hours of context — gone.

FAILURE II

Skips protocols.

Says it wrote tests. Tests were inadequate. Docs untouched.

FAILURE III

Loses threads.

Subagents die mid-task. Reports vanish. Three agents step on each other.

FAILURE IV

Context collapses.

By the end of a long session, it's forgotten the original goal.

Finch Engine fixes all four.

§ 02 · the contrast

Same agent. Different operating layer.

Run the same task in a vanilla Claude Code session, then through Finch Engine. Each failure mode above — caught and corrected by the protocol layer, not the agent.

Without Enginevanilla Claude Code · no protocol layer
With Engineprotocol layer engaged · hooks armed
fig. 01Amnesiac — the agent forgot yesterday
$ continue what we were working on
> I don't have prior context. Could you
> share what we were working on?
✗ context lost · two hours gone
$ engine session rehydrate \
    sessions/2026_05_14_AUTH_AUDIT
✓ rehydrated · 2,847 lines
✓ 6 decisions · 3 open threads
✓ session restored · resume verbatim
fig. 02Skipped protocols — proof that wasn't there
> I've written the tests.
$ ls test/
ls: no such file or directory
✗ claimed without proof
▸ phase 4 · proof required
✗ gate blocked · tests-written
   required artifact: test/*.spec.ts
   advance denied · re-do phase 3
✓ phase gate refused to advance
fig. 03Lost threads — the subagent vanished
▸ delegating to refactor-agent…
[agent terminated · output truncated]
> what did the refactor say?
I don't have that information.
✗ delegation black-holed
#needs-refactor → #claimed-refactor
   → #done-refactor · pane 3
↘ handoff logged · 247 lines
↘ patch ready for review
✓ tags caught the thread
fig. 04Context collapse — the goal evaporated
[conversation compacted · context lost]
> what were we trying to ship?
> I'm not sure — could you remind me?
✗ overflow ate the goal
▸ /dehydrate · checkpoint saved
✓ goal pinned · 4 decisions kept
▸ /rehydrate · context restored
✓ resume · no degradation
✓ overflow handled · goal kept

Same model. Same prompts. The engine is the difference.

§ 03 · try it

Watch /analyze run on real code. No signup.

Paste a snippet — or pick a sample. We run /analyze server-side and stream the engine's output here. Phase by phase. Same protocol your trial runs.

Rate-limited to 5 runs / day per IP. Want unlimited? Start the trial.

ENGINE/analyze · streaming
Hit "Run /analyze" to see the engine work through phases on your code.

Phase 1 — Context Ingestion
Phase 2 — Research Loop
Phase 3 — Calibration
Phase 4 — Synthesis

Each phase has gates. Each gate requires proof. You'll see them light up as the engine works.
§ 04 · the engine

A protocol layer, not a prompt library.

Six core abstractions. Bash and markdown under the hood. No framework to learn, no new dependencies, no runtime to install beyond the engine itself.

I

Sessions

Stateful conversation with phases + a persistent log. The session is the source of truth, not the chat.

II

Skills are protocols

Phase-gated workflows with required proof fields. The agent can't skip steps because the tool layer enforces it.

III

Tags

Every artifact carries semantic state — #needs-fix, #claimed-fix, #done-fix. The tags are the state machine.

IV

Directives

Stacked rules — core, project, directory. The agent reads them automatically when they apply.

V

Delegation

Skills hand off to skills. Async, blocking, or silent. You don't lose threads.

VI

Hooks

Bash hooks fire on every tool call. This is how “discipline” actually works.

It reviews the plan, not the PR. The protocol catches bugs before the bug-author has written them.

§ 05 · the engine, running

What it looks like when discipline is in the tool.

One session in flight. Phases on the rail. Proof gates lighting up only after their conditions are met. Tags moving through the state machine at the bottom. This is what the six abstractions look like when they're alive.

SESSIONsessions/2026_05_15_AUTH_AUDIT/SKILL/expert-system-reviewLIVE · phase 3 of 4 · 11.4s
  1. phase 1context ingestiondone
    • source-acknowledged
    • symbols-extracted
  2. phase 2research loop · 3 roundsdone
    • hypothesis-tested
    • evidence-cited
    • severity-assigned
  3. phase 3calibrationactive
    • findings-deduplicated
    • severity-bucket-sweep
    • handoff-prepared
  4. phase 4synthesisqueued
    • write ANALYSIS.md
    • hook · post-tool-use · log-sync
session logappend-only · auditable
11.21phase 1 · ingested 1,840 LOC across 6 files
11.24phase 2 · round 1 · hypothesis · plaintext password compare
11.27found · SEC-001 · MUST FIX · line 3 of auth.py
11.31phase 2 · round 2 · no brute-force guard observed
11.33found · SEC-002 · SHOULD FIX · brute-force exposure
11.36phase 2 · round 3 · session id not regenerated on login
11.38found · SEC-003 · SHOULD FIX · session fixation
11.40phase 3 · sweeping severity buckets ·
tags
  • #needs-fix 3
  • #claimed-fix 2
  • #done-fix 1
  • #blocked 0
↗ 2 delegations in flight · hook log-sync armed
  1. ISessionthe frame above. Phases + log + tags as one auditable object.
  2. IISkillthe phases on the rail. Each gates on proof.
  3. IIITagsthe chips at the bottom. They are the state machine.
  4. IVDirectivesthe gate rules. Block invalid moves silently.
  5. VDelegationthe “in flight” counter. Handoffs the engine remembers for you.
  6. VIHookslog-sync, post-tool-use. Discipline fires on every call.
§ 06 · fleet · parallel orchestration

Four agents. One pane of glass.

Fleet runs Engine sessions in parallel — each in its own pane, each with its own skill, each surviving its own context. You watch four problems get solved at once, without losing the thread on any of them. tmux for agents, with the protocol layer keeping score.

FLEET · 4 panes · 4 agents · all phases logged12m 04s elapsed · 47 hooks · 0 failures
PANE 1analyzer/analyze3 / 4
[11:40] PHASE 3 · calibration
[11:41] ✓ findings-deduped (12 → 8)
[11:42] ▸ severity-bucket-sweep
[11:42]   SEC-001 · MUST FIX
[11:43]   SEC-002 · SHOULD FIX
75%
PANE 2researcher/research2 / 3
[11:38] PHASE 2 · gemini deep research
[11:40] query refined · 4 sub-questions
[11:42] ▸ awaiting 19 sources
[11:43]   3 of 19 returned
[11:43]   citations · pinned
55%
PANE 3tester/test4 / 4 ✓
[11:39] PHASE 4 · proof
[11:41] generated · auth.test.ts
[11:42] ran · 27 tests
[11:42] ✓ 27 passing · 0 failing
[11:43] hook · post-tool-use · log-sync
done
PANE 4shipper/review1 / 3
[11:43] PHASE 1 · load prior session
[11:43] ✓ session memory hydrated
[11:44] ▸ diff loaded · 247 lines
[11:44] ◌ panel of reviewers · queued
[11:44]   10 reviewers · 2 models
20%
↘ shipper depends on analyzer ✓ · tester ✓ · researcher (waiting)hooks fired · 47  ·  proof gates passed · 23  ·  collisions · 0

One terminal. Four contexts. Zero collisions. The orchestrator routes the work; the engine enforces it.

§ 07 · the catalog

14 skills. New ones drop weekly.

Every skill is a multi-phase protocol with proof-gated execution — dogfooded inside Finch for a week before it ships to subscribers. Pro subscribers vote on what gets built next. Each tile has a 60-second screencast so you see the protocol before you buy it.

/expert-system-reviewNEW

Ten reviewers, two AI models, on every plan before a line ships.

engineeringv0.4 · dogfood +7d
/brainstorm

Structured ideation that ends in a decision — and a record of what you rejected.

engineeringv0.7 · 14 runs/wk
/research

Gemini Deep Research with a refined-brief protocol. Comes back with citations.

engineeringv0.5 · 22 runs/wk
/review

PR review that loads the original implementation session's memory first.

engineeringv0.9 · 38 runs/wk
/qa

Browser-based E2E QA via Playwright. Real navigation, real assertions.

engineeringv0.3 · 9 runs/wk
/refine

TDD for LLM prompts and schemas. Iterate against measurements, not vibes.

engineeringv0.6 · 12 runs/wk
/standup

Today's standup transcript → a Slack-ready summary in ninety seconds.

opsv1.2 · 5 runs/wk
/investor-update

Monthly update synthesized across Linear, Slack, standups, and your codebase.

opsv0.8 · 1 run/mo
/post-call

Sales call transcript → Notion post-call notes with action items.

opsv0.5 · 8 runs/wk
/prep-call

Pre-call brief generated from your CRM and live web research.

opsv0.4 · 6 runs/wk
/market-intel

Scan your industry for trigger events worth a sales conversation.

opsv0.3 · 2 runs/wk
/outbound-research

Deep account research from a URL or CSV → personalized outreach.

opsv0.4 · 3 runs/wk
/humanize

Strip AI tells from drafts. No more em-dashes, “elevated” anything.

opsv0.6 · 19 runs/wk
/learnings

Push micro-learnings to your team's shared library.

opsv0.3 · 4 runs/wk
/your-skillVOTE

Submit what you need. Top-voted skill ships within two weeks. You decide what drops next.

request queuesubscribe to vote →
§ 08 · production proof

1,280 sessions. 6 people. Every role.

Pulled from our shared drive this morning — sessions logged by every person at Finch, not just the engineers. Five months. One engine. The number that matters is who's running it.

6people.
5months.
1,280sessions.
Every role at Finch ran sessions this quarter:
TomFounder / CEO534
YarikEngineering593
NathanGTM · non-technical89
BrunoEngineering56
JustinCustomer · account ops6
RobDesign2

It's not a coding tool. It's how the whole company operates.

§ 09 · pricing

Simple. Two tiers. Cancel anytime.

Individual$200/mo

Everything in Finch Engine. The full skills marketplace. Fleet. Cloud sync. Dashboard.

  • 14+ skills · weekly drops
  • Fleet · multi-pane parallel orchestration
  • Cloud log sync across all your machines
  • Cross-machine semantic session search
  • Web dashboard at engine.finchclaims.com
  • Skill request queue · vote on what we ship next
  • Priority support · private Discord
  • 30-day free trial · card on file
Start 30-day free trial

Founders Edition: lock this price for life.

Team$150/seat/mo

Everything in Individual, plus team accounts, shared session view, and admin controls.

  • Everything in Individual
  • Team accounts · invites · RBAC
  • Shared session view across the team
  • Team dashboard + admin controls
  • 3-seat minimum · $450/mo floor
  • 30-day free trial · card on file
Equip the team

Talk to us about enterprise plans.

Annual: 20% off. Cancel anytime via Stripe Customer Portal. We don't read your logs.

§ 10 · faq

Plain answers.

What happens at the end of the 30-day trial?
Your card on file charges $200. We send a “your trial is ending” email at day 28 with a one-click cancel link. No grace period games, no retention dark patterns, no “email Tom to cancel.” Cancel through the Stripe Customer Portal at any time. If you cancel before day 30, you're never charged.
How is this different from Cursor or Claude Code itself?
Cursor and Claude Code give you the agent. Finch Engine gives the agent discipline. Hooks block tool calls if the protocol's been skipped. Phases require proof to advance. Logs are append-only and auditable. Cursor sells autocomplete; Claude Code sells autonomy; this sells accountability. Run all three together — they're complementary layers.
Do I need a Claude API key?
Yes. Finch Engine runs on top of your existing Claude Code or Cursor subscription — we don't resell inference. Bring your own Claude API key (or Claude Pro / Claude Max subscription). The $200/mo covers the engine, the skills marketplace, cloud sync, and support. Your Anthropic spend is separate and unchanged.
What's actually in the skills marketplace?
Fourteen launch skills (six engineering, eight ops/CEO) — each a multi-phase protocol with proof-gated execution, dogfooded for a week at Finch before it ships to you. New skills drop every Friday. Subscribers get a request queue: submit what you want, others vote, top-voted ships within two weeks.
When do cloud sync, dashboard, and team features ship?
Four weeks from launch — included in your subscription as soon as they're live. Founders Edition subscribers get every future feature included in their $200/mo locked price forever. Until cloud is live, sessions run locally with no degradation.
Is this NOT for me?
If you're not already using Claude Code or Cursor at least weekly, this isn't for you yet. If you want a chat-only AI assistant, get Lindy or a custom GPT. If you want a no-code automation builder, get n8n or Zapier. This is for founders and engineering teams whose AI agent is already part of how they ship, and who want it to behave more like a real teammate.
What if you (Finch) get acquired or stop maintaining this?
Your local engine and locally-cached skills keep working — you can run them forever on the version you have. If Finch the company shuts down or pivots away from this product, we commit to open-sourcing the engine and final skill library on a 90-day notice. Written into the terms.

If you're using Claude Code, Cursor, or any agent today — you've felt all four failure modes.

Finch Engine fixes that. Try it for 30 days.