Every capability,
documented.
PilotGentic ships as six complete capability streams — from the SQLite foundation layer all the way to remote phone control. Streams 0–3 are free. Streams 4–5 are paid add-ons for Pro and Max subscribers.
Stream 0
Foundation
The bedrock everything else runs on.
- SQLite behavior database — every action PilotGentic takes is recorded and searchable — Claude can look up what happened, when, and why — full-text search via FTS5
- AX compare — detects when a UI changes between steps — catches broken automations before they fail silently — accessibility tree diffing at the frame level
- Action recorder — captures exactly what you do step-by-step so Claude can learn from it — typed event payloads, not fragile screenshots
- Migration system — the database evolves as PilotGentic grows — schema updates happen automatically, no data loss, no manual steps
Stream 1
Task Engine
Persist, version, and run automation tasks at scale.
- Task store — your automations are saved and versioned — run them again anytime, track every change, nothing gets lost between sessions
- Task runner — watch your task execute live — each step updates in real time so you always know where it is and what it's doing
- Task escalation — Claude pauses and asks you when it hits a decision it can't make — you unblock it, it continues — no babysitting required
- Task tools — create, run, list, and delete tasks directly from Claude — just ask, no separate UI needed — pilotgentic_task MCP suite
Stream 2
Video Upload Path
Show Claude a screen recording — get a working task.
- Frame extractor — record yourself doing a task once — PilotGentic pulls the key moments automatically, no manual annotation needed
- Analyzer — Claude Vision reads every screen state in your recording — understands what each UI element is and what you did with it
- Synthesizer — turns the analysis into a working step-by-step automation — ready to run immediately, no code written
- AX verifier — checks every generated step against your live UI before saving — catches anything that won't work on your actual screen
Stream 3
Swift UI
First-class task management built into the app.
- Task Dashboard — see all your automations at a glance — run, edit, or delete from the sidebar without opening Claude
- Task Wizard — train a new task by walking through it step by step — no code, no prompt engineering, just show PilotGentic what to do
- Progression Meter — tracks how capable your AI setup is getting — L1 to L4 skill level shown live, unlocks as your automation library grows
- MCPClient — the app can call its own tools directly — no round-trip to Claude needed for in-app actions, keeps the UI fast and responsive
Stream 4
ML Layer
Anomaly detection, semantic search, and optimization proposals.
- Anomaly Detector — notices when a task starts taking longer or failing more than usual — flags it automatically so you can fix it before it breaks — Z-score analysis on run history
- Task Optimizer — analyzes your automations and suggests concrete speed improvements — shorter waits, better click order, skipped dialogs — proposes, never auto-applies
- Progression Engine — tracks your automation library as it grows — prerequisite-aware L1–L4 leveling unlocks more advanced capabilities as your skills develop
- Voyage Indexer — find any past action by describing it in plain English — semantic search over everything PilotGentic has ever done, powered by Voyage AI
Stream 5
Remote Add-On
Control your Mac from anywhere. Get notified the moment Claude needs you.
- Relay server — works from anywhere — no port forwarding, no VPN — WebSocket relay on Fly.io bridges your Mac and browser through NAT
- QR pairing — pair a new device in under 30 seconds — scan the QR code in Settings, done — HMAC-signed tokens, 2-min expiry
- Push notifications — get alerted the moment Claude needs your input — Web Push (VAPID) delivered to your phone even when the tab is closed
- Task dispatch — trigger any trained automation from your phone and watch it run step-by-step — no need to be at your Mac
- Supervisor View — manage multiple Macs from one browser dashboard — global escalation queue shows every Mac waiting for input
- Screen streaming — see exactly what's on your Mac screen in real time — JPEG stream at configurable quality, toggle on demand to save bandwidth
- Remote Terminal — open a shell on your Mac from any browser, anywhere — run commands, check logs, start builds — full PTY with color, resize, and no SSH setup needed
- Input forwarding — click, type, and scroll directly in the screen stream — your Mac responds as if you're sitting at it — CGEvent dispatch, not screen sharing lag
- Session revoke — revoke any paired device immediately — OTP-authenticated with email confirmation so only you can do it
Stream 7
Session Mode
Choose how AI acts on your Mac — observable or silent, your call.
- AI Doppler mode — cursor moves visibly to every target — you see exactly what Claude is doing, step by step, in real time
- Background mode — silent AX press with no cursor movement — Claude acts without interrupting your screen, ideal for unattended or parallel workflows
- Hybrid mode — observable for the interactions you care about, silent for the rest — mix visibility and throughput in the same session
- Strict enforcement — hard gate: if a tool call would violate the declared mode, it is blocked and returned as an error — no silent fallbacks
- Flexible enforcement — soft gate: mode violations are logged and allowed — useful while developing or when you want awareness without blocking
- Per-session policy — mode and enforcement level are declared in CLAUDE.md and injected into every session — Claude Code picks them up automatically
Add-On
Voice
Hands-free conversation with Claude — on-device STT and AI voice output.
- Whisper STT — on-device speech-to-text via whisper-cli — no cloud, no API key required for voice input
- ElevenLabs AI voice — natural AI voice output via ElevenLabs streaming API, or on-device AVSpeechSynthesizer fallback
- Hands-free mode — mic auto-re-arms after Claude finishes speaking for a continuous conversation loop
- Voice ring UI — 48-bar animated ring — cyan while listening, electric blue while speaking
Stream 8
Dev Loop
Edit. Reload. Test. 3 seconds, same session — no build cycle, no context loss.
- Hot-reload MCP harness — edit any tool handler in dev source — the MCP server picks it up in ~3 seconds via node --watch, same Claude session continues uninterrupted
- HTTP transport — MCP over HTTP/SSE lets Claude reconnect transparently after a server restart — no session kill, no context loss, no 90-second build cycle for JS changes
- Trajectory-based skill extraction — behavior.db records every action with outcome, timing, and AX labels — external agents can read failure trajectories and propose targeted fixes
- Parallel dev + prod servers — run the dev hot-reload server alongside the stable bundle — test new tool logic on port 7778 while the production bundle stays on stdio
ML Layer
Anomaly detection, semantic search, and optimization proposals.
- Anomaly Detector
- Task Optimizer
- Progression Engine
- Voyage Indexer
Remote Add-On
Control your Mac from anywhere. Get notified the moment Claude needs you.
- Relay server
- QR pairing
- Push notifications
- Task dispatch
- Supervisor View
- Screen streaming
- Remote Terminal
- Input forwarding
- Session revoke
Voice
Hands-free conversation with Claude — on-device STT and AI voice output.
- Whisper STT
- ElevenLabs AI voice
- Hands-free mode
- Voice ring UI
Add-ons are modular and downloadable — the base app stays lean. Compare plans →
What you get
Free, and more when you need it
Start with everything free.
Streams 0–3 ship with the free download. Unlock the ML and Remote add-ons when you are ready to go further.
macOS 14 Sonoma or later · Apple Silicon & Intel