Inside the engine.

How George processes a message. Type one and watch.

Send a message through the pipeline
connecting to Pi...
pipeline — idle
Message In
raw text
The Gate
Deterministic / Local
14 questions in <5ms. No API call. No cost.
Real names replaced before anything reaches the cloud.
Classify
14 local + 8 deferred
Routing
decision
internet
Sort
classify + plan
Recall
memory retrieval
Fetch
external data
External data pulled before the model thinks — not after.
Think
generates response
skipped on DEEP_LITE
Simple messages skip the expensive reasoning call.
Shape
voice + format
restore
Response Out
pseudonyms rehydrated
pipeline
stages
total time
george

The gate.

A message comes in. Before anything reaches the cloud, it hits a local machine. The gate asks 22 yes-or-no questions — is this a greeting? Does it contain a name? Is it asking for something complex? Fourteen get answered by pattern matching in under 5 milliseconds. No AI involved. The other eight need language understanding, so those get deferred to the first cloud call.

A greeting never leaves the building. A real question has already been classified, scored for complexity, and routed — before the cloud has seen a word.

TIER 1 — Local
14 binary rules. Regex and pattern matching. <5ms. Zero API cost.
TIER 2 — Cloud
8 complex rules. Requires language understanding. Deferred to first cloud call.

Privacy.

While the gate classifies, a scanner looks for anything personal. Names, phone numbers, emails, addresses. When it finds them, it doesn't redact — it replaces. Real names become realistic pseudonyms. Real phone numbers become different real-looking numbers. The cloud gets a message that reads like natural language, just with someone else's identity. After the cloud responds, the local machine swaps everything back. The real data never left.

LOCAL (Raspberry Pi) ─────────────────── CLOUD (sanitized only)

Real data stays here: ─── sanitized text ───▸ Cloud sees only:
Phil Henderson ──────────────────── Michael Chen
403-555-0192 ───────────────────── 403-555-0147
[email protected] ──────────────────── [EMAIL_1]

◄──────────────── RESTORE ────────────────
rehydrate pseudonyms back to real values

Memory.

Most AI remembers by keeping a log of everything and hoping the model finds the relevant part. This system doesn't do that. Every fact gets ten numerical scores — how important, how recent, what domain, how sensitive, how confident the source. A question triggers math against those scores. Only what's relevant gets pulled. Nothing more.

Four paths.

Not every question needs the same amount of work. The system decides which path based on the gate scores, not on how long the message is.

DEEP
Full pipeline. Classification, memory retrieval, external data, reasoning, and formatting across multiple models that check each other's work.
DEEP_LITE
Same pipeline, skips the expensive reasoning call. The first call classifies and drafts the response in one pass.
QUICK
One fast cloud call. Casual questions, low-stakes lookups.
SNAP
Zero API calls. Pattern-matched. Greetings, confirmations. Instant. Free.

The modules.

Each piece does one thing. Green border means local execution — no API cost. Amber means an external model call.