A floating AI orb for macOS — voice-first, privacy-focused, built entirely in Swift + Python. Runs a local LLM on-device via Apple MLX, cascades to cloud APIs only when needed, plays chess, plays Arkanoid autonomously via reinforcement learning, and learns your preferences over time.
George uses a two-tier intelligence model: a local on-device LLM running on Apple Silicon GPU via mlx_lm for fast, private answers, and a cascade of cloud APIs for questions requiring real-time data.
| File | Role |
|---|---|
GeorgeController.swift | Main Swift controller — mic, speech recognition, screen capture, TTS, all AI calls |
server.py | Python HTTP server on localhost:8765 — LLM routing, local inference, cloud API calls |
Agent.swift | Reinforcement learning layer — learns preferences and improves routing decisions over time |
Memory.swift | Persistent memory of facts, conversation history, and user profile |
Step 1 — Fast Pattern Matching — Before invoking any model, server.py runs the query through compiled regex patterns. Zero tokens, zero API calls.
# server.py — fast_route()
def fast_route(user_input, has_cloud):
if PERSONAL_LOCAL_PATTERNS.search(user_input): return 'LOCAL'
if not has_cloud: return 'LOCAL'
if FORCE_CLOUD_PATTERNS.search(user_input): return 'CLOUD'
if CLOUD_PATTERNS.search(user_input): return 'CLOUD'
if LOCAL_PATTERNS.search(user_input): return 'LOCAL'
return None # ambiguous — go to Step 2python
| Pattern | Triggers |
|---|---|
PERSONAL_LOCAL | Questions about you, your family, health, schedule — always local, never sent to cloud |
FORCE_CLOUD | "use the internet", "search online", "google it" — bypasses local model entirely |
CLOUD_PATTERNS | News, scores, stock prices, current events, today's weather |
LOCAL_PATTERNS | Definitions, jokes, stories, math, general knowledge — anything timeless |
Step 2 — LLM Router — If pattern matching returns None, George uses its own local LLM to classify the query. A 4-token inference call at temperature=0 returns only the word LOCAL or CLOUD.
Step 3 — Cloud Cascade — When a query needs the internet, George tries providers in order:
| Priority | Provider | Notes |
|---|---|---|
| 0 | wttr.in | Weather queries bypass all LLMs — free, no API key |
| 1 | OpenRouter | Tries gpt-4o-mini, gemma-3, llama-3 in order |
| 2 | OpenAI | Direct fallback if OpenRouter fails |
| 3 | Gemini | Google Gemini 2.0 Flash as final fallback |
| 4 | Local fallback | Answers locally with a caveat if all cloud fails |
{"sentence": "Here is the first sentence.", "source": "local"}
{"sentence": "And here is the second.", "source": "cloud"}
{"done": true, "full": "Complete text", "source": "cloud"}json
George can look at your screen on demand. Say "Hey George, what's on my screen?" and George takes a screenshot, encodes it, and sends it to a cloud vision model.
Trigger phrases: "what's on my screen" · "describe my screen" · "what do you see" · "look at my screen" · "read the screen" · "summarize this page"
// GeorgeController.swift — handleScreenVision()
let path = "/tmp/george_screen.png"
_ = await shell("screencapture -x " + path)
{ "type": "image_url", "image_url": { "url": "data:image/png;base64,..." } }swift
George integrates with macOS hardware through official Apple APIs and standard Unix tools — no kernel extensions, no custom drivers.
| Framework | Role |
|---|---|
AVAudioEngine | Captures raw audio at 44.1kHz, streams buffers to the speech recognizer |
SFSpeechRecognizer | On-device speech recognition. A second instance listens for "stop" / "shut up" while George is speaking |
The local LLM runs on Apple Silicon GPU (M1/M2/M3/M4) using Apple's MLX framework via mlx_lm. Loaded into GPU memory at startup.
| Data | Command |
|---|---|
| CPU usage | ps -A -o %cpu | awk |
| RAM total / free | sysctl -n hw.memsize / vm_stat |
| Battery | pmset -g batt |
| Disk usage | df -h / |
| Network interface | route get default |
| Fan / CPU temp | istats fan speed / istats cpu temp (optional) |
George's memory system is what makes it feel like a real companion rather than a stateless chatbot. Everything lives in ~/.george/memory.json — never sent to the cloud.
{ "persona": { "name": "George", "traits": ["curious","warm","witty","direct","honest"] },
"history": [ /* last 80 conversation turns */ ],
"facts": [ /* up to 300 plain-English facts about you */ ] }json
Every user message is scanned by a regex-based fact extractor before it reaches the AI. Categories detected:
Name · Spouse/Partner · Children · Extended family · Job · Location · Age · Pets · Hobbies (21 types) · Music genres (27 types)
If you say "correction", "remember I...", "actually I...", or "I don't listen to X", George removes the conflicting fact and stores the corrected version.
| Rule |
|---|
| Voice-only output — no markdown, no asterisks, no bullet points |
| Keep replies to 1–3 sentences unless asked for detail |
| Naturally weave in past facts when they enrich the answer |
| Never invent facts, news, weather, or current events without real data |
| Never mention Claude, Anthropic, LLaMA, or that you're an AI — you ARE George |
The floating orb is a real-time status display built in pure SwiftUI with 11 distinct visual styles.
| State | Visual Behavior |
|---|---|
.booting | Scale pulls in (0.92x), three animated dots cycle below showing loading progress |
.idle | Slow breathing — scale oscillates 1.0 ↔ 1.07 over 3 seconds |
.listening | Orb shrinks to 82%, ripple ring expands and fades, live transcription displayed |
.thinking | Subtle nudge to 1.02x, 6 particles orbit the perimeter |
.speaking | Rapid pulse between 1.0 and 1.26 scale — fast heartbeat effect |
George includes a complete, self-contained chess implementation in Swift with no external libraries — full rules, minimax search, alpha-beta pruning, and natural-language move commentary.
Pieces encoded as raw integers 0–12, stored as a flat 64-element array. ChessBoard is a Swift struct (value type) — every speculative move creates an independent copy.
struct ChessBoard {
var squares: [ChessPiece] = Array(repeating: .empty, count: 64)
var whiteKingsideCastle = true
var whiteQueensideCastle = true
var blackKingsideCastle = true
var blackQueensideCastle = true
var enPassantSquare: Int = -1
var whiteToMove = true
}swift
| Piece | Value | Reasoning |
|---|---|---|
| Pawn | 100 | Baseline unit |
| Knight | 320 | Slightly less than bishop |
| Bishop | 330 | Bishop pair advantage |
| Rook | 500 | Worth ~5 pawns |
| Queen | 900 | Worth ~9 pawns |
| King | 20,000 | Sentinel — game ends before capture |
Searches 4 moves deep. Without pruning: ~810,000 positions (30⁴). Alpha-beta reduces the effective branching factor from ~30 to ~7 — responses under one second on Apple Silicon.
private static func alphabeta(_ board: ChessBoard, depth: Int,
alpha: Int, beta: Int, maximising: Bool) -> Int {
if depth == 0 { return board.evaluate() }
if b <= a { break } // beta cutoff — prune remaining branches
}swift
George can play Arkanoid autonomously using a Q-learning agent. It watches the game screen via screenshot, infers state, chooses actions, and sends real keyboard inputs to control the paddle.
# Bellman equation — GamePlayer.swift updateQ()
bestNext = max(qTable[nextState][action] for action in actions)
updated = current + α × (reward + γ × bestNext − current)
qTable[state][action] = updatedpython
| Hyperparameter | Value | Reasoning |
|---|---|---|
| Learning rate (α) | 0.25 | Conservative for noisy screen-based state |
| Discount factor (γ) | 0.95 | Longer horizon — brick hits are delayed by many frames |
| Epsilon (ε) | 0.3 → 0.1 | Explore freely at first, then mostly exploit learned strategy |
| Frame rate | 50ms / loop | 20 decisions per second |
| Reward | Trigger |
|---|---|
| +2.0 | Ball moving toward paddle AND paddle within 15% of ball |
| +1.5 | A brick was destroyed this frame |
| +1.0 | Ball velocity moving toward paddle (correct anticipation) |
| -0.5 | Ball moving toward paddle but paddle is far away |
| -2.0 | Ball reached the bottom edge (ball lost) |
~/.george/qtable.json. George gets smarter across sessions. On startup: "Starting Arkanoid. I have 847 states learned."main.swift is 46 lines. AppDelegate.swift is 42 lines. Small files, enormous consequences.
app.setActivationPolicy(.accessory)
// No Dock icon · Not in Cmd+Tab · No menu bar
// George is ambient — always present, never intrusiveswift
./setup.sh
~/Desktop/George.command # double-click launcherbash
8 steps: Verify prerequisites → Create ~/.george/ → Copy server.py → Enter API keys → Install mlx-lm → Install yt-dlp → swift build -c release → Download Llama 3.2 3B (~1.8 GB)
A self-contained single-file web app served by server.py at GET /. Open it on any phone, tablet, or computer — no install required. Dark space theme, CSS-only animated orb, streaming text, voice input via Web Speech API.