Project 01

George AI

A floating AI orb for macOS — voice-first, privacy-focused, built entirely in Swift + Python. Runs a local LLM on-device via Apple MLX, cascades to cloud APIs only when needed, plays chess, plays Arkanoid autonomously via reinforcement learning, and learns your preferences over time.

UI & App
Swift / SwiftUI / AppKit
Local AI
MLX · Llama 3.2 3B 4-bit
Cloud AI
OpenRouter → OpenAI → Gemini
Backend
Python Flask · localhost:8765
Speech In
SFSpeechRecognizer + AVAudioEngine
Memory
Local JSON · never sent to cloud

Media George AI in Action

Demo
George Interactive
George my llm conversing with me, he is able to see the screen and able to see desktop. He sees my flying bird and makes a comment. I store long term memory with json file, he has a long term memory.
🎬
Coming Soon
Coming Soon
Video Title Here
Add your description here.
🎬
Coming Soon
Coming Soon
Video Title Here
Add your description here.
🎬
Coming Soon
Coming Soon
Video Title Here
Add your description here.

Chapter 01 System Architecture & AI Routing

George uses a two-tier intelligence model: a local on-device LLM running on Apple Silicon GPU via mlx_lm for fast, private answers, and a cascade of cloud APIs for questions requiring real-time data.

FileRole
GeorgeController.swiftMain Swift controller — mic, speech recognition, screen capture, TTS, all AI calls
server.pyPython HTTP server on localhost:8765 — LLM routing, local inference, cloud API calls
Agent.swiftReinforcement learning layer — learns preferences and improves routing decisions over time
Memory.swiftPersistent memory of facts, conversation history, and user profile

Smart 3-Step Routing

Step 1 — Fast Pattern Matching — Before invoking any model, server.py runs the query through compiled regex patterns. Zero tokens, zero API calls.

# server.py — fast_route()
def fast_route(user_input, has_cloud):
    if PERSONAL_LOCAL_PATTERNS.search(user_input): return 'LOCAL'
    if not has_cloud: return 'LOCAL'
    if FORCE_CLOUD_PATTERNS.search(user_input): return 'CLOUD'
    if CLOUD_PATTERNS.search(user_input): return 'CLOUD'
    if LOCAL_PATTERNS.search(user_input): return 'LOCAL'
    return None  # ambiguous — go to Step 2python
PatternTriggers
PERSONAL_LOCALQuestions about you, your family, health, schedule — always local, never sent to cloud
FORCE_CLOUD"use the internet", "search online", "google it" — bypasses local model entirely
CLOUD_PATTERNSNews, scores, stock prices, current events, today's weather
LOCAL_PATTERNSDefinitions, jokes, stories, math, general knowledge — anything timeless

Step 2 — LLM Router — If pattern matching returns None, George uses its own local LLM to classify the query. A 4-token inference call at temperature=0 returns only the word LOCAL or CLOUD.

Step 3 — Cloud Cascade — When a query needs the internet, George tries providers in order:

PriorityProviderNotes
0wttr.inWeather queries bypass all LLMs — free, no API key
1OpenRouterTries gpt-4o-mini, gemma-3, llama-3 in order
2OpenAIDirect fallback if OpenRouter fails
3GeminiGoogle Gemini 2.0 Flash as final fallback
4Local fallbackAnswers locally with a caveat if all cloud fails
🟡
When any cloud provider answers, the orb displays a yellow ring — you always know when data left your Mac.

Streaming Response Format

{"sentence": "Here is the first sentence.", "source": "local"}
{"sentence": "And here is the second.",      "source": "cloud"}
{"done": true, "full": "Complete text",      "source": "cloud"}json

Chapter 02 Screen Vision

George can look at your screen on demand. Say "Hey George, what's on my screen?" and George takes a screenshot, encodes it, and sends it to a cloud vision model.

Trigger phrases: "what's on my screen" · "describe my screen" · "what do you see" · "look at my screen" · "read the screen" · "summarize this page"

// GeorgeController.swift — handleScreenVision()
let path = "/tmp/george_screen.png"
_ = await shell("screencapture -x " + path)
{ "type": "image_url", "image_url": { "url": "data:image/png;base64,..." } }swift
🔒
George never passively monitors your screen. Screen vision is triggered only when you explicitly ask. Screen Recording permission must be granted in System Settings → Privacy & Security → Screen Recording.

Chapter 03 Hardware Integration

George integrates with macOS hardware through official Apple APIs and standard Unix tools — no kernel extensions, no custom drivers.

Microphone

FrameworkRole
AVAudioEngineCaptures raw audio at 44.1kHz, streams buffers to the speech recognizer
SFSpeechRecognizerOn-device speech recognition. A second instance listens for "stop" / "shut up" while George is speaking

GPU — Local AI Inference

The local LLM runs on Apple Silicon GPU (M1/M2/M3/M4) using Apple's MLX framework via mlx_lm. Loaded into GPU memory at startup.

Hardware Stats

DataCommand
CPU usageps -A -o %cpu | awk
RAM total / freesysctl -n hw.memsize / vm_stat
Batterypmset -g batt
Disk usagedf -h /
Network interfaceroute get default
Fan / CPU tempistats fan speed / istats cpu temp (optional)

Chapter 04 Memory Engine

George's memory system is what makes it feel like a real companion rather than a stateless chatbot. Everything lives in ~/.george/memory.json — never sent to the cloud.

{ "persona": { "name": "George", "traits": ["curious","warm","witty","direct","honest"] },
  "history": [ /* last 80 conversation turns */ ],
  "facts":   [ /* up to 300 plain-English facts about you */ ] }json

Proactive Fact Mining

Every user message is scanned by a regex-based fact extractor before it reaches the AI. Categories detected:

Name · Spouse/Partner · Children · Extended family · Job · Location · Age · Pets · Hobbies (21 types) · Music genres (27 types)

Correction System

If you say "correction", "remember I...", "actually I...", or "I don't listen to X", George removes the conflicting fact and stores the corrected version.

Rule
Voice-only output — no markdown, no asterisks, no bullet points
Keep replies to 1–3 sentences unless asked for detail
Naturally weave in past facts when they enrich the answer
Never invent facts, news, weather, or current events without real data
Never mention Claude, Anthropic, LLaMA, or that you're an AI — you ARE George

Chapter 05 Orb Visual System

The floating orb is a real-time status display built in pure SwiftUI with 11 distinct visual styles.

The Five Orb States

StateVisual Behavior
.bootingScale pulls in (0.92x), three animated dots cycle below showing loading progress
.idleSlow breathing — scale oscillates 1.0 ↔ 1.07 over 3 seconds
.listeningOrb shrinks to 82%, ripple ring expands and fades, live transcription displayed
.thinkingSubtle nudge to 1.02x, 6 particles orbit the perimeter
.speakingRapid pulse between 1.0 and 1.26 scale — fast heartbeat effect

The 11 Orb Styles

00Classic BlueRadial gradient sphere with blue glow and orbiting particles
01Blue Flower8 rotating ellipse petals orbit the outside of the orb
02Plasma Arc3 partial arcs rotate at 120° offsets — electric effect
03Crystal Gem6 rotating diamond shapes with ice-white facets
04Deep Nebula4 rotating gradient nebula clouds + 12 star dots
05Matrix RainGreen globe with orbiting 0, 1, ▓, ░ characters
06Lava Lamp5 blurred orange/red blobs float and merge inside
07Ghost LightThree concentric rings breathing slowly around a white core
08Aurora5 blurred gradient bands rotating like aurora curtains
09Neon PulseHot pink globe with 3 concentric neon rings pulsing outward
10Vortex6 arc segments colored by hue rotation spin together
💡
Double-tap the orb to quit George. Single taps are ignored — voice is the primary input.

Chapter 06 Chess Engine

George includes a complete, self-contained chess implementation in Swift with no external libraries — full rules, minimax search, alpha-beta pruning, and natural-language move commentary.

Board Representation

Pieces encoded as raw integers 0–12, stored as a flat 64-element array. ChessBoard is a Swift struct (value type) — every speculative move creates an independent copy.

struct ChessBoard {
    var squares: [ChessPiece] = Array(repeating: .empty, count: 64)
    var whiteKingsideCastle  = true
    var whiteQueensideCastle = true
    var blackKingsideCastle  = true
    var blackQueensideCastle = true
    var enPassantSquare: Int = -1
    var whiteToMove = true
}swift

Material Values

PieceValueReasoning
Pawn100Baseline unit
Knight320Slightly less than bishop
Bishop330Bishop pair advantage
Rook500Worth ~5 pawns
Queen900Worth ~9 pawns
King20,000Sentinel — game ends before capture

Minimax with Alpha-Beta Pruning

Searches 4 moves deep. Without pruning: ~810,000 positions (30⁴). Alpha-beta reduces the effective branching factor from ~30 to ~7 — responses under one second on Apple Silicon.

private static func alphabeta(_ board: ChessBoard, depth: Int,
                                alpha: Int, beta: Int, maximising: Bool) -> Int {
    if depth == 0 { return board.evaluate() }
    if b <= a { break }  // beta cutoff — prune remaining branches
}swift

Chapter 07 Reinforcement Learning Game Player

George can play Arkanoid autonomously using a Q-learning agent. It watches the game screen via screenshot, infers state, chooses actions, and sends real keyboard inputs to control the paddle.

Q-Learning Algorithm

# Bellman equation — GamePlayer.swift updateQ()
bestNext = max(qTable[nextState][action] for action in actions)
updated  = current + α × (reward + γ × bestNext − current)
qTable[state][action] = updatedpython
HyperparameterValueReasoning
Learning rate (α)0.25Conservative for noisy screen-based state
Discount factor (γ)0.95Longer horizon — brick hits are delayed by many frames
Epsilon (ε)0.3 → 0.1Explore freely at first, then mostly exploit learned strategy
Frame rate50ms / loop20 decisions per second

Reward Function

RewardTrigger
+2.0Ball moving toward paddle AND paddle within 15% of ball
+1.5A brick was destroyed this frame
+1.0Ball velocity moving toward paddle (correct anticipation)
-0.5Ball moving toward paddle but paddle is far away
-2.0Ball reached the bottom edge (ball lost)
💾
The Q-table is saved to ~/.george/qtable.json. George gets smarter across sessions. On startup: "Starting Arkanoid. I have 847 states learned."

Chapter 08 App Entry Point

main.swift is 46 lines. AppDelegate.swift is 42 lines. Small files, enormous consequences.

Activation Policy

app.setActivationPolicy(.accessory)
// No Dock icon · Not in Cmd+Tab · No menu bar
// George is ambient — always present, never intrusiveswift

Boot Sequence

  1. OS loads George binary → main.swift executes
  2. installCrashHandlers() registers SIGBUS, SIGSEGV, SIGABRT, SIGILL, SIGFPE
  3. NSApplication.shared creates the app singleton
  4. AppDelegate() instantiated and wired to app
  5. setActivationPolicy(.accessory) — hidden from Dock
  6. app.run() — main run loop starts, never returns
  7. applicationDidFinishLaunching fires
  8. NSPanel created and configured (.borderless, .nonactivatingPanel)
  9. Panel positioned bottom-right using NSScreen.main.visibleFrame
  10. GeorgeController() created — Memory + Agent loaded from disk
  11. OrbView SwiftUI view tree created
  12. NSHostingView bridges SwiftUI into AppKit
  13. panel.makeKeyAndOrderFront — orb appears on screen
  14. Task { await george.boot() } scheduled async
  15. Boot: requests mic + speech recognition permissions
  16. Boot: launches server.py as child process
  17. Boot: pings localhost:8765 until server responds
  18. Boot: server.py loads local LLM into Apple Silicon GPU
  19. Boot: state → .idle — breathing animation begins
  20. George speaks a random greeting — fully operational

Chapter 09 Setup & Mobile UI

setup.sh — One-Command Installation

./setup.sh
~/Desktop/George.command   # double-click launcherbash

8 steps: Verify prerequisites → Create ~/.george/ → Copy server.py → Enter API keys → Install mlx-lm → Install yt-dlpswift build -c release → Download Llama 3.2 3B (~1.8 GB)

Mobile Web UI

A self-contained single-file web app served by server.py at GET /. Open it on any phone, tablet, or computer — no install required. Dark space theme, CSS-only animated orb, streaming text, voice input via Web Speech API.

📱
You can be in another room, send George a query from your phone, and hear the response come out of your Mac speakers. The phone is just a remote control.