Mac Studio M3 Ultra 96GB
Complete Setup Guide
Every command, every tool, every model — the full runbook for transforming a brand-new Mac Studio into a 24/7 AI powerhouse
Hardware
Mac Studio M3 Ultra Specifications
Software Bundle
Complete AI Software Inventory
Every tool, model, and framework installed on this Mac Studio. Filter by category to explore the full stack.
Full Runbook
From unboxing to always-on AI station
Unbox & Inspect
Unbox the Mac Studio, check for physical damage. Connect to a display via HDMI or Thunderbolt.
Peripherals & Network
Connect keyboard and mouse (wired recommended for first setup). Connect Ethernet cable — preferred over Wi-Fi for stability during large model downloads.
Credentials Ready
Have these ready before starting:
- Apple ID credentials
- Anthropic API key (
sk-ant-...) - OpenAI API key (
sk-...) - GitHub Personal Access Token (
ghp_...) - Discord Bot Token(s) (
MTI...) - Discord Server ID
First Boot Wizard
Power on, select language & region (Hong Kong). Skip Migration Assistant. Sign in with Apple ID. Create admin account with a strong password (16+ chars).
Post-Wizard Configuration
Show commands
FileVault Disk Encryption
Encrypt the entire disk at rest. Record the recovery key offline in a safe location.
Show command
Firewall & Stealth Mode
Show commands
Energy Settings (24/7 Operation)
Prevent the Mac from sleeping — critical for always-on AI workloads.
Show commands
SSH Remote Access
Enable secure remote management with key-only authentication.
Show commands
Xcode Command Line Tools
Show command
Homebrew
The essential package manager for macOS.
Show commands
Core CLI Tools
Install all essential command-line tools in one batch.
Show command
Git & GitHub Configuration
Show commands
Node.js & Global npm Packages
Show commands
Python Setup
Show commands
Ollama (Local LLM Engine)
The core runtime that powers all local AI models. Optimised for Apple Silicon.
Show commands
Ollama Optimisation for M3 Ultra
Tune Ollama for maximum performance on 96GB unified memory.
Show configuration
LM Studio (GUI)
Visual interface for running and testing LLMs. Provides an OpenAI-compatible API at localhost:1234.
Show command
Whisper (Speech-to-Text)
Show commands
Stable Diffusion (Image Generation)
Show command
Primary Model: Qwen3.5-122B-A10B (MoE)
The flagship local model for this build. 122 billion total parameters, but only activates 10 billion per request thanks to Mixture-of-Experts architecture — meaning you get 122B-class intelligence at 10B-class speed. Supports 256K context natively. Outperforms GPT-5 mini on tool-use benchmarks (72.2 vs 55.5 on BFCL-V4).
Show commands
Specialist Models (swap in as needed)
These cannot run alongside the 122B model — Ollama will unload the 122B to free RAM. Keep them downloaded for on-demand use.
Show commands
96GB RAM Budget — Interactive Model Explorer
Select any Qwen 3.5 model below to see how it fits in the 96GB unified memory. RAM allocation, context boundaries, and architecture diagrams update in real time.
Compare other RAM tiers (128–512 GB)
Does not fit.
Context Window Boundary
| Context Length | KV Cache RAM | Free for Agents | Verdict |
|---|
Architecture Diagram
When to Use Which Context Size
| Use Case | Context | Config |
|---|
Ollama Context Configuration
Create persistent Modelfiles with fixed context sizes for different use cases.
Show recommended Ollama context configuration
Docker Desktop
Required for running Open WebUI, n8n, and other containerised services.
Show command
Open WebUI (ChatGPT-like Interface)
A beautiful web interface for chatting with your local LLMs — like ChatGPT, but 100% private.
Show command
PrivateGPT (Document AI / RAG)
Upload documents and ask questions — entirely local, no cloud.
Show commands
AnythingLLM (Simple RAG Alternative)
Show command
Initialise OpenClaw Workspace
Show commands
Configure API Keys
Show commands
SOUL.md — Agent Identity
Show example
HEARTBEAT.md — Periodic Tasks
Show example
Bot Routing Configuration
Show configuration
Deploy All Agents
Show commands
Create Discord Application
For each bot (BizManager, DevAgent, AlertBot, ContentBot):
- Go to
discord.com/developers/applications - Click New Application → name it
- Bot tab → Add Bot → copy Token
- Enable: Message Content Intent, Server Members Intent, Presence Intent
- OAuth2 → Scopes:
bot,applications.commands - Permissions: Send/Read/Manage Messages, Embed Links, Attach Files, Read History, Add Reactions
- Copy invite URL → open in browser → authorise to server
Discord Server Channel Structure
Claude Code CLI
Show commands
VS Code + Extensions
Show commands
Continue.dev (Local AI Code Assistant)
Open-source code assistant that uses your local Ollama models.
Show configuration
Terminal Enhancements
Show commands
n8n (Workflow Automation)
Visual workflow builder — connects Discord, GitHub, Ollama, email, and more.
Show command
PM2 Process Manager
Keeps all agents running 24/7 with auto-restart on crash or reboot.
Show ecosystem config
Auto-Start Verification
After a reboot, all services should come back automatically:
Show verification commands
Health Check Script (launchd)
Runs every 5 minutes to monitor and auto-restart failed services.
Show script
Show launchd plist
Monitoring Commands
Show commands
Regular Maintenance Schedule
| Frequency | Task | Command |
|---|---|---|
| Weekly | Update Homebrew packages | brew update && brew upgrade && brew cleanup |
| Weekly | Update Ollama models | ollama pull qwen3.5:122b-a10b |
| Weekly | Update npm globals | npm update -g |
| Monthly | Clean up Docker | docker system prune -f |
| Monthly | Clear old logs | rm -f /tmp/macai-healthcheck*.log |
| Monthly | macOS updates | softwareupdate --list |
Complete Software List
Everything installed on this machine
CLI Tools (via Homebrew)
| Software | Command | Purpose |
|---|---|---|
| Git | brew install git | Version control |
| Node.js | brew install node | JavaScript runtime |
| Python 3.12 | brew install [email protected] | Python runtime |
| wget / curl | brew install wget curl | Download tools |
| htop | brew install htop | Process monitor |
| jq | brew install jq | JSON processor |
| tmux | brew install tmux | Terminal multiplexer |
| ripgrep | brew install ripgrep | Fast search |
| fd | brew install fd | Fast find |
| bat | brew install bat | Better cat |
| fzf | brew install fzf | Fuzzy finder |
| gh | brew install gh | GitHub CLI |
| tree | brew install tree | Directory viewer |
| cmake / pkg-config | brew install cmake pkg-config | Build tools |
AI Runtimes & Applications
| Software | Install Method | Purpose |
|---|---|---|
| Ollama | brew install ollama | Local LLM engine |
| LM Studio | brew install --cask lm-studio | LLM GUI + API |
| whisper.cpp | brew install whisper-cpp | Speech-to-text |
| DiffusionBee | brew install --cask diffusionbee | Image generation |
| Open WebUI | Docker container | ChatGPT-like UI |
| PrivateGPT | Git clone + Poetry | Document Q&A (RAG) |
| AnythingLLM | brew install --cask anythingllm | Simple RAG |
| n8n | Docker container | Workflow automation |
LLM Models (via Ollama)
| Model | Size | Purpose |
|---|---|---|
| Qwen3.5-122B-A10B | ~77 GB | Primary model — 122B MoE, 10B active |
| Qwen3.5 9B | ~5 GB | Fast fallback, same family |
| Qwen3.5 27B | ~16 GB | Dense bilingual (swap-in) |
| Qwen 2.5 Coder 32B | ~18 GB | Code generation (swap-in) |
| DeepSeek R1 32B | ~18 GB | Reasoning (swap-in) |
| Llama 3.2 Vision 11B | ~7 GB | Image understanding (swap-in) |
| Nomic Embed Text | ~274 MB | Document embeddings (RAG) |
| MxBai Embed Large | ~670 MB | High-quality embeddings (RAG) |
Development Tools
| Software | Install Method | Purpose |
|---|---|---|
| VS Code | brew install --cask visual-studio-code | Code editor |
| Claude Code | npm install -g @anthropic-ai/claude-code | AI coding CLI |
| OpenClaw | npm install -g openclaw | Agent orchestration |
| PM2 | npm install -g pm2 | Process manager |
| Docker Desktop | brew install --cask docker | Container runtime |
| iTerm2 | brew install --cask iterm2 | Better terminal |
| Rectangle | brew install --cask rectangle | Window management |
| Stats | brew install --cask stats | Menu bar monitor |
Quick Start
One-liner install (after Homebrew)
Copy and paste this to install the entire stack in one batch.
Want us to set this up for you?
Book a setup session and we'll have your Mac Studio running AI agents in under 2 hours.