← Back to Case Study
⚙️

Mac Studio M3 Ultra 96GB
Complete Setup Guide

Every command, every tool, every model — the full runbook for transforming a brand-new Mac Studio into a 24/7 AI powerhouse

15 sections 40+ tools 10+ LLM models Setup: ~2 hours

Hardware

Mac Studio M3 Ultra Specifications

M3 Ultra
Apple Silicon
96GB
Unified Memory
8TB
SSD Storage
60
Core GPU

Software Bundle

Complete AI Software Inventory

Every tool, model, and framework installed on this Mac Studio. Filter by category to explore the full stack.

Full Runbook

From unboxing to always-on AI station

1 Pre-Setup Checklist Before powering on
Unbox & Inspect

Unbox the Mac Studio, check for physical damage. Connect to a display via HDMI or Thunderbolt.

Peripherals & Network

Connect keyboard and mouse (wired recommended for first setup). Connect Ethernet cable — preferred over Wi-Fi for stability during large model downloads.

Credentials Ready

Have these ready before starting:

  • Apple ID credentials
  • Anthropic API key (sk-ant-...)
  • OpenAI API key (sk-...)
  • GitHub Personal Access Token (ghp_...)
  • Discord Bot Token(s) (MTI...)
  • Discord Server ID
2 macOS Initial Setup ~10 min
First Boot Wizard

Power on, select language & region (Hong Kong). Skip Migration Assistant. Sign in with Apple ID. Create admin account with a strong password (16+ chars).

Privacy-first: Disable Analytics, Siri, and Screen Time. Enable Location Services for timezone only. Choose Dark appearance for always-on station.
Post-Wizard Configuration
Show commands
# Check macOS version sw_vers # Update to latest macOS softwareupdate --list softwareupdate --install -a # Set hostname sudo scutil --set HostName mac-studio-ai sudo scutil --set LocalHostName mac-studio-ai sudo scutil --set ComputerName "Mac Studio AI" # Set timezone sudo systemsetup -settimezone Asia/Hong_Kong # Show hidden files and extensions defaults write NSGlobalDomain AppleShowAllExtensions -bool true defaults write com.apple.finder AppleShowAllFiles -bool true defaults write com.apple.desktopservices DSDontWriteNetworkStores -bool true killall Finder
3 System Security ~10 min
FileVault Disk Encryption

Encrypt the entire disk at rest. Record the recovery key offline in a safe location.

Show command
sudo fdesetup enable # SAVE the recovery key that is displayed!
Firewall & Stealth Mode
Show commands
# Enable firewall sudo /usr/libexec/ApplicationFirewall/socketfilterfw --setglobalstate on # Enable stealth mode (don't respond to pings) sudo /usr/libexec/ApplicationFirewall/socketfilterfw --setstealthmode on # Verify sudo /usr/libexec/ApplicationFirewall/socketfilterfw --getglobalstate
Energy Settings (24/7 Operation)

Prevent the Mac from sleeping — critical for always-on AI workloads.

Show commands
sudo pmset -a displaysleep 10 # Display sleeps after 10 min sudo pmset -a sleep 0 # Computer never sleeps sudo pmset -a disksleep 0 # Disk never sleeps sudo pmset -a womp 1 # Wake on LAN sudo pmset -a autorestart 1 # Auto-restart after power failure sudo pmset -a powernap 1 # Background tasks during nap # Verify pmset -g
SSH Remote Access

Enable secure remote management with key-only authentication.

Show commands
# Enable SSH sudo systemsetup -setremotelogin on # Generate SSH key ssh-keygen -t ed25519 -C "[email protected]" # Harden SSH — disable password auth sudo tee -a /etc/ssh/sshd_config.d/hardened.conf << 'EOF' PasswordAuthentication no ChallengeResponseAuthentication no PermitRootLogin no EOF sudo launchctl stop com.openssh.sshd sudo launchctl start com.openssh.sshd
4 Developer Foundations ~15 min
Xcode Command Line Tools
Show command
xcode-select --install
Homebrew

The essential package manager for macOS.

Show commands
# Install Homebrew /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)" # Add to PATH (Apple Silicon) echo 'eval "$(/opt/homebrew/bin/brew shellenv)"' >> ~/.zprofile eval "$(/opt/homebrew/bin/brew shellenv)" # Verify brew --version
Core CLI Tools

Install all essential command-line tools in one batch.

Show command
brew install \ git node [email protected] wget curl \ htop jq tree tmux \ ripgrep fd bat fzf gh \ cmake pkg-config
Git & GitHub Configuration
Show commands
git config --global user.name "MacAI HK" git config --global user.email "[email protected]" git config --global init.defaultBranch main git config --global pull.rebase true # Authenticate GitHub CLI gh auth login
Node.js & Global npm Packages
Show commands
npm install -g \ pnpm yarn typescript ts-node \ nodemon pm2 openclaw
Python Setup
Show commands
pip3 install --upgrade pip setuptools wheel pip3 install virtualenv pipx httpx requests python-dotenv pyyaml pipx ensurepath
5 AI Runtime Stack ~15 min (excl. downloads)
Ollama (Local LLM Engine)

The core runtime that powers all local AI models. Optimised for Apple Silicon.

Show commands
# Install & start as background service brew install ollama brew services start ollama # Verify curl http://localhost:11434/api/tags
Ollama Optimisation for M3 Ultra

Tune Ollama for maximum performance on 96GB unified memory.

Show configuration
# Add to ~/.zshrc export OLLAMA_HOST=0.0.0.0:11434 # Listen on all interfaces export OLLAMA_NUM_PARALLEL=4 # 4 parallel requests export OLLAMA_MAX_LOADED_MODELS=3 # Keep 3 models in RAM export OLLAMA_KEEP_ALIVE=24h # Models stay loaded 24h export OLLAMA_FLASH_ATTENTION=1 # Flash attention for speed # Apply & restart source ~/.zshrc brew services restart ollama
LM Studio (GUI)

Visual interface for running and testing LLMs. Provides an OpenAI-compatible API at localhost:1234.

Show command
brew install --cask lm-studio # After launch: Settings → Server → Start server on launch # OpenAI-compatible API at http://localhost:1234/v1
Whisper (Speech-to-Text)
Show commands
# Pre-built via Homebrew brew install whisper-cpp # Or build from source with CoreML (Neural Engine acceleration) cd ~/Developer git clone https://github.com/ggerganov/whisper.cpp.git cd whisper.cpp make clean WHISPER_COREML=1 make -j # Download models bash ./models/download-ggml-model.sh base.en bash ./models/download-ggml-model.sh large-v3
Stable Diffusion (Image Generation)
Show command
# GUI option (easiest) brew install --cask diffusionbee # Or Draw Things from Mac App Store (free, Apple Silicon optimised) # CLI option with Metal GPU acceleration cd ~/Developer git clone https://github.com/leejet/stable-diffusion.cpp.git cd stable-diffusion.cpp mkdir build && cd build cmake .. -DGGML_METAL=ON cmake --build . --config Release -j
6 LLM Models ~2–4 hours (download)
Primary Model: Qwen3.5-122B-A10B (MoE)

The flagship local model for this build. 122 billion total parameters, but only activates 10 billion per request thanks to Mixture-of-Experts architecture — meaning you get 122B-class intelligence at 10B-class speed. Supports 256K context natively. Outperforms GPT-5 mini on tool-use benchmarks (72.2 vs 55.5 on BFCL-V4).

Show commands
# Primary — Qwen3.5 122B MoE (Q4_K_M ~77GB) ollama pull qwen3.5:122b-a10b # ~77GB — our main model # Fast fallback (for simple tasks when 122B is busy) ollama pull qwen3.5:9b # ~5GB — lightweight, same family # Embedding model (required for RAG / document search) ollama pull nomic-embed-text # ~274MB
Specialist Models (swap in as needed)

These cannot run alongside the 122B model — Ollama will unload the 122B to free RAM. Keep them downloaded for on-demand use.

Show commands
# Coding-focused ollama pull qwen2.5-coder:32b # ~18GB — excellent for code # Chinese + English bilingual ollama pull qwen3.5:27b # ~16GB — dense, fast # Reasoning ollama pull deepseek-r1:32b # ~18GB — strong reasoning # Vision / Multimodal ollama pull llama3.2-vision:11b # ~7GB — image understanding # Additional embedding ollama pull mxbai-embed-large # ~670MB # Verify all ollama list
96GB RAM Budget — Interactive Model Explorer

Select any Qwen 3.5 model below to see how it fits in the 96GB unified memory. RAM allocation, context boundaries, and architecture diagrams update in real time.

Compare other RAM tiers (128–512 GB)

Architecture
MoE
Total Params
122B
Active / Token
10B
VRAM (Q4_K_M)
77 GB
Expert Config
60E / 4A
Fits 96GB?
Tight fit

Does not fit.

Context Window Boundary

Context LengthKV Cache RAMFree for AgentsVerdict
Architecture Diagram
When to Use Which Context Size
Use CaseContextConfig
For 128K+ content: Don't increase context — use RAG (Retrieval-Augmented Generation) via PrivateGPT or AnythingLLM instead. RAG chunks documents into embeddings stored on disk, then retrieves only relevant snippets into a small context window. This keeps RAM stable while handling unlimited document sizes.
Ollama Context Configuration

Create persistent Modelfiles with fixed context sizes for different use cases.

Show recommended Ollama context configuration
# Set default context to 16K (balanced for agents + model) ollama run qwen3.5:122b-a10b --ctx-size 16384 # Or create a persistent Modelfile with fixed context cat > ~/Modelfile-qwen122b << 'EOF' FROM qwen3.5:122b-a10b PARAMETER num_ctx 16384 PARAMETER num_gpu 999 EOF ollama create qwen122b-16k -f ~/Modelfile-qwen122b # For long-document mode (stop agents first!) cat > ~/Modelfile-qwen122b-long << 'EOF' FROM qwen3.5:122b-a10b PARAMETER num_ctx 65536 PARAMETER num_gpu 999 EOF ollama create qwen122b-64k -f ~/Modelfile-qwen122b-long # Enable KV cache quantisation to squeeze more context # (halves KV cache RAM — 32K context uses only ~1.6GB) cat > ~/Modelfile-qwen122b-q8kv << 'EOF' FROM qwen3.5:122b-a10b PARAMETER num_ctx 32768 PARAMETER cache_type q8_0 EOF ollama create qwen122b-32k-q8kv -f ~/Modelfile-qwen122b-q8kv
7 AI Applications & GUIs ~15 min
Docker Desktop

Required for running Open WebUI, n8n, and other containerised services.

Show command
brew install --cask docker # Launch Docker Desktop, then enable "Start on login" in settings
Open WebUI (ChatGPT-like Interface)

A beautiful web interface for chatting with your local LLMs — like ChatGPT, but 100% private.

Show command
docker run -d \ --name open-webui \ -p 3000:8080 \ -v open-webui:/app/backend/data \ --add-host=host.docker.internal:host-gateway \ -e OLLAMA_BASE_URL=http://host.docker.internal:11434 \ --restart always \ ghcr.io/open-webui/open-webui:main # Access at http://localhost:3000 # Create admin account on first visit
PrivateGPT (Document AI / RAG)

Upload documents and ask questions — entirely local, no cloud.

Show commands
cd ~/Developer git clone https://github.com/zylon-ai/private-gpt.git cd private-gpt # Install with Ollama support poetry install --extras "ui llms-ollama embeddings-ollama vector-stores-qdrant" # Start PGPT_PROFILES=local make run # Access at http://localhost:8001
AnythingLLM (Simple RAG Alternative)
Show command
brew install --cask anythingllm # After launch: # 1. Set LLM provider → Ollama (localhost:11434) # 2. Set Embedding → Ollama (nomic-embed-text) # 3. Create workspace → upload documents
8 Agent Framework (OpenClaw) ~20 min
Initialise OpenClaw Workspace
Show commands
openclaw init --workspace ~/Documents/GitHub/MacAI # Creates: # ├── SOUL.md — Agent identity # ├── HEARTBEAT.md — Periodic tasks # ├── .openclaw/ # │ ├── config.yml — Main configuration # │ ├── routing.yml — Bot-to-bot communication # │ └── secrets.enc — Encrypted API keys # └── agents/ # ├── biz-manager/SOUL.md # ├── dev-agent/SOUL.md # └── alert-bot/SOUL.md
Configure API Keys
Show commands
openclaw config set ANTHROPIC_API_KEY "sk-ant-api03-..." openclaw config set OPENAI_API_KEY "sk-..." openclaw config set GITHUB_TOKEN "ghp_..." openclaw config set DISCORD_BOT_TOKEN "MTI..." openclaw config set DISCORD_SERVER_ID "123456789012345678" # Verify openclaw config list
SOUL.md — Agent Identity
Show example
# SOUL.md — MacAI Assistant Name: MacAI Assistant Role: Business operations AI for Hong Kong SME Language: Bilingual (English / 繁體中文) Tone: Professional, concise, helpful Boundaries: - NEVER share confidential client data - NEVER execute financial transactions without human approval - ALWAYS escalate legal questions to human - ALWAYS log all actions for audit trail
HEARTBEAT.md — Periodic Tasks
Show example
# HEARTBEAT.md — Automated Check-ins interval: 15m tasks: - Check Discord #alerts for new messages - Review pending GitHub PRs - Monitor system health (RAM, CPU, disk) daily (09:00 HKT): - Generate morning briefing → #daily-briefing - Summarise overnight Discord activity weekly (Monday 09:00): - Generate weekly summary report - Check for model updates (ollama)
Bot Routing Configuration
Show configuration
# .openclaw/routing.yml peers: BizManager: can_send_to: [DevAgent, AlertBot, ContentBot] channels: [ai-assistant, daily-briefing] tools: [discord, github-issues] DevAgent: can_send_to: [BizManager, AlertBot] channels: [macai-dev] tools: [github, claude-code, terminal] AlertBot: can_send_to: [BizManager] channels: [alerts] ContentBot: can_send_to: [BizManager] channels: [content-drafts]
Deploy All Agents
Show commands
openclaw agent start --name "BizManager" --soul agents/biz-manager/SOUL.md openclaw agent start --name "DevAgent" --soul agents/dev-agent/SOUL.md openclaw agent start --name "AlertBot" --soul agents/alert-bot/SOUL.md openclaw agent start --name "ContentBot" --soul agents/content-bot/SOUL.md # Check running agents openclaw agent list
9 Discord Bot Setup ~15 min
Create Discord Application

For each bot (BizManager, DevAgent, AlertBot, ContentBot):

  • Go to discord.com/developers/applications
  • Click New Application → name it
  • Bot tab → Add Bot → copy Token
  • Enable: Message Content Intent, Server Members Intent, Presence Intent
  • OAuth2 → Scopes: bot, applications.commands
  • Permissions: Send/Read/Manage Messages, Embed Links, Attach Files, Read History, Add Reactions
  • Copy invite URL → open in browser → authorise to server
Discord Server Channel Structure
📋 OPERATIONS ├── #daily-briefing — Morning reports (BizManager) ├── #ai-assistant — General client queries ├── #task-board — Active task tracking └── #alerts — System alerts (AlertBot) 💻 DEVELOPMENT ├── #macai-dev — Dev tasks (DevAgent) ├── #code-review — Automated PR reviews └── #deployments — Deployment notifications 📝 CONTENT ├── #content-drafts — Draft content (ContentBot) └── #content-approved — Approved content ⚙️ ADMIN ├── #bot-logs — Bot activity logs └── #system-health — RAM/CPU/disk monitoring
10 Coding & Development Tools ~10 min
Claude Code CLI
Show commands
npm install -g @anthropic-ai/claude-code # Enable team agent mode echo 'export CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1' >> ~/.zshrc source ~/.zshrc # Verify claude --version
VS Code + Extensions
Show commands
brew install --cask visual-studio-code # Essential extensions code --install-extension GitHub.copilot code --install-extension GitHub.copilot-chat code --install-extension ms-python.python code --install-extension dbaeumer.vscode-eslint code --install-extension esbenp.prettier-vscode code --install-extension eamodio.gitlens code --install-extension continue.continue
Continue.dev (Local AI Code Assistant)

Open-source code assistant that uses your local Ollama models.

Show configuration
# ~/.continue/config.json { "models": [ { "title": "Qwen Coder 32B", "provider": "ollama", "model": "qwen2.5-coder:32b", "apiBase": "http://localhost:11434" }, { "title": "Qwen3.5 122B MoE", "provider": "ollama", "model": "qwen3.5:122b-a10b", "apiBase": "http://localhost:11434" } ], "tabAutocompleteModel": { "title": "Qwen Coder 7B", "provider": "ollama", "model": "qwen2.5-coder:7b" } }
Terminal Enhancements
Show commands
brew install --cask iterm2 # Oh My Zsh sh -c "$(curl -fsSL https://raw.githubusercontent.com/ohmyzsh/ohmyzsh/master/tools/install.sh)" # Powerlevel10k theme git clone --depth=1 https://github.com/romkatv/powerlevel10k.git \ ${ZSH_CUSTOM:-$HOME/.oh-my-zsh/custom}/themes/powerlevel10k # Plugins git clone https://github.com/zsh-users/zsh-autosuggestions \ ${ZSH_CUSTOM:-~/.oh-my-zsh/custom}/plugins/zsh-autosuggestions git clone https://github.com/zsh-users/zsh-syntax-highlighting.git \ ${ZSH_CUSTOM:-~/.oh-my-zsh/custom}/plugins/zsh-syntax-highlighting
11 Automation & Workflows ~10 min
n8n (Workflow Automation)

Visual workflow builder — connects Discord, GitHub, Ollama, email, and more.

Show command
docker run -d \ --name n8n \ -p 5678:5678 \ -v n8n_data:/home/node/.n8n \ --restart always \ n8nio/n8n # Access at http://localhost:5678
PM2 Process Manager

Keeps all agents running 24/7 with auto-restart on crash or reboot.

Show ecosystem config
// ecosystem.config.js module.exports = { apps: [ { name: 'openclaw-biz', script: 'openclaw', args: 'agent start --name BizManager --soul agents/biz-manager/SOUL.md', restart_delay: 5000, max_restarts: 10, autorestart: true }, { name: 'openclaw-dev', script: 'openclaw', args: 'agent start --name DevAgent --soul agents/dev-agent/SOUL.md', restart_delay: 5000, autorestart: true }, { name: 'openclaw-alert', script: 'openclaw', args: 'agent start --name AlertBot --soul agents/alert-bot/SOUL.md', restart_delay: 5000, autorestart: true } ] }; # Start all, save, and enable boot startup pm2 start ecosystem.config.js pm2 save pm2 startup
12 Always-On Configuration ~10 min
Auto-Start Verification

After a reboot, all services should come back automatically:

Show verification commands
# Verify after reboot brew services list # Ollama → started docker ps # open-webui, n8n → running pm2 list # All agents → online ollama list # Should respond curl localhost:3000 # Open WebUI → 200 OK curl localhost:5678 # n8n → 200 OK
Health Check Script (launchd)

Runs every 5 minutes to monitor and auto-restart failed services.

Show script
#!/bin/bash — healthcheck.sh # Runs every 5 min via launchd TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S') # Check Ollama if curl -s http://localhost:11434/api/tags > /dev/null 2>&1; then echo "[$TIMESTAMP] ✓ Ollama: Running" else echo "[$TIMESTAMP] ✗ Ollama: DOWN — restarting" brew services restart ollama fi # Check Docker containers if curl -s http://localhost:3000 > /dev/null 2>&1; then echo "[$TIMESTAMP] ✓ Open WebUI: Running" else docker restart open-webui fi # Check PM2 if pm2 jlist > /dev/null 2>&1; then echo "[$TIMESTAMP] ✓ PM2: Running" else pm2 resurrect fi
Show launchd plist
<!-- ~/Library/LaunchAgents/com.macai.healthcheck.plist --> <?xml version="1.0" encoding="UTF-8"?> <plist version="1.0"> <dict> <key>Label</key> <string>com.macai.healthcheck</string> <key>ProgramArguments</key> <array> <string>/bin/bash</string> <string>/Users/admin/scripts/healthcheck.sh</string> </array> <key>StartInterval</key> <integer>300</integer> <key>RunAtLoad</key> <true/> </dict> </plist> # Load: launchctl load ~/Library/LaunchAgents/com.macai.healthcheck.plist
13 Monitoring & Maintenance Ongoing
Monitoring Commands
Show commands
htop # Real-time process monitor memory_pressure # RAM pressure level df -h # Disk space du -sh ~/.ollama/models/* # Model sizes ollama list # Loaded models docker stats # Container resources pm2 monit # Agent dashboard
Regular Maintenance Schedule
FrequencyTaskCommand
WeeklyUpdate Homebrew packagesbrew update && brew upgrade && brew cleanup
WeeklyUpdate Ollama modelsollama pull qwen3.5:122b-a10b
WeeklyUpdate npm globalsnpm update -g
MonthlyClean up Dockerdocker system prune -f
MonthlyClear old logsrm -f /tmp/macai-healthcheck*.log
MonthlymacOS updatessoftwareupdate --list

Complete Software List

Everything installed on this machine

CLI Tools (via Homebrew)

SoftwareCommandPurpose
Gitbrew install gitVersion control
Node.jsbrew install nodeJavaScript runtime
Python 3.12brew install [email protected]Python runtime
wget / curlbrew install wget curlDownload tools
htopbrew install htopProcess monitor
jqbrew install jqJSON processor
tmuxbrew install tmuxTerminal multiplexer
ripgrepbrew install ripgrepFast search
fdbrew install fdFast find
batbrew install batBetter cat
fzfbrew install fzfFuzzy finder
ghbrew install ghGitHub CLI
treebrew install treeDirectory viewer
cmake / pkg-configbrew install cmake pkg-configBuild tools

AI Runtimes & Applications

SoftwareInstall MethodPurpose
Ollamabrew install ollamaLocal LLM engine
LM Studiobrew install --cask lm-studioLLM GUI + API
whisper.cppbrew install whisper-cppSpeech-to-text
DiffusionBeebrew install --cask diffusionbeeImage generation
Open WebUIDocker containerChatGPT-like UI
PrivateGPTGit clone + PoetryDocument Q&A (RAG)
AnythingLLMbrew install --cask anythingllmSimple RAG
n8nDocker containerWorkflow automation

LLM Models (via Ollama)

ModelSizePurpose
Qwen3.5-122B-A10B~77 GBPrimary model — 122B MoE, 10B active
Qwen3.5 9B~5 GBFast fallback, same family
Qwen3.5 27B~16 GBDense bilingual (swap-in)
Qwen 2.5 Coder 32B~18 GBCode generation (swap-in)
DeepSeek R1 32B~18 GBReasoning (swap-in)
Llama 3.2 Vision 11B~7 GBImage understanding (swap-in)
Nomic Embed Text~274 MBDocument embeddings (RAG)
MxBai Embed Large~670 MBHigh-quality embeddings (RAG)

Development Tools

SoftwareInstall MethodPurpose
VS Codebrew install --cask visual-studio-codeCode editor
Claude Codenpm install -g @anthropic-ai/claude-codeAI coding CLI
OpenClawnpm install -g openclawAgent orchestration
PM2npm install -g pm2Process manager
Docker Desktopbrew install --cask dockerContainer runtime
iTerm2brew install --cask iterm2Better terminal
Rectanglebrew install --cask rectangleWindow management
Statsbrew install --cask statsMenu bar monitor

Quick Start

One-liner install (after Homebrew)

Copy and paste this to install the entire stack in one batch.

# CLI tools brew install git node [email protected] wget htop jq tree tmux ripgrep fd bat fzf gh cmake pkg-config # AI stack + Apps brew install ollama brew install --cask docker lm-studio diffusionbee anythingllm visual-studio-code iterm2 rectangle stats # Dev tools npm install -g pnpm typescript ts-node nodemon pm2 openclaw @anthropic-ai/claude-code n8n # Start Ollama brew services start ollama # Download primary model (Qwen3.5 122B MoE — ~77GB, takes ~1-2 hours) ollama pull qwen3.5:122b-a10b && ollama pull qwen3.5:9b && ollama pull nomic-embed-text

Want us to set this up for you?

Book a setup session and we'll have your Mac Studio running AI agents in under 2 hours.