Personal AI agent for Mac

Scout

Frontier Agent On Your MacBook

FREE LOCAL-FIRST AGENTS

Utilize Your Hardware for Free AI

Scout is a local-first agent that runs air-gapped on your MacBook — chats, memories, and personal data stay on your machine instead of someone else's cloud. It gives you a chat workspace for research, writing, filesystem utilization, and much more, with different apps for specific use cases.

Scout runs on the same engine as the Courier API Platform — tool calling, flex models, analytics, OpenAI-compatible endpoints — but ships with a polished chat interface, OS workspace, and a personal-use license. The full API surface is there if you need it.

What Scout Does

An Agent That Lives On Your Mac

Chat, research, automate files, test APIs, and extend with MCP apps — all from a macOS-style workspace powered by Gemma 4 running locally on your Mac, or by Courier Cloud for the heavy models when you don't have the memory.

Agentic Chat Interface

Streaming conversations with tool calls, model picker, and modes for research, writing, and deep workflows.

Pathfinder Semantic Search

Indexes your home folder with embeddings and descriptions so Scout can find documents, code, and notes instantly.

Shadow Shell

Our novel sandboxing technology that protects your filesystem from agent mistakes. Everything the OS agent changes is staged in a Shadow Shell for your approval.

OS Agent

Scout can operate on your OS, navigating your filesystem, reading and writing to your desktop as a powerful agent — all safely thanks to Shadow.

MCPost API Testing

Import Postman or Insomnia collections and let Scout run, modify, and chain API requests agent-first.

MCP App Ecosystem

Connect MCP servers as first-class Scout apps alongside Assistant, OS, and Settings modes.

Settings Agent

A dedicated agent that can help you configure anything within Courier OS — models, integrations, preferences, and workspace setup.

Recommended Scout Stack

Gemma 4 26B A4B — main Scout agent for chat, tools, and workflows
Gemma 4 E2B — lightweight sub-agent for routing and fast tasks
Qwen3 Embedding 0.6B — Pathfinder semantic search index

Local or Cloud, One Click

Scout always runs locally on your Mac, but the models it calls can run either way. Use Gemma locally when your Mac has the memory, or route inference through Courier Cloud's free tier — same agent experience either way.

Scout modes include Assistant for everyday chat, OS for filesystem automation, Settings for configuration help, and MCP apps you connect yourself.

Find the Right Mac for Your Use Case

Start from a preset or build your own flex stack, then see which Apple Silicon Mac fits your models.

Model Count

Select multiple models for different tasks (e.g., coding, vision, and general chat). As your user base grows, you will see increased latency and degradation in user-experience if multiple models are not utilized.

1-3 Models:Focused Setup

4-10 Models:Versatile Setup

10+ Models:Full Ecosystem

Throughput - Quantization & VRAM

Performance is determined by model quantization and available VRAM (Video Memory). Reasoning diminishes as quantization drops, possibly leading to hallucinations and other unintended side-effects.

4-bit: Maximum speed, lower VRAM
8-bit: Balanced speed and logic
16-bit: Maximum reasoning capability

Model Size - Parameters

Parameters are the internal variables the AI learns during training. A 30GB model has more "knowledge" than an 8GB model.

Lite (1GB - 14GB):Fast, Efficient

Balanced (15GB - 50GB):Versatile, Strong

Frontier (70GB+):Advanced Reasoning

Context Window - Memory

The context window is the amount of text (tokens) the AI can "remember" during a conversation or process in a single request.

32k tokens:~50 pages of text

128k tokens:Full book length

1M+ tokens:Entire codebases

Dynamic Memory Management

Courier offers 2 model serving options to maximize memory efficiency, Flex and Static

Flex Models

Flex models load into memory upon request and unload after 5 minutes of inactivity.

• Enables running multiple large models on limited hardware

• Dynamic memory allocation

• Only the largest flex model counts towards VRAM requirements

Static Models

Static models stay loaded in memory at all times, providing instant response.

• Instant availability, no load time

• Continuous memory occupancy

• Each static model adds directly to total VRAM requirements

Feeling overwhelmed or unsure what to choose?

Let us help you figure it out.

Start With a Use Case

Pre-configured flex stacks — memory is calculated from the largest flex model loaded at once.

Scout

Chat agent, lightweight sub-agent, and embeddings for Pathfinder

All models use flex APIs

Coding Agent

Planner + implementer stack for agentic coding workflows

All models use flex APIs

Production Server

General-purpose production API with Gemma 4

All models use flex APIs

What do you need AI for?

Select the primary functions for your self-hosted AI setup

Agent

Tool-calling agents, chat, and multi-modal workflows

Image Generation

Generate images from text prompts

Embeddings/RAG

Semantic search, retrieval, and memory

Select Your Models

Choose the AI models to include in your platform (Filtered by your use cases)

Placeholder

No models selected. Add models to your platform to continue.

Hardware Recommendation

Infrastructure Requirements

Based on your model selection

Total VRAM Required7 GB

Recommended HardwareMac Mini (16GB)

Need multi-device clustering or a custom setup? Book a free consultation

Ready to ask Scout?

Deploy Gemma, open Courier OS, and start chatting with an agent that understands your files, tools, and APIs.

Scout

Utilize Your Hardware for Free AI

An Agent That Lives On Your Mac

Agentic Chat Interface

Pathfinder Semantic Search

Shadow Shell

OS Agent

MCPost API Testing

MCP App Ecosystem

Settings Agent

Recommended Scout Stack

Local or Cloud, One Click

Platform Configuration Guidelines

Dynamic Memory Management

Start With a Use Case

What do you need AI for?

Select Your Models

Hardware Recommendation

Ready to ask Scout?