Skip to main content

Local AI

Aurora fully supports local AI workflows using both Nvidia and AMD GPUs. Acceleration packages for both vendors are included by default and do not require manual intervention to get working properly.

Use the robots to your advantage!

AI Tools

The following AI-focused command-line tools are available via homebrew, install individually or use this command to install them all: ujust bbrew and choose the ai menu option:

NameDescription
aichatAll-in-one AI-Powered CLI Chat & Copilot
block-goose-cliBlock Protocol AI agent CLI
claude-codeClaude coding agent with desktop integration
codexCode editor for OpenAI's coding agent that runs in your terminal
copilot-cliGitHub Copilot CLI for terminal assistance
crushAI coding agent for the terminal, from charm.sh
gemini-cliCommand-line interface for Google's Gemini API
kimi-cliCLI for Moonshot AI's Kimi models
llmAccess large language models from the command line
lm-studioDesktop app for running local LLMs
mistral-vibeCLI for Mistral AI models
modsAI on the command-line, from charm.sh
opencodeAI coding agent for the terminal
qwen-codeCLI for Qwen3-Coder models
ramalamaManage and run AI models locally with containers
whisper-cppHigh-performance inference of OpenAI's Whisper model

Ramalama

Install Ramalama via brew install ramalama: manage local models and is the preferred default experience. It's for people who work with local models frequently and need advanced features. It offers the ability to pull models from huggingface, ollama, and any container registry. By default it pulls from ollama.com, check the Ramalama documentation for more information.

Ramalama's command line experience is similar to Podman. Bluefin sets rl as an alias for ramalama, for brevity. Examples include:

rl pull llama3.2:latest
rl run llama3.2
rl run deepseek-r1

You can also serve the models locally:

rl serve deepseek-r1

Then go to http://127.0.0.0:8080 in your browser.

Ramalama will automatically pull in anything your host needs to do the workload. The images are also stored in the same container storage as your other containers. This allows for centralized management of the models and other podman images:

❯ podman images
REPOSITORY TAG IMAGE ID CREATED SIZE
quay.io/ramalama/rocm latest 8875feffdb87 5 days ago 6.92 GB

Integrating with Existing Tools

ramalama serve will serve an OpenAI compatible endpoint at http://0.0.0.0:8080, you can use this to configure tools that do not support ramalama directly:

Newelle )

Alpaca

To get a more graphical experience managing and interacting with your models, you can use the graphical client Alpaca It is running an embedded Ollama engine and features a beautiful graphical interface to answer your most burning questions.

To have proper AMD GPU Acceleration support, install the com.jeffser.Alpaca.Plugins.AMD plugin via the CLI
(flatpak install com.jeffser.Alpaca.Plugins.AMD) or from Discover by searching for it. This addon requires a compatible ROCM AMD GPU, like a 7900 XTX or Radeon 6900 XT.

Alpaca Client