Claude Code is one of the best AI coding assistants out there. But if you're using it heavily, API costs can add up fast. What if you could route Claude Code through affordable open models — GLM-5, Kimi K2.5, MiniMax M2.5, and MiniMax M2.7 — for just $10/month?

That's exactly what OpenCode Go + LiteLLM Proxy makes possible. In this guide, I'll walk you through the full setup — from subscribing to OpenCode Go, configuring LiteLLM as a local proxy, and launching Claude Code against these models. I'll cover both a native macOS approach and a Docker approach (LiteLLM in Docker, Claude Code running natively).

What is OpenCode Go?

OpenCode Go is a low-cost subscription from OpenCode that gives you reliable access to curated open coding models. It costs $5 for the first month, then $10/month.

The models are hosted across the US, EU, and Singapore for stable global access. OpenCode's team has tested and benchmarked each model specifically for coding agent use cases.

You can use the models with Opencode Cli seamlessly, but I wanted to use the models with Claude Code.

Available Models


+---------------+--------------+-------------------+
| Model         | Provider     | API Compatibility |
+---------------+--------------+-------------------+
| GLM-5         | Zhipu AI     | OpenAI            |
| Kimi K2.5     | Moonshot AI  | OpenAI            |
| MiniMax M2.5  | MiniMax      | Anthropic         |
| MiniMax M2.7  | MiniMax      | Anthropic         |
+---------------+--------------+-------------------+

Usage Limits

Limits are dollar-value based, meaning cheaper models give you more requests:

+-------------+---------+------------+--------------+--------------+
|             | GLM-5   | Kimi K2.5  | MiniMax M2.7 | MiniMax M2.5 |
+-------------+---------+------------+--------------+--------------+
| Per 5 hours |   1,150 |      1,850 |       14,000 |       20,000 |
| Per week    |   2,880 |      4,630 |       35,000 |       50,000 |
| Per month   |   5,750 |      9,250 |       70,000 |      100,000 |
+-------------+---------+------------+--------------+--------------+

Estimates are based on observed average request patterns:

  • GLM-5–700 input, 52,000 cached, 150 output tokens per request
  • Kimi K2.5–870 input, 55,000 cached, 200 output tokens per request
  • MiniMax M2.7/M2.5–300 input, 55,000 cached, 125 output tokens per request

MiniMax M2.5 at 100,000 requests/month is incredibly generous for $10.

The Architecture

Here's how the pieces fit together:

Claude Code  →  LiteLLM Proxy (local)  →  OpenCode Go API  →  Open Models

Why LiteLLM? Claude Code expects an Anthropic-compatible API. LiteLLM is an open-source proxy that translates OpenAI/Anthropic-style API calls into the correct format for any provider. It acts as a local gateway — Claude Code talks to LiteLLM on localhost, and LiteLLM forwards requests to OpenCode Go's endpoints with the right authentication and parameters.

API Compatibility: MiniMax M2.5 and M2.7 are natively Anthropic-compatible, so LiteLLM passes their API calls through with minimal translation. GLM-5 and Kimi K2.5 use the OpenAI API format, so LiteLLM converts them to Anthropic-compatible requests. This is handled automatically by LiteLLM's model translations — you just specify the model name and LiteLLM takes care of the rest.

Prerequisites

  • An OpenCode Go subscription and API key
  • Python 3.9+ (for native setup) or Docker (for Docker setup)
  • Claude Code installed:
curl -fsSL https://claude.ai/install.sh | bash

Step 1: Subscribe to OpenCode Go

  1. Head to OpenCode Go
  2. Sign in and subscribe
  3. Copy your API key — you'll need it in the next step

Step 2: Configure LiteLLM Proxy

Create a config.yaml file with the following content. Replace your_api_key_here with your actual OpenCode Go API key:

general_settings:
  master_key: dummy

model_list:

  - model_name: glm-5
    litellm_params:
      model: zai/glm-5
      api_base: https://opencode.ai/zen/go/v1
      api_key: "your_api_key_here"
      drop_params: true
  - model_name: kimi-k2.5
    litellm_params:
      model: moonshot/kimi-k2.5
      api_base: https://opencode.ai/zen/go/v1
      api_key: "your_api_key_here"
      drop_params: true
  - model_name: minimax-m2.7
    litellm_params:
      model: minimax/minimax-m2.7
      api_base: https://opencode.ai/zen/go
      api_key: "your_api_key_here"
      drop_params: true
  - model_name: minimax-m2.5
    litellm_params:
      model: minimax/minimax-m2.5
      api_base: https://opencode.ai/zen/go
      api_key: "your_api_key_here"
      drop_params: true

A few things to note:

  • master_key: dummy — LiteLLM requires a master key, but since this is a local proxy, we use a placeholder.
  • drop_params: true — This is essential. It tells LiteLLM to silently strip any parameters the upstream provider doesn't support, instead of throwing errors.
  • Different API base paths — GLM-5 and Kimi K2.5 use /zen/go/v1, while MiniMax models use /zen/go (no /v1 suffix). Getting this wrong will cause 404 errors.

Approach A: Native macOS Setup

This approach runs LiteLLM directly on your Mac as a persistent background service.

Install LiteLLM

pip3 install 'litellm[proxy]'

Test it manually

litellm  --config /path/to/config.yaml --port 38765

If everything works, you should see LiteLLM start up and list the configured models. Try a quick curl:

curl http://localhost:38765/v1/models

Try to add the python bin folder in PATH, if you can't find litellm.

Set up as a macOS Launch Agent

To keep LiteLLM running in the background and auto-start on login, create a Launch Agent.

Save the following as ~/Library/LaunchAgents/io.litellm.proxy.plist:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
  "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>io.litellm.proxy</string>

    <key>ProgramArguments</key>
    <array>
        <string>/path/to/litellm</string>
        <string>--config</string>
        <string>/path/to/config.yaml</string>
        <string>--port</string>
        <string>38765</string>
    </array>

    <key>RunAtLoad</key>
    <true/>

    <key>KeepAlive</key>
    <true/>

    <key>StandardOutPath</key>
    <string>/Users/you/Library/Logs/litellm.log</string>

    <key>StandardErrorPath</key>
    <string>/Users/you/Library/Logs/litellm.error.log</string>

    <key>WorkingDirectory</key>
    <string>/path/to/config/directory</string>
</dict>
</plist>

Replace the paths with your actual locations. To find where litellm is installed:

which litellm

A good convention is to keep your config at ~/.config/litellm/config.yaml.

Load the Launch Agent

launchctl load ~/Library/LaunchAgents/io.litellm.proxy.plist

LiteLLM will now start automatically on login and restart if it crashes (KeepAlive: true). Check the logs at ~/Library/Logs/litellm.log if anything goes wrong.

To stop it:

launchctl unload ~/Library/LaunchAgents/io.litellm.proxy.plist

Approach B: Docker Setup

If you'd rather not install LiteLLM natively, you can run it in Docker while keeping Claude Code on your host machine.

Start LiteLLM

docker run -d \
  --name litellm \
  -p 38765:38765 \
  -v /path/to/config.yaml:/app/config.yaml:ro \
  docker.litellm.ai/berriai/litellm:main-stable \
  --config /app/config.yaml --port 38765

Replace /path/to/config.yaml with the absolute path to your config file.

LiteLLM is now available at http://localhost:38765 on your host. Claude Code runs natively and connects to it just like the macOS approach.

Step 3: Launch Claude Code with the Proxy

Native MacOS

With LiteLLM running as a Launch Agent on port 38765, launch Claude Code like this:

ANTHROPIC_BASE_URL=http://localhost:38765 \
ANTHROPIC_API_KEY=dummy \
claude --model glm-5

You can create shell aliases for convenience:

# Add to your ~/.zshrc or ~/.bashrc
alias claude-glm='ANTHROPIC_BASE_URL=http://localhost:38765 ANTHROPIC_API_KEY=dummy claude --model glm-5'
alias claude-kimi='ANTHROPIC_BASE_URL=http://localhost:38765 ANTHROPIC_API_KEY=dummy claude --model kimi-k2.5'
alias claude-minimax27='ANTHROPIC_BASE_URL=http://localhost:38765 ANTHROPIC_API_KEY=dummy claude --model minimax-m2.7'
alias claude-minimax25='ANTHROPIC_BASE_URL=http://localhost:38765 ANTHROPIC_API_KEY=dummy claude --model minimax-m2.5'

Now you can just type claude-kimi and start coding.

Testing the Setup

Before using Claude Code, you can verify the proxy works with a simple Python script using the OpenAI client:

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:38765",
    api_key="dummy"
)

response = client.chat.completions.create(
    model="kimi-k2.5", # or glm-5 or minimax-m2.7
    messages=[{"role": "user", "content": "Hello, how are you?"}],
    max_tokens=1000
)

print(response.choices[0].message.content)

If you get a response, your proxy is working correctly and Claude Code will too.

Tips and Gotchas

  1. drop_params: true is non-negotiable. Claude Code sends parameters that open models don't support. Without this flag, you'll get cryptic errors from the upstream API.
  2. Watch the API base paths. GLM-5 and Kimi K2.5 use https://opencode.ai/zen/go/v1 while MiniMax models use https://opencode.ai/zen/go (no /v1). Mixing these up causes 404s.
  3. Usage limits are dollar-based, not request-based. The limits may sound low, but at MiniMax M2.5's pricing, $10/month buys you ~100,000 requests — more than most developers will use in a month.
  4. Pick your model based on what you value. GLM-5 and Kimi K2.5 are better for quality-sensitive tasks. MiniMax M2.5 is the volume play — use it when you want maximum throughput and raw request count matters more than model capability.

Conclusion

For $10/month, you get four competitive open models through the best AI coding interface available. Once the LiteLLM proxy is running as a background service, the setup is completely invisible — you just code.

Links: