如果实验室有一台大家一起用服务器,能跑大模型,那么OpenClaw的token费用不用愁了,免费免费!这里详细介绍一下配置步骤。

方案一:

GPU-部署DeepSeek-V3.2模型--GPU云服务器-火山引擎 或者别的云厂商部署产品

方案二:

一、硬件和模型建议:

  • 硬件:推荐 2-4 张 A100/H100 或 8 张 RTX 4090(显存建议 ≥ 64GB 总显存,单卡需支持 BF16)。
  • 软件:Python 3.8-3.11,CUDA 11.8 或 12.1。

能跑大模型的服务器一台。

MiniMax-M2.5,Kimi-2.5,GLM-5 ,DeepSeek V3.2等各类agent/terminal能力模型。

二、VLLM和SGlang部署模型服务

这里举例sglang版

1. 下载sglang镜像并起容器

lmsysorg/sglang - Docker Image

Install SGLang — SGLang

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest
2.下载模型权重

DeepSeek V3.2 模型较大(约 120GB),请确保磁盘空间充足。

# 安装 huggingface_hub
pip install huggingface_hub
# 下载模型
huggingface-cli download deepseek-ai/deepseek-v3.2 --local-dir ./deepseek-v3.2 --local-dir-use-symlinks False

可使用国内镜像源(如 modelscope)或手动下载分片文件。

3.启动 SGLang 推理服务

DeepSeek V3.2 是 671B 参数的 MoE 模型,必须使用张量并行 (TP)​ 和 专家并行 (EP)​ 进行切分。以下命令假设在 2台机器(每台4卡)​ 或 单机8卡​ 环境下启动:

# 单机多卡启动示例 (8卡,TP=4, EP=2)
python -m sglang.launch_server \
    --model-path /worksapce/deepseek-v3.2 \
    --tp-size 4 \
    --ep-size 2 \
    --dtype bfloat16 \
    --trust-remote-code \
    --host 0.0.0.0 \
    --port 30000 \
    --mem-fraction-static 0.9 \
    --enable-deepep-moe \
    --deepep-mode auto

关键参数解释:

  • --tp-size 4:张量并行度,通常设置为单机卡数或物理节点数。
  • --ep-size 2:专家并行度,将 MoE 专家分配到不同设备上。
  • --enable-deepep-moe:启用 DeepEP 优化,显著提升 MoE 模型吞吐。
  • --trust-remote-code:DeepSeek 模型需要加载自定义架构代码。

注意:如果是多机部署,需要使用 sglang.launch_dist并配置 --dist-init-addr和 --node-rank,具体参考 SGLang 官方文档的分布式启动部分。

4. 客户端测试

服务启动后(看到 The server is fired up and ready to roll!日志),CURL 测试

curl -X POST http://localhost:30000/generate \
  -H "Content-Type: application/json" \
  -d '{
    "text": "The capital of France is",
    "sampling_params": {
      "temperature": 0.1,
      "max_new_tokens": 100
    }
  }'

SGLang 部署 DeepSeek V3.2 的核心在于正确配置并行策略(TP+EP)以适配其巨大的参数量。建议从单机 2-4 卡的小规模测试开始,逐步扩展到多机部署。

三、配置OpenClaw

然后配置如下

openclaw onboard
│
◇  I understand this is personal-by-default and shared/multi-user use requires lock-down. Continue?
│  Yes
│
◇  Onboarding mode
│  QuickStart
│
◇  Existing config detected ────────────╮
│                                       │
│  workspace: ~/.openclaw/workspace     │
│  model: vllm//models/Qwen3.5-35B-A3B  │
│  gateway.mode: local                  │
│  gateway.port: 18789                  │
│  gateway.bind: loopback               │
│                                       │
├───────────────────────────────────────╯
│
◇  Config handling
│  Reset
│
◇  Reset scope
│  Full reset (config + creds + sessions + workspace)
Moved to Trash: ~/.openclaw/openclaw.json
Moved to Trash: ~/.openclaw/agents/main/sessions
Moved to Trash: ~/.openclaw/workspace
│
◇  QuickStart ─────────────────────────╮
│                                      │
│  Gateway port: 18789                 │
│  Gateway bind: Loopback (127.0.0.1)  │
│  Gateway auth: Token (default)       │
│  Tailscale exposure: Off             │
│  Direct to chat channels.            │
│                                      │
├──────────────────────────────────────╯
│
◇  Model/auth provider
│  SGLang
│
◇  SGLang base URL
│  http://xxxx:{$port}/v1 #model url 比如 http://localhost:30000/v1
│
◇  SGLang API key
│  xxx  #API key
│
◇  SGLang model
│  xxx  #model name 比如/worksapce/deepseek-v3.2
│
◇  Model configured ──────────────────────────────────╮
│                                                     │
│  Default model set to //xxx  │
│                                                     │
├─────────────────────────────────────────────────────╯
│
◇  Filter models by provider
│  All providers
│
◇  Default model
│  Keep current (xxxx)
│
◇  Model check ────────────────────────────────────────────────────────────────────────╮
│                                                                                      │
│  Model not found: xxxxx. Update agents.defaults.model or run  │
│  /models list.                                                                       │
│                                                                                      │
├──────────────────────────────────────────────────────────────────────────────────────╯
│
◇  Channel status ────────────────────────────╮
│                                             │
│  Telegram: needs token                      │
│  WhatsApp (default): not linked             │
│  Discord: needs token                       │
│  Slack: needs tokens                        │
│  Signal: needs setup                        │
│  signal-cli: missing (signal-cli)           │
│  iMessage: needs setup                      │
│  imsg: missing (imsg)                       │
│  IRC: not configured                        │
│  Google Chat: not configured                │
│  Feishu: install plugin to enable           │
│  Google Chat: install plugin to enable      │
│  Nostr: install plugin to enable            │
│  Microsoft Teams: install plugin to enable  │
│  Mattermost: install plugin to enable       │
│  Nextcloud Talk: install plugin to enable   │
│  Matrix: install plugin to enable           │
│  BlueBubbles: install plugin to enable      │
│  LINE: install plugin to enable             │
│  Zalo: install plugin to enable             │
│  Zalo Personal: install plugin to enable    │
│  Synology Chat: install plugin to enable    │
│  Tlon: install plugin to enable             │
│                                             │
├─────────────────────────────────────────────╯
│
◇  How channels work ───────────────────────────────────────────────────────────────────────╮
│                                                                                           │
│  DM security: default is pairing; unknown DMs get a pairing code.                         │
│  Approve with: openclaw pairing approve <channel> <code>                                  │
│  Public DMs require dmPolicy="open" + allowFrom=["*"].                                    │
│  Multi-user DMs: run: openclaw config set session.dmScope "per-channel-peer" (or          │
│  "per-account-channel-peer" for multi-account channels) to isolate sessions.              │
│  Docs: channels/pairing              │
│                                                                                           │
│  Telegram: simplest way to get started — register a bot with @BotFather and get going.    │
│  WhatsApp: works with your own number; recommend a separate phone + eSIM.                 │
│  Discord: very well supported right now.                                                  │
│  IRC: classic IRC networks with DM/channel routing and pairing controls.                  │
│  Google Chat: Google Workspace Chat app with HTTP webhook.                                │
│  Slack: supported (Socket Mode).                                                          │
│  Signal: signal-cli linked device; more setup (David Reagans: "Hop on Discord.").         │
│  iMessage: this is still a work in progress.                                              │
│  Feishu: 飞书/Lark enterprise messaging with doc/wiki/drive tools.                        │
│  Nostr: Decentralized protocol; encrypted DMs via NIP-04.                                 │
│  Microsoft Teams: Bot Framework; enterprise support.                                      │
│  Mattermost: self-hosted Slack-style chat; install the plugin to enable.                  │
│  Nextcloud Talk: Self-hosted chat via Nextcloud Talk webhook bots.                        │
│  Matrix: open protocol; install the plugin to enable.                                     │
│  BlueBubbles: iMessage via the BlueBubbles mac app + REST API.                            │
│  LINE: LINE Messaging API bot for Japan/Taiwan/Thailand markets.                          │
│  Zalo: Vietnam-focused messaging platform with Bot API.                                   │
│  Zalo Personal: Zalo personal account via QR code login.                                  │
│  Synology Chat: Connect your Synology NAS Chat to OpenClaw with full agent capabilities.  │
│  Tlon: decentralized messaging on Urbit; install the plugin to enable.                    │
│                                                                                           │
├───────────────────────────────────────────────────────────────────────────────────────────╯
│
◇  Select channel (QuickStart)
│  Skip for now
Updated ~/.openclaw/openclaw.json
Workspace OK: ~/.openclaw/workspace
Sessions OK: ~/.openclaw/agents/main/sessions
│
◇  Skills status ─────────────╮
│                             │
│  Eligible: 3                │
│  Missing requirements: 48   │
│  Unsupported on this OS: 0  │
│  Blocked by allowlist: 0    │
│                             │
├─────────────────────────────╯
│
◇  Configure skills now? (recommended)
│  No
│
◇  Hooks ──────────────────────────────────────────────────────────────────╮
│                                                                          │
│  Hooks let you automate actions when agent commands are issued.          │
│  Example: Save session context to memory when you issue /new or /reset.  │
│                                                                          │
│  Learn more: https://docs.openclaw.ai/automation/hooks                   │
│                                                                          │
├──────────────────────────────────────────────────────────────────────────╯
│
◇  Enable hooks?
│  Skip for now
Config overwrite: /Users/xxx/.openclaw/openclaw.json)
│
◇  Gateway service runtime ────────────────────────────────────────────╮
│                                                                      │
│  QuickStart uses Node for the Gateway service (stable + supported).  │
│                                                                      │
├──────────────────────────────────────────────────────────────────────╯
│
◇  Gateway service already installed
│  Restart
│
◑  Restarting Gateway service…..Restarted LaunchAgent: gui/501/ai.openclaw.gateway
◇  Gateway service restarted.
│
◇  
Agents: main (default)
Heartbeat interval: 30m (main)
Session store (main): /Users/.openclaw/agents/main/sessions/sessions.json (1 entries)
- agent:main:main (5m ago)
│
◇  Optional apps ────────────────────────╮
│                                        │
│  Add nodes for extra features:         │
│  - macOS app (system + notifications)  │
│  - iOS app (camera/canvas)             │
│  - Android app (camera/canvas)         │
│                                        │
├────────────────────────────────────────╯
│
◇  Control UI ─────────────────────────────────────────────────────────────────────╮
│                                                                                  │
│  Web UI: http://127.0.0.1:18789/                                                 │
│  Web UI (with token):                                                            │
│  http://127.0.0.1:18789/#token=xxxx │
│  Gateway WS: ws://127.0.0.1:18789                                                │
│  Gateway: reachable                                                              │
│  Docs: https://docs.openclaw.ai/web/control-ui                                   │
│                                                                                  │
├──────────────────────────────────────────────────────────────────────────────────╯
│
◇  Start TUI (best option!) ─────────────────────────────────╮
│                                                            │
│  This is the defining action that makes your agent you.    │
│  Please take your time.                                    │
│  The more you tell it, the better the experience will be.  │
│  We will send: "Wake up, my friend!"                       │
│                                                            │
├────────────────────────────────────────────────────────────╯
│
◇  Token ─────────────────────────────────────────────────────────────────────────────────╮
│                                                                                         │
│  Gateway token: shared auth for the Gateway + Control UI.                               │
│  Stored in: ~/.openclaw/openclaw.json (gateway.auth.token) or OPENCLAW_GATEWAY_TOKEN.   │
│  View token: openclaw config get gateway.auth.token                                     │
│  Generate token: openclaw doctor --generate-gateway-token                               │
│  Web UI stores a copy in this browser's localStorage (openclaw.control.settings.v1).    │
│  Open the dashboard anytime: openclaw dashboard --no-open                               │
│  If prompted: paste the token into Control UI settings (or use the tokenized dashboard  │
│  URL).                                                                                  │
│                                                                                         │
├─────────────────────────────────────────────────────────────────────────────────────────╯
│
◇  How do you want to hatch your bot?
│  Open the Web UI
│


就可以使用啦!

Logo

小龙虾开发者社区是 CSDN 旗下专注 OpenClaw 生态的官方阵地,聚焦技能开发、插件实践与部署教程,为开发者提供可直接落地的方案、工具与交流平台,助力高效构建与落地 AI 应用

更多推荐