背景

问题来自 小龙虾自杀, 当我让 OpenClaw 更新一些配置时, 它执行了一条 openclaw gateway stop 命令, 导致 OpenClaw 服务停止, 然后我就干瞪眼了, 还在傻等, 它甚至一句分别的话都没有说…

openclaw gateway stop

解决方案

用 keeplived 来保持 OpenClaw 服务的运行, 在服务停止时, 能够自动重启服务, 顺便也学一下 keepalived 的用法

查看IP

ifconfig

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
        inet 172.16.0.9  netmask 255.255.0.0  broadcast 172.16.255.255
        inet6 fe80::f816:3eff:fedc:b5a8  prefixlen 64  scopeid 0x20<link>
        ether fa:16:3e:dc:b5:a8  txqueuelen 1000  (Ethernet)
        RX packets 67391273  bytes 27877705315 (27.8 GB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 110078632  bytes 24769427866 (24.7 GB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 2288521  bytes 373008683 (373.0 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 2288521  bytes 373008683 (373.0 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

检测脚本

check_openclaw.sh 脚本, 用于检查 OpenClaw 服务是否运行

#!/bin/bash
export XDG_RUNTIME_DIR="/run/user/0"

if systemctl --user is-active openclaw-gateway.service > /dev/null 2>&1; then
    if curl -s -f -o /dev/null http://127.0.0.1:18789/health 2>/dev/null; then
        #logger -t openclaw-health "Health check PASSED"
        exit 0
    else
        logger -t openclaw-health "Health check FAILED - service not responding"
        exit 1
    fi
else
    logger -t openclaw-health  "Health check FAILED - service is not active"

    # 尝试启动服务
    logger -t openclaw-health "Attempting to start service"
    systemctl --user start openclaw-gateway.service
    sleep 3

    # 检查启动是否成功
    if systemctl --user is-active openclaw-gateway.service > /dev/null 2>&1; then
        logger -t openclaw-health  "Service started successfully"
        exit 0
    else
        logger -t openclaw-health  "Failed to start service"
        exit 1
    fi
fi

keepalived 配置

keepalived 配置 /etc/keepalived/keepalived.conf, 因为就一台服务器, 所以 state 为 MASTER

global_defs {
    router_id OPENCLAW_MONITOR
    script_user root
    enable_script_security
}

vrrp_script chk_openclaw {
    script "/usr/local/bin/check_openclaw.sh"
    interval 10      # 每10秒检查一次(更频繁)
    timeout 5        # 脚本执行超时5秒
    weight -20       # 检查失败时优先级降低20
    fall 2           # 连续2次失败判定为故障
    rise 1           # 1次成功就恢复
}

vrrp_instance OPENCLAW_MONITOR {
    state MASTER                     # 单机必须用 MASTER
    interface eth0                   # 使用 eth0
    virtual_router_id 51
    priority 100                     # 优先级
    advert_int 2                     # 心跳间隔2秒

    # 虚拟IP配置
    virtual_ipaddress {
        172.16.0.100/16 dev eth0     # VIP 绑定到 eth0
    }

    track_script {
        chk_openclaw
    }

}

启动 keepalived 服务

# 重启 keepalived
systemctl restart keepalived
# 启用 keepalived 服务
systemctl enable keepalived
# 查看 keepalived 状态
systemctl status keepalived

查看 VIP(虚拟IP)

ip addr show eth0 | grep 172.16.0.100
inet 172.16.0.100/16 scope global secondary eth0

演练故障

# 停止 OpenClaw 服务
openclaw gateway stop

journalctl -t openclaw-health -f 查看日志

Apr 02 16:06:24 lavm-0sdc09108n openclaw-health[1567604]: Attempting to start service
Apr 02 16:06:27 lavm-0sdc09108n openclaw-health[1567623]: Service started successfully
Apr 02 16:06:34 lavm-0sdc09108n openclaw-health[1567638]: Health check FAILED - service not responding

nc -z localhost 18789 验证:

Connection to localhost (127.0.0.1) 18789 port [tcp/*] succeeded!

也可以执行 openclaw gateway status 查看服务状态

🦞 OpenClaw 2026.3.24 (cff6dc9)
   I can grep it, git blame it, and gently roast it—pick your coping mechanism.

│
◇
Service: systemd (enabled)
File logs: /tmp/openclaw/openclaw-2026-04-02.log
Command: /root/.nvm/versions/node/v22.22.1/bin/node /root/.pnpm-global/5/.pnpm/openclaw@2026.3.24_@napi-rs+canvas@0.1.97/node_modules/openclaw/dist/index.js gateway --port 18789
Service file: ~/.config/systemd/user/openclaw-gateway.service
Service env: OPENCLAW_GATEWAY_PORT=18789

Service config looks out of date or non-standard.
Service config issue: Gateway service PATH includes version managers or package managers; recommend a minimal PATH. (/root/.nvm/versions/node/v22.22.1/bin)
Service config issue: Gateway service uses Node from a version manager; it can break after upgrades. (/root/.nvm/versions/node/v22.22.1/bin/node)
Service config issue: System Node 22 LTS (22.14+) or Node 24 not found; install it before migrating away from version managers.
Recommendation: run "openclaw doctor" (or "openclaw doctor --repair").
Config (cli): ~/.openclaw/openclaw.json
Config (service): ~/.openclaw/openclaw.json

Gateway: bind=loopback (127.0.0.1), port=18789 (service args)
Probe target: ws://127.0.0.1:18789
Dashboard: http://127.0.0.1:18789/
Probe note: Loopback-only gateway; only local clients can connect.

Runtime: running (pid 1567606, state active, sub running, last exit 0, reason 0)
RPC probe: ok

Listening: 127.0.0.1:18789
Troubles: run openclaw status
Troubleshooting: https://docs.openclaw.ai/troubleshooting

openclaw-gateway.service

还有更简单的方法, 直接在 /etc/systemd/system/openclaw-gateway.service 中配置Restart=always 即可, 这样它就会在服务停止时自动重启服务

[Unit]
Description=OpenClaw Gateway
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
#User=$(whoami)
User=root
EnvironmentFile=/opt/openclaw/.env
WorkingDirectory=/home/$(whoami)/.openclaw
ExecStart=$(which openclaw) gateway --force
Restart=always
RestartSec=2

[Install]
WantedBy=multi-user.target
Logo

小龙虾开发者社区是 CSDN 旗下专注 OpenClaw 生态的官方阵地,聚焦技能开发、插件实践与部署教程,为开发者提供可直接落地的方案、工具与交流平台,助力高效构建与落地 AI 应用

更多推荐