AI Agents Too Expensive and Leaky? This Open-Source Plugin Slashes Costs by 60% While Keeping Sensitive Data On-Premises

Image

MLNLP Community is a renowned machine learning and natural language processing community both domestically and internationally, serving graduate students, university faculty, and enterprise researchers in the NLP field. The community's vision is to foster communication and advancement among academia, industry, and enthusiasts in natural language processing and machine learning, with special emphasis on supporting beginners' growth.

Source | AI Technology Review

AI Agents are integrating into our workflows in unprecedented ways. However, as we enthusiastically deploy them in practical scenarios, current Agent usage patterns reveal three critical issues:

Cloud: "Too Risky to Use": Want an Agent to analyze a customer data spreadsheet? Great idea, but customer names, phone numbers, ID numbers... all these sensitive details get sent to third-party cloud servers along with the context. A single data analysis task could mean a serious privacy breach. This is a risk we simply cannot afford.

Cloud: "Too Expensive to Use": For numerous simple tasks, such as using grep to locate a function call or generating a brief text summary, Agents indiscriminately invoke the most expensive top-tier models. The majority of tokens are wasted on simple tasks that could be handled by cheaper models—using a sledgehammer to crack a nut.

Local: "Doesn't Work Well": Running models locally is secure and cost-effective, but performance often falls short of expectations due to limited computational power and parameter scale on edge devices. While format conversion and data aggregation work fine, models "crash" when faced with multi-file cross-analysis or complex anomaly detection, proving inadequate for high-difficulty tasks.

The cloud is too dangerous, while local deployment is too underwhelming—must developers choose between the two?

Clearly, there's always a way forward. Why choose when you can have both? The optimal solution is cloud-edge collaboration:

Let lightweight local models handle privacy-sensitive data and simple tasks, while delegating complex "tough nuts to crack" to powerful cloud models. The key to enabling this lies in the "intelligent traffic controller" that routes each request to the most appropriate path—the routing mechanism.

THUNLP Laboratory at Tsinghua University, Renmin University of China, AI9Stars, ModelBest, and OpenBMB have jointly released and open-sourced ClawXRouter, specifically designed to solve this problem!

ClawXRouter is a cloud-edge collaborative AI Agent routing plugin that seamlessly adapts to the OpenClaw ecosystem, originating from the EdgeClaw cloud-edge collaborative Agent framework.

EdgeClaw natively incorporates comprehensive cloud-edge collaboration capabilities including three-tier privacy routing, cost-performance-aware routing, intelligent de-identification forwarding, and dual-track memory. ClawXRouter packages these core routing capabilities into an independent plugin for easy integration into the OpenClaw ecosystem.

Developers can enable AI Agents to automatically achieve the following without modifying a single line of business code:

▪ Public data analysis on the cloud

▪ Sensitive data de-identification before cloud upload

▪ Private data processing locally

Image

One plugin seamlessly enables cloud-edge collaboration, resolving developers' three major challenges: "too risky to use," "too expensive to use," and "doesn't work well."

GitHub open-source link: https://github.com/OpenBMB/ClawXRouter

ClawHub link: https://clawhub.ai/plugins/clawxrouter

ClawXRouter: Automatically Delivering the "Optimal Solution" for Every Request

Three-Tier Privacy Routing: Solving "Too Risky to Use"

Even routine tasks like Code Review can accidentally expose API Keys or database passwords to cloud models. ClawXRouter employs hooks to automatically scan every message, tool call, and Agent output like a security checkpoint, classifying them into three tiers:

S3 (Private): SSH private keys, hardcoded passwords, payroll data. These are physically isolated, with requests processed entirely offline by local models, completely invisible to the cloud. Private information never leaves the device.

S2 (Sensitive): Alert logs containing internal network IPs, contact lists with phone numbers. ClawXRouter automatically identifies and intelligently de-identifies such data (e.g., replacing "Wang Xiaoer" with "[REDACTED:NAME]") before forwarding to cloud models.

S1 (Safe): General queries like "What's the difference between HTTP 403 and 401?" are sent directly to the cloud to leverage its full capabilities.

Behind this lies a dual-detection engine combining rules + models, ensuring both speed and accuracy for foolproof security.

Cost-Performance-Aware Routing: Solving "Too Expensive to Use"

Using "space-grade" models for "screw-tightening" tasks! ClawXRouter includes a local small model acting as a "Task Evaluator" (LLM-as-Judge). It rapidly assesses task complexity and dispatches requests to the most appropriate model.

Image

How effective is it? Tested on PinchBench (comprising 23 OpenClaw Agent benchmarks):

Image

The conclusion is clear: 58% cost savings with a 6.3% performance improvement!

Dual-Track Memory and Intelligent De-identification: Solving "Doesn't Work Well"

What if a task involves both sensitive information and requires the cloud model's powerful reasoning capabilities? This is where ClawXRouter's intelligent de-identification mechanism shines.

For complex tasks involving sensitive information where local models fall short, there's no need to "force it": ClawXRouter automatically identifies and intelligently de-identifies sensitive information before securely delegating the task to the cloud for processing.

Simultaneously, ClawXRouter cleverly maintains dual-track memory and dual-track session mechanisms: cloud models only see de-identified conversation history (`MEMORY.md`), while complete information is retained locally (`MEMORY-FULL.md`). This protects privacy without letting local model limitations bottleneck workflows, fundamentally eliminating the risk of private data leaking to third-party services through context windows.

Composable Pipelines and Visual Dashboard

Every developer and team has unique needs. To address this, ClawXRouter provides:

Composable Routing Pipelines: Privacy routing and cost-performance-aware routing operate within the same pipeline, following the safety-first principle. The privacy router runs first with high priority, directly short-circuiting upon detecting sensitive data; only after passing safety checks does the cost-performance router activate to optimize expenses. The entire pipeline covers the complete lifecycle from model selection to session termination through 10 Hooks, non-invasively taking over OpenClaw's original workflow.

Visual Dashboard: Supports both Chinese and English, featuring five panels for usage overview, session records, detection logs, routing rule configuration, and model configuration. All changes take effect immediately without requiring restarts, allowing users to flexibly adjust according to their needs.

Image

Image

Quick Start Guide

Bash

# Prerequisite: OpenClaw installed

# Install via npm (recommended)

pnpm add -w @openbmb/clawxrouter

# Or install via ClawHub

openclaw plugins install clawhub:clawxrouter

# (Optional) Install local inference backend

ollama pull openbmb/minicpm4.1

ollama serve

# Start

openclaw gateway

# Dashboard → http://127.0.0.1:18789/plugins/clawxrouter/stats

Cloud: too risky and too expensive; Local: doesn't work well—ClawXRouter's answer is: No need to choose; let cloud and edge each play to their strengths. The project will continue open-source iteration, welcoming developers and industry partners to contribute and jointly build a secure, efficient cloud-edge collaborative Agent ecosystem.

About Us

The MLNLP Community is a grassroots academic community jointly built by machine learning and natural language processing scholars from around the world. It has grown into a renowned ML and NLP community domestically and internationally, aiming to promote progress among academia, industry, and enthusiasts in machine learning and natural language processing.

The community provides an open exchange platform for professionals' advanced studies, employment, and research. We welcome everyone to follow and join us.

Image

Related Articles

分享網址
AINews·AI 新聞聚合平台
© 2026 AINews. All rights reserved.