OpenClaw open-source contribution and memory-system rebuild

Problem

The upstream version's memory is a single-layer vector retrieval: no forgetting mechanism, no proactive extraction, and if the embedding service so much as hiccups, the whole memory system goes down with it. Long conversations also blow past the token budget — and upstream underestimates CJK token counts by roughly 40%, which makes Chinese-language scenarios noticeably worse.

Approach

Start by earning trust in the small: submit a PR fixing a misconfigured MiniMax API endpoint — already merged upstream. Then fork and customize deeply: (1) rebuild the single-layer vector retrieval into a three-tier cognitive memory architecture (retrieval-engine layer / cognitive-memory layer / scheduling layer), adding forgetting and proactive extraction; (2) a four-tier retrieval fallback chain: embedding failure → fallback provider → keyword-only → SQL LIKE multi-token scoring, so any layer going down has the next one to catch it; (3) four-layer context management: entry truncation (60% head + 30% tail, with full content persisted to disk and readable on demand) → three-stage progressive trimming → persisted-session cleanup (atomic writes that replace digested, redundant tool output) → CJK-aware token budgeting.

What this project proves

Writing a project from scratch makes it easy to "just stay in your own comfort zone." The difficulty of the OpenClaw work is this: reading the internal architecture of an active open-source Agent project, pointing out its structural problems, and then fixing them.

What I changed

Memory: from one tier to three. The upstream version is essentially "stuff it into the vector store, pull it back by similarity" — no forgetting, no proactive extraction. After the rebuild it splits into a retrieval-engine layer, a cognitive-memory layer, and a scheduling layer, so memory finally has a lifecycle.

Availability: four-tier retrieval fallback. In the upstream version, the moment the embedding service goes down, memory is entirely unusable. After the rebuild: embedding → fallback provider → keyword-only → SQL LIKE multi-token scoring, each tier catching the one above. (Look familiar? The "real AI → demo mode" fallback in this site's chat path is the same design philosophy.)

Context: four-layer management. Entry truncation keeps the head 60% and tail 30% while persisting full content to disk to read on demand; three-stage progressive trimming; persisted-session cleanup that uses atomic writes to replace digested, redundant tool output; and finally CJK-aware token budgeting — upstream's ~40% underestimate for Chinese throws off every upstream strategy, and this correction is the prerequisite for the whole thing being usable in Chinese.

It started with a merged PR

Fixing a misconfigured MiniMax API endpoint — tiny, but it went through the full open-source collaboration loop: spot the problem, locate it, open the PR, pass review, get merged.