At a glanceTarget role: LLM / Agent application development (internship + full-time)Education: Monash AI master's · graduating 2026.07Base: Chengdu · open to remoteAvailability: Remote internship now → full-time on graduationRésumé ↓

AI / Agent application developer · Monash AI master's · Class of 2026

Huang Yihang

Turning LLMs into Agents that actually get things done.

From multi-agent cross-validation to desktop-grade autonomous execution — memory architecture, safety sandboxes, tool calls, all the way to production.

Ask what he's built…

AI twin · tap to start chatting

The headlines first.

Swipe to browse

Flagship open-source project · solo build · MIT, free and open

NoWorries NoWorries

Say it once. Leave the rest to it.

NoWorriesNoWorries
Turn this month's invoices into an Excel report
Excel workAdd SUM formulasAnalyze data + build chartsVLOOKUP matchingPivot tables
Word documentsDraft a formal reportWrite a weekly updateTake meeting minutesPolish your prose
Cross-file workflowsExcel data → analysis reportPDF tables → ExcelBatch-generate payslips
Everyday officeWrite business emailsSummarize documentsExtract contract clauses
1Plan the taskRead invoices → parse fields → build the report
2Tool Callingexcel_writer.create("invoice_summary.xlsx")
DoneReport built, originals backed up automatically
invoice_summary.xlsx · built · one-click rollback

Electron + TypeScript + Python, built from scratch by one person.

One sentence — it plans, calls tools, and executes across multiple steps on its own.

Files never leave your machine. Every step can be rolled back.

Memory system

Three tiers of memory. It gets you, the more you use it.

Local SQLite · vector embeddings + semantic search · incremental summarization · emotional tagging

Instant memoryThe current conversation, saved in real time
Episodic memoryTasks and events, archived by month
Core memoryPreferences and habits, semantically searchable
Autonomous task planningOne sentence → break it into steps → decide the order and timing of calls itself, executing across multiple steps autonomously
Function callingDiscovers and calls Excel / Word / PPT / file-system tool modules dynamically at runtime
Cross-file workflowsExcel data → analysis report, PDF tables → Excel, batch-generated payslips — multiple files working in concert
Desktop-grade, shippedInstalled locally, files never leave your machine, running on your own model key (BYOK)
Safety sandbox

Worst case, you lose a couple of minutes.

Workspace isolation, automatic backups before any change, dangerous-command blocking, sensitive-path protection, a full operation log, and one-click undo. Deleted something? There's a backup. Changed the wrong thing? It was working on a copy. Even first-timers can hand it real work.

Skills system

Learning a new skill is as easy as installing an app.

Skills are plug-and-play, community-shared, customizable to build, and versioned — from professional Excel / Word / PPT handling to automating internal company workflows, discovered and registered automatically at runtime.

Use your own key, no subscription fee — Zhipu GLM · Tongyi Qianwen · Doubao · Kimi · DeepSeek · Ollama (local, offline)

“The point of technology isn't to make people busier — it's to earn people the right to have no worries.”

Open-source contribution · OpenClaw

Understood OpenClaw's memory system, then rewrote the whole thing.

Upstream used single-layer vector retrieval — one embedding failure and it all collapses. I rebuilt it into three-tier cognitive memory + a four-tier fallback, so it can forget and always recover.

01embedding retrievalvector semantic match
02fallback providerbackup embedding service
03keyword-onlykeyword retrieval
04SQL LIKE scoringmulti-token fallback

Degrade tier by tier — always an answer.

PR merged upstream · fixed the MiniMax API endpoint config

Three-tier cognitive memory

Rebuilt upstream's single-layer vector retrieval into three tiers — retrieval engine / cognitive memory / scheduling — so memory can forget and actively surface what matters.

Four-layer context management

Entry-point truncation, three-stage progressive trimming, persistent-session cleanup, plus a CJK-aware token budget — correcting roughly 40% token underestimation in Chinese-language cases.

Open framework · designed solo

Sprout — a task tree that grows itself.

Every Agent decides its own splits — a tree topology, unbounded depth, truly recursive.

Every Agent decides for itself whether to split.

Stuck? Detect it, cancel it, re-split.

Done nodes die, results bubble up.

Same task (write 4 independent Python modules), under a capped single-call budget: a single Agent scores just 25, while Sprout takes the full 100.

Scored automatically by a programmatic rubric, out of 100, reproducible. The single Agent runs out before finishing and gets truncated by token limits; after Sprout splits, each child node gets its own budget, and all 4 modules ship — the core value isn't parallel speedup, it's getting around the token / attention bottleneck of a single call.

Decision

Two-phase Worker

analyze() uses a lightweight call to first judge “should this split?”, then execute() does the real work. Separating analysis from execution makes the split decision sharper.

Emergence

Approach injection

When a parent splits, it generates a methodology and focus for each subtask and injects them into the child Agent's system prompt — roles emerge from the task rather than being predefined.

Boundaries

Safety bounds

max_depth · max_children · max_total_nodes · max_total_tokens — four ceilings to keep the tree from exploding; 24 unit tests cover the core modules.

ContractLens · built with a practising lawyer

Read a 348-page contract down to one clear page.

AI review of Victorian (Australia) property contracts — a 10-stage pipeline, 7 AI analysts in parallel.

Contract of Sale348 pages · OCR
Particulars
Special Conditions
Section 32
Title & Plan
OC certificates
Council / Water
Lease
Review report · 3 Graham RdComplete
SC 5.1 · Vendor waives all lease warranties
Certificates overdue · 6 certificates need renewal
Section 32 · Mandatory disclosure missing, with source page numbers
91 findings · each cited to the source~$1 per contract

Upload the contract PDF; the rule engine splits it automatically.

7 AI analysts get to work in parallel.

A one-page report — every conclusion cited to the source.

Anti-hallucination

Every conclusion holds up to scrutiny.

Mandatory source citations + rapidfuzz fuzzy-match verification + targeted retries only on failed citations; output then passes two compliance gates (regex + AI semantic review), ruling out “AI-lawyer” overreach.

Calibrated on reality

Benchmarked against real lawyer reports.

4 real contracts (104–348 pages, including scanned OCR and a 5-address mixed-title case) ran end to end; compared line by line against a practising lawyer's review — about 30 findings matched the lawyer's report, plus 2 ACN discrepancies confirmed line by line with the lawyer that their report hadn't itemized.

Next.js 15FastAPILangGraphClaude tiered calls (Opus / Sonnet / Haiku)SupabasePyMuPDF + Tesseract OCR

Proven in production.

Two internships, putting Agents into real business.

Sugon · Agent Development Intern

2025.12 – 2026.02

Agent application developmentBuilt an intelligent HR Agent and enterprise RAG Q&A on Dify, with vector-store integration and chunking-strategy tuning, reaching 85%+ Q&A accuracy.

Multi-agent cross-validationThree specialized review Agents independently assess SFT training items, with structured scoring + conflict arbitration for joint decisions — cutting manual review cost significantly.

SFT data engineering and QCAn automated QC workflow on Feishu validates 500+ items a day, cutting manual effort from 3 hours to 10 minutes; reviewed Tool Calling and CoT correctness line by line.

Fantuan · AI Product Intern

2024.11 – 2025.02

Product iteration and validationDrove 4 versions of the AceEssay AI-reduction tool, with a dual Turnitin / GPTZero evaluation framework, bringing the AI-detection rate from 100% down to 10–20%.

Content growth and SEO60+ pieces of content drove 75K site visits and grew followers from 0 to nearly 30K; core-keyword ranking went from 48 to 9, with organic traffic up roughly 3× month over month.

Easter egg · this very website

This keynote is itself an exhibit.

Read the full build log (in Chinese) ›

Built with Claude Code, every step from one vague request to launch is on the record — the real prompts, the time each stage took, and every time the AI went off the rails and how I corrected it, all public in the build log (written in Chinese).

001起点:一句模糊的需求,和一次 104 个 agent 的深度调研从「我想做个介绍自己的网站」出发,用多 agent 调研工作流摸清中英文世界的做法,拿到 12 条交叉验证过的结论。2026-06-11 · 约 25 分钟(其中调研工作流自动运行约 12 分钟)002计划:四个决策问题,和「不用 RAG」的反直觉选择人拍四个关键决策,AI 出完整实施计划;最重要的架构决定是 v1 故意不用 RAG。2026-06-11 · 约 20 分钟003脚手架:计划赶不上生态——AI SDK 直接装到了 v6create-next-app 起项目;计划里防的是 v5 旧写法,结果装上来的是 v6——用「先读类型定义再写代码」化解。2026-06-11 · 约 15 分钟004内容架构:一份数据,喂页面也喂 AIcontent/ 同时驱动页面渲染和 system prompt;check:content 一条命令扫光所有占位符。2026-06-11 · 约 25 分钟005聊天全链路:四场景验证矩阵,和一个教科书级的 Node 坑真 AI/演示模式同协议流式降级跑通;mock 服务器一个 req.on('close') 挂错位置,导致响应一个字节都发不出去。2026-06-11 · 约 50 分钟(其中排查 mock 挂起约 15 分钟)006页面、架构页与全站验证:让网站自己证明自己五个页面 + 同源 prompt 展示页全部落地;lint/类型/构建全绿,浏览器实测聊天交互、移动端与控制台零报错。2026-06-11 · 约 50 分钟007接入真 AI:DeepSeek v4 是推理模型,差点把 token 预算吃光演示模式换真大模型;发现 v4-flash 默认带思考链,30 个 token 全花在 reasoning 上正文为空——用 thinking 参数关闭,并把厂商专属字段做成通用 env 透传。2026-06-12 · 约 30 分钟008决策变更:源码暂不公开重新权衡后决定暂不开源本站源码;按「网站不许说谎」的原则,把全站相关口径改为「设计公开、源码可应邀提供」。2026-06-12 · 约 10 分钟009视觉改版:摆脱「AI 默认审美」,以及一次为 SEO 放弃新特性的取舍站主点出要害:默认风格一眼就是「一句话让 AI 生成的」;改成苹果发布会风格,动效零依赖;中途发现 scroll-driven animations 对爬虫不可靠,果断弃用换 IntersectionObserver。2026-06-12 · 约 70 分钟010发布会改版:第五个产品是网站本身,以及一场和 React 19 水合的遭遇战站主用单文件原型拍板苹果发布会式首页;聊天岛位置之争以「hero 输入框胶囊 + 终章真窗」收场;实现期撞上 React 19 水合冲掉内联脚本 DOM 变更的硬坑,顺带挖出旧 reveal 系统的同款潜伏 bug。2026-06-13 · 约 2 小时(设计对谈约 30 分钟 + 实现与验证约 90 分钟)011招聘方友好功能:先调研招聘方真实行为,再决定做什么站主想加方便 HR 的功能但自陈不了解,于是先跑一轮多 agent 调研招聘方真实行为/隐私红线/中国语境,再据此落地一批:首屏招聘速览条、AI 分身 HR 初筛 chip、一键复制候选人摘要、对话导出、.vcf+二维码、零追踪隐私页——全部贴合零数据库调性,并刻意避开追踪招聘方身份的红线。2026-06-14 · 约 1.5 小时(多 agent 调研约 10 分钟 + 实现与验证约 80 分钟)012改回纯系统字体:一次「文档说一套、代码做一套」的自我纠正多 agent 评审发现:站点口径写着「系统字体栈、不内嵌」,代码却用 next/font 自托管了 Geist,还阻塞预加载两个 woff2 卡 hero 大字 LCP。删掉 Geist、改回纯系统栈——一行级改动同时消除自相矛盾与一次字体往返。2026-06-14 · 约 10 分钟013英文版上线:一次「类型全过、构建才报错」的 RSC 序列化课给主站做中英切换:单一 layout + /en 路由 + lang 包裹(不上 i18n 路由中间件),getContent(lang) 单一数据源、组件统一吃 content prop。最扎心的坑——UI 文案里塞了个函数,tsc 全过、next build 预渲染才炸出「函数不能跨 RSC→客户端边界」。AI 分身在 /en 用英文 system prompt 回答,四场景验证矩阵全绿。2026-06-14 · 约 1.5 小时014从「零记录」到「轻量记录」:把访客统计做成不纰漏的合规设计站主想知道哪些招聘方来过、问了什么。先用多 agent 工作流查清『IP 反查公司』对国内小公司基本无效、且对海外是 GDPR 雷区;据此把方案收敛成『自愿留资拿「谁」+ 粗粒度城市当弱信号』,再上一道多法域合规审查把关。审查抓出三处真雷:通知与隐私页口径相反的「同部署」漏洞、可逆推 IP 的弱默认盐、保留期没真删。全部修掉,附 LIA + PIPL 自评。2026-06-14 · 约 2 小时015聊天「就地往下展开」+ 回答渲染 Markdown:三版才丝滑原来点「问 AI 分身」是锚点跳到底部终章、跳过中间六屏。改了三版才对:①居中弹窗——站主否(要就地,不是弹窗);②退出剧场+滚到顶就地展开——站主否(乱跳);③最终:在被点胶囊处 fixed 锚一张深色聊天卡、clip-path 自上而下丝滑长出、锁滚动不跳转。另外分身回答的 Markdown 一直当纯文本显示,接上 react-markdown。2026-06-17 · 约 2.5 小时016聊天第四版:真·文档流内温柔展开 + 浅色卡重设计站主嫌 v3 的 fixed 覆盖卡动画生硬:要温柔、要在页面里就地展开、下面内容自动往下挪。重做为真·流内插入 + grid-rows 温柔撑开。二轮验收又砍三刀:黑卡配白底 hero 太丑→固定浅色 card 变体;矮卡折叠选项→加高缩字;免责两行→删(终章保留合规告知)。2026-07-04 · 约 1.5 小时(展开重做约 1 小时 + 配色与信息密度约 30 分钟)

Tech specs.

Agents & LLMsAgent architecture design · multi-agent orchestration · Tool / Function Calling · RAG and vector retrieval · Prompt Engineering / CoT · Dify workflow orchestration
Model APIsOpenAI · Claude · Gemini · Zhipu · DeepSeek
Languages & frameworksPython · TypeScript · JavaScript · Java · Electron · Flask · PyTorch · Transformers
EngineeringGit / GitHub open-source collaboration · Feishu Open Platform API · MySQL / SQLite
EducationMonash University (QS 37) · Master of Artificial Intelligence · 2024.07 – 2026.07 Tianjin University of Technology · BSc, Data Science and Big Data Technology · 2019.09 – 2023.07
LanguagesChinese (native) · professional working English for communication and technical writing (PTE 61)

One more thing.

This one is alive.

Every product demo above was an animation. This window is the real thing — my AI twin is live: it answers from a real résumé, and says so plainly when something's outside what it knows. How it works ›

Hi — I'm the AI twin of the person behind this site. I answer from their real résumé and project history, and I'll tell you straight when something isn't in my notes. Start with one of these:

Tell me about yourself

I'm Huang Yihang's AI twin. He's an AI master's student at Monash University (graduating July 2026), targeting LLM / Agent application development roles, based in Chengdu and open to remote. He did an Agent development internship at Sugon, independently built the open-source desktop AI assistant NoWorries, landed a merged PR on the open-source project OpenClaw, and wrote his own multi-agent framework, Sprout. Want the full picture? See the Projects and About pages.

When can you start?

He can start a remote internship right now; for full-time, he's available as soon as he finishes his master's in July 2026. He's based in Chengdu, comfortable working remotely, and the overseas degree certification doesn't slow down a remote start. In short: internship anytime, full-time the moment he graduates.

Are you open to remote work?

Yes—remote-friendly, and happy to come on-site or travel when it matters. He's based in Chengdu and open to both internship and full-time roles. The remote toolchain (Feishu, Git, async communication) is something he's actually run, both during his Sugon internship and while building this site collaboratively.

What sets you apart?

In one line: building from zero, reading and refactoring someone else's system, and shipping Agents into real enterprise workflows—he has verifiable work in all three. NoWorries is an open-source desktop Agent he built solo (three-tier memory + safety sandbox). OpenClaw was about understanding an active official project and refactoring its memory system, with the PR merged. At Sugon, he landed multi-agent cross-validation inside a real SFT quality-control pipeline. Most candidates can show one of the three. He has the real thing in all three.

Pick a project and tell me about a hard bug

Here's a real one: while refactoring OpenClaw, he found the official token estimate for Chinese (CJK) was off by roughly 40% on the low side—which threw off every upstream context-trimming strategy and blew the token budget constantly in Chinese scenarios. He added a CJK-aware token-budget correction layer to stabilize it. More wrong turns and corrections (including the ones the AI itself made) are all laid out in the build log—ask me about any detail and I'll take it down to the mechanism level.

How do I reach you?

Email: 1653120857@qq.com, GitHub: github.com/hlbbbbbbb. You'll also find every contact method on the About page, or you can download the résumé PDF / save the digital business card. And if you're a recruiter—he replies a lot faster than I do :)

AI-generated answers can be off — for anything that matters, go by the résumé and a conversation with me directly. Chats may be logged to improve the twin.

Fallback chain: real AI → FAQ demo → static page · fully readable with no JS

Let's build something that thinks.

Graduating with my master's in 2026.07, looking for a full-time LLM / Agent application development role (open to interning now). Send an email — I'll reply fast.

For recruiters · grab it in one click

Save contact (.vcf)
Scan to open on your phoneScan to open on your phone