Xiao Cui    About    Archive

Building software, raising humans

Titans Part 3: HOPE Architecture - From Paper to Reality (and Back)

This is Part 3 of my Titans implementation series. Part 1 covered the basic memory mechanism. Part 2 focused on performance optimization. This post tackles the full HOPE (Hierarchical Optimized Parallel Encoding) architecture - the multi-level memory system that makes Titans truly interesting - and the hard lessons learned when evaluation results didn’t match expectations.

Implementing Titans - Learning to Memorize at Test Time

Context engineering has become extreme intriguing to me recently, as hands on building couple of agentic platform project, one asepct of context engineering is memory and continous learning, Manus shared an excellent learning: Context Engineering for AI Agents from harness perspective, Google Research dropped a paper in January 2025 that caught my attention: Titans: Learning to Memorize at Test Time. The core idea is elegant - give transformers a learnable memory that updates during inference, not just training. I decided to implement it from scratch to understand it deeply.

Building SGREP - Part Two

After shipping the first version of sgrep with ColBERT late interaction, I thought the hard work was done. The search accuracy was good - MRR of 0.70 on my test queries, significantly better than plain semantic search. But when I started using it on larger codebases, problems emerged.

Building SGREP

Recently, there was a great mrep tool published by mxbread team, helping to address the issue that LLM harnesses such as claude code, codex, amp, when doing search, spent unnecessary time to retrieve useless tokens. Here’s what mgrep claim: