Titans Part 3: HOPE Architecture - From Paper to Reality (and Back)

29 Dec 2025

This is Part 3 of my Titans implementation series. Part 1 covered the basic memory mechanism. Part 2 focused on performance optimization. This post tackles the full HOPE (Hierarchical Optimized Parallel Encoding) architecture - the multi-level memory system that makes Titans truly interesting - and the hard lessons learned when evaluation results didn’t match expectations.

Titans Part 2: Optimizing Memory Updates and Adaptive Learning

21 Dec 2025

This is a follow-up to my previous post on implementing Titans. After getting the basic implementation working, I dove deeper into performance optimization with Claude Code as a pair programmer. This post covers the journey from a working but slow implementation to something that trains efficiently on commodity GPUs.

Implementing Titans - Learning to Memorize at Test Time

20 Dec 2025

Context engineering has become extreme intriguing to me recently, as hands on building couple of agentic platform project, one asepct of context engineering is memory and continous learning, Manus shared an excellent learning: Context Engineering for AI Agents from harness perspective, Google Research dropped a paper in January 2025 that caught my attention: Titans: Learning to Memorize at Test Time. The core idea is elegant - give transformers a learnable memory that updates during inference, not just training. I decided to implement it from scratch to understand it deeply.

Building SGREP - Part Two

11 Dec 2025

After shipping the first version of sgrep with ColBERT late interaction, I thought the hard work was done. The search accuracy was good - MRR of 0.70 on my test queries, significantly better than plain semantic search. But when I started using it on larger codebases, problems emerged.

Building SGREP

06 Dec 2025

Recently, there was a great mrep tool published by mxbread team, helping to address the issue that LLM harnesses such as claude code, codex, amp, when doing search, spent unnecessary time to retrieve useless tokens. Here’s what mgrep claim: