Titans: Learning to Memorize at Test Time

(arxiv.org)

Comments

gwern 17 January 2025
cs702 13 January 2025
Interesting. I like the idea of a meta-mechanism that learns to update an associative memory based on how surprising the data is. The other stuff, reading memory via keys and values and selectively erasing it with gating, look pretty conventional on a first glance. Thank you for sharing this on HN. I've added it to my reading list.

EDIT: I'm reminded of this other type of associative memory: https://github.com/glassroom/heinsen_routing. The idea there is to compute a mixture of memories that best predicts the given input sequence. Quite frankly, I don't remember how the whole thing works, but I do remember that it works. It's been a while since I used it, so YMMV. In any case, it may be of interest to you.