AutoMem: Automated Learning of Memory as a Cognitive Skill

Memory expertise is a learned skill: knowing what to encode, when to retrieve, and how to organize knowledge--a capacity known in cognitive science as metamemory. We bring this perspective to LLMs by treating memory management as a trainable skill. We promote file-system operations to first-class memory actions alongside task actions, letting the model itself decide how to manage its memory. This memory skill improves along two axes: the structure that supports it (prompts, file schemas, action vocabulary), and the proficiency of the model exercising it. Both axes resist manual optimization: episodes in long-horizon tasks run for thousands of steps, and a single memory mistake can hide long before it surfaces, making human review of full trajectories impractical. We introduce AutoMem, a framework that automates both axes. In the first loop, a strong LLM reviews complete agent trajectories and iteratively revises the memory structure that shapes how the agent interacts with its memory files. In the second loop, the agent's own good memory decisions are identified from many episodes and used as training signal to sharpen the model's memory proficiency directly. Across three procedurally generated long-horizon games (Crafter, MiniHack, and NetHack), optimizing memory alone--without modifying the model's task-action behavior--improved the base agent's performance ~2x-4x, bringing a 32B open-weight model competitive with frontier systems such as Claude Opus 4.5 and Gemini 3.1 Pro Thinking. Our results show that memory management is an independently learnable skill, and a high-leverage objective yielding large gains on long-horizon tasks.

Image: Daily English Reader / Local generated SVG (Project-owned local asset)

5 min read C1

0:00 0:00

แปลไทยทั้งบท

ความรู้เกี่ยวกับความทรงจํา เป็นทักษะที่เรียนรู้ รู้อะไรที่จะเขียนรหัส เวลาที่จะเรียกคืน และวิธีการจัดรู้ -- ความทำได้ที่รู้จักในวิทยาศาสตร์ทางสัญชาตญาณว่า เป็นความทรงจําสัญชาตญาณ. เรานํามุมมองนี้มาสู่ LLM โดยการรักษาการจัดการความจําเป็นทักษะที่ทำได้ฝึกอบรมได้. เราส่งเสริมการดําเนินงานระบบไฟล์ ให้เป็นการดําเนินงานในความทรงจําชั้นนํา พร้อมกับการดําเนินงานในงาน โดยให้ตัวอย่างเองตัดสินใจว่า จะจัดการความทรงจําอย่างไร.

ความทำได้ในการจดจํานี้ปรับปรุงขึ้นตามสองแกน คือโครงสร้างที่สนับสนุนมัน (คําสั่ง, รูปแบบไฟล์, คําศัพท์การกระทำ) และความทำได้ของแบบที่ใช้มัน. ทั้งสองแกนจะต่อสู้กับการอป্টিมิสชั่นแบบมือ: ตอนในงานระยะยาวใช้เวลาเป็นพันๆ ขั้นตอน และความผิดพลาดในความทรงจําเดียวทำได้ซ่อนตัวได้นานก่อนที่จะปรากฏขึ้น ทำให้การรีวิวของมนุษย์เกี่ยวกับเส้นทางเต็มไปหมด ไม่เป็นไปได้. เราแนะนํา AutoMem เป็นกรอบที่อัตโนมัติทั้งสองแกน.

ในวงแรก LLM แข็งแรงตรวจสอบเส้นทางของตัวแทนครบถ้วน และปรับปรุงโครงสร้างความทรงจําแบบเรื่อย ๆ ที่ทรงรูปแบบวิธีการที่ตัวแทนปฏิสัมพันธ์กับไฟล์ความทรงจําของเขา. ในลุปที่สอง การตัดสินใจความจําที่ดีของตัวแทนเองถูกระบุจากหลายตอน และใช้เป็นสัญญาณการฝึกอบรมเพื่อกระชับความทำได้ในการจําจําแบบโดยตรง. ผ่าน 3 เกมระยะยาว (Crafter, MiniHack, และ NetHack) ที่สร้างขึ้นโดยวิธีการ โดยการอป্টিมิสเตอร์ความจําโดยเฉพาะ - โดยไม่ต้องปรับปรุงพฤติกรรมการทำงานของรุ่น - ปรับปรุงการทำงานของตัวแทนฐาน ~ 2x-4x, นํารุ่น 32B แบบน้ําหนักเปิดเข้าร่วมการแข่งขันกับระบบชายแดน เช่น Claude Opus 4.5 และ Gemini 3.1 Pro Thinking.

ประโยคและวลีที่ใช้ได้จริงจากเรื่องนี้

Useful phrases from this story

is a learned skillCollocation

เป็นทักษะที่เรียนรู้.

From the storyMemory expertise is a learned skill: knowing what to encode, when to retrieve, and how to organize knowledge--a capacity known in cognitive science as metamemory.

knowing what to encodeCollocation

รู้อะไรที่จะรหัส.

From the storyMemory expertise is a learned skill: knowing what to encode, when to retrieve, and how to organize knowledge--a capacity known in cognitive science as metamemory.

bring this perspective to LLMsCollocation

นํามุมมองนี้ไปสู่ LLM.

From the storyWe bring this perspective to LLMs by treating memory management as a trainable skill.

treating memory management as aCollocation

การจัดการความทรงจํา.

From the storyWe bring this perspective to LLMs by treating memory management as a trainable skill.

letting the model itself decideCollocation

ปล่อยให้ตัวอย่างเองตัดสินใจ.

From the storyWe promote file-system operations to first-class memory actions alongside task actions, letting the model itself decide how to manage its memory.

Save & Review

Only words saved from this story appear here.