Yujin Tang

Yujin Tang (湯 聿津)

I'm a research scientist at Sakana AI. Before that, I was a research scientist at Google DeepMind (formerly Google Brain) based in Tokyo. I received my PhD in Computer Science from the University of Tokyo, my M.S. from Waseda University, and my B.S. from Shanghai Jiao Tong University. My research interests are in reinforcement learning, robotics, evolutionary algorithms, and generative models.

Google Scholar LinkedIn X

News

Recent Publications

L2D

Large Language Models to Diffusion Finetuning [ICML 2025]

We introduce a finetuning method that enables large language models to scale test-time compute using the diffusion framework. Accuracy improves with more diffusion steps, and the model can adaptively allocate compute via ODE solvers and guided generation. Our method applies to any cross-entropy–trained model without altering original weights, complements standard finetuning and search, and bridges autoregressive and diffusion paradigms.

ASAL

Automating the Search for Artificial Life with Foundation Models [ALIFE 2025]

We present ASAL, the first method to apply vision-language foundation models to Artificial Life (ALife). ASAL automates the discovery of lifelike simulations by finding target behaviors, generating open-ended novelty, and illuminating diverse phenomena. It works across multiple ALife substrates—Boids, Lenia, Game of Life, and more—and has led to the discovery of previously unseen lifeforms. By enabling human-aligned, scalable exploration, ASAL introduces a powerful new paradigm for accelerating ALife research beyond manual trial-and-error.

Transformer2

Transformer²: Self-Adaptive LLMs [ICLR 2025]

We present Transformer², a self-adaptive framework that enables large language models to handle unseen tasks in real time by adjusting only the singular components of their weight matrices. Using a two-pass inference process with a task dispatcher and RL-trained expert vectors, Transformer² outperforms methods like LoRA with fewer parameters and greater efficiency. It generalizes across architectures and modalities, offering a scalable path toward dynamic, self-organizing AI systems.

CycleQD

Agent Skill Acquisition for LLMs via CycleQD [ICLR 2025]

We introduce CycleQD, a skill-focused training method for language models that applies Quality Diversity with cyclic task adaptation, model merging–based crossover, and SVD-based mutation. By rotating the focus across tasks, CycleQD avoids data imbalance issues and simplifies objective design. Applied to LLAMA3-8B-INSTRUCT, it outperforms traditional fine-tuning on coding, OS, and database tasks, matching GPT-3.5-Turbo’s performance while preserving strong language abilities. The method is general and extends beyond language to domains like image segmentation.

NAMM

An Evolved Universal Transformer Memory [ICLR 2025]

We introduce Neural Attention Memory Models (NAMMs), a learned memory management system that enhances both the efficiency and performance of transformers by selectively focusing on relevant context. Unlike prior rule-based methods, NAMMs evolve atop pre-trained models and condition only on attention matrices, making them broadly applicable. Trained on a small set of tasks, NAMMs improve performance across long-context benchmarks while drastically reducing input size, and they generalize zero-shot across architectures and modalities—including vision and reinforcement learning.