neural networks research group
areas
people
projects
demos
publications
software/data
Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning (2026)
Xin Qiu
, Yulu Gan, Conor F. Hayes, Qiyao Liang, Yinggan Xu, Roberto Dailey,
Elliot Meyerson
,
Babak Hodjat
,
Risto Miikkulainen
Fine-tuning large language models (LLMs) for downstream tasks is an essential stage of modern AI deployment. Reinforcement learning (RL) has emerged as the dominant fine-tuning paradigm, underpinning many state-of-the-art LLMs. In contrast, evolution strategies (ES) has largely been overlooked due to the widespread belief that it does not scale to modern model sizes. This paper overturns this assumption by demonstrating the first successful application of ES to full-parameter fine-tuning of LLMs at the billion-parameter scale, without dimensionality reduction. ES can indeed search over extremely high-dimensional parameter spaces and outperform established RL implementations across multiple axes, including improved tolerance to long-horizon and delayed rewards, robustness across diverse base LLMs, reduced susceptibility to reward hacking, and improved training stability. These findings suggest that ES is not merely a viable alternative to RL, but a fundamentally different and powerful backpropagation-free post-training paradigm that opens a new direction for LLM fine-tuning beyond current RL-based approaches.
View:
PDF
Citation:
In
Proceedings of the 43rd International Conference on Machine Learning
, 2026. (Also arXiv:2509.24372).
Bibtex:
@inproceedings{qiu:icml26, title={Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning}, author={Xin Qiu and Yulu Gan and Conor F. Hayes and Qiyao Liang and Yinggan Xu and Roberto Dailey and Elliot Meyerson and Babak Hodjat and Risto Miikkulainen}, booktitle={Proceedings of the 43rd International Conference on Machine Learning}, month={ }, note={(Also arXiv:2509.24372)}, url="http://nn.cs.utexas.edu/?qiu:icml26", year={2026} }
People
Babak Hodjat
Collaborator
babak [at] cognizant com
Elliot Meyerson
Ph.D. Alumni
ekm [at] cs utexas edu
Risto Miikkulainen
Faculty
risto [at] cs utexas edu
Xin Qiu
Collaborator
xin qiu [at] cognizant com
Areas of Interest
Evolutionary Computation
Neuroevolution