An Energy Landscape Approach to Miniaturizing Enzymes using Protein Language Model Embeddings
Lala, J.; Agrawal, H.; Dong, F.; Wells, J.; Angioletti-Uberti, S.
AO_SCPLOWBSTRACTC_SCPLOWWe present a general approach to find amino acid sequences corresponding to the most compact enzyme likely to retain the structure of a given catalytic site. Our approach is based on using Monte Carlo (MC) simulations to sample an energy landscape where minima correspond, by construction, to sequences with the aforementioned properties. Building on previous work (Wu et al., 2025) and with the BAGEL package (Lala et al., 2025), we implement a route to achieve this goal using only the information extracted from a protein language model (PLM), without structural information. After generating a set of candidate sequences with this PLM-guided BAGEL optimization, we further filter potential candidates for downstream experimental validation using a two-stage protocol. First, deep-learning-based structure prediction models (ESMFold, Chai-1, Boltz-2) are used to identify a structural consensus among designs with highly conserved active-site geometries, yielding many candidates with active-site RMSD below a few angstroms relative to the wild-type and pLDDT scores above 80. Second, molecular dynamics simulations are performed on a filtered subset of sequences (based on active-site RMSD and SolubleMPNN log-likelihoods) to evaluate active-site stability when including thermal fluctuations. For the most promising enzymes, these yield RMSF values in the active site below 1.0 [A] and an active-site RMSD drift between 0.5 and 1.5 [A], making these mini-variants comparable to the wild type, though outcomes vary across enzymes. Given the protocols generality, we believe these results represent a step forward in AI-guided enzyme design. To facilitate rapid experimental validation by the broader community, we open-source all sequences generated by our computational pipeline. These include designs for four representative enzymes of this study: PETase, subtilisin Carlsberg (serine protease), Taq DNA polymerase, and VioA.
This AI tool acts like a shrink ray for proteins. It takes bulky, complex enzymes and designs "Mini-Me" versions that are tiny but still work perfectly. These pocket-sized enzymes are way easier to handle in the lab and are now available for anyone to download and test.
Protein designers and BAGEL fans went wild sharing the open-sourced mini-sequences, posted by Jakub Lála (@jakublala)
View discussion on XPeer Reviews
Peer review in progress...
Your Assessment
Rate This Paper
Quick Takes
0 takesLoading...
More to Read
View All →CD4⁺ T cells confer transplantable rejuvenation via Rivers of telomeres
Lanna, A.; Valvo, S.; Dustin, M.; Rinaldi, F.
Using a GPT-5-driven autonomous lab to optimize the cost and titer of cell-free protein synthesis
Smith, A. A.; Wong, E. L.; Donovan, R. C.; Chapman, B. A.; Harry, R.; Tirandazi, P.; Kanigowska, P.; Gendreau, E. A.; Dahl, R. H.; Jastrzebski, M.; Cortez, J. E.; Bremner, C. J.; Hemuda, J. C. M.; Dooner, J.; Graves, I.; Karandikar, R.; Lionetti, C.; Christopher, K.; Consiglio, A. L.; Tran, A.; McCusker, W.; Nguyen, D. X.; Nunes da Silva, I. B.; Bautista-Ayala, A. R.; McNerney, M. P.; Atkins, S.; McDuffie, M.; Serber, W.; Barber, B. P.; Thanongsinh, T.; Nesson, A.; Lama, B.; Nichols, B.; LaFrance, C.; Nyima, T.; Byrn, A.; Thornhill, R.; Cai, B.; Ayala-Valdez, L.; Wong, A.; Che, A. J.; Thavaraj
A Single-Cell and Spatial 3D Multi-omic Atlas of Developing Human Basal Ganglia and Inhibitory Neurons
Heffel, M. G.; Xu, H.; Pastor-Alonso, O.; Li, X.; Baig, M. S.; Irfan Ghoor, R.; Li, R.; Kern, C.; Kum, J.; Zhang, Y.; Paino, J.; Tsai, M. J.; Tai, C.-Y.; Tucker, G.; Zhao, Z.; Hou, A.; von Behren, Z.; Bhade, M.; Li, S.; Sandoval, K.; Scholes, J.; Codrea, F.; Calimlim, J.; Liao, E. K.; Leung, G.; Kim, J.; Eskin, E.; Flint, J.; Cotter, J. A.; Pasaniuc, B.; Bintu, B.; Zhu, Q.; Mukamel, E. A.; Ernst, J.; Paredes, M. F.; Luo, C.
Prediction of transformative breakthroughs in biomedical research
Davis, M. T.; Busse, B. L.; Arabi, S.; Meyer, P.; Hoppe, T. A.; Meseroll, R. A.; Hutchins, B. I.; Willis, K. A.; Santangelo, G. M.