Virtual Cells Need Context, Not Just Scale
Dibaeinia, P.; Babu, S.; Knudson, M.; ElSheikh, A.; Wen, Y.; Liu, H.; Perera, J.; Khan, A. A.
Loading
Dibaeinia, P.; Babu, S.; Knudson, M.; ElSheikh, A.; Wen, Y.; Liu, H.; Perera, J.; Khan, A. A.
The intersection of AI and biology has entered a phase of explosive growth, driven by the ambition to build "Virtual Cells" or computational models capable of predicting cellular responses to any perturbation. Following the success of structural biology (e.g., AlphaFold) and large language models, the field has converged on training massive, high-capacity models on large-scale single-cell data. This position paper argues that scaling model capacity is insufficient to solve the Virtual Cell problem because the primary failure mode is a lack of adequate coverage over diverse biological contexts, not insufficient model expressivity. We support this claim by reviewing recent studies showing that simple baselines perform on par with sophisticated architectures within a given biological context, and current models fail to consistently generalize across contexts. We connect this finding to the causal inference literature on transportability and contrast it with domains where scaling has succeeded. We substantiate our argument through analysis of a state-of-the-art model on a 22-million-cell immunology dataset. We conclude that the community faces a causal transport problem that cannot be solved by accumulating more data from the same distributions. Instead, we argue that contextual diversity and causal representation learning deserve increased emphasis, complementing ongoing scaling of model capacity and data volume.
Peer review in progress...
Loading...
CD4⁺ T cells confer transplantable rejuvenation via Rivers of telomeres
Lanna, A.; Valvo, S.; Dustin, M.; Rinaldi, F.
Using a GPT-5-driven autonomous lab to optimize the cost and titer of cell-free protein synthesis
Smith, A. A.; Wong, E. L.; Donovan, R. C.; Chapman, B. A.; Harry, R.; Tirandazi, P.; Kanigowska, P.; Gendreau, E. A.; Dahl, R. H.; Jastrzebski, M.; Cortez, J. E.; Bremner, C. J.; Hemuda, J. C. M.; Dooner, J.; Graves, I.; Karandikar, R.; Lionetti, C.; Christopher, K.; Consiglio, A. L.; Tran, A.; McCusker, W.; Nguyen, D. X.; Nunes da Silva, I. B.; Bautista-Ayala, A. R.; McNerney, M. P.; Atkins, S.; McDuffie, M.; Serber, W.; Barber, B. P.; Thanongsinh, T.; Nesson, A.; Lama, B.; Nichols, B.; LaFrance, C.; Nyima, T.; Byrn, A.; Thornhill, R.; Cai, B.; Ayala-Valdez, L.; Wong, A.; Che, A. J.; Thavaraj
A Single-Cell and Spatial 3D Multi-omic Atlas of Developing Human Basal Ganglia and Inhibitory Neurons
Heffel, M. G.; Xu, H.; Pastor-Alonso, O.; Li, X.; Baig, M. S.; Irfan Ghoor, R.; Li, R.; Kern, C.; Kum, J.; Zhang, Y.; Paino, J.; Tsai, M. J.; Tai, C.-Y.; Tucker, G.; Zhao, Z.; Hou, A.; von Behren, Z.; Bhade, M.; Li, S.; Sandoval, K.; Scholes, J.; Codrea, F.; Calimlim, J.; Liao, E. K.; Leung, G.; Kim, J.; Eskin, E.; Flint, J.; Cotter, J. A.; Pasaniuc, B.; Bintu, B.; Zhu, Q.; Mukamel, E. A.; Ernst, J.; Paredes, M. F.; Luo, C.
Prediction of transformative breakthroughs in biomedical research
Davis, M. T.; Busse, B. L.; Arabi, S.; Meyer, P.; Hoppe, T. A.; Meseroll, R. A.; Hutchins, B. I.; Willis, K. A.; Santangelo, G. M.