On the frontiers of research at the Lux AI Summit
Kyunghyun Cho and Shirley Ho talk about the convergence of frontier science and artificial intelligence
Last week, we convened about 300 AI engineers, scientists, researchers and founders in New York City to discuss the frontiers of the field under the banner of “the AI canvas.” The idea was to move the conversation away from what can be built, to what should be built and why. AI tools have made extraordinary progress since the launch of ChatGPT in late 2022, and we are still just figuring out all of the ways we can use these miraculous correlation machines.
Even so, there remains prodigious work on the research frontiers of artificial intelligence to identify ways of improving model performance, merging models together, and ensuring that training and inference costs are as efficient as possible. To that end, we brought together two stars of the science world to talk more about the future of AI.
Kyunghyun Cho is a computer science professor at New York University and executive director of frontier research at the Prescient Design team within Genentech Research & Early Development (gRED). Shirley Ho is Group Leader of Cosmology X Data Science at the Flatiron Institute of the Simons Foundation as well as Research Professor in Physics at New York University.
Together with me and Laurence Pevsner, we talk about the state of the art in AI today, how scientific discovery can potentially be automated with AI, whether PhDs are a thing of the past, and what the future of universities is in a time of funding cuts and endowment taxes.
This transcript has been edited for length and clarity. For a full version of the discussion, please visit our podcast.
Danny Crichton:
Thank you so much for joining us here at the Lux AI Summit today. Your panel was about the technology frontiers of artificial intelligence. Give us a little summary of what you just discussed on stage.
Shirley Ho:
There was a huge range of topics, but most importantly was how do we get from where we are now — which is using a lot of specialized tools and LLMs — to true scientific intelligence. What are the roadblocks? How do we make it systematic? How do we actually create the data set we need?
Kyunghyun Cho:
Yeah, absolutely. We talked about needing to distinguish between solving the problems we know and just need to solve more efficiently versus solving the problem of discovery. Scientific discovery is very unique in the sense that it’s all about finding something that has not been found before. How do we make AI internalize this whole process?
Laurence Pevsner:
Right now, if you tell an AI, “I want to get better at discovery,” what does it do? Where does it fail? How is it helpful?
Kyunghyun Cho:
At the moment, a lot of focus has been on how these systems are trained. What they are often encouraged to do is answer questions based on the known knowledge. Sometimes we’re going to let AI run small experiments, in particular in the coding agents. But these are extremely narrow. What we actually want, though, is for the AI systems to know the process of discovery and make suggestions on what we need to do in order to provide AI systems with more data so that the information gain is maximized and then, eventually, future success is maximized as well.
Danny Crichton:
We talk about AI for science in this general format — as if there’s some sort of generalized “scientist.” Yet when we look at human society, we see the complete opposite. People get narrow PhDs, postdocs get narrower and narrower and narrower. So where do you stand on general compute versus more specialized LLMs models and data sets?
Shirley Ho:
There are multiple answers to that. The first answer is that I think we can get AI to become more generalist. Humans are doing all these specialized things. AI could learn everything, but it has to learn not just from the literature, but also from real data. I think that’s the piece that’s missing right now: AI has all the literature, but that’s not how you get discovery. If you talk to any scientist, it’s not that you just read all the papers from before. You actually need to go create new experiments, get more data, analyze it and come back.
Further down the line, I will want to see whether AI can come up with new ideas without a human’s help. But that’s definitely more futuristic.
A lot of our research has been around creating a polymathic model, which can go across many different areas — from physics to chemistry to biology and astrophysics to everything all together. And I think that actually helps the system become more generalist in a way that can bring new ideas from physics to chemistry, from chemistry to biology and maybe to something else, like health. So that’s my hope in the near term.
Further down the line, I will want to see whether AI can come up with new ideas without a human’s help. But that’s definitely more futuristic.
Kyunghyun Cho:
When we think about AI or AGI, we are very much constrained by what we think we know how to do. When in fact, of course, AI is built in a very different way from us. So we do have to specialize individuals and then form a society in order to cover different aspects of the very difficult problems people face. But AI doesn’t really have that particular constraint. In fact, AI systems are often worse at a lot of things we know well, but can solve problems we just don’t even know how to approach.
The distinction of the generalist versus specialist applies to us, but not necessarily to the large-scale models we are building now and will build in the future.
One example I can give you is in drug discovery. In that field, there are people who are specialized in trying to figure out human physiology. There are people specialized in a particular molecular modality who work on designing drugs and developing them and optimizing them. There are people who are working on the commercial side as well, and then there are people who are working on the clinical trials. All those people are extremely specialized, and it’s really difficult to see what happens across these stages and make connections. This lack of connection is one of the reasons the success rate is so low, end-to-end.
But AI doesn’t have those constraints. It can, in fact, look at all the data coming out of all the different stages and identify a tiny bit of correlation across them. And then that correlation becomes something we can use in order to improve the success rate. So what that means is that the distinction of the generalist versus specialist applies to us, but not necessarily to the large-scale models we are building now and will build in the future.
Laurence Pevsner:
It’s like that famous metaphor about the blind men trying to understand what an elephant is. They’re all feeling different pieces. Hopefully a generalist polymathic AI can actually see the whole elephant at once. I’m wondering: Is polymathic AI used right now? Is it helping in your research currently?
Shirley Ho:
We recently built a model. It’s a fluid dynamics model that captures everything from blood flowing through an artificial heart all the way to oceanography and astrophysics. So fluids across all the different scales, from smallest to the largest.
And you’re like, “Why did you build such a model?” Well, you can take this model and fine-tune it — give it a little extra data on something it’s never seen before. In this case, something that explodes: an exploding star. There are only five simulations of exploding stars in the entire world. It’s very expensive. But now I can make predictions with just one example, which was impossible before. Before you needed 10,000 examples, a million examples. Now I need just one, and I can already make predictions.
That is a huge step, and this is a great thing about foundation models where data from nearby disciplines can actually boost performance or make the impossible now possible.
Danny Crichton:
In the world of science, it seems like we have a bunch of stuff happening all at once. On the one hand, we have terrifying cuts to federal funding and challenging finances from the endowment side because of endowment taxes. On the other hand, we have a rebuilding of the pipeline for scientists. The idea that you would have AI biologists didn’t really exist 10–15 years ago. Now, it seems like AI is the first step to becoming a frontier biologist or astrophysicist.
How would you start to rethink the future of research careers in terms of skill building, in terms of what you’d be focusing on? Does it change radically?
Kyunghyun Cho:
I think it should stay the same. The existing system has a huge amount of merit. We want to be able to choose as a society to do work that is not blindsided — or narrow-sided — by profit motives. That’s how we originally decided taxpayers would essentially outsource this selection and execution to the federal government. And the federal government outsources this research to the universities and other non-profit research organizations.
I think this is probably the only way in which we can actually continue to invest in the future of research, not the research that’s going to be developed into a product today or tomorrow or next year. Those things are going to be done by industry.
The issue here is not that this system is wrong or outdated, but unfortunately this whole system started to be somewhat overshadowed by shiny new tools in a shiny new industry with a shiny new amount of money.
If you look at some of the National Science Foundation solicitations in the computer science field over the past few years, it’s really difficult to tell whether the solicitation is for research or for developing products. And that’s a mistake by the federal government, because the federal government was supposed to take the taxpayer money and try to invest in the things that companies would have overlooked so that we can protect our future.
What I think is really important is making sure the old system works by ensuring that the execution and choice of the topics is least influenced by what is being productized today or in the next few years.
You learn how to think about things, you learn mathematics, history, social science, all those courses, but you never actually get to execute. You never get to use your logical thinking and then see the consequence. Programming is the only place where you can.
Shirley Ho:
Yeah, this is interesting. There’s this idea that you should fund frontier research, and the government should not decide where the frontier is. But I think you were also asking about another issue, which is how you prepare future generations to make best use of all these shiny new tools. Should they just be, like, just forget it, don’t learn software engineering because there’s no more software engineering jobs.
I would say that people should do what they love, because then you have the most motivation and creativity pouring in that direction. But on top of that, having some idea of what tools are available and what tools might be upcoming will be super useful.
Kyunghyun Cho:
When it comes to programming, even though we have amazing coding assistants and whatnot, I think we have to teach programming to everyone as early as possible. In my view, programming is not only amazing because it gets you better products and improves your productivity. But it is also a good way to test the logic of your thoughts. You learn how to think about things, you learn mathematics, history, social science, all those courses, but you never actually get to execute. You never get to use your logical thinking and then see the consequence. Programming is the only place where you can.
Laurence Pevsner:
Yeah, when we were talking about polymathic AI, I was wondering if one of the benefits will be that we humans, too, get to evolve. In a future in which an AI can really go deep, we can also become more generalist, more interdisciplinary. We’ll all have a bit of programming. We’ll all have a bit of astrophysics, we’ll all have a bit of biology. That way, we can actually work with the AI models too and be able to think this way.
Shirley Ho:
I do love it as an educational tool, but I want to be slightly controversial. I’m just thinking that programming is not the only place you can test your logic. In a lot of lab experiments — physics, chemistry — you can actually hypothesize and then go test it out. But programming is probably the fastest and easiest.
Danny Crichton:
We’re coming towards the end of 2025 and one of the most important AI conferences is coming up with NeurIPS. I heard you have quite a few papers accepted, so congratulations. I have two questions: One is, what were some of the developments over 2025 that you think were overlooked? And two is, when you look forward either to NeurIPS in a few weeks or into 2026, where does the AI research agenda go next?
Shirley Ho:
I’m not going to talk about our papers! I want to talk about some other people’s papers! There is a huge amount of literature on how to merge models, how to understand models, and how to steer models. All that is usually classified as “interpretability.” And as a scientist, I think it’s great if we can understand the models a little bit better, because if you tell someone, “I found a new fundamental rule about the universe, I have no idea how it came about,” no one’s going to believe you. So can you somewhat interpret the model? I think it’s super useful.
Kyunghyun Cho:
What I’ve seen, not necessarily this year, but over the past few years, is the trend of acknowledging that we have built this amazing correlation machine. You feed in as much data as you want and it’s going to capture every single statistical correlation that exists within this data.
If you think about learning or inference, what is the next step? After correlation is causation. And causation is really important.
And now we are actually going to the next level up. So if you think about learning or inference, what is the next step? After correlation is causation. And causation is really important. It connects to what Shirley just said because, if we want to be able to control or steer any of these systems, we have to have a causal understanding of the system. Otherwise, we’re going to make all those random mistakes.
So how we get to large-scale, high-dimensional causal analysis is a big trend. It’s all about framing the problem in a way that maximally benefits from this amazing correlation machine. We’re going more into meta-learning or meta-inference, and that’s what I’m interested in. It naturally connects, of course, to discovery. Without causation, you cannot actually discover things very easily.






