For all the hate that Google (rightly) gets for some of their work in other domains, I appreciate that they continue to put major resources behind using AI to try and save lives in medicine and autonomous driving.
Easy to take for granted, but their peer companies are not doing this type of long term investment.
I think it's important that people know this. Despite what the other AI companies claim or put out as occasional PR, they have absolutely no real interest (through internal work or funding external researchers) in using AI to benefit science and humanity as a whole. They just want their digital god. As such, there is simply not enough funding for AI research with scientific applications. Consequently, many people in machine learning are not working in scientific applications, even though they really want to.
Someone has to do it. Big pharma has a lot of money and if AI can reduce their costs in human resources, they will be willing to put some of their profits aside to further the research in AI space.
Money wells are drying up across the trch Industry and ai companies will have to look for funds from adjacent industries like biotech and medicine.
It doesn't compensate for all the other bad stuff google's doing.
Long gone are the times were I looked up to google as a champion of good technologies. When I read this I'm just sad such en important step of humankind's technological future is in the hand of evil(-ishbat least) companies
Sometimes it feels like Google are so far ahead in AI but all we get to see are mediocre LLMs from Open AI. Like they're not sharing the really good stuff with everyone.
I think I believe OpenAI's claim that they have better models that are too expensive to serve up to people.
I think Google have only trained what they feel they need, and not a megamodel, but I can't justify this view other as some kind of general feeling. They obviously know enough to make excellent models though, so I doubt they're behind in any meaningful sense.
This has been said by enough people in the know to be considered true by now. Not just from oAI, but also Anthropic and Meta have said this before. You train the best of the best, and then use it to distill/curate/inform the next training run, on something that makes sense to serve at scale. That's how you get from GPT4 / o3 prices (80$/60$ /Mtok) to gpt5 prices (10$ /Mtok) to gpt5-mini (2$ /Mtok).
Then you use a combination of the best models to amplify your training set, and enhance it for the next iteration. And then repeat the process at gen n+1.
From what I understand, the model was used to broaden a search that was already conducted by humans. It's not like the model has devised new knowledge. Kind of a low hanging fruit. But question is: how many of these can be reaped ? Hopefully a lot!
("low hanging fruit", well, not the right way to put it, Google's model are not exactly dumb technology)
> What made this prediction so exciting was that it was a novel idea. Although CK2 has been implicated in many cellular functions, including as a modulator of the immune system, inhibiting CK2 via silmitasertib has not been reported in the literature to explicitly enhance MHC-I expression or antigen presentation. This highlights that the model was generating a new, testable hypothesis, and not just repeating known facts.
Look through any of the single cell atlas papers that are published all the time and you'l see slightly different methods, called machine learning in the past, AI now, used to achieve exactly the same thing. Every form of AI has been continuously produced new results like this throughout the history of genomics.
The reason you are reading about this is because 1) Gemma has a massive massive PR budget, whereas scientists have zero PR budget, 2) it's coming from Google so it's not the traditional scientists and you know Google and when they publish something new, it's makes it to HN.
I don't see any reason to be excited by the result here. It is a workaday result, using a new tool. I'm usually excited by new tools, but given the source, it's going to take a lot of digging to get past the PR spin, so that extra needless work seems exhausting.
Facebooks's proteins language modelling, followed by Google's AlphaFold, did have really new and interesting methods! But single cell RNA models have been underwhelming because there's no easy "here is the ground truth" out there like there is for proteins. We won't know if this is a significant advancement until years of study show which of the many scRNA foundation models make better predictions. And there was a paper about a year ago that poured a ton of cold water on the whole field: replacing models with random weights barely changed the results on the very limited evaluation sets that we have.
Every AI post is either filled with negative comments stating that AI can just regurgitate stuff. This states otherwise and the comment I replied to tries to downplay that.
Remarkably some claim AI has now discovered a new drug candidate on its own. Reading the prep-print (https://www.biorxiv.org/content/10.1101/2025.04.14.648850v2....), it appears the model was targeted to just a very specific task and without evaluating other models on the same task. I know nothing about gens, and I can see that is an important advance. However, seems a bit headline grabbing when claiming victory for one model without comparing against others using the same process.
If a simple majority classifier has the same performance as a fancy model with 58 layers of transformers, and you use your fancy model instead of the majority classifier, is it the model that's doing the discovery or is it the operator that choose to look in a particular place?
I am all for crediting humans and I don't particularly fancy all the anthropomorphising myself. However rubbing it in now feels similarly pointless as suggesting the US should switch to metric.
Well it's important, because the particular new lead for drug targeting is not super valuable, they are a dime a dozen, easier to find than a startup idea. Actually driving a successful drug development program is an entirely different matter that can only be established with $10-$100M of early exploration, with a successful drug costing much more to get to market.
It could also be that particular prioritization method that uses Gemma is useful in its own, but we won't know that unless it is somehow benchmarked against the many alternatives that have been used up until now. And in other benchmark settings, these cell sentence methods have not been that impressive.
But what I’ll say is, ideally they would demonstrate whether this model can perform any better than simple linear models for predicting gene expression interactions.
We’ve seen that some of the single cell “foundation” models aren’t actually the best at in silico perturbation modeling. Simple linear models can outperform them.
So this article makes me wonder: if we take this dataset they’ve acquired, and run very standard single cell RNA seq analyses (including pathway analyses), would this published association pop out?
My guess is that yes… it would. You’d just need the right scientist, right computational biologist, and right question.
However, I don’t say this to discredit the work in TFA. We are still in the early days of scSeq foundation models, and I am excited about their potential.
Cellular level computational simulation existed a very long time and it's more impressive by the day because of large collections of experimental datasets available.
However to infer or predict celular acitivities you need a ton of domain knowledge and experties about particular cell types, biological processes and specific environments. Typically the successful ones are human curated and validated (e.g large interaction networks based on literature).
In cancer it's even more unpredictable because of the lack of good (experimental) models, in-vivo or in-vitro, representing what actually happens the clinically and biologically underneath. Given the single cell resolution, its uncertainty will also amplify because of how heterogeneous inter- and intra- tumours are.
Having said that, a foundation model is definitely the future for futher development. But with all of these things, the bigger the model, the harder the validation process.
Other potential cancer treatment methods that 2.5pro - a different model than is referenced in the article - has confirmed as potentially viable when prompted by an amateur cancer researcher:
- CPMV; Cow-Pea Mosaic Virus (is a plant virus that doesn't infect humans but causes an (IFN-1 (IFN-alpha and a lot of IFN-beta)) anti-cancer response in humans. Cow Pea consumption probably used to be even more prevalent in humans before modern agriculture; cow peas may have been treating cancer in humans for thousands of years at least.)
I emailed these potential new treatments to various researchers with a fair disclaimer; but IDK whether anything has been invested in developing a treatment derived from or informed by knowledge of the relevant pathways affected by EPS3.9 or CPMV.
There are RNA and mRNA cancer vaccines in development.
Without a capsid, RNA is destroyed before arrival. So RNA vaccines are usually administered intramuscularly.
AFAIU, as a general bioengineering platform, CPMV Cow-Pea Mosaic Virus could also be used like a capsid to package for example an RNA cancer vaccine.
AFAIU, CSC3.9 (which produces the "potent anti-cancer" EPS3.9 marine spongiibacter polysaccharide) requires deep sea pressure; but it's probably possible to bioengineer an alternative to CSC3.9 which produces EPS3.9 in conditions closer to ambient temp and pressure?
> Would there be advantages to (CPMV + EPS3.9) + (CPMVprime + mRNA)? (for cancer treatment)
I am concerned about this kind of technology being used to circumvent traditional safeguards and international agreements that prevent the development of biological weapons.
Well you might be pleased to know that there are large safety teams working at all frontier model companies worried about the same thing! You could even apply if you have related skills.
> the oversight of more estabilished international agreements and safeguards?
“Unlike the chemical or nuclear weapons regimes, the [Biological Weapons Convention] lacks both a system to verify states' compliance with the treaty and a separate international organization to support the convention's effective implementation” [1].
Biological weapons compliance is entirely voluntary. We don’t have international monitors watching America and Russia’s smallpox stockpiles. That’s left to each nation.
Your angle is "there is no oversight, why are you asking for it?". It's the same overall angle as the guy who was spreading covid rumours here in this thread.
There are efforts at estabilishing it though. And it's hard and expensive for wet labs, but it could be much simpler for things like simulating biological pathways.
One could also see your response as "other nations are developing threats, we should race", which I personally think is misguided.
Instead of these petty armchair discussions, we should instead focus on being more serious about it.
I mean.. they work within the legal frameworks of very large corporations with nation state engagement. It's not like they're autonomous anonymous DAOs
Hi! I work directly on these teams as a model builder and have talked to my colleagues are the other labs well.
All our orgs have openings and if you also could consider working for organizations such as the UK AISI team and other independent organizations that are assessing these models. It's a critical field and there is a need for motivated folks.
Seems like no matter how positive the headline about the technology is, there is invariably someone in the comments pointing out a worst case hypothetical. Is there a name for this phenomenon?
Not believing everything you read on the internet? Being jaded from constant fluff and lies? Not having gell-mann amnesia?
I get your sentiment of "why you gotta bring down this good thing" but the answer to your actual question is battle scars from the constant barrage of hostile lies and whitewashing we are subject to. It's kind of absurd (and mildly irresponsible) to think "THIS time will be the time things only go well and nobody uses the new thing for something I don't want".
We’ve just had a virus - specifically engineered to be highly infectious for humans - escaping the lab (which was running very lax safety level - BSL2 instead of required BSL4) and killing millions and shutting down half the globe. So I’m wondering what safeguards and prevention you’re talking about :)
You're trying to deflect the discussion into a polemic tarpit. That's not going to work.
I do not endorse the view that covid was engineered. Also, I consider it to be unrelated to what I am concerned about, and I will kindly explain it to you:
Traditional labs work with the wet stuff. And there are a lot of safeguards (the levels you mentioned didn't came out of thin air). Of course I am in favor of enforcing the existing safeguards to the most ethical levels possible.
However, when I say that I am concerned about AI being used to circumvent international agreements, I am talking about loopholes that could allow progress in the development of bioweapons without the use of wet labs. For example, by carefully weaving around international rules and doing the development using simulations, which can bypass outdated assumptions that didn't foresaw that this could be possible when they were conceived.
This is not new. For example, many people were concerned about research on fusion energy related to compressing fuel pellets, which could be seen as a way of weaving around international treatises on the development of precursor components to more powerful nuclear weapons (better triggers, smaller warheads, all kinds of nasty things).
>For example, by carefully weaving around international rules and doing the development using simulations, which can bypass outdated assumptions that didn't foresaw that this could be possible when they were conceived.
Covid development in Wuhan was exactly a careful weaving - by means of laundering through EcoHealth - around the official rule of "no such dangerous GoF research on US soil". Whether such things weaved away offshore or into virtual space is just minor detail of implementation.
This myth is documented in the EcoHealth Alliance publicly available NIH and DARPA grants documents among others. Wrt your link - Wikipedia unfortunately isn’t subject to the law like those grants.
Covid is irrelevant to the discussion I opened. You're trying to steer the discussion into a place that will lead us nowhere because there's too many artificial polemics around it.
The only thing to be said about it that resonates with what I'm concerned with is that anyone that is good in the head wants better international oversight on potential bioweapons development.
For all the hate that Google (rightly) gets for some of their work in other domains, I appreciate that they continue to put major resources behind using AI to try and save lives in medicine and autonomous driving.
Easy to take for granted, but their peer companies are not doing this type of long term investment.
I think it's important that people know this. Despite what the other AI companies claim or put out as occasional PR, they have absolutely no real interest (through internal work or funding external researchers) in using AI to benefit science and humanity as a whole. They just want their digital god. As such, there is simply not enough funding for AI research with scientific applications. Consequently, many people in machine learning are not working in scientific applications, even though they really want to.
Someone has to do it. Big pharma has a lot of money and if AI can reduce their costs in human resources, they will be willing to put some of their profits aside to further the research in AI space.
Money wells are drying up across the trch Industry and ai companies will have to look for funds from adjacent industries like biotech and medicine.
It doesn't compensate for all the other bad stuff google's doing. Long gone are the times were I looked up to google as a champion of good technologies. When I read this I'm just sad such en important step of humankind's technological future is in the hand of evil(-ishbat least) companies
Sometimes it feels like Google are so far ahead in AI but all we get to see are mediocre LLMs from Open AI. Like they're not sharing the really good stuff with everyone.
I think I believe OpenAI's claim that they have better models that are too expensive to serve up to people.
I think Google have only trained what they feel they need, and not a megamodel, but I can't justify this view other as some kind of general feeling. They obviously know enough to make excellent models though, so I doubt they're behind in any meaningful sense.
This has been said by enough people in the know to be considered true by now. Not just from oAI, but also Anthropic and Meta have said this before. You train the best of the best, and then use it to distill/curate/inform the next training run, on something that makes sense to serve at scale. That's how you get from GPT4 / o3 prices (80$/60$ /Mtok) to gpt5 prices (10$ /Mtok) to gpt5-mini (2$ /Mtok).
Then you use a combination of the best models to amplify your training set, and enhance it for the next iteration. And then repeat the process at gen n+1.
When they do, someone like OpenAI comes along and runs the VC enshittification playbook on it.
From what I understand, the model was used to broaden a search that was already conducted by humans. It's not like the model has devised new knowledge. Kind of a low hanging fruit. But question is: how many of these can be reaped ? Hopefully a lot!
("low hanging fruit", well, not the right way to put it, Google's model are not exactly dumb technology)
> What made this prediction so exciting was that it was a novel idea. Although CK2 has been implicated in many cellular functions, including as a modulator of the immune system, inhibiting CK2 via silmitasertib has not been reported in the literature to explicitly enhance MHC-I expression or antigen presentation. This highlights that the model was generating a new, testable hypothesis, and not just repeating known facts.
ah ok, my bad. My worst post of the week :-)
This just in "excel helped discover a potential cancer cure using pitvottables"
Reading comments around AI is always fun
> It's not like the model has devised new knowledge. Kind of a low hanging fruit.
Just keep moving goalposts.
Look through any of the single cell atlas papers that are published all the time and you'l see slightly different methods, called machine learning in the past, AI now, used to achieve exactly the same thing. Every form of AI has been continuously produced new results like this throughout the history of genomics.
The reason you are reading about this is because 1) Gemma has a massive massive PR budget, whereas scientists have zero PR budget, 2) it's coming from Google so it's not the traditional scientists and you know Google and when they publish something new, it's makes it to HN.
I don't see any reason to be excited by the result here. It is a workaday result, using a new tool. I'm usually excited by new tools, but given the source, it's going to take a lot of digging to get past the PR spin, so that extra needless work seems exhausting.
Facebooks's proteins language modelling, followed by Google's AlphaFold, did have really new and interesting methods! But single cell RNA models have been underwhelming because there's no easy "here is the ground truth" out there like there is for proteins. We won't know if this is a significant advancement until years of study show which of the many scRNA foundation models make better predictions. And there was a paper about a year ago that poured a ton of cold water on the whole field: replacing models with random weights barely changed the results on the very limited evaluation sets that we have.
What goalposts do you think were moved? Please elaborate.
Every AI post is either filled with negative comments stating that AI can just regurgitate stuff. This states otherwise and the comment I replied to tries to downplay that.
Remarkably some claim AI has now discovered a new drug candidate on its own. Reading the prep-print (https://www.biorxiv.org/content/10.1101/2025.04.14.648850v2....), it appears the model was targeted to just a very specific task and without evaluating other models on the same task. I know nothing about gens, and I can see that is an important advance. However, seems a bit headline grabbing when claiming victory for one model without comparing against others using the same process.
If someone discovers anything, it does not change anything if someone else could have discovered it theoretically as well?
If a simple majority classifier has the same performance as a fancy model with 58 layers of transformers, and you use your fancy model instead of the majority classifier, is it the model that's doing the discovery or is it the operator that choose to look in a particular place?
I am all for crediting humans and I don't particularly fancy all the anthropomorphising myself. However rubbing it in now feels similarly pointless as suggesting the US should switch to metric.
Well it's important, because the particular new lead for drug targeting is not super valuable, they are a dime a dozen, easier to find than a startup idea. Actually driving a successful drug development program is an entirely different matter that can only be established with $10-$100M of early exploration, with a successful drug costing much more to get to market.
It could also be that particular prioritization method that uses Gemma is useful in its own, but we won't know that unless it is somehow benchmarked against the many alternatives that have been used up until now. And in other benchmark settings, these cell sentence methods have not been that impressive.
This is so awesome. Hoping those in the biology field can comment on the significance.
It is awesome.
But what I’ll say is, ideally they would demonstrate whether this model can perform any better than simple linear models for predicting gene expression interactions.
We’ve seen that some of the single cell “foundation” models aren’t actually the best at in silico perturbation modeling. Simple linear models can outperform them.
So this article makes me wonder: if we take this dataset they’ve acquired, and run very standard single cell RNA seq analyses (including pathway analyses), would this published association pop out?
My guess is that yes… it would. You’d just need the right scientist, right computational biologist, and right question.
However, I don’t say this to discredit the work in TFA. We are still in the early days of scSeq foundation models, and I am excited about their potential.
Cellular level computational simulation existed a very long time and it's more impressive by the day because of large collections of experimental datasets available.
However to infer or predict celular acitivities you need a ton of domain knowledge and experties about particular cell types, biological processes and specific environments. Typically the successful ones are human curated and validated (e.g large interaction networks based on literature).
In cancer it's even more unpredictable because of the lack of good (experimental) models, in-vivo or in-vitro, representing what actually happens the clinically and biologically underneath. Given the single cell resolution, its uncertainty will also amplify because of how heterogeneous inter- and intra- tumours are.
Having said that, a foundation model is definitely the future for futher development. But with all of these things, the bigger the model, the harder the validation process.
Meanwhile OpenAI going into the porn business
If you can have porn without the human trafficking and exploitation associated with the porn industry, that's a big win too
OTOH... time for a wall-e rewatch
Their research is pivoting to STDs
Both is equally important, just in another dimension.
Let's go !!!
Hell yeah
Other potential cancer treatment methods that 2.5pro - a different model than is referenced in the article - has confirmed as potentially viable when prompted by an amateur cancer researcher:
- EPS3.9: Polysaccharide (deep sea bacterium sugar, fermentable, induces IFN-1) causes Pyroptosis causes IFN-1 causes Epitope Spreading (which is an amplifying effect) causes anti-cancer response.
- CPMV; Cow-Pea Mosaic Virus (is a plant virus that doesn't infect humans but causes an (IFN-1 (IFN-alpha and a lot of IFN-beta)) anti-cancer response in humans. Cow Pea consumption probably used to be even more prevalent in humans before modern agriculture; cow peas may have been treating cancer in humans for thousands of years at least.)
I emailed these potential new treatments to various researchers with a fair disclaimer; but IDK whether anything has been invested in developing a treatment derived from or informed by knowledge of the relevant pathways affected by EPS3.9 or CPMV.
There are RNA and mRNA cancer vaccines in development.
Without a capsid, RNA is destroyed before arrival. So RNA vaccines are usually administered intramuscularly.
AFAIU, as a general bioengineering platform, CPMV Cow-Pea Mosaic Virus could also be used like a capsid to package for example an RNA cancer vaccine.
AFAIU, CSC3.9 (which produces the "potent anti-cancer" EPS3.9 marine spongiibacter polysaccharide) requires deep sea pressure; but it's probably possible to bioengineer an alternative to CSC3.9 which produces EPS3.9 in conditions closer to ambient temp and pressure?
> Would there be advantages to (CPMV + EPS3.9) + (CPMVprime + mRNA)? (for cancer treatment)
I am concerned about this kind of technology being used to circumvent traditional safeguards and international agreements that prevent the development of biological weapons.
Well you might be pleased to know that there are large safety teams working at all frontier model companies worried about the same thing! You could even apply if you have related skills.
Are these safety teams subject to the oversight of more estabilished international agreements and safeguards?
> the oversight of more estabilished international agreements and safeguards?
“Unlike the chemical or nuclear weapons regimes, the [Biological Weapons Convention] lacks both a system to verify states' compliance with the treaty and a separate international organization to support the convention's effective implementation” [1].
[1] https://en.wikipedia.org/wiki/Biological_Weapons_Convention
The UN has a page on Biological Weapons.
https://disarmament.unoda.org/en/our-work/weapons-mass-destr...
You can read more about the efforts for disarmament there.
Where is there oversight?
Biological weapons compliance is entirely voluntary. We don’t have international monitors watching America and Russia’s smallpox stockpiles. That’s left to each nation.
Your angle is "there is no oversight, why are you asking for it?". It's the same overall angle as the guy who was spreading covid rumours here in this thread.
There are efforts at estabilishing it though. And it's hard and expensive for wet labs, but it could be much simpler for things like simulating biological pathways.
One could also see your response as "other nations are developing threats, we should race", which I personally think is misguided.
Instead of these petty armchair discussions, we should instead focus on being more serious about it.
I mean.. they work within the legal frameworks of very large corporations with nation state engagement. It's not like they're autonomous anonymous DAOs
Hi! I work directly on these teams as a model builder and have talked to my colleagues are the other labs well.
All our orgs have openings and if you also could consider working for organizations such as the UK AISI team and other independent organizations that are assessing these models. It's a critical field and there is a need for motivated folks.
That does not answer my question.
I thought openAI gave up on safety when Anthropic splintered off as well as when they engaged ScaleAI to traumatize people for RLHF?
Or Google when they fired Timnit?
The GPT5 system card is 59 pages. Pages 5 to 56 address safety in various forms.
https://cdn.openai.com/gpt-5-system-card.pdf
Seems like no matter how positive the headline about the technology is, there is invariably someone in the comments pointing out a worst case hypothetical. Is there a name for this phenomenon?
Performative cynicism?
Rational discourse? Not working for a marketing team? Realism?
Not believing everything you read on the internet? Being jaded from constant fluff and lies? Not having gell-mann amnesia?
I get your sentiment of "why you gotta bring down this good thing" but the answer to your actual question is battle scars from the constant barrage of hostile lies and whitewashing we are subject to. It's kind of absurd (and mildly irresponsible) to think "THIS time will be the time things only go well and nobody uses the new thing for something I don't want".
Pessimism?
We’ve just had a virus - specifically engineered to be highly infectious for humans - escaping the lab (which was running very lax safety level - BSL2 instead of required BSL4) and killing millions and shutting down half the globe. So I’m wondering what safeguards and prevention you’re talking about :)
You're trying to deflect the discussion into a polemic tarpit. That's not going to work.
I do not endorse the view that covid was engineered. Also, I consider it to be unrelated to what I am concerned about, and I will kindly explain it to you:
Traditional labs work with the wet stuff. And there are a lot of safeguards (the levels you mentioned didn't came out of thin air). Of course I am in favor of enforcing the existing safeguards to the most ethical levels possible.
However, when I say that I am concerned about AI being used to circumvent international agreements, I am talking about loopholes that could allow progress in the development of bioweapons without the use of wet labs. For example, by carefully weaving around international rules and doing the development using simulations, which can bypass outdated assumptions that didn't foresaw that this could be possible when they were conceived.
This is not new. For example, many people were concerned about research on fusion energy related to compressing fuel pellets, which could be seen as a way of weaving around international treatises on the development of precursor components to more powerful nuclear weapons (better triggers, smaller warheads, all kinds of nasty things).
>For example, by carefully weaving around international rules and doing the development using simulations, which can bypass outdated assumptions that didn't foresaw that this could be possible when they were conceived.
Covid development in Wuhan was exactly a careful weaving - by means of laundering through EcoHealth - around the official rule of "no such dangerous GoF research on US soil". Whether such things weaved away offshore or into virtual space is just minor detail of implementation.
Still irrelevant to what I brought up.
Don't spread misinformation. This myth is widely believed only by Americans.
https://en.wikipedia.org/wiki/COVID-19_misinformation#Virus_...
He says and quotes Wikipedia.
This myth is documented in the EcoHealth Alliance publicly available NIH and DARPA grants documents among others. Wrt your link - Wikipedia unfortunately isn’t subject to the law like those grants.
Covid is irrelevant to the discussion I opened. You're trying to steer the discussion into a place that will lead us nowhere because there's too many artificial polemics around it.
The only thing to be said about it that resonates with what I'm concerned with is that anyone that is good in the head wants better international oversight on potential bioweapons development.