Classical solvers are very very good at solving PDEs. In contrast PINNs solve PDEs by... training a neural network. Not once, that can be used again later. But every single time you solve a new PDE!
You can vary this idea to try to fix it, but it's still really hard to make it better than any classical method.
As such the main use cases for PINNs -- they do have them! -- is to solve awkward stuff like high-dimensional PDEs or nonlocal operators or something. Here it's not that the PINNs got any better, it's just that all the classical solvers fall off a cliff.
---
Importantly -- none of the above applies to stuff like neural differential equations or neural closure models. These are genuinely really cool and have wide-ranging applications.! The difference is that PINNs are numerical solvers, whilst NDEs/NCMs are techniques for modelling data.
I concur. As a postdoc for many years adjacent to this work, I was similarly unimpressed.
The best part about PINNs is that since there are so many parameters to tune, you can get several papers out of the same problem. Then these researchers get more publications, hence better job prospects, and go on to promote PINNs even more. Eventually they’ll move on, but not before having sucked the air out of more promising research directions.
I believe a lot of this hype is purely attributable to Karniadakis and how bad a lot of the methods in many areas of engineering are. The methods coming out of CRUNCH (PINNs chief among them) seem, if they are not just actually, more intelligent in comparison, since engineers are happy to take a solution to inverse or model selection problems by pure brute force as "innovative" haha.
The general rule of thumb to go by is that whatever Karniadakis proposes, doesn't actually work outside of his benchmarks. PINNs don't really work, and _his flavor_ of neural operators also don't really work.
PINNs have serious problems with the way the "PDE-component" of the loss function needs to be posed, and outside of throwing tons of, often Chinese, PhD students, and postdocs at it, they usually don't work for actual problems. Mostly owed to the instabilities of higher order automatic derivatives, at which point PINN-people begin to go through a cascade of alternative approaches to obtain these higher-order derivatives. But these are all just hacks.
I love karniadakis energy. I invited him to give a talk in my research center ands his talk was fun and really targeted at physicists who understand numerical computing. He gave a good sell and was highly opinionated which was super welcomed. His main argument was that these are just other ways to arrive optimisation and they worked very quickly with only a bit of data. I am sure he would correct me greatly at this point. I’m not an expert on this topic but he knew the field very well and talked at length about the differences between one iterative method he developed and the method that Yao lai at Stanford developed after I had her work on my mind because she talked in an ai conference I organised in Oslo. I liked that he seemed to be willing to disagree with people about his own opinions because he simply believed he is correct.
Edit: this is the Yao lai paper I’m talking about:
I work on a team that has actually deployed NN based surrogate models into production in industry. We don’t use PINNs for the simple reason that many industrial scale solvers are solving significantly more complex systems than a single global PDE (at least in CFD, perhaps other areas are simpler). For instance, close to the boundaries, the solver our engineers use uses an approximation that does not satisfy conservation of mass and momentum. So when we try to use physical constraints, our accuracy goes down. Even in the cases where we could technically use PINNs we find they are underwhelming, and spending time on crafting better training data sets has always been a better option for us.
Me and a friend were discussing PINNs, and he made an argument against them: The Bitter Lesson. PINNs are a way of incorporating domain knowledge into ML models. The Bitter Lesson, for those unaware, is a famous essay by Rich Sutton that states that the history of AI is full of attempts of methods guided by domain/expert knowledge, but that ultimately, all such methods were overtaken by methods that simply scaled data and/or computation.
Computer science is currently subservient to an economic climate in which the only viable business is one that scales revenue without scaling labor. That's the bitter lesson.
But isn't that the story of technology and civilization? Hunter-Gatherers produced no surplus and couldn't support a complex and unequal society. Early agriculture could support a pyramid, but not very high.
It used to be almost everyone worked in agriculture, now about 1% does, so the others are free to do something else. Prior to the microprocessor making a computer required manual assembly of thousands of parts, early microprocessors contained thousands of parts manufactured by a small number of photographic and chemical steps, and the number of parts has grown into the billions without the number of steps expanding millions of times.
The Bitter Lesson only applies when you don't know the function you are modeling (e.g. perfect chess strategy or text-to-speech).
On the contrary, neural networks will never give you a better way to convert to the frequency domain than the Fourier Transform. At best, they might approximate it.
>And the use-case for PINNs is, e.g., modeling a known function that is not solvable analytically?
No. That is the use case for numerical analysis, where you can develop a high performance, accurate algorithm based on the mathematical analysis of the problem.
It is really kind of silly to want to encode the solution in some neural network.
Scaling data is not always possible. It's really hard to get your hands on good labelled medical imaging data, for instance. Maybe it makes sense to try to incorporate insights from biology and physiological instead of hoping that the neural net will "get it" from seeing enough data.
>Maybe it makes sense to try to incorporate insights from biology and physiological instead of hoping that the neural net will "get it" from seeing enough data.
For medical imaging your solver is most likely "physics based" in any case. PINNs want to encode the physics inside a neural network, instead of developing an appropriate algorithm (which obviously has to also incorporate the physics) which solves the problem and which can be analyzed mathematically.
It's not really an argument, it's more of an observation; and certainly not meant to be taken as a law, as in "thou shalt not use background knowledge even if you can".
See, the reason for the Bitter Lesson being a lesson is that Neural Nets, that Sutton is mainly writing about, are pretty crap at representing background knowledge. The only way you can store expert knowledge in a neural net is to modify its structure and its weights. The weights you can only modify by some kind of learning procedure like backprop, in practice. Very limited forms of background knowledge, like convolutions can be encoded in a neural net's structure, but imagine trying to represent, I don't know, the last ten lines of code you wrote today as a bunch of neural net connections. Continuous functions is just not the right kind of notation for that sort of thing.
If neural nets were any better at encoding background knowledge, they would use it, but they can't so they have to rely on data. And that's why they need so much of it. Background knowledge functions as a strong inductive bias- it directs the search for a hypothesis to hypotheses that we know make sense (again, think of convolutions). Without background knowledge, or with only a little background knowledge, you need tons of examples to learn anything useful.
So the Bitter Lesson is basically making a virtue out of necessity. In any case, it's not a prescriptive thing, only descriptive.
Thanks for the thought-out reply. That said, we disagree on quite a few points. A 'lesson' is something to be learned from, and Rich Sutton explicitly mentions in his conclusion what we should learn from this lesson. But it's indeed not law. His argument is also not limited to ANNs, nor even ML. His first example concerns state space search and chess. In general, I think the field is moving further away from expert/domain knowledge and towards reliance on copious amounts of data and computation. End-to-end learning embodies this. LLMs incorporate essentially no domain knowledge about human language (processing). But of course it remains a matter of degree — certain domain knowledge remains very useful.
Yes, I remember the argument. In computer chess the first AI system to dominate humans, IBM's DeepBlue, was based entirely on search and an opening book, so search + domain knowledge. Then AlphaGo which dominated in Go, was based on search, an opening book and a pair of self-playing deep neural nets and was equipped with the rules of Go. AlphaZero that followed, dropped the opening book but was still based on search and self-playing neural nets and was still given the rules of Go (and chess and Shoggi, if memory serves). And MuZero that followed that, dropped everything but the search and the self-playing neural nets.
Now, Sutton argues that some of those systems at least did not rely on domain knowledge. They all did: Monte Carlo Tree Search, used for board game-playing AI agents, is nothing else but an encoding of domain knowledge - specifically, domain knowledge about the structure of two-player, complete information games. It's the same domain knowledge that was used to create the minimax-based DeepBlue software that won against Kasparov.
Neural nets have still not managed to win against human players in board games without incorporating such a strong, knowledge-dependent component, as a game-tree search.
So sutton is fudging the details. Not on purpose. He's an RL person. In RL, as in planning and other disciplines, folks tend to forget all the knowledge they put into their systems in the form of inductive biases, or auxiliary (but can't-do-without) algorithms like MCTS. He's like the proverbial fish that don't know what water is, because they swim in it.
This articles seems like it was at least partially written by AI. Lots of fluff and no clear explanation of what PINNs are and how they work (other than the code).
Agreed. Stylistic hints include the heavy use of bulleted lists with bold headings, and the general lack of concern with justifying any vaguely plausible-sounding assertion ("The PINN approach ensures physical consistency, efficient computation, and accurate generalization from limited data.")
I think someone who cared about the specific content would at least note that linear PDEs like the heat equation often have closed-form solutions and/or efficient algorithms for solving any particular problem, so aren't likely to be usefully solved with PINNs.
I agree with the wave of criticisms surrounding PINNs.
As a researcher currently working on a project involving PINNs for cardiac biomedical applications, in my field industry players generally steer clear of PINNs.
The core issue lies in the approach of approximating physical laws through a loss function.
Basically: instead of leveraging mathematically robust solvers, PINNs attempt to encode the underlying physics (e.g., conservation laws) as constraints within the loss function of a NN.
Clearly this is a problem for systems requiring high fidelity in physical accuracy...
PINNs certainly do not fulfill their promises, the idea of outperforming a traditional PDE (or just ODE) solver by training a neural network, is on its face ridiculous.
It is very telling that the motivating example in https://physicsbaseddeeplearning.org/intro-teaser.htmltotally fails. If you read further you will notice that all other examples also fail and I mean "fail" as in the result is literally useless. (Reading this was genuinely upsetting)
The one legitimate area where I can see them being used is for interpolation in complex engineering/physics applications. Surrogate models actually can be a worthwhile endeavor if you need to solve a complex problem over a big parameter space. But of course this is fundamentally different than training a neutral network to solve a single PDE.
Even the idea is fundamentally flawed, ODE solvers are well developed and are based on sound mathematical theory. The idea that you can outperform them, by a neural network is on its face pretty silly.
It is quite funny to me to see that the author first states that high computational costs is among the main challenges for solving PDEs, and then carries on by modeling the heat equation using PINNs.
FWIW - I used to do research in this area - PINNs are a terribly overhyped idea.
See for example https://www.nature.com/articles/s42256-024-00897-5
Classical solvers are very very good at solving PDEs. In contrast PINNs solve PDEs by... training a neural network. Not once, that can be used again later. But every single time you solve a new PDE!
You can vary this idea to try to fix it, but it's still really hard to make it better than any classical method.
As such the main use cases for PINNs -- they do have them! -- is to solve awkward stuff like high-dimensional PDEs or nonlocal operators or something. Here it's not that the PINNs got any better, it's just that all the classical solvers fall off a cliff.
---
Importantly -- none of the above applies to stuff like neural differential equations or neural closure models. These are genuinely really cool and have wide-ranging applications.! The difference is that PINNs are numerical solvers, whilst NDEs/NCMs are techniques for modelling data.
/rant ;)
I concur. As a postdoc for many years adjacent to this work, I was similarly unimpressed.
The best part about PINNs is that since there are so many parameters to tune, you can get several papers out of the same problem. Then these researchers get more publications, hence better job prospects, and go on to promote PINNs even more. Eventually they’ll move on, but not before having sucked the air out of more promising research directions.
—a jaded academic
I believe a lot of this hype is purely attributable to Karniadakis and how bad a lot of the methods in many areas of engineering are. The methods coming out of CRUNCH (PINNs chief among them) seem, if they are not just actually, more intelligent in comparison, since engineers are happy to take a solution to inverse or model selection problems by pure brute force as "innovative" haha.
The general rule of thumb to go by is that whatever Karniadakis proposes, doesn't actually work outside of his benchmarks. PINNs don't really work, and _his flavor_ of neural operators also don't really work.
PINNs have serious problems with the way the "PDE-component" of the loss function needs to be posed, and outside of throwing tons of, often Chinese, PhD students, and postdocs at it, they usually don't work for actual problems. Mostly owed to the instabilities of higher order automatic derivatives, at which point PINN-people begin to go through a cascade of alternative approaches to obtain these higher-order derivatives. But these are all just hacks.
I love karniadakis energy. I invited him to give a talk in my research center ands his talk was fun and really targeted at physicists who understand numerical computing. He gave a good sell and was highly opinionated which was super welcomed. His main argument was that these are just other ways to arrive optimisation and they worked very quickly with only a bit of data. I am sure he would correct me greatly at this point. I’m not an expert on this topic but he knew the field very well and talked at length about the differences between one iterative method he developed and the method that Yao lai at Stanford developed after I had her work on my mind because she talked in an ai conference I organised in Oslo. I liked that he seemed to be willing to disagree with people about his own opinions because he simply believed he is correct.
Edit: this is the Yao lai paper I’m talking about:
https://www.sciencedirect.com/science/article/pii/S002199912...
What do you do now?
https://kidger.site/about/
I work on a team that has actually deployed NN based surrogate models into production in industry. We don’t use PINNs for the simple reason that many industrial scale solvers are solving significantly more complex systems than a single global PDE (at least in CFD, perhaps other areas are simpler). For instance, close to the boundaries, the solver our engineers use uses an approximation that does not satisfy conservation of mass and momentum. So when we try to use physical constraints, our accuracy goes down. Even in the cases where we could technically use PINNs we find they are underwhelming, and spending time on crafting better training data sets has always been a better option for us.
Me and a friend were discussing PINNs, and he made an argument against them: The Bitter Lesson. PINNs are a way of incorporating domain knowledge into ML models. The Bitter Lesson, for those unaware, is a famous essay by Rich Sutton that states that the history of AI is full of attempts of methods guided by domain/expert knowledge, but that ultimately, all such methods were overtaken by methods that simply scaled data and/or computation.
http://www.incompleteideas.net/IncIdeas/BitterLesson.html
I would love to hear HN's take on this argument.
Computer science is currently subservient to an economic climate in which the only viable business is one that scales revenue without scaling labor. That's the bitter lesson.
But isn't that the story of technology and civilization? Hunter-Gatherers produced no surplus and couldn't support a complex and unequal society. Early agriculture could support a pyramid, but not very high.
It used to be almost everyone worked in agriculture, now about 1% does, so the others are free to do something else. Prior to the microprocessor making a computer required manual assembly of thousands of parts, early microprocessors contained thousands of parts manufactured by a small number of photographic and chemical steps, and the number of parts has grown into the billions without the number of steps expanding millions of times.
Improving labour productivity is good, actually
The Bitter Lesson only applies when you don't know the function you are modeling (e.g. perfect chess strategy or text-to-speech).
On the contrary, neural networks will never give you a better way to convert to the frequency domain than the Fourier Transform. At best, they might approximate it.
Right. And the use-case for PINNs is, e.g., modeling a known function that is not solvable analytically?
>And the use-case for PINNs is, e.g., modeling a known function that is not solvable analytically?
No. That is the use case for numerical analysis, where you can develop a high performance, accurate algorithm based on the mathematical analysis of the problem.
It is really kind of silly to want to encode the solution in some neural network.
So, what's the use-case for PINNs, then?
Allegedly you can use it to solve PDEs, although that does not seem to work well.
Maybe there are some very special problems, without well developed silver where they work.
Scaling data is not always possible. It's really hard to get your hands on good labelled medical imaging data, for instance. Maybe it makes sense to try to incorporate insights from biology and physiological instead of hoping that the neural net will "get it" from seeing enough data.
>Maybe it makes sense to try to incorporate insights from biology and physiological instead of hoping that the neural net will "get it" from seeing enough data.
For medical imaging your solver is most likely "physics based" in any case. PINNs want to encode the physics inside a neural network, instead of developing an appropriate algorithm (which obviously has to also incorporate the physics) which solves the problem and which can be analyzed mathematically.
It's not really an argument, it's more of an observation; and certainly not meant to be taken as a law, as in "thou shalt not use background knowledge even if you can".
See, the reason for the Bitter Lesson being a lesson is that Neural Nets, that Sutton is mainly writing about, are pretty crap at representing background knowledge. The only way you can store expert knowledge in a neural net is to modify its structure and its weights. The weights you can only modify by some kind of learning procedure like backprop, in practice. Very limited forms of background knowledge, like convolutions can be encoded in a neural net's structure, but imagine trying to represent, I don't know, the last ten lines of code you wrote today as a bunch of neural net connections. Continuous functions is just not the right kind of notation for that sort of thing.
If neural nets were any better at encoding background knowledge, they would use it, but they can't so they have to rely on data. And that's why they need so much of it. Background knowledge functions as a strong inductive bias- it directs the search for a hypothesis to hypotheses that we know make sense (again, think of convolutions). Without background knowledge, or with only a little background knowledge, you need tons of examples to learn anything useful.
So the Bitter Lesson is basically making a virtue out of necessity. In any case, it's not a prescriptive thing, only descriptive.
Thanks for the thought-out reply. That said, we disagree on quite a few points. A 'lesson' is something to be learned from, and Rich Sutton explicitly mentions in his conclusion what we should learn from this lesson. But it's indeed not law. His argument is also not limited to ANNs, nor even ML. His first example concerns state space search and chess. In general, I think the field is moving further away from expert/domain knowledge and towards reliance on copious amounts of data and computation. End-to-end learning embodies this. LLMs incorporate essentially no domain knowledge about human language (processing). But of course it remains a matter of degree — certain domain knowledge remains very useful.
Yes, I remember the argument. In computer chess the first AI system to dominate humans, IBM's DeepBlue, was based entirely on search and an opening book, so search + domain knowledge. Then AlphaGo which dominated in Go, was based on search, an opening book and a pair of self-playing deep neural nets and was equipped with the rules of Go. AlphaZero that followed, dropped the opening book but was still based on search and self-playing neural nets and was still given the rules of Go (and chess and Shoggi, if memory serves). And MuZero that followed that, dropped everything but the search and the self-playing neural nets.
Now, Sutton argues that some of those systems at least did not rely on domain knowledge. They all did: Monte Carlo Tree Search, used for board game-playing AI agents, is nothing else but an encoding of domain knowledge - specifically, domain knowledge about the structure of two-player, complete information games. It's the same domain knowledge that was used to create the minimax-based DeepBlue software that won against Kasparov.
Neural nets have still not managed to win against human players in board games without incorporating such a strong, knowledge-dependent component, as a game-tree search.
So sutton is fudging the details. Not on purpose. He's an RL person. In RL, as in planning and other disciplines, folks tend to forget all the knowledge they put into their systems in the form of inductive biases, or auxiliary (but can't-do-without) algorithms like MCTS. He's like the proverbial fish that don't know what water is, because they swim in it.
It works when data is cheap, and model size is not an issue.
This articles seems like it was at least partially written by AI. Lots of fluff and no clear explanation of what PINNs are and how they work (other than the code).
Agreed. Stylistic hints include the heavy use of bulleted lists with bold headings, and the general lack of concern with justifying any vaguely plausible-sounding assertion ("The PINN approach ensures physical consistency, efficient computation, and accurate generalization from limited data.")
I think someone who cared about the specific content would at least note that linear PDEs like the heat equation often have closed-form solutions and/or efficient algorithms for solving any particular problem, so aren't likely to be usefully solved with PINNs.
I agree with the wave of criticisms surrounding PINNs.
As a researcher currently working on a project involving PINNs for cardiac biomedical applications, in my field industry players generally steer clear of PINNs. The core issue lies in the approach of approximating physical laws through a loss function.
Basically: instead of leveraging mathematically robust solvers, PINNs attempt to encode the underlying physics (e.g., conservation laws) as constraints within the loss function of a NN. Clearly this is a problem for systems requiring high fidelity in physical accuracy...
PINNs certainly do not fulfill their promises, the idea of outperforming a traditional PDE (or just ODE) solver by training a neural network, is on its face ridiculous.
It is very telling that the motivating example in https://physicsbaseddeeplearning.org/intro-teaser.html totally fails. If you read further you will notice that all other examples also fail and I mean "fail" as in the result is literally useless. (Reading this was genuinely upsetting)
The one legitimate area where I can see them being used is for interpolation in complex engineering/physics applications. Surrogate models actually can be a worthwhile endeavor if you need to solve a complex problem over a big parameter space. But of course this is fundamentally different than training a neutral network to solve a single PDE.
Even the idea is fundamentally flawed, ODE solvers are well developed and are based on sound mathematical theory. The idea that you can outperform them, by a neural network is on its face pretty silly.
It is quite funny to me to see that the author first states that high computational costs is among the main challenges for solving PDEs, and then carries on by modeling the heat equation using PINNs.
Can PINNs go away? I do CFD and they show zero promise and application for real problems. just a waste of money