It's sort of baffled me that people appear to be shipping real code using WasmGC since the limitations described in this post are so severe. Maybe it's fine because they're just manipulating DOM nodes? Every time I've looked at WasmGC I've gone "there's no way I could use this yet" and decided to check back a year later and see if it's There Yet.
Hopefully it gets there. The uint8array example from this post was actually a surprise to me, I'd just assumed it would be efficient to access a typed array via WasmGC!
Beyond the limitations in this post there are other things needed to be able to target WasmGC with existing stuff written in other languages, like interior references or dependent handles. But that's okay, I think, it can be worthwhile for it to exist as-is even if it can't support i.e. existing large-scale apps in memory safe languages. It's a little frustrating though.
>> The uint8array example from this post was actually a surprise to me, I'd just assumed it would be efficient to access a typed array via WasmGC!
The problem is that the Scheme i8 array is not actually a UInt8Array with WasmGC. It’s a separate heap allocated object that is opaque to the JS runtime.
In the linear memory Wasm model, the Scheme i8 array is allocated in the wasm memory array, and so one can create an UInt8Array view that exactly maps to the same bytes in the linear memory buffer. This isn’t possible (yet?) with the opaque WasmGC object type.
I'm a happy wasm user, last week I was looking at adding support for psd file on my app [1], the choices was in between using a full js lib that is almost 1MB large, doesn't work very well and is slow OR leverage some C stuff. Without any kind of optimisation, the wasm version is 10 times faster and 5 times smaller and that's before digging through SIMD. There will always people complaining about a lot of things but wasm is already fine enough for a lot of use cases
Hope my post didn't come across as complaining because I agree! Wasm is great right now for lots of things, just wanted to highlight a use case that isn't great yet.
I've been shipping a Flutter app that uses it for months. Pretty heavy stuff, its doing everything from LLM inference to model inference to maintaining a vector store and indexeddb in your browser.
Frame latency feels like it's gone, there's 100% a significant decrease in perceived latency.
I did have a frustrating performance issues with 3rd party code doing "source code parsing" via RegEx, thought it was either the library or Flutters fault, but from the article content, sounds like it was WASM GC. (saw a ton of time spent converting objects from JS<->WASM on a 50 KLOC file)
From that perspective, the article sounds a bit maximalist in its claims, but only from my perspective.
I think if you read "real time graphics" as "3d game" it gives a better understanding of where it's at, my anecdata aside.
Don't wanna name names, because it's on me, it's a miracle it exists, and works.
I don't think there's a significant # of alternatives, so hopefully Flutter syntax highlighting library, as used in a package for making markdown columns, is enough to be helpful.
Problem was some weird combo of lots of regex and an absolutely huge amount of code. It's one of those problems it's hard for me to draw many conclusions from:
- Flutter may be using browser APIs for regex, so there's some sort of JS/WASM barrier copying cost
- The markdown column renderer is doing nothing at all to handle this situation lazily, i.e. if any portion of the column is displayed, syntax highlighting must be done on the complete markdown input
- Each different color text, so pretty much every word, gets its own object in the view hierarchy, tens if not hundreds of thousands this case. Can't remember if this is due to the syntax highlighting library or the markdown package
- Regex is used to parse to code and for all I know one of them has pathological performance like backtracking unintentionally.
Definitely a lot is missing, yeah, and adding more will take time. But it works well already for pure computational code. For example, Google Sheets uses WasmGC for Java logic:
Really liked NaCl (and PNaCl) idea, which allows running arbitrary code, sanitized, with ~90% speed of native execution. Playing Bastion game in browser was refreshing. Unfortunately communication with js code and bootstrap issues (can't run code without plugin, no one except chrome supported this) ruined that tech
WASM nowadays has become quite the monstrosity compared to NaCl/PNaCl. Just look at this WASM GC spaghetti, trying to compile a GC'd language but hooking it up V8/JavaScriptCore's GC, while upholding a strict security model... That sounds like it won't cause any problems whatsoever!
Sometimes I wonder if the industry would have been better off with NaCl as a standard. Old, mature tooling would by and large still be applicable (it's still your ordinary x86/ARM machine code) instead of the nascent and buggy ecosystem we have now. I don't know why, but the JS folks just keep reinventing everything all the time.
> Old, mature tooling would by and large still be applicable (it's still your ordinary x86/ARM machine code)
It wasn't, though. Since NaCl ran code in the same process as the renderer, it depended upon a verifier for security, and required the generated code to follow some unusual constraints to support that verification. For example, on x86, all branch targets were required to be 32-byte aligned, and all indirect branches were required to use a specific instruction sequence to enforce that alignment. Generating code to meet these constraints required a modified compiler, and reduced code density and speed.
In any case, NaCl would have run into the exact same GC issues if it had been used more extensively. The only reason it didn't was that most of the applications it saw were games which barely interacted with the JS/DOM "world".
I simplified in my comment. It was a much better story for tooling, since you could reuse large parts of existing backends/codegen, optimization passes, and debugging. The mental model of execution would remain too, rather than being a weird machine code for a weird codesize-optimized stack machine.
I would wager the performance implications of NaCl code, even for ARM which required many more workarounds than x86 (whose NaCl impl has a "one weird trick" aura), were much better than for modern WASM.
It's hard to say if it would've run into the same issues. For one, it would've been easier to port native GCs: they don't run afoul of W^X rules, they just read memory if that, which you can do performantly in NaCl on x86 due to the segments trick. I also suspect the culture could've more easily evolved towards shared objects where you would be able to download/parse/verify a stdlib once, and then keep using it.
I agree it was because the applications were games, but for another second-order reason: they were by and large C/C++ codebases where memory was refcounted manually. Java was probably the second choice, but those were the days when Java applets were still auto-loading, so there was likely no need for anybody to try.
> It's hard to say if it would've run into the same issues. For one, it would've been easier to port native GCs...
WASM GC isn't just about memory management for the WASM world; it's about managing references (including cyclical refs!) which cross the boundary into the non-WASM world. Being able to write a GC within the WASM (or NaCl) world doesn't get you that functionality.
> WASM nowadays has become quite the monstrosity compared to NaCl/PNaCl
It's really the other way around, NaCl/PNaCl was quite the monstrosity that didn't fit into the browser runtime environment at all and required completely separate APIs to access 'platform features' - while access to existing web APIs had to go through an incredibly slow and cumbersome messaging layer - e.g. the people complaining today that WASM doesn't allow direct DOM access would have a complete mental breakdown with NaCl/PNaCl ;)
In a way, WASM is also an evolution of PNaCl (which was also a CPU-agnostic bytecode, but one that couldn't be standardized because it was an adhoc subset of LLVM IR).
I'm reminded of writing JavaScript way back in the old Internet Explorer days (6 and to a lesser extent 7), when you had to manually null out any references to DOM elements if you were done with them, or else the JS and the DOM nodes wouldn't get garbage collected because IE had two different garbage collectors and cycles between them didn't get collected immediately.
> I don't know why, but the JS folks just keep reinventing everything all the time.
It's because they only know the web. They have never seen seen what real programmers actually do. They only live in their stupid web bubble thinking it's all there is.
It’s not a prerequisite for using the DOM from wasm.
See, for example, the rust web frameworks of leptos and dioxus. They’re honestly great, and usable today as replacements for react and friends. (With the single caveat that wasm bundle size is a bit bigger than .js size).
They work by exposing a number of browser methods through to wasm, and then they call them through a custom wasm/JS API bridge. All rust objects and DOM objects are completely isolated. Rust objects are allocated via an embedded malloc implementation and JS objects are managed by V8 (or whatever). but the DOM can still be manipulated via (essentially) message passing over an RPC like interface.
But the rust code needs to compile malloc specially for wasm. This is ok in rust - malloc is 75kb or something. But in languages like C#, Go or Python, the runtime GC is much bigger and harder to fit in a little wasm bundle.
The upside of wasm-gc is that this divide goes away. Objects are just objects, shared between both languages. So wasm bundles can use & reference JS/DOM objects directly. And wasm programs can piggyback on V8’s GC without needing to ship their own. This is good in rust, and great in GC languages. I saw an example with blazor where a simple C# wasm todo app went from 2mb or something to 10kb when wasmgc was used.
TLDR: wasm-gc isn’t strictly needed. You can use DOM from wasm today. It just makes wasm bundles smaller and wasm-dom interaction easier (and theoretically faster).
That's why I wrote direct DOM access above. Sure, we can load extra JS in an addition to WASM and and funnel everything through JS. Some say, it does not matter.
I think it does, but it is hard to track the initiatives that tackle this. That's why I'm asking.
WASM-GC is essentially a way to hook into an externally provided garbage collector, it doesn't help much with calling into web APIs.
The DOM has been designed as a JS API (for better or worse), accessing that from WASM will always require to go through some FFI layer (this layer may be hidden and automatically created at runtime, but it still needs to exist).
The question is just how much marshalling needs to happen in that FFI layer. Making the DOM (or other Web APIs) actually WASM friendly would require an alternative DOM API which looks more like a very low-level C API - e.g. not using Javascript objects/strings at all, but only numbers, some of them 'pointers' into an ArrayBuffer (e.g. the WASM heap).
There's also a middle-way of adding such 'garbage free' functions to web APIs which allow to be called with less overhead in the JS engine, for instance WebGPU has such functions, and they were specifically added to reduce marshalling and GC overhead when called from WASM.
E.g. GC-free Web APIs would be much more useful than GC support in WASM for interacting with the browser side. GC support in WASM is mainly useful for languages that depend on garbage collection, because those can delegate the GC runtime tasks to the JS engine's garbage collector.
If I remember correctly one reason for the direct DOM initiative been held was that it depended on the WASM GC being completed.
"The DOM has been designed as a JS API (for better or worse), accessing that from WASM will always require to go through some FFI layer (this layer may be hidden and automatically created at runtime, but it still needs to exist)."
I don't see that. The DOM is the underlying data structure and what we need is direct access to it form WASM, without FFI or going through JS. WASM should be a first class citizen next to JS.
That would require a "C ABI friendly" alternative DOM API. That's hardly worth the trouble since the performance problems lurk in the actual DOM design and implementation, not in the JS shim. It would make more sense to go that way for lower level web APIs like WebGL or WebGPU.
Also if we're talking big changes like this it would actually make more sense to implement the DOM on top of the 3D APIs, and not the 3D APIs as an appendix of the DOM ;)
Calling an "idiomatic" C++ API from WASM would also require a (COM like) shim though since C++ has no standardized ABI. E.g. as soon as your API uses things like std::string it already gets hairy.
Wasm GC also solved the problem of reference cycles between objects in disparate heap managers leading to memory leaks. It's not just a performance or size win: it's a correctness win.
Out of curiosity, why is malloc 75kb ? That seems like an crazy amount of code (if this after linking and dead code removal for platform specific magic?)
Malloc can indeed be implemented in a handful of bytes, but that's nog going to perform well.
Doing malloc well is actually quite a bit of work. You need to group allocations by size, manage alignment, request and release pages from the OS/browser implement reallocate, etc. A typical malloc implementation is actually a combination of several different allocation strategies.
The best solution is to reduce the amount of alloc/free calls in your code, then you can just as well use a slow-and-small allocator like emmalloc since allocator overhead doesn't matter anymore: https://github.com/emscripten-core/emscripten/blob/main/syst...
(e.g. if memory management overhead shows up in the profiler, the proper solution is not to go looking for a faster general-purpose allocator, but to reduce the amount of calls into the allocator)
To be fair neither are WebGL and WebGPU, versus the native API counterparts, the best you can get are shadertoy demos, and product visualisation on ecommerce sites.
Due to tooling, sandboxing and not having any control about what GPU gets selected, or why the browser blakckboxes it and switches into software rendering.
It does run without the browser, it's just that on Steam, people expect an executable.
And sorry for the lack of quality - this project was built by just one guy who did custom everything (art, engine, editor, writing, music etc.) - it's super impressive imo. I'm sure if you replaced the models and textures with better looking ones (at no increase of technical complexity), it would look better.
Honestly I've come to view the "it's the API's fault" as cope.
There are times when it is legitimately true, but it's far easier to say that your would have been amazing efforts were blocked because you needed obscure feature of prototype GPU than it is to accept you were never going to get close in the first place for completely different reasons.
No, it's a valid complaint. Even before hardware raytracing a huge amount of code was moving to compute shaders, most global illumination techniques in the last 10-15 years is easier to implement if you can write into random indices (often you can refactor to collecting reads but it's quite cumbersome and will almost certainly cost in performance).
Even WebGL 2 is only equivalent of GLES 3.1 (and that's maybe a low 4.1 desktop GL equivalent). I think my "software" raytracing for desktop GL was only feasible with GL 4.3 or GL 4.4 if i remember correctly (And even those are kinda ancient).
Thanks for correcting, I only remembered that 3.2 was the latest so went one down since I remembered compute wasn't part but seems it was in 2 steppings. :)
This is exactly what I am talking about: successful 3D products don’t need raytracing or GI, the bulk of the audience doesn’t care, as shown by the popularity of the Switch. Sure those things might be nice but people act like they are roadblocks.
Yes and no, Nintendo properties has a certain pull that aren't available for everyone.
But also, I'm pretty sure that even the Switch 1 has far more capable graphics than WebGL 2. Doom Eternal is ported to it and reading a frame teardown someone did they mentioned that parts of it are using compute.
Yes, you can do fairly cool stuff for a majority of people but old API's also means that you will spend far more time to get something halfway decent (older worse API's just takes more time to do things with than the modern ones).
PS3/XB360 level graphics still requires a fair bit of content and the web-game-ecosystem kinda died off with Flash and people moved onto mobile or focused on marketplaces like Steam/XBIndie,etc.
I do think we're due for a new wave of web-based gaming though, web audio just wasn't near maturity when Flash went and the mobile/steam/xbindie marketplaces still worked for indies. But now with all the crowding and algorithm changes people are hurting and I think it might just be a little spark needed for a major shift to occur.
- the incredible overhead of each and every API call
- the nerfed timers that jitter on purpose
- the limitation of a single rendering context and that you *must* use the JS main thread to all those rendering calls (so no background async for you..)
Yeah, that's an issue, esp with WebGL.. but you can get pretty far by reducing calls with a cache, things like "don't set the uniform / attribute if you don't need to".. but I hear WebGPU has a better API for this, and eventually this should get native performance.. though, I also wonder, is this really a bottleneck for real-world projects? I love geeking out about this.. but.. I suspect the real-world blocker is more like "user doesn't want to wait 5 mins to download AAA textures"
> Nerfed timers
Yeah, also an issue. Fwiw Mainloop.js gives a nice API for having a fixed timestep and getting an interpolation value in your draw handler to smooth things out. Not perfect, but easy and state-of-the-art afaict. Here's a simple demo (notice how `lerp` is called in the draw handler): https://github.com/dakom/mainloop-test
Re: multithreading, I don't think that's a showstopper... more like, techniques you'd use for native aren't going to work out of the box on web, needs more custom planning. I see this as more of a problem for speeding up systems _within_ systems, i.e. faster physics by parallelizing grids or whatever, but for having a physics WASM running in a worker thread that shares data with the main thread, it's totally doable, just needs elbow grease to make it work (will be nice when multithreading _just works_ with a a SharedArrayBuffer and easily)
Multithreading yes that works the way you mention but I meant multiple rendering contexts.
In standard OpenGL the de-facto way to do parallel GPU resource uploads while rendering is to have multiple rendering contexts in a "share group" which allows them to share some resources such as textures. So then you can run rendering in one thread that uses one context and do resource uploads in another thread that uses a different context.
There was a sibling comment that mentioned something called off screen canvas which hints that it might be something that would let the web app achieve the same.
...also, if you just need a non-jittery frame time, computing the average over multiple frames actually gives you a frame duration that's stable and exact (e.g. 16.667 or 8.333 milliseconds despite the low-resolution inputs).
Also, surpise: there are no non-jittery time sources on native platforms either (for measuring frame duration at least) - you also need to run a noise-removal filter over the measured frame duration in native games. Even the 'exact' presentation timestamps from DXGI or MTLDrawable have very significant (up to millisecond) jitter.
> - the limitation of a single rendering context and that you must use the JS main thread to all those rendering calls (so no background async for you..)
I didn't mean just WASM -> JS but the WebGL API call overhead which includes marshalling the call from WASM runtime across multiple layers and processes inside the browser.
Win32 performance counter has native resolution < 1us
OffScreencanvas is something I haven't actually come across before. Looks interesting, but I already expect that the API is either brain damaged or intentionally nerfed for security reasons (or both). Anyway I'll look into it so thanks for that!
> Win32 performance counter has native resolution < 1us
Yes but that's hardly useful for things like measuring frame duration when the OS scheduler runs your per-frame code a millisecond late or early, or generally preempts your thread in the middle of your timing code (eg measuring durations precisely is also a non-trivial problem on native platforms even with high precision time sources).
Almost every game for the last 25 years has used those Win32 performance counters, or the platforms nearest equivalent (it’s a just a wrapper over some CPU instructions), to measure frame times. It’s the highest resolution clock in the system, it’s a performance counter. You’re supposed to use it for that.
If you want to correlate the timestamps with wall time then good luck, but if you just need to know how many nanoseconds elapsed between two points in the program on a single thread then that’s your tool.
TL;DR: the precision of your time source won't matter much since thread scheduling gets in the way, one general solution is to apply some sort of noise filter to remove jitter
The claim was that with WebGL was "the best you can get are shadertoy demos, and product visualisation on ecommerce sites". Figma is neither, regardless of how it's making use of WebGL under the hood. Not sure what relevance an Unreal engine demo is, as you seem to think I was making a claim about real-time graphics that I wasn't.
I had this argument with pjmlp not too long ago, and it goes in circles.
Basically they define anything less than pushing the extreme limits of rendering technology to be worthless, while simultaneously not actually understanding what that is beyond the marketing hype. The fact most users would not be able to run that seven year old demo on their systems today, even natively, would be beside the point of course.
WebGL particularly absolutely has problems, but the revealing thing is how few people state what they really are, such as the API being synchronous or the inability to use inverted z-buffers. Instead it's a lot of noise about ray tracing etc.
WASM per call overhead is a whole other problem too, GC or not.
Yes, I know there's more overhead on the web than native, but that is missing the point of my post. I'm talking about issues with Wasm GC relative to other ways of rendering with Wasm. I've played many browser games with good performance, btw.
Sorry to be ignorant, but I have a couple questions about Wasm:
- Will the Wasm GC pave the way to having native object references to JS objects/classes (seeing how its the same GC as used by JS)?
- Is it possible to do lightweigt interop between a high level language using Wasm GC (such as Java), with a low-level one (such as C++) that doesn't use it? If, so, how?
Imo the biggest shortcoming of Wasm is the incredibly awkward way in which it integrates into the browser
Seeing that this is not yet used by Emscripten JS shims I wonder if there are downsides which prevent its use for something like WebGL objects, or whether just nobody got around yet rewriting the Emscripten shims to externrefs.
Like the article says, you can have _opaque_ references. It's really a small step forward but does address a major pain point.
So for your first question. Yes you can hold a reference, so the major benefit is that dead objects via things like memory cycles aren't an issue any more (previously connecting objects between WASM and JS worlds needed manual references that outside of something like C++ RAII would be brittle).
For the second part, it would depend on the compiler but since Java/C++ would both live in a WASM world they should be possible to combine fairly painlessly (given a sane way to declare linked functions that are named properly), I'd put it's interoperability issues at the same level as PInvoke from C#.
Even better, if Windows is the only deployment target, throw away P/Invoke, use C++/CLI and there is no need to guess how to get P/Invoke attributes exactly correct for each API call.
I was excited to read this post because I haven't yet tried WasmGC for anything beyond tiny toy examples, but was disappointed to find no actual numbers for performance. I don't know the author well enough to be able to assess their assertions that various things are "slow" without data.
This was a quickie post for me. Just a tale of what my experience has been. No in-depth, apples to apples comparison so I don't know the magnitude of the performance differential.
Not surprising tbh, no automatic memory management solution is ready for realtime graphics - it always requires a lot of manual care to reduce memory management overhead (for instance working around the GC instead of using it), which then kinda defeats the purpose of the 'automatic' in automatic memory management. There simply is no memory management silver bullet when performance matters.
Perhaps I should've said this in the post to ward off comments from the anti-GC crowd but I do realtime graphics in managed memory languages just fine outside of the Wasm context. This is a Wasm problem.
Unity games have plenty of GC related stutter issues and working around the Unity GC to prevent those issues is basically its own field of engineering.
As for UE, I wonder if they would go down the same route today, or whether they just can't rip out their GC because it would mean a complete rewrite of the engine.
Games don't need lifetime management at object granularity.
The code snippets were very cool.
I'll check out hoot!
Overall I don't think Wasm GC is for realtime graphics.
Wouldn't the garbage collector cause a framerate drop when rendering? Usually garbage collectors are not used for graphics heavy apps.
My impression was that WasmGC was perfect for writing business logic in Go or C# or other garbage collected languages that were not running perfectly before in wasm.
You can absolutely do graphics rendering with GC'd languages. The trick is the same as with manual memory management: Minimize allocation. You mention C# which is a language that has been used very successfully in the games industry.
> Unsatisfying workarounds [...] Use linear memory for bytevectors
It never makes sense to use GC for leaf memory if you're in a language that offers both, since mere refcounting (or a GC'ed object containing a unique pointer) is trivial to implement.
There are a lot of languages where it's expensive to make the mistake this post is making. (I don't know much about WASM in particular; it may still have other errors).
> This would require a little malloc/free implementation and a way to reclaim memory for GC'd bytevectors.
Allocation is always expensive in the hot path. Arenas, pools, etc. are a thing even in C engines for a reason. If it's possible to mix GC and linear, that's ridiculously powerful.
In Scheme what you want to do is allocate a big bytevector and use it over and over. This is what I already do outside of Wasm. I don't want or need linear memory involved, I just want access to my (array i8) from JS.
Well I just thought of something obvious... Have a function that lets you pass in an ArrayBuffer, then it brings it into the virtual address space of the WASM program. Function would return the virtual address that was assigned to that array buffer. From there, you call into WASM again with that pointer, and the program can take action.
Then there would be another function to relinquish ownership of the ArrayBuffer.
There is, but then you'd need to declare the entire WASM heap as a single SharedArrayBuffer. It only makes sense for shared-memory multithreading (but not that support for SharedArrayBuffer only works in 'cross-origin isolated' contexts).
Better, but WebGPU and WebGL aren't going to win any performance prizes either, and tooling is pretty much non existent.
Nothing like Pix, Instruments or Renderdoc, SpectorJS is the only thing you get after almost 15 years since WebGL 1.0.
And from the hardware level they support, it about PlayStation 3 kind of graphics, if the browser doesn't block the GPU, nor selects the integrated one instead of dedicated one.
Your are left with shaders as the only way to actually push the hardware.
true, it's just that the topic of this post seemed strange to me, since you wouldn't use a programming language with GC for high intensity graphics app in native either, hmmm.
You can't GC together with the host environment if you do a custom GC (i.e. a wasm object and a JS object in a cycle wouldn't have any way to ever be GC'd).
Not really. As I understand it, WASM offers no facility to unwind the stack which is needed for tracing garbage collectors. The only solution here is to manually write your own implementation of the stack on the heap and force all data to live there instead of in registers. This is a huge performance penalty.
It's sort of baffled me that people appear to be shipping real code using WasmGC since the limitations described in this post are so severe. Maybe it's fine because they're just manipulating DOM nodes? Every time I've looked at WasmGC I've gone "there's no way I could use this yet" and decided to check back a year later and see if it's There Yet.
Hopefully it gets there. The uint8array example from this post was actually a surprise to me, I'd just assumed it would be efficient to access a typed array via WasmGC!
Beyond the limitations in this post there are other things needed to be able to target WasmGC with existing stuff written in other languages, like interior references or dependent handles. But that's okay, I think, it can be worthwhile for it to exist as-is even if it can't support i.e. existing large-scale apps in memory safe languages. It's a little frustrating though.
>> The uint8array example from this post was actually a surprise to me, I'd just assumed it would be efficient to access a typed array via WasmGC!
The problem is that the Scheme i8 array is not actually a UInt8Array with WasmGC. It’s a separate heap allocated object that is opaque to the JS runtime.
In the linear memory Wasm model, the Scheme i8 array is allocated in the wasm memory array, and so one can create an UInt8Array view that exactly maps to the same bytes in the linear memory buffer. This isn’t possible (yet?) with the opaque WasmGC object type.
Yes, that's right. I'm hoping there will be a way to do this in a future revision of Wasm GC.
I'm a happy wasm user, last week I was looking at adding support for psd file on my app [1], the choices was in between using a full js lib that is almost 1MB large, doesn't work very well and is slow OR leverage some C stuff. Without any kind of optimisation, the wasm version is 10 times faster and 5 times smaller and that's before digging through SIMD. There will always people complaining about a lot of things but wasm is already fine enough for a lot of use cases
[1] https://github.com/mickael-kerjean/filestash
Hope my post didn't come across as complaining because I agree! Wasm is great right now for lots of things, just wanted to highlight a use case that isn't great yet.
I've been shipping a Flutter app that uses it for months. Pretty heavy stuff, its doing everything from LLM inference to model inference to maintaining a vector store and indexeddb in your browser.
Frame latency feels like it's gone, there's 100% a significant decrease in perceived latency.
I did have a frustrating performance issues with 3rd party code doing "source code parsing" via RegEx, thought it was either the library or Flutters fault, but from the article content, sounds like it was WASM GC. (saw a ton of time spent converting objects from JS<->WASM on a 50 KLOC file)
From that perspective, the article sounds a bit maximalist in its claims, but only from my perspective.
I think if you read "real time graphics" as "3d game" it gives a better understanding of where it's at, my anecdata aside.
When you said "jump in perceived latency", did you mean perceived latency went up or down?
Down, significantly
What's the name of the app I want to try it out
Which libraries caused these problems for you?
Don't wanna name names, because it's on me, it's a miracle it exists, and works.
I don't think there's a significant # of alternatives, so hopefully Flutter syntax highlighting library, as used in a package for making markdown columns, is enough to be helpful.
Problem was some weird combo of lots of regex and an absolutely huge amount of code. It's one of those problems it's hard for me to draw many conclusions from:
- Flutter may be using browser APIs for regex, so there's some sort of JS/WASM barrier copying cost
- The markdown column renderer is doing nothing at all to handle this situation lazily, i.e. if any portion of the column is displayed, syntax highlighting must be done on the complete markdown input
- Each different color text, so pretty much every word, gets its own object in the view hierarchy, tens if not hundreds of thousands this case. Can't remember if this is due to the syntax highlighting library or the markdown package
- Regex is used to parse to code and for all I know one of them has pathological performance like backtracking unintentionally.
Definitely a lot is missing, yeah, and adding more will take time. But it works well already for pure computational code. For example, Google Sheets uses WasmGC for Java logic:
https://web.dev/case-studies/google-sheets-wasmgc#the_final_...
Really liked NaCl (and PNaCl) idea, which allows running arbitrary code, sanitized, with ~90% speed of native execution. Playing Bastion game in browser was refreshing. Unfortunately communication with js code and bootstrap issues (can't run code without plugin, no one except chrome supported this) ruined that tech
WASM nowadays has become quite the monstrosity compared to NaCl/PNaCl. Just look at this WASM GC spaghetti, trying to compile a GC'd language but hooking it up V8/JavaScriptCore's GC, while upholding a strict security model... That sounds like it won't cause any problems whatsoever!
Sometimes I wonder if the industry would have been better off with NaCl as a standard. Old, mature tooling would by and large still be applicable (it's still your ordinary x86/ARM machine code) instead of the nascent and buggy ecosystem we have now. I don't know why, but the JS folks just keep reinventing everything all the time.
> Old, mature tooling would by and large still be applicable (it's still your ordinary x86/ARM machine code)
It wasn't, though. Since NaCl ran code in the same process as the renderer, it depended upon a verifier for security, and required the generated code to follow some unusual constraints to support that verification. For example, on x86, all branch targets were required to be 32-byte aligned, and all indirect branches were required to use a specific instruction sequence to enforce that alignment. Generating code to meet these constraints required a modified compiler, and reduced code density and speed.
In any case, NaCl would have run into the exact same GC issues if it had been used more extensively. The only reason it didn't was that most of the applications it saw were games which barely interacted with the JS/DOM "world".
I simplified in my comment. It was a much better story for tooling, since you could reuse large parts of existing backends/codegen, optimization passes, and debugging. The mental model of execution would remain too, rather than being a weird machine code for a weird codesize-optimized stack machine.
I would wager the performance implications of NaCl code, even for ARM which required many more workarounds than x86 (whose NaCl impl has a "one weird trick" aura), were much better than for modern WASM.
It's hard to say if it would've run into the same issues. For one, it would've been easier to port native GCs: they don't run afoul of W^X rules, they just read memory if that, which you can do performantly in NaCl on x86 due to the segments trick. I also suspect the culture could've more easily evolved towards shared objects where you would be able to download/parse/verify a stdlib once, and then keep using it.
I agree it was because the applications were games, but for another second-order reason: they were by and large C/C++ codebases where memory was refcounted manually. Java was probably the second choice, but those were the days when Java applets were still auto-loading, so there was likely no need for anybody to try.
> It's hard to say if it would've run into the same issues. For one, it would've been easier to port native GCs...
WASM GC isn't just about memory management for the WASM world; it's about managing references (including cyclical refs!) which cross the boundary into the non-WASM world. Being able to write a GC within the WASM (or NaCl) world doesn't get you that functionality.
> WASM nowadays has become quite the monstrosity compared to NaCl/PNaCl
It's really the other way around, NaCl/PNaCl was quite the monstrosity that didn't fit into the browser runtime environment at all and required completely separate APIs to access 'platform features' - while access to existing web APIs had to go through an incredibly slow and cumbersome messaging layer - e.g. the people complaining today that WASM doesn't allow direct DOM access would have a complete mental breakdown with NaCl/PNaCl ;)
In a way, WASM is also an evolution of PNaCl (which was also a CPU-agnostic bytecode, but one that couldn't be standardized because it was an adhoc subset of LLVM IR).
I'm reminded of writing JavaScript way back in the old Internet Explorer days (6 and to a lesser extent 7), when you had to manually null out any references to DOM elements if you were done with them, or else the JS and the DOM nodes wouldn't get garbage collected because IE had two different garbage collectors and cycles between them didn't get collected immediately.
> I don't know why, but the JS folks just keep reinventing everything all the time.
It's because they only know the web. They have never seen seen what real programmers actually do. They only live in their stupid web bubble thinking it's all there is.
What does this have to do with wasm gc?
Same here, and the irony is Mozzilla opposing it hardly matters nowadays for the Firefox browser market, it is Google driving where WebAssembly goes.
Remember NaCL, and PNaCL SDKs, came with support for C, C++ and OCaml, the latter being an example for GC languages.
Wasn't WASM GC a prerequisite for getting direct DOM access from WASM? Does progress for WASM GC mean progress for DOM access as well?
Every time I check back on that the initiative seems to run under a different name. What is the best way to track progress on that front?
It’s not a prerequisite for using the DOM from wasm.
See, for example, the rust web frameworks of leptos and dioxus. They’re honestly great, and usable today as replacements for react and friends. (With the single caveat that wasm bundle size is a bit bigger than .js size).
They work by exposing a number of browser methods through to wasm, and then they call them through a custom wasm/JS API bridge. All rust objects and DOM objects are completely isolated. Rust objects are allocated via an embedded malloc implementation and JS objects are managed by V8 (or whatever). but the DOM can still be manipulated via (essentially) message passing over an RPC like interface.
But the rust code needs to compile malloc specially for wasm. This is ok in rust - malloc is 75kb or something. But in languages like C#, Go or Python, the runtime GC is much bigger and harder to fit in a little wasm bundle.
The upside of wasm-gc is that this divide goes away. Objects are just objects, shared between both languages. So wasm bundles can use & reference JS/DOM objects directly. And wasm programs can piggyback on V8’s GC without needing to ship their own. This is good in rust, and great in GC languages. I saw an example with blazor where a simple C# wasm todo app went from 2mb or something to 10kb when wasmgc was used.
TLDR: wasm-gc isn’t strictly needed. You can use DOM from wasm today. It just makes wasm bundles smaller and wasm-dom interaction easier (and theoretically faster).
That's why I wrote direct DOM access above. Sure, we can load extra JS in an addition to WASM and and funnel everything through JS. Some say, it does not matter.
I think it does, but it is hard to track the initiatives that tackle this. That's why I'm asking.
WASM-GC is essentially a way to hook into an externally provided garbage collector, it doesn't help much with calling into web APIs.
The DOM has been designed as a JS API (for better or worse), accessing that from WASM will always require to go through some FFI layer (this layer may be hidden and automatically created at runtime, but it still needs to exist).
The question is just how much marshalling needs to happen in that FFI layer. Making the DOM (or other Web APIs) actually WASM friendly would require an alternative DOM API which looks more like a very low-level C API - e.g. not using Javascript objects/strings at all, but only numbers, some of them 'pointers' into an ArrayBuffer (e.g. the WASM heap).
There's also a middle-way of adding such 'garbage free' functions to web APIs which allow to be called with less overhead in the JS engine, for instance WebGPU has such functions, and they were specifically added to reduce marshalling and GC overhead when called from WASM.
E.g. GC-free Web APIs would be much more useful than GC support in WASM for interacting with the browser side. GC support in WASM is mainly useful for languages that depend on garbage collection, because those can delegate the GC runtime tasks to the JS engine's garbage collector.
If I remember correctly one reason for the direct DOM initiative been held was that it depended on the WASM GC being completed.
"The DOM has been designed as a JS API (for better or worse), accessing that from WASM will always require to go through some FFI layer (this layer may be hidden and automatically created at runtime, but it still needs to exist)."
I don't see that. The DOM is the underlying data structure and what we need is direct access to it form WASM, without FFI or going through JS. WASM should be a first class citizen next to JS.
That would require a "C ABI friendly" alternative DOM API. That's hardly worth the trouble since the performance problems lurk in the actual DOM design and implementation, not in the JS shim. It would make more sense to go that way for lower level web APIs like WebGL or WebGPU.
Also if we're talking big changes like this it would actually make more sense to implement the DOM on top of the 3D APIs, and not the 3D APIs as an appendix of the DOM ;)
> The DOM has been designed as a JS API
My understanding is that it originated as a straight binding to Netscape Navigator's C++ API, so it's actually originally a C++ API.
Calling an "idiomatic" C++ API from WASM would also require a (COM like) shim though since C++ has no standardized ABI. E.g. as soon as your API uses things like std::string it already gets hairy.
There already is browser.wit, that is what a direct interface could look like.
But this is what flohofwoe was talking about, WIT is basically COM/CORBA/RMI/Remoting for WebAssembly.
Yes, but in the end this just automates the bindings generation, runtime overhead for marshalling will be the same.
Wasm GC also solved the problem of reference cycles between objects in disparate heap managers leading to memory leaks. It's not just a performance or size win: it's a correctness win.
Out of curiosity, why is malloc 75kb ? That seems like an crazy amount of code (if this after linking and dead code removal for platform specific magic?)
Malloc can indeed be implemented in a handful of bytes, but that's nog going to perform well.
Doing malloc well is actually quite a bit of work. You need to group allocations by size, manage alignment, request and release pages from the OS/browser implement reallocate, etc. A typical malloc implementation is actually a combination of several different allocation strategies.
The best solution is to reduce the amount of alloc/free calls in your code, then you can just as well use a slow-and-small allocator like emmalloc since allocator overhead doesn't matter anymore: https://github.com/emscripten-core/emscripten/blob/main/syst...
(e.g. if memory management overhead shows up in the profiler, the proper solution is not to go looking for a faster general-purpose allocator, but to reduce the amount of calls into the allocator)
> This is good in rust
Is it actually going to be feasible to use WasmGC in Rust? I haven't yet found anything on this.
To be fair neither are WebGL and WebGPU, versus the native API counterparts, the best you can get are shadertoy demos, and product visualisation on ecommerce sites.
Due to tooling, sandboxing and not having any control about what GPU gets selected, or why the browser blakckboxes it and switches into software rendering.
That's kinda untrue - there are games (not necessarily high end ones, but quite sophisticated ones nontheless).
The biggest issue is the API limitations that restrict you from doing things a certain modern way, and you have to use more mid 2000s techniques.
Here's a game that uses an Electron powered engine that uses Js and WebGL:
https://store.steampowered.com/app/1210800/Rum__Gun/
I would be impressed if it actually was available on a browser, without native help.
And that is Flash 3D quality, we would expect something better 15 years later.
It does run without the browser, it's just that on Steam, people expect an executable.
And sorry for the lack of quality - this project was built by just one guy who did custom everything (art, engine, editor, writing, music etc.) - it's super impressive imo. I'm sure if you replaced the models and textures with better looking ones (at no increase of technical complexity), it would look better.
Having looked at just the trailer on steam, it does look impressive, especially for a one man effort.
It is available in a browser, the demo at least:
https://www.gamepix.com/play/rum-and-gun
Honestly I've come to view the "it's the API's fault" as cope.
There are times when it is legitimately true, but it's far easier to say that your would have been amazing efforts were blocked because you needed obscure feature of prototype GPU than it is to accept you were never going to get close in the first place for completely different reasons.
No, it's a valid complaint. Even before hardware raytracing a huge amount of code was moving to compute shaders, most global illumination techniques in the last 10-15 years is easier to implement if you can write into random indices (often you can refactor to collecting reads but it's quite cumbersome and will almost certainly cost in performance).
Even WebGL 2 is only equivalent of GLES 3.1 (and that's maybe a low 4.1 desktop GL equivalent). I think my "software" raytracing for desktop GL was only feasible with GL 4.3 or GL 4.4 if i remember correctly (And even those are kinda ancient).
WebGL2 is GLES 3.0, not 3.1 (that would be big because 3.1 has storage buffers and compute shaders).
Thanks for correcting, I only remembered that 3.2 was the latest so went one down since I remembered compute wasn't part but seems it was in 2 steppings. :)
> No, it's a valid complaint.
For what?
This is exactly what I am talking about: successful 3D products don’t need raytracing or GI, the bulk of the audience doesn’t care, as shown by the popularity of the Switch. Sure those things might be nice but people act like they are roadblocks.
Yes and no, Nintendo properties has a certain pull that aren't available for everyone.
But also, I'm pretty sure that even the Switch 1 has far more capable graphics than WebGL 2. Doom Eternal is ported to it and reading a frame teardown someone did they mentioned that parts of it are using compute.
Yes, you can do fairly cool stuff for a majority of people but old API's also means that you will spend far more time to get something halfway decent (older worse API's just takes more time to do things with than the modern ones).
That is PlayStation 3 and XBox 360 kind of graphics level, yet we hardly see them on any browser, other than some demos.
PS3/XB360 level graphics still requires a fair bit of content and the web-game-ecosystem kinda died off with Flash and people moved onto mobile or focused on marketplaces like Steam/XBIndie,etc.
I do think we're due for a new wave of web-based gaming though, web audio just wasn't near maturity when Flash went and the mobile/steam/xbindie marketplaces still worked for indies. But now with all the crowding and algorithm changes people are hurting and I think it might just be a little spark needed for a major shift to occur.
Agree that there will be a revitalization of web gaming, here's a demo of Unreal Engine 5 running in WebGPU: (only runs on Windows atm)
https://play.spacelancers.com/
Not to mention
> overhead of each API call
Yeah, that's an issue, esp with WebGL.. but you can get pretty far by reducing calls with a cache, things like "don't set the uniform / attribute if you don't need to".. but I hear WebGPU has a better API for this, and eventually this should get native performance.. though, I also wonder, is this really a bottleneck for real-world projects? I love geeking out about this.. but.. I suspect the real-world blocker is more like "user doesn't want to wait 5 mins to download AAA textures"
> Nerfed timers
Yeah, also an issue. Fwiw Mainloop.js gives a nice API for having a fixed timestep and getting an interpolation value in your draw handler to smooth things out. Not perfect, but easy and state-of-the-art afaict. Here's a simple demo (notice how `lerp` is called in the draw handler): https://github.com/dakom/mainloop-test
Re: multithreading, I don't think that's a showstopper... more like, techniques you'd use for native aren't going to work out of the box on web, needs more custom planning. I see this as more of a problem for speeding up systems _within_ systems, i.e. faster physics by parallelizing grids or whatever, but for having a physics WASM running in a worker thread that shares data with the main thread, it's totally doable, just needs elbow grease to make it work (will be nice when multithreading _just works_ with a a SharedArrayBuffer and easily)
Multithreading yes that works the way you mention but I meant multiple rendering contexts.
In standard OpenGL the de-facto way to do parallel GPU resource uploads while rendering is to have multiple rendering contexts in a "share group" which allows them to share some resources such as textures. So then you can run rendering in one thread that uses one context and do resource uploads in another thread that uses a different context.
There was a sibling comment that mentioned something called off screen canvas which hints that it might be something that would let the web app achieve the same.
> - the incredible overhead of each and every API call
The calling overhead between WASM and JS is pretty much negligible since at least 2018:
https://hacks.mozilla.org/2018/10/calls-between-javascript-a...
> - - the nerfed timers that jitter on purpose
At least Chrome and Firefox have "high-enough" resolution timers in cross-origin-isolated contexts:
https://developer.chrome.com/blog/cross-origin-isolated-hr-t...
...also, if you just need a non-jittery frame time, computing the average over multiple frames actually gives you a frame duration that's stable and exact (e.g. 16.667 or 8.333 milliseconds despite the low-resolution inputs).
Also, surpise: there are no non-jittery time sources on native platforms either (for measuring frame duration at least) - you also need to run a noise-removal filter over the measured frame duration in native games. Even the 'exact' presentation timestamps from DXGI or MTLDrawable have very significant (up to millisecond) jitter.
> - the limitation of a single rendering context and that you must use the JS main thread to all those rendering calls (so no background async for you..)
OffscreenCanvas allows to perform rendering in a worker thread: https://web.dev/articles/offscreen-canvas
I didn't mean just WASM -> JS but the WebGL API call overhead which includes marshalling the call from WASM runtime across multiple layers and processes inside the browser.
Win32 performance counter has native resolution < 1us
OffScreencanvas is something I haven't actually come across before. Looks interesting, but I already expect that the API is either brain damaged or intentionally nerfed for security reasons (or both). Anyway I'll look into it so thanks for that!
> Win32 performance counter has native resolution < 1us
Yes but that's hardly useful for things like measuring frame duration when the OS scheduler runs your per-frame code a millisecond late or early, or generally preempts your thread in the middle of your timing code (eg measuring durations precisely is also a non-trivial problem on native platforms even with high precision time sources).
Almost every game for the last 25 years has used those Win32 performance counters, or the platforms nearest equivalent (it’s a just a wrapper over some CPU instructions), to measure frame times. It’s the highest resolution clock in the system, it’s a performance counter. You’re supposed to use it for that.
If you want to correlate the timestamps with wall time then good luck, but if you just need to know how many nanoseconds elapsed between two points in the program on a single thread then that’s your tool.
Almost all games also had subtle microstutter until some Croteam peeps actually looked into it and figured out what's wrong with naive frame timing on modern operating systems: https://medium.com/@alen.ladavac/the-elusive-frame-timing-16...)
TL;DR: the precision of your time source won't matter much since thread scheduling gets in the way, one general solution is to apply some sort of noise filter to remove jitter
Figma uses WebGL for rendering and they seem to be doing ok.
Although I will say that the difference between my old Intel MacBook and the M2 Pro is night and day.
Yeah, at a level like GDI+, CoreGraphics, XWindows hardware surfaces,....
This isn't really what real-time graphics is all about in modern times.
This is,
https://youtu.be/AV279wThmVU?si=Ou04h5z0Mju7kiJ0
The demo is from 2018, 7 years ago!
The claim was that with WebGL was "the best you can get are shadertoy demos, and product visualisation on ecommerce sites". Figma is neither, regardless of how it's making use of WebGL under the hood. Not sure what relevance an Unreal engine demo is, as you seem to think I was making a claim about real-time graphics that I wasn't.
I had this argument with pjmlp not too long ago, and it goes in circles.
Basically they define anything less than pushing the extreme limits of rendering technology to be worthless, while simultaneously not actually understanding what that is beyond the marketing hype. The fact most users would not be able to run that seven year old demo on their systems today, even natively, would be beside the point of course.
WebGL particularly absolutely has problems, but the revealing thing is how few people state what they really are, such as the API being synchronous or the inability to use inverted z-buffers. Instead it's a lot of noise about ray tracing etc.
WASM per call overhead is a whole other problem too, GC or not.
Thanks for bringing a reasonable perspective to this discussion.
Figma falls under the graphics requirements of ecommerce sites, I have my doubts that they even make use of WebGL 2.0 features.
It only a way to hopefully get hardware accelerated canvas, if the browser doesn't consider GPU blacklisted for 3D acceleration.
That isn't real time graphics in the sense of games programming.
Yes, I know there's more overhead on the web than native, but that is missing the point of my post. I'm talking about issues with Wasm GC relative to other ways of rendering with Wasm. I've played many browser games with good performance, btw.
I would be interested in any that beats iPhone's demo of OpenGL ES 3.0 in 2011, Infinity Blade.
In real time graphics quality rendering that is.
Sorry to be ignorant, but I have a couple questions about Wasm:
- Will the Wasm GC pave the way to having native object references to JS objects/classes (seeing how its the same GC as used by JS)?
- Is it possible to do lightweigt interop between a high level language using Wasm GC (such as Java), with a low-level one (such as C++) that doesn't use it? If, so, how?
Imo the biggest shortcoming of Wasm is the incredibly awkward way in which it integrates into the browser
> Will the Wasm GC pave the way to having native object references to JS objects/classes
AFAIK that was already possible via 'WASM reference types': https://caniuse.com/wasm-reference-types
Seeing that this is not yet used by Emscripten JS shims I wonder if there are downsides which prevent its use for something like WebGL objects, or whether just nobody got around yet rewriting the Emscripten shims to externrefs.
PS: probably not worth the effort (see last comment: https://github.com/emscripten-core/emscripten/issues/20021)
Like the article says, you can have _opaque_ references. It's really a small step forward but does address a major pain point.
So for your first question. Yes you can hold a reference, so the major benefit is that dead objects via things like memory cycles aren't an issue any more (previously connecting objects between WASM and JS worlds needed manual references that outside of something like C++ RAII would be brittle).
For the second part, it would depend on the compiler but since Java/C++ would both live in a WASM world they should be possible to combine fairly painlessly (given a sane way to declare linked functions that are named properly), I'd put it's interoperability issues at the same level as PInvoke from C#.
PInvoke is awesome. If you use MSVC for C++ it can do stuff like SEH/stack debugging that goes from .NET frames to C++ and back.
Even better, if Windows is the only deployment target, throw away P/Invoke, use C++/CLI and there is no need to guess how to get P/Invoke attributes exactly correct for each API call.
I was excited to read this post because I haven't yet tried WasmGC for anything beyond tiny toy examples, but was disappointed to find no actual numbers for performance. I don't know the author well enough to be able to assess their assertions that various things are "slow" without data.
This was a quickie post for me. Just a tale of what my experience has been. No in-depth, apples to apples comparison so I don't know the magnitude of the performance differential.
Not surprising tbh, no automatic memory management solution is ready for realtime graphics - it always requires a lot of manual care to reduce memory management overhead (for instance working around the GC instead of using it), which then kinda defeats the purpose of the 'automatic' in automatic memory management. There simply is no memory management silver bullet when performance matters.
Perhaps I should've said this in the post to ward off comments from the anti-GC crowd but I do realtime graphics in managed memory languages just fine outside of the Wasm context. This is a Wasm problem.
> no automatic memory management solution is ready for realtime graphics
Unreal Engine and Unity don't exist then?
Unity games have plenty of GC related stutter issues and working around the Unity GC to prevent those issues is basically its own field of engineering.
As for UE, I wonder if they would go down the same route today, or whether they just can't rip out their GC because it would mean a complete rewrite of the engine.
Games don't need lifetime management at object granularity.
The code snippets were very cool. I'll check out hoot! Overall I don't think Wasm GC is for realtime graphics.
Wouldn't the garbage collector cause a framerate drop when rendering? Usually garbage collectors are not used for graphics heavy apps.
My impression was that WasmGC was perfect for writing business logic in Go or C# or other garbage collected languages that were not running perfectly before in wasm.
You can absolutely do graphics rendering with GC'd languages. The trick is the same as with manual memory management: Minimize allocation. You mention C# which is a language that has been used very successfully in the games industry.
Unity uses C# but the engine is written in C++, but you are probably right and there are other engines out there written in C#
> Unsatisfying workarounds [...] Use linear memory for bytevectors
It never makes sense to use GC for leaf memory if you're in a language that offers both, since mere refcounting (or a GC'ed object containing a unique pointer) is trivial to implement.
There are a lot of languages where it's expensive to make the mistake this post is making. (I don't know much about WASM in particular; it may still have other errors).
Sorry but it's just a different choice not a mistake. I do realtime graphics just fine in non-web managed memory languages.
> This would require a little malloc/free implementation and a way to reclaim memory for GC'd bytevectors.
Allocation is always expensive in the hot path. Arenas, pools, etc. are a thing even in C engines for a reason. If it's possible to mix GC and linear, that's ridiculously powerful.
In Scheme what you want to do is allocate a big bytevector and use it over and over. This is what I already do outside of Wasm. I don't want or need linear memory involved, I just want access to my (array i8) from JS.
I just wish WASM could use more than one ArrayBuffer at a time. Would eliminate unnecessary copying for interop with JS code.
Well I just thought of something obvious... Have a function that lets you pass in an ArrayBuffer, then it brings it into the virtual address space of the WASM program. Function would return the virtual address that was assigned to that array buffer. From there, you call into WASM again with that pointer, and the program can take action.
Then there would be another function to relinquish ownership of the ArrayBuffer.
There's no SharedArrayBuffer support? Or I misunderstand the idea
There is, but then you'd need to declare the entire WASM heap as a single SharedArrayBuffer. It only makes sense for shared-memory multithreading (but not that support for SharedArrayBuffer only works in 'cross-origin isolated' contexts).
There is a related 'multiple memories' proposal btw: https://github.com/WebAssembly/multi-memory
so what about realtime graphics with wasm without GC? (compiled from languages not needing a GC like Rust, C/C++, Odin, ...)
Better, but WebGPU and WebGL aren't going to win any performance prizes either, and tooling is pretty much non existent.
Nothing like Pix, Instruments or Renderdoc, SpectorJS is the only thing you get after almost 15 years since WebGL 1.0.
And from the hardware level they support, it about PlayStation 3 kind of graphics, if the browser doesn't block the GPU, nor selects the integrated one instead of dedicated one.
Your are left with shaders as the only way to actually push the hardware.
As mentioned, that works quite well already but it's not the topic of this post.
true, it's just that the topic of this post seemed strange to me, since you wouldn't use a programming language with GC for high intensity graphics app in native either, hmmm.
GC'd languages are used all the time for this. C# is a huge language in game development, for example.
Shouldn't it be possible to implement your own GC in WASM? Why does WASM try to be everything?
Slower, single threaded, greatly increases binary size, separate heap from JS so bad interop with extern refs. Wasm GC is a great thing.
You can't GC together with the host environment if you do a custom GC (i.e. a wasm object and a JS object in a cycle wouldn't have any way to ever be GC'd).
yes, it's regularly done. But I think you are misunderstanding. WASM GC isn't a GC implementation.
Yes, this is how it's done eg with Python and Go.
An advantage of a common GC could be interop between languages.
Not really. As I understand it, WASM offers no facility to unwind the stack which is needed for tracing garbage collectors. The only solution here is to manually write your own implementation of the stack on the heap and force all data to live there instead of in registers. This is a huge performance penalty.