From the user-experience perspective, it matters that it is Rust. Unless you directly write WASM code, which is rarely (ever) the case.
Rust is a language that compiles to WASM. Sure, there are other options to compile to WASM, but they differ by interface (programming language, library, tooling) and (likely to a lesser degree) - performance.
Given the title I had expected this to be a comparison between the js and rust in terms of development, not performance. I learned nothing new about rust by visiting this page. All I learned was that compiled WASM is faster than js.
> All I learned was that compiled WASM is faster than js.
You learned that Rust compiled into WASM is faster than JS. That won't be the case for every language. (We had an example yesterday on the front-page where it wasn't. Ironically, its title implied that it was a generic comparison with WASM.)
I wouldn't say the source language is not meaningful. If you wrote an ahead-of-time JavaScript-to-WebAssembly compiler and benchmarked the WASM generated by that, it would probably be a lot slower than the WASM generated by the Rust compiler, no?
True, but if you where in a front-end project doing JS with some WASM parts that happened to use Rust for the WASM, you'd probably not ask yourself "JS or WASM?", you'd ask yourself "JS or Rust?" when deciding where to put some logic.
And I'm not sure if we can assume that all languages that offer compilation to WASM would be roughly "rust fast": I wouldn't be surprised if we already had some examples that aren't really good matches for that tight inner loop case you might want to wasmify. Perhaps there are already some that just want to get the "runs in browser" checkbox ticked?
Not sure, it could also be viewed as a comparison of ahead-of-time compilation vs. just-in-time compilation (I guess the code paths are hot enough to be completely "jit-ed" here), and then the compiler isnt irrelevant at all.
But havent looked into the details on how much difference that alone makes in this case.
I think its also a testament on how fast JS has gotten.
The fact that it was written in Rust (or any other non-VM lang) is meaningful because programs written in a Language that needs a VM (All the GC'd languages for example) need the vm to be shipped as well. Thus, making the WASM-program larger and slower.
This has obvious effects (you don't have an allocator so you can't have String) and more subtle effects (your slices don't have a sort() method! However they do have a sort_unstable() method) but while it's probably a reasonable environment for the firmware inside a custom $25 gizmo it's not a very comfortable one for general purpose programming.
To deliver a bit more, for example an allocator, you're bringing in a bunch of platform specific code, which, just like the Garbage Collector for Go, you did not write.
Like Rust, C defines what happens if you don't have its rich standard runtime, you get "free standing" C.
C's free standing mode is even thinner than Rust's core, because it doesn't supply a library of code, you just get the primitive types, the operators and the language features that don't involve any libraries. To give a very concrete example, Rust's core depends on memcmp() existing, Rust assumes your toolchain knows how to memcmp() on your target architecture inherently, but in C you could write memcmp() if you had to. You won't these days, because your C compiler invariably provides this feature, but in principle you could.
In principle C++ also defines a "free standing" mode, but it's a mess and so people don't write for C++ free standing mode in the real world. C++ in an environment where you can't have the actual standard library is likely to be specified in terms of which features from the standard library you can have and which you cannot, for each such environment. For example maybe you can have threading but no filesystem APIs, or you can have a heap allocator but no threads.
Since we're being pedantic, all mainstream languages (even Rust) have a runtime, so your "managed vs unmanaged" language distinction is completely meaningless. :)
If there is a salient distinction, it's whether the runtime is easily and practically distributed as part of the compiled artifact, which is true for C, Rust, Go, etc but not so much for Python or JS (nor probably Java or C#, although there are efforts for static, native compilation for those platforms which probably come with significant caveats).
Larger runtimes not only add bloat to the final WebAssembly, but they also can make interoperating harder because they often require more (i.e. any) bookkeeping when sending stuff back and forth.
also worth mentioning is that this is why people build projects like TinyGo and MicroPython – they love the language, but can't work with the trade-offs that the designers chose
> also worth mentioning is that this is why people build projects like TinyGo and MicroPython – they love the language, but can't work with the trade-offs that the designers chose
It's not like small runtimes are fundamentally better in this regard. They work better for constrained environments, but they often make tradeoffs which are inappropriate for other use cases. It's hard to make a runtime that works for everything, and it's not necessarily even a worthwhile goal--you can use TinyGo when you're working in a constrained environment, use Go when you're not.
The difference in performance is about 3x and it looks like an pretty ideal case for wasm. Seems to match observations from Zaplip portmortem:
> Rust is faster than JS in some cases, but those cases are rarer than we expected, and the performance gain is on the order of 2x some of the time, not 10x most of the time.
That 2x figure is something that shows up consistently over the years, since asm.js in fact. The 10x cases are rare, like they found. Still, 2x is enough to justify using a different language in some cases.
Separately, though, on very large applications wasm does have an advantage on startup times: there is no warmup period as the JIT learns the types, no sluggish first frames for the entire application or the first time you click on a feature that hits a new code path. For something like Photoshop or a game engine that can be a very big deal. But, again, this type of application is not the most common, so JS will remain the best option for most things.
I wish there was an "about" blurb. Perhaps 2-6 paragraphs explaining what's going on?
Although it's obvious that you're comparing JavaScript and WASM performance, the devil is in the details. What exactly are you comparing?
There's quite a bit of overhead in calling out from WASM to the DOM; how are you making WASM faster? How much "JavaScript" is involved in the WASM version? Are you manipulating a Canvas? Generating a bitmap?
> There's quite a bit of overhead in calling out from WASM to the DOM; how are you making WASM faster?
If you check out the code, you will notice that wasm isn't talking to the DOM. Its purpose is to generate the image data as a Uint8Array, which is then passed to the canvas [0], same way the javascript implementation does it
The innermost rendering loop in the JS seems to create and destruct an array instance for every single pixel iteration (see https://github.com/dmaynard/chaos-screen-saver/blob/master/s...). I guess that this could be potentially optimized away by the JIT, but it will make things slower or at least less predictable.
That's kind of an underrated aspect of these comparisons - while you absolutely can work around Javascript's weird performance cliffs and avoid putting pressure on the garbage collector, you have to fight the language at every turn because so many idiomatic JS patterns are inherently slow or flood the GC. You may find idiomatic JS easier to work with than something like Rust, but Rust is much easier to work with than the narrow and loosely defined subset of JS that you have to stick to for optimal performance. Taken to its limit you end up more or less writing asmjs by hand.
The rust code is 3x as long and a lot more complex too.
In theory, static typing would correct the biggest performance issues in JS (use monomorphic functions, don't mutate objects, and limit arrays of just one primitive/object type).
In practice, TypeScript allows and encourages you to create types that are horrendous for performance.
I'd love to see a comparison using AssemblyScript (basically stricter TS for WASM). I'd bet it's nearly the same speed as Rust while still being a third of the size.
The Rust version is longer mostly due to boilerplate for WASM<>JS interface, and awful vertically-exploded formatting (probably caused by rustfmt's dumb heuristics).
but the core loop in Rust is pretty straightforward. It could have been shortened and optimized further.
Also keep in mind that the larger the project, the harder it gets to keep JS in the performance sweet spot without tipping over any JIT heuristic, using GC, or accidentally causing a perf cliff, while the Rust has pretty stable and deterministic optimizations and keeps its memory management control at any scale.
That's a really good point and I think that in this sense the comparison is really made between Rust and JS rather than between WASM and JS (as others have complained).
It was my understanding that the V8 GC frankly was rarely used, and that they generally just let memory pile up quite a lot before it's used, in the hopes that it may never have to be run during application lifetime.
It depends on the application, a short-lived script may complete all of its work before the GC interrupts it, but something that runs continuously can't afford to generate much if any garbage in its main loop because it will inevitably pile up and eventually cause a huge stall when the GC decides that enough is enough. It's especially critical for animated or interactive applications like games, because with those the stall will manifest as the application freezing completely until the GC is finished.
Last I checked, destructuring created 4-5x as many bytecode instructions and a potential GC pause. I'd think this could be detected and optimized easily enough, but I guess there are bigger problems for the JIT devs to solve.
A quick profiling seemed to indicate that just a bit less than 10% of the JS time is being spent on the DOM rather than the calculations at hand. I wonder how much of that could be reclaimed simply by running the calculation in a web worker.
I suspect the bitwise AND operator every loop is another big performance issue. Normally, the JIT would leave the loop iterator as a 31-bit int, but because it is stored in a shared object, I suspect it must (f64 -> i31 -> AND -> f64) every time. A local variable that updates the object variable every 64ms and resets to zero would probably be faster.
The decPixel function should use a switch statement with values of 0, 1, 255, and default so it only needs to branch one time. This is probably a decent performance win too as around 15-20% of all time is spent here.
EDIT: I should praise the author for using a ternary instead of Math.max() as very few people know that the ternary is literally 100x faster. I wonder why this optimization was never made as it seems common enough.
I'm trying to work out how my interpretation of the calculations (in JS)[1] compare with the authors code, but trying to measure performance in CodePen is ... difficult to work out. My approach was to: 1. Run the CodePen with the inspector open; 2. Start recording performance; 3. Right click on the display panel and select 'Reload Frame'; 4: Stop recording performance after the images reappear.
... But when I look at the results nothing is making sense. Clearly my approach was wrong.
Why do you think the JIT would be able to optimize this? And how would it go about it? I know only about a few rough things and heuristics. I wouldn't expect or assume that this would be optimized.
It would probably have to recognize that the _usage_ of this function can be translated into a local mutation without allocating additional arrays. But from just looking at the function locally it isn't clear whether that is a safe assumption.
What I meant to say is that this definitely isn't a safe assumption and performance of this loop will be less predictable. That said, I wouldn't be surprised that such complex JITs as V8 or JSC can detect this scenario.
The actual code running comes from an unpublished module @davidsmaynard/attractor_iterator. It also returns a new array instance for that part. Most importantly, it uses a strange mix of global variables, class methods and properties, which I guess the author tried to optimize by trial & error.
My conclusion from this is that Firefox seems to be much faster than Chrome-based browsers when it comes to WASM. I have noticed this trend before.
I would be more interested in a more realistic test with scenarios replacing web frameworks like React. This benchmark seems more like something you would do with shaders anyway.
When comparing js and wasm, one thing people rarely mention is that with js, you get built-in functions/runtime (without crossing the boundary and then getting language mismatch).
When "reimplementing" that functionality in wasm, it will add a good amount to binary size. Just a hashmap will add a decent bit.
Not to say there aren't uses, but js comes with some advantages.
If you have a few algs. you need to run under the hood, preferably with few cross domain calls, that can be expressed mostly using a small bunch of native types and not a lot of maps/string etc. then something low-level like WASM might help.
But that doesn't match most real world scenarios very well.
I get that the intended meaning is "JS sux", but I can't help being impressed with how performant engines are. The "pixels per ms" count ratio is less than 3:1 on my machine, in what I imagine is a CPU-heavy task.
I'd love for someone who knows this problem a little better to chime in, but this doesn't look like that heavy a task to me. I've read through the code very briefly. The core of the attractor logic[1] seems to be a few trig functions with a lot of bookkeeping around it (e.g. measuring performance to know how many iterations will fit within each frame budget, I think?). But each iteration depends on the state of previous iterations anyway (stored in module-globals[2]) so there's a data dependency that a CPU isn't going to have a great time with.
I'd be interested to know where most of the time is being spent in this program, but I don't think it's really playing to the strengths of Rust/WASM.
Compares the rendering speed of Javascript/ES-6 to WASM compiled from Rust. Result: JS/ES-6: ~4000 pixels per millisecond. WASM/Rust: ~16000 pixels per millisecond. These results are on Firefox using M1 Max
Rust is a great solution to the memory safety problem, but is more difficult to use and slower to compile than some other languages. I'm wondering why I would use Rust for WebAssembly, which doesn't have memory safety issues?
My impression here is that (1) you don't want a language that you'd have to ship a runtime with (any interpreted language, Go, Swift), (2) Rust WASM tooling/documentation is (much?) better than tooling for other languages, and (3) most people would prefer memory safety when it comes to a purely memory-safety-to-compile-time trade-off.
What exactly do you mean by that? WASM is assembly, the memory management is entirely manual, and it's currently a very bad target for garbage collection.
Anyway, Rust has more advantages than memory safety.
On my phone Firefox I occasionally get a surprisingly stable "Infinity" readout. Am I seeing a timing sidechannel mitigation in the flesh or is that just my news-fed brain imagining explanations?
It's nice to see how much faster is wasm. This kind of problem would also be nice for parallelization. What's the state of multi-threading wasm? any advantage over javascript web workers?
Atomics have universal support in modern browsers.
This is primarily a single-threaded problem. Each x and y update depends directly on the previous x and y values with the x and y being used to update pixels.
At most (barring changes to the algorithm), it looks like you could calculate these on one thread and update the image on another thread.
It’s true that the problem of generating the exact same image seems to be a single-threaded problem. But I suppose the general problem of rendering an approximation of a strange attractor could be parallelized by generating images from several different starting positions and combining them.
The JS Framework Benchmark[0] has got your back. The link below pre-filters to my subjective "picks" that I think give a birds-eye view of the wasm-to-js comparison for DOM manipulation:
My picks: SolidJS > Vue > Sycamore-rs > Svelte > React > Yew
SolidJS is a truly reactive JS framework. Sycamore-rs can be thought of as the Rust-clone of SolidJS, and Yew can be thought of as the Rust-clone of React.
The JS Framework Benchmark compares benchmarks (both WASM and JS) in the more usual setting of DOM manipulations.
The link below pre-filters to my subjective "picks" that I think give a birds-eye view of the wasm-to-js comparison for DOM manipulation:
My picks, ranked descending by performance: SolidJS > Vue > Sycamore-rs > Svelte > React > Yew
SolidJS is a truly reactive JS framework. Sycamore-rs can be taught of as the Rust-clone of SolidJS, and Yew can be taught of as the Rust-clone of React.
These comparisons are always highly biased by the kind of work being done. V8 is so good at optimizing code, it's usually possible to reach similar performance in JS.
Not that relevant, but it looks pretty terrible on a hi-dpi screen, it's probably not accounting for the pixel density when creating the canvas. Looks much better on a normal screen.
> V8 is so good at optimizing code, it's usually possible to reach similar performance in JS.
It often is, but unless your code is purely numerical (in which case V8 is fast by default) you have write very awkward code to make JavaScript fast. On the other hand you can often transliterate JS code 1:1 into Rust and it'll be 10x faster.
That is well demonstrated by the sourcemap events from 2018: Mozilla replaced part of the source-map JS library by a straightforward Rust rewrite compiled to WASM, yielding a 5.9x speed improvement (with much less variance to boot) (Oxidising Source Maps with Rust and WebAssembly).
mraleph then went through a pretty epic bout of algorithmic improvement and engine-specific optimisations, reaching not quite parity with the rust/wasm version but close (Maybe you don’t need Rust and WASM to speed up JS).
Nick Fitzgerald was then able to relatively easily add the algorithmic improvements to the WASM version for an other 3x gain (~10x total from the original) (Speed Without Wizardry).
There can be value in "things will be fast by default" vs "things can be fast if you use a careful subset of the language". As well it is nice not to have to depend on v8 being the runtime to get good performance.
One thing I appreciate about say, Rust, vs C# is that in both using iterators to do operations on collections is the standard idiomatic thing to do. However in C# there is always some overhead, and at big scale you have to question whether that overhead is ok here. Usually is fine but sometimes you have to drop back to for loops.
With Rust the iterators pretty reliably get compiled down to very efficient code. So you just do the usual easy thing and get the good performance, no need to worry about it.
> There can be value in "things will be fast by default" vs "things can be fast if you use a careful subset of the language".
One could equally say that there's value in "things will be general by default" vs "things can be general if you use obscure features of a language and thirty libraries to generalize the things that aren't general yet".
Yet most programs seem to be following the 80/20 rule, so the majority of code that has been ever written seems to prefer generality over raw speed. Of course one possible solution is to write the general parts and the fast parts of your program in different languages, but interoperation may be tedious.