Shouldn't the allocation approach of WebAssembly change for this to be a real th...

0x457 · on March 24, 2023

There really isn't an allocation approach in WASM. It just gives you a buffer and can grow, and the rest is up to you. Which means you have to bring your own allocator. Your fragmentation entirely depends on how good your allocator.

Since we optimize for size on the web, it's hard to justify bringing the entire jemalloc into every wasm module.

deathanatos · on March 24, 2023

… the allocator can only do so much. If there's a small allocation after a large one, and the large on is freed and the small one held forever, you may never be able to de-fragment that.

I think it's telling that Unix started with sbrk and ended up with mmap, here.

flohofwoe · on March 25, 2023

The OS just gives you new chunks of memory, but it's the job of the application to manage the memory it has been given by the OS.

Modern general purpose allocators are a bit more clever then just grabbing the next available address from a single free-list, usually they group allocations by size into buckets.

Then there's specialized allocators (e.g. arena, bump, stack ...) which not only prevent fragmentation but also increase memory management performance.

In general, memory fragmentation in a limited address space is only a problem for programs that don't have a proper memory management strategy.

deathanatos · on March 27, 2023

I'm aware of all of the things that you mention, and you can assume that the allocator I am discussing here is reasonably intelligent and does all of those things.

Even with everything you mentioned, the point is that the allocator (doing all those things) cannot return free memory to the OS, because the APIs simply don't exist for it: that's what mmap() provides over sbrk().

In the worst case, even for an allocator doing all those things, this can result in an allocator being unable to return any memory: if exactly the last allocation remains alive, sbrk() cannot be called (as you'd free a still-live allocation); but in the case of mmap(), we can free most, or in the best case, all of the memory not in actual use.

vouwfietsman · on March 24, 2023

Giving you a buffer that grows is the allocation approach I am talking about. This is not how your OS works. Your OS itself works with an allocator that does a pretty good job making sure that your memory ends up not fragmented. Because WASM is in between, the OS is not in control of the memory, and instead the browser is. The browser implementation of "bring your own allocator" is cute but realistically just a waste of time for everybody who wants to deploy a wasm app because whatever allocator you bring is crippled by the overarching allocator of the browser messing everything up.

It seems like the vendors are recognizing this though, with firefox now having a discard function aparently!

https://github.com/WebAssembly/design/issues/1397

sph · on March 25, 2023

> This is not how your OS works. Your OS itself works with an allocator

This is exactly how your OS works. There is no malloc syscall. You call sbrk [1] (the modern mechanism to do that on Linux eludes me, maybe it's through mmap), get a chunk of memory and do your thing. malloc is implemented in glibc, and you can swap it out with any other allocator, but your kernel just gives you a chunk of memory and it is your job (as userland process) to suballocate it, deal with fragmentation, etc.

1: https://en.wikipedia.org/wiki/Sbrk

vouwfietsman · on March 25, 2023

I am not sure you read the link I posted. Picking a single point out of it, like "No way to shrink the allocated Wasm Memory." makes it clear that this allocation strategy is really very different to how your OS works when you call malloc/free outside of wasm.

Also take a look at "Some applications need address space, not memory" in that link, same stuff.

Long story short, running inside WASM is like running without virtual memory. Virtual memory was invented to solve a bunch of issues, so now WASM has all those issues unsolved.

flohofwoe · on March 25, 2023

Allocation strategies are the job of the program that's run, not of the VM the program runs inside. "Just" use an allocation strategy that doesn't fragment your heap, usually the best solution is to use a handful specialized allocators instead of a single 'general purpose' allocator.