The file sizes given by the site are wrong. Super Mario World is 512KB, or 508KB if you discard the padding at the end. Only compressing it to ZIP format gives you a file size around 360KB.
I think applying RLE to the file might give a better estimate, because files have blank space throughout, not just at the end. Just tried it on Super Mario World and the estimated size was 479,154 (468K) bytes.
If you don't have an RLE tool handy, you can force Pucrunch to act as an RLE-only compressor by using the -r 0 switch which disables the LZ compression feature.
hexdump will automatically does "squeezing" of repeated lines. Follow this with a line count and multiply by the bytes/line and you'll get a rough number of non-repetitive bytes.
https://man7.org/linux/man-pages/man1/hexdump.1.html
The goal here isn't to tell the real file sizes, the goal here is to estimate the "effective" file size without any padding. Padding can be at places other than the end of the file.
It's still quite impressive to fit such a lengthy and fun game into 508KB without any procedural generation, just tight assembly and clever use of bitmaps. This is programming art.
This is largely a lost art because of several bad paradigms that have become very popular in recent years, making software exponentially worse in the process. In the old days, it was common and NES games were even smaller than SNES games because the graphics were much less complex. I believe Super Mario Bros 3, which many people still insist was better than Super Mario World, actually fit into 32KB.
As the linked article notes, some of the special on-cart chips were mainly used for data decompression, like the SPC7110. So on those games, definitely compressed assets!
But... I'd love to know how often assets were compressed on "normal" carts without special chips.
Decompressing assets on the fly during gameplay action seems like it would be quite a challenge for the SNES' CPU.
My understanding is that images on title screens and cinematics were often compressed. Anime/comic art styles lend themselves really well to RLE compression because you have lots of consecutive pixels of the same color. And, obviously, these can be fairly static images that don't need to be updated 60 times a second.
Definitely an outlier, but: the title screen of Secret of Mana was actually a JPG that took around a minute (!!) to decompress. The music and scrolling text are cleverly designed to mask this: https://manaredux.com/lore/how-was-the-incredible-title-scre...
From what I've seen, on average, with many exceptions, most games will lightly compress most data. Tile graphics and layouts only need to be decompressed whenever a scene/level/map is loaded. Games aren't doing streaming audio or anything, so song data can be decompressed and uploaded to the sound processor before a level starts too.
Likewise, text needs to be decompressed once immediately before it's displayed, so games will usually compress that - it's quick enough to decompress a few hundred bytes while the text box loads.
The only thing that I've seen that's usually stored uncompressed is sprite animation data - particularly for player characters with lots of different animations. There's not enough VRAM to load all of it at once, so it needs to be streamed in, and in that case the CPU often just doesn't have the muscle.
I was actually just thinking last night about how something like Fortnite handled emotes. Skins and cosmetics I can understand, but they have 100 players and hundreds of emotes that any player can use at any time. Is it all really just streamed in on demand? Most common preloaded?
Maybe a little counter-intuitive, but Fortnite emotes (essentially, animations) takes up much less memory than skins.
Generally for a modern polygonal game, you are using skeletal animation for the characters. A character consists of a polygon mesh and anywhere from dozens to hundreds of bones.
Animations consist of keyframes. Each keyframe represents just a handful of bytes for each bone (the XYZ coordinates for each end of the bone, and the rotational angle, or something like that). The animators create as many keyframes as they want, maybe 10-20 per second max. So a 5-second emote animation might contain something like 8 32-bit floats * 100 bones * 10 frames per second * 5 seconds = ~120 kilobytes of uncompressed data.
That's all you need to specify an animation. The rest is calculated on the fly at runtime and rendered to the screen at 60fps or whatever the current frame rate is. The graphics engine interpolates bone positions between your keyframes, and the player model mesh is deformed by the bones. Also, those skeletal animations can be shared between all player models.
The alternative to skeletal animation is fully prebaked animations. This involves minimal interpolation and calculation at runtime. It is more memory intensive because you are calculating the position of every point on the mesh ahead of time, and then storing that data on disk. This is generally how a very complex and non-interactive animation (think: cutscenes, etc) would be animated and stored on disk. Note, this is still far less storage-intensive than storing rendered video, and you still maintain a great deal of flexibility at runtime - you can change the camera location, rendering passes, resolution, etc. That's why you don't see a lot of prerendered video cutscenes these days.
I guess you are right. I never actually did the math on animation storage but I just somehow assumed it was bigger. They obviously use a standard character model (or at least a few) that every animation works with. I just assumed it would be bigger esp with the number of emotes modern cosmetic games have.