Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Decompilation of Paper Mario for N64 (papermar.io)
151 points by skibz on Jan 13, 2024 | hide | past | favorite | 52 comments


Fascinating work. I know that people did the same for Super Mario 64 [1]. It is still unbelievable to me that they can generate a bit-by-bit identical copy of the original ROM by simply running some old gcc on actual C source code files. IIRC the main insight was that Nintendo did not use any optimization flags which made it possible to create a 100% matching binary. I've never really looked into it, but I guess the did some try and error to figure out the exact gcc version that Nintendo used to make sure that they get 100% identical binary code?

[1] https://github.com/n64decomp/sm64


SM64 uses SGI's IRIS Development Option (IDO) compiler. And yes, it's unoptimised.

Paper Mario, however, /does/ use GCC, and it's optimised. Figuring out the compiler version was fairly easy as there's a limited number of options - we know when the game began development, so we looked for releases around that time. The harder parts were figuring out compiler flags (consider all the -f flags affecting code generation; papermario used -fforce-addr) and coming to the terrifying conclusion that the compiler was modified!

The majority of papermario was built with a modified build of GCC 2.8.1 [1] at -O2. The SDK code (libultra, nusystem) was built with GCC 2.7.2 at -O3. The iQue Player version, i.e. the Chinese release, was built with EGCS.

[1] https://github.com/pmret/gcc-papermario


Thanks for the insight. The fact that Paper Mario uses optimizaion flags makes this project even more fascinating. Great work.


There's an amazing video about refactoring Mario 64 to reach 30 FPS and render 6x faster on N64:

https://youtu.be/t_rzYnXEQlE?si=ZWpp7-74cMdbsdba

The changes were much more extensive than just removing the debug flags.

They were also able to get local coop and a 4 player split screen working which were features originally intended for the game but cut for performance reasons.


> I guess the did some try and error to figure out the exact gcc version that Nintendo used

The official SDKs have been leaked for decades, so they probably just looked there.


Did Nintendo's SDK's use GCC back then?

Hopefully they shipped the GPL source with it!


They used IDO, but GCC was an option, I think.


Hah love the domain hack here.

But seriously, it's nice to see the number of decomps and disassemblies for old video games we're getting recently. First Super Mario 64 and The Legend of Zelda: Ocarina of Time, now just about every other NES to N64 era Nintendo game you can think of. This will certainly be useful for mods in future.


IIRC mods are already utilizing them! I think the OoT mod Indigo is.


The part of decompilation that most amazes me, is getting binaries to match 1-to-1 (up to hashing it seems) – the level of knowledge about executable formats, layout, and how to control the linker is beyond me.


Checkout http://decomp.me - it’s a community built tool used by a lot of video game decompilation projects. You put in the original bytecode, it will attempt a decomp, and then you fiddle with the source (using the same toolchain & flags known/best guessed to be used by original devs) until it matches perfectly. It’s super cool.


Building decomp.me was really worth it. It even helped us match the very last function - someone from the Metal Gear Solid decompilation team finished it off.

https://decomp.me/scratch/GImYC https://github.com/pmret/papermario/pull/1019


Thank you for that link! It's good to be learning about decompilation/RE tools that aren't just Ghidra, IDA, Hopper. It looks like it works by function, which I'm sure will help future decompilation efforts.


I wonder if you could train an LLM to decompile binaries...


Gpt4 can already do that, but... not very well. And mainly the major architectures.


If it produce wrong results, can it do it then, though?

I mean, my cat too can do that.

Getting tiresome with AI apologetics everywhere.


No, that's a silly comparison.

I can do a bit of decompilation with some mistakes, GPT4 can do more with some mistakes, hexrays can do much more with more accuracy but still makes mistakes. Nothing needs to be 100% perfect.


uh this is ai apologeticism, you keep making up a very low baseline and keep being impressed when it is reached.

i insist my cat too can make decompilation mistakes if only i equip her with the keyboard paws


This is another decompilation project that hosts copyright infringing code in their github repo. Considering how common these projects do this it would be wise if someone were to create a project to help these projects not contain the decompiled code itself. What I would expect is that instead of code there is a decompiler, a symbol map, a way to add comments, and a way to fix up messy code. Then there is a build step to create this from the ROM. The project already handles extracting assets from the ROM instead of just including it in the github repo like the code.


In practice this makes little difference: Nintendo's gonna C&D and sue you anyway, and you're not going to have the money to fight it. So might as well make it more convenient.


It's actually the inverse. In practice Nintendo has allowed all of these projects to exist, some have speculated that it's because Nintendo cares more about the assets or that Nintendo focus more on taking away more direct routes of piracy. It's true that it makes little difference right now, but I don't think publishing decompiled code is good practice or a good habit for the game decompilation scene to get into.


Didn't Nintendo just DMCA the Portal64 project via Valve/Steam?

>publishing decompiled code is [not] good practice

I remember specifically that Portal64's developer required you to have the Portal.steam.app in order to build your own N64 version (using his tools/code).


>Didn't Nintendo just DMCA the Portal64 project via Valve/Steam?

No, Valve's legal team reached out to the creator and became worried when they learned that the project uses libultra from the leaked n64 sdk from Nintendo. To avoid any potential legal problems Valve asked the creator of Portal 64 to shut it down.

>using his tools/code

It required pirating Nintendo's N64 SDK in order to build. These decompilation projects don't require the SDK, but as my my original comment suggests they do contain code that is a derivative work of libraries from the SDK.


Thank you for the clarifications. I just purchased an N64 Console to re-visit a childhood I never really explored (didn't have Nintendo as a kid).

All this modern history was mind-blowing... a way-underpriced local CL listing for an N64 (with ALL the old games) has taken me way too far into this reverse-engineered rabbit-hole.


Dev here, AMA


is the demand for n64 decompilations partially fueled by the difficulty people have had in developing emulators for the machine?


To my knowledge: not really. Porting to other platforms is usually an explicit non-goal by the decompilation team. For some reason, port projects tend to attract the wrong kinds of attention, both from the public at large and from dodgy people in the romhacking community.


Do all decompilation projects for 90s video games out there aim for perfect matching?

I get why perfect decompilation is a big deal, I'm just wondering if there are other approaches out there besides perfect decompilation in the community at large.


As far as I know, yes.. Besides simple differences like register allocation, it's difficult to prove that your code behaves the same as the target if its nonmatching. It's also just really satisfying when you get a match.

When doing standard reverse engineering, you might use something like Ghidra or Hex-Rays. This is what the developer of noclip.website [1] did to reimplement a lot of Mario Galaxy code, such as enemy AI.

[1] https://noclip.website/#smg/AstroGalaxy


Ik there is a 'why' page, but why are you specifically interested in this stuff?

great work btw


Thanks! I'll tell my story...

I first played Paper Mario on a PAL N64 when I was around 6 or 7 years old, and I recall not being able to get past certain sections because I could barely understand the dialogue. The cartridge had a previous owner who left a completed savefile, so after getting stuck I loaded it and eventually defeated the final boss. I remember the day vividly where I left the console on overnight displaying the "The End" screen because I thought there might be an easter egg - I'd just learned about Totaka's song in Luigi's Mansion (GCN). There is no postgame in Paper Mario, but I always dreamed of one.

In my time in the modding community I've found that I prefer to create documentation and tools for others to express their creativity than create my own, which is why tools like [1] and [2] exist. The decompilation lets users make mods even more flexibly than previous tools, so I hope to see some people build some cool stuff.

I also just love learning about how this game from my childhood works. It feels kinda like archaeology: discovering parts of the engine where hacks were thrown in at the last minute, finding code that was linked against an earlier version of the engine, etc.

[1] https://mamar.nanaian.town/ [2] https://github.com/nanaian/papermario-dx


Would love to see Megaman legends 64 decompiled


Paper Mario is one of those games that I'm still yet to finish. Love the story, the controller, the graphics... I've tried to play it many times using emulators with no success, there's always something glitching in some area of the game, making the progress very hard. So i end up subscribing to the Nintendo online on my switch just to play it.


Are you aware that the sequel, thousand your door, will be coming out this year updated for the switch?

It was a fantastic game. If you wanna play you’re in for a treat.


Wasn't there a sequel made for the 3DS a couple of years ago as well?


Technically, though that was Sticker Star, and modern Paper Mario from that game onwards is a very different beast.

Those games lose the interesting stories and character design and mechanics that made the first two so beloved (Super Paper Mario is its own thing), and are heavily criticised by the fanbase as a result. Sticker Star is usually seen as the worst in the series due to the awkward puzzles, bosses where a single arbitrary item is needed to win and the generic characters and settings, but Color Splash and Origami King get their fair share of criticism too.

My advice to anyone reading this is to skip Sticker Star, maybe watch a Let's Play of Color Splash and try out Origami King if you're willing to accept it'll be nothing like the first two.


This is great! I wonder how long until we see GPT-assisted decompilation.

Taking a peek at the source, it's so interesting to see the a piece of history. For example, this was released in Japan in 2000, then internationally months later. As I recall, there was awareness building around the idea that vibrating controllers (here, the N64 Rumble Pak accessory) cause RSI or carpal tunnel. Since the developers shortened the rumble length outside of Japan, it looks like they were aware as well: https://github.com/nanaian/papermario-dx/blob/main/src/rumbl...

I wonder what led to this decision being made at the exclusion of the JP release.


If current AI can barely do maths, decompilation is not something I'd expect it to do well. It will of course try and come up with something plausible, but often subtly wrong.


gpt4 can spit out accurate unoptimized AST of javascript and python. (I just tried it.)

Now to test emitting.


Would decompilation be closer to arithmetic or translation?


Depends on your goal. If it's matching decompilation, probably the former.

There's been research into the latter but its in early stages. https://github.com/nforest/awesome-decompilation?tab=readme-...

decomp.me gives us a large database of C(++) <-> target asm to train a model on ;)


If you want the decompiled code to produce a 1:1 match with the original binary (even if it takes some finessing by hand at the end) you need something rigorous approaching arithmetic. A fuzzy decompiler that just approximates the intent of the original code won't get you there (and this is mostly what you get out of GPT for many tasks), but it could still be useful for something.


Imagine adding accessibility features to these older games through this kind of stuff. :)


I've actually been working on a mod for Paper Mario that aims to make it blind-accessible! However, I'm not sure what the best practices or prior art is as to how to represent certain features. Do you have any good resources?


I’m just blown away by how clean the code is for the mod version and how well the documentation is written.


Paper Mario: The Thousand-year Door is being remade by Nintendo for the Nintendo Switch.


Do Majora's Mask next! Jet Force Gemini!

Where do I donate?


Just so you know, those in charge of this project aren't going to port it to PC, though someone else could do that. I'm saying this in case the reason you want to donate is because you expect a port to come out of it.

> PC Port?

There's a lot of people interested in playing a PC port of Paper Mario. Unfortunately, making a port isn't our focus. Porting the game isn't why we made the decomp project, and it's not a motivating factor in delving into this game. We have so many exciting goals for the project including decompiling other versions, making modding easier, and further understanding and documenting the codebase; making a port isn't really on our radar.


These are the first steps to opening a wide universe of possibilities.

The things that Kaze Emanuar [1] has pulled off with Mario 64 are amazing.

[1] https://www.youtube.com/@KazeN64/videos


I would like to see similar improvements made to the engine of Perfect Dark, as that game really pushed the N64 hardware.


I would love to see Perfect Dark framerate issues fixed and played on the original hardware.


>Majora's Mask

Already in progress if you want to contribute: https://zelda64.dev/games/mm




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: