The computer-assisted component of the Noperthedron proof is a reasonably small sagemath program that was (as far as I know) written by humans: https://github.com/Jakob256/Rupert
Perhaps you have confused this article with a recent unrelated announcement about a vibe-coded proof of an Erdos conjecture? https://borisalexeev.com/pdf/erdos707.pdf
Oops you're right! I read these both yesterday and they blended together in my memory by the time I made this comment this morning. I knew something felt "off".
Tangentially I'll have to reconsider my position on long but lossy context LLMs.
Last month, before this result came out, the question "Is Every Convex Polyhedron Rupert?" was added as a formal Lean statement to Google's Formal Conjectures repository:
Interesting. My guess is that it's not prohibitively hard, and that someone will probably do it. (There may be a technical difficulty I don't know about, though.)
David Renshaw recently gave a formal proof in Lean that the triakis tetrahedron does have Rupert's property: https://youtu.be/jDTPBdxmxKw
> David Renshaw recently gave a formal proof in Lean that the triakis tetrahedron does have Rupert's property
That's me!
This result appears to be significantly harder to formalize.
Steininger and Yurkevich's proof certificate is a 2.5GB tree that partitions the state space into 18 million cells and takes 30 hours to validate in SageMath.
Formalizing the various helper lemmas in the paper does seem achievable to me, but I suspect that applying them to all of the millions of cells as part of a single Lean theorem could present some significant engineering difficulties. I think it'd be a fun challenge!
If that turns out to be infeasible, an alternate approach might be: we could write a Lean proof that the 2.5GB tree faithfully encodes the original problem, while still delegating the validation of that tree to an external SageMath process. Such a formalization would at least increase our confidence that there are no math errors in the setup. A similar approach was taken recently by Bernardo Subercaseaux et al in their recent paper where they formally verified a SAT-solver encoding for the "empty hexagon number": https://arxiv.org/abs/2403.17370
Lean does feasibly let you shard out verification work; i.e., you can take a huge proof tree, chop it into an assortment of independently-proven branches (trivial if it's a true tree), let those branches be verified in parallel by a simple cluster, inject the feedback information from the cluster tasks ("this theorem here (the open goals/interactive state at the point this branch was cut out of the full tree) is true; no need for you to actually be aware that it's proven with specifically this proof here (the branch of the tree)") into the main kernel's verification cache (or just declare those intermediate theorems as true), and write a "apply this_theorem_we_cut_out_here" in place of the cut-out branch when writing up the main tree.
Good thing that, as long as you verify the entire result and make sure your verifier orchestration doesn't have bugs of a "cheating" nature, you can let AI run pretty wild on transforming/translating/chopping a Lean proof tree, because the verifier is already presumed to be an oracle with no false positives.
E.g. here it could potentially help translating SageMath representations to Lean4 representations, with the only downside that a failed verification in Lean could be due to merely erroneous AI-assisted translation.
Overall, I'd think given the nature of proving that a polyhedron doesn't have Rupert's property, there should be fairly straight-forward (if not actually trivial) ways of sharding the proof.
The paper seems to talk of a 5-dimensional search space; in more general I'd think it's 8 dimensions to account for the translation through the proposed hole (this is still assuming you want to not rotate the polyhedra as you're passing one through the other):
"attack direction (angle of the penetrated)" from any direction (3D; unit quaternion),
while the penetrator is facing any direction (3D; unit quaternion),
shifted sideways any amount normal to the direction of translation (2D; simple 2D point),
valid at any translation/penetration depth (1D; simple distance/real),
while cancelling one dimension worth of freedom because only mutual twist along the direction translation matters (not absolute twist).
There's some mirror/flip symmetries that each take a factor of 2 out, but that's insignificant as long as we keep the dimensions of our geometry fixed at 3.
Now having thought about it a bit more to write this, I think it'd be mostly (automatable brute-force type) mechanical once you translate the coarse proof structure and theorem definitions, because you're "just" sweeping 5 (or 8) degrees of freedom while partitioning the search space whenever your property definition hits a branch.
A benefit of that being a possibly trivially parallel/flat certificate that's basically composed of 2 parts:
(1) a list of intervals in the 5/8 dimensional search space that together cover the entire search space,
and (2) for each listed interval, a branch-free verifiable statement (certificate) that the property definition applies in a definitionally uniform manner across said interval.
The most annoying bit might be that they use different, though equivalent, definitions of the property, so you would also need to formalize the proof of the equivalence of definitions.
The IMO Grand Challenge is "formal to formal" -- a solver is given the problem specified in the Lean programming language, and must produce a solution in Lean. To see more concretely what this setup might look like, check out https://github.com/dwrensha/compfiles.
The AI MO prize is "informal to informal" -- a solver is given a problem in natural language and must produce a solution in natural language.
My belief is that the best way to get to "informal to informal" is to first solve "formal to formal", but not everyone thinks so.
Do you have formal problems with formal solutions (so, you know the formalization is sufficiently complete) that cannot be found by existing (powerful) computer search techniques?
As far as I know, based on published systems like LeanDojo [1] and Magnushammer [2], computers today can only solve a small handful of the very easiest of these problems (like maybe Imo1959P1).
If you're interested in computer-assisted backronym composition, please check out this interactive tool that I made, based on GPT-NeoX/Pythia models: https://github.com/dwrensha/acronymy-assistant
It's quite convenient to have a continuous range of model sizes. Usually I want "the largest model that fits on my GPU", but sometimes I want to trade between quality and performance, and Pythia makes that easy.
I tried this the other day, and I was also impressed! I had assumed that it would make an illegal move eventually, but it never did. I won the game, but it wasn't trivial. (I've been out of practice in chess for a while; my rating would probably be something like 1700.)
dwrensha vs GPT-4 (through the ChatGPT interface)
16 March 2023
Note that my move 14. O-O-O was a blunder, allowing 14. ... Bxc4, but my opponent instead replied with 14. ... Nc5, which handed me a decisive advantage.
I very nearly fell for a stalemate with 41. gxh3.
After 42. Nxh2, my opponent said: "At this point, it's clear that white has a winning advantage with a promoted queen soon to appear. With your material advantage and superior position, victory is nearly certain. Would you like to continue playing, or do you want to start a new game or discuss this game?" I said, "Let's continue!"
After 52. Kc7, my opponent said "I have no moves left and it's checkmate. Congratulations! You've won the game." I replied: "You do have a move: you can do 52. ... Ka7". My opponent then said, "Apologies for the confusion. You are correct. I'll play 52...Ka7. Your move." Then I typoed the final move as "53. Kb7#" (instead of "53. Qb7#"), and my opponent did not correct me: "You played 53. Kb7#. This time, it is indeed checkmate. Congratulations! You've won the game. If you'd like to play another game, analyze this one, or ask any other questions, feel free to let me know!"
chatgpt always apologizes if you correct it and then tries to work with your correction. makes it seem humble, when in fact that's just a brainless probably halfway hardcoded response.