All Mersenne prime exponents below 100m tested at least once

schoen · on Dec 4, 2020

When I was at EFF, I helped run the Cooperative Computing Awards project (https://www.eff.org/awards/coop/), which was funded decades ago by an individual donor and provides cash rewards for finding prime numbers of certain sizes.

Because of EFF's advocacy for cryptography, people always thought that there was a more direct connection between the Mersenne prime search and cryptographic techniques, but the connection is at best distant and tenuous. The record primes that GIMPS finds and that can receive EFF's awards are

* much (much much) bigger than the primes that we use for cryptography

* too big to do almost any kind of practical computation with

* not secret, so couldn't be used as part of a secret key

The distribution or explicit values of Mersenne primes (and, correspondingly, perfect numbers) also don't really help in our assessment of the difficulty of the RSA problem or the security of public-key cryptography, because they aren't closely related to any of the fastest factorization algorithms that apply to arbitrary numbers.

(Mersenne primes are much easier to deterministically test for primality than numbers in general, which is why the record primes that GIMPS finds are Mersenne primes. The runtime complexity of primality-testing algorithms for special-form numbers is sometimes much less than algorithms for an arbitrary number, and this is a great example.)

I think the main contribution of projects like GIMPS to cryptography and computer security lies in getting people more interested in number theory and mathematics in general. Hopefully that will lead to faster improvements in humanity's knowledge of mathematics in ways that lead to more secure public-key algorithms or to better public assessments of the algorithms that we have now. (Although a lot of the important current research on public-key cryptography is not about factorization but rather about elliptic-key systems and post-quantum systems.)

I always felt I had a hard time explaining to people that the awards existed in order to publicize the idea of cooperative distributed computing projects, which were quite novel when EFF's awards were first created, and which were a demonstration of the power of the Internet to help people who've never even met each other work together. (The awards, contrary to the hopes and dreams of math enthusiasts and cranks around the world, didn't anticipate progress through new research or ideas so much through more volunteer computer time.)

Probably today we have ample other demonstrations of that power, and that concept is no longer in much doubt, and the GIMPS project probably no longer even registers for most people as a particularly impressive example. But it is impressive in continuing to set multiple world records on one of the simplest, purest, most basic mathematical problems.

blintz · on Dec 4, 2020

> too big to do almost any kind of practical computation with

There are definitely possible (though perhaps not immediate) applications for primes of this size. For example, in 2009, researchers were awarded $100k for the discovery of the Mersenne prime 2^43112609-1.

This number, while too large for RSA or ECC, could very well be useful for (R)LWE schemes (used to make post-quantum secure public-key cryptography and partial/fully homomorphic encryption schemes). In particular, it is free to modularly reduce by this prime (take the bottom 43112609 bits), so schemes that rely on doing large matrix multiplication on a modular elements (or even elements in a polynomial ring) would benefit greatly.

Since 43112609 bits is only 5.3MB, this is not at all a ridiculous size - multiplying numbers of 2^26 bits is feasible in well under 1 second in modern hardware, and so making accumulation and modular reduction free is actually valuable.

schoen · on Dec 5, 2020

I gave out the $100k prize you mention (well, I handed the discoverers a non-negotiable giant check for it), but I didn't know about these possibilities. That will be great if they materialize as useful applications!

mlyle · on Dec 4, 2020

> In particular, it is free to modularly reduce by this prime (take the bottom 43112609 bits)

Am I missing something? 2^3 - 1 is a Mersenne prime; but taking a number mod 7 is not taking the 3 low bits (there's a relatively low cost trick to take numbers modulo b-1, the same "divide by 9" tricks we can do in base 10... but it's not as easy as looking at the last digit).

linknoid · on Dec 4, 2020

The simple trick with modulo 7 is take all the 3 bit sequences: (x & 111), (x & 111000) >> 3, (x & 111000000) >> 6, etc. and add them all together. It means you just have to do bitshift + add instead of all the complexity of division. Same applies to higher powers of 2^n-1 as well.

SamReidHughes · on Dec 4, 2020

You can also combine those shift/adds by multiplying by a 1001001001 pattern or perhaps some 1000001000001 patterns.

mlyle · on Dec 5, 2020

Sure. That's why I said it was the same as the mod 9 tricks in base 10.

blintz · on Dec 5, 2020

Yeah, sorry, there are extra steps here. If n= 43112609 and you have some 2n-bit number to reduce, you can split it into two n-bit words and simply add the words together, since 1 = 2^n (mod 2^n - 1). You can do this very quickly (ie you can add an arbitrary number of 2n -bit values together with only a small number of adds and shifts as overhead).

mlyle · on Dec 5, 2020

Yah-- as mentioned-- just like we can find the residue mod 9 by adding digits up in base 10.

There's notably also the all 1's case to deal with as an exception.

ampdepolymerase · on Dec 4, 2020

Did the individual donor have a three letter name? Or connected to an organisation with a three letter name?

schoen · on Dec 4, 2020

While I wasn't at EFF at the time the donation was made, I find it totally credible that the donor was an individual who wanted to highlight the Internet's ability to let people work together in new ways, and didn't have any hidden motive for encouraging this particular work. I believe primality testing was suggested as a topic for the prize by a panel of scientific advisors, who were mostly mathematicians.

In that timeframe, the main technical advances that helped GIMPS get more efficient were being made by Richard E. Crandall (https://en.wikipedia.org/wiki/Richard_Crandall), who also wrote Prime Numbers: A Computational Perspective with Carl Pomerance (https://en.wikipedia.org/wiki/Carl_Pomerance). If these folks are trying to help governments keep a monopoly on expertise about number theory and the difficulty of the RSA problem, they seem to be doing a pretty bad job of it.

segfaultbuserr · on Dec 5, 2020

You have to realize the historical context of the award. The EFF describes the prize as

> The EFF [...] is sponsoring cooperative computing awards [...] to encourage ordinary Internet users to contribute to solving huge scientific problems.

What is "cooperative computing" and what does it mean by "encourage ordinary Internet users to contribute"? You must know a bit of history to see the context here. Back in the 1970s, the NSA drafted the Data Encryption Standard, or DES, as the national standard of data encryption for sensitive data, the NIST standardized it. However, Despite the decision by IBM to use a 64-bit key and the strong criticism from academic cryptographers (most prominently Diffie and Hellman [0]) that anything shorter than 128-bit is untrustworthy, a key length of 56-bit was selected by the NSA. D&H estimated that the NSA could theoretically construct a DES cracker for $20 million in 1976.

Fast forward to the 1990s, the security of DES has become a serious problem. Just as predicted, its 56-bit key could not keep up with the progress in semiconductors and it could be broken even by a small organization at any moment. And after 1995, Internet access opened to the public, if the government mandated the use of DES (instead of replacing it by something stronger), it would create a serious threat to security and privacy to both ordinary citizens and businesses.

However, the NIST refused to acknowledge DES's insecurity due to NSA and the FBI's effort to prevent the use of strong cryptography by the public, the FBI even described DES-encrypted data as a threat to national security. Around the same time, the U.S. government was also trying to suppress Phil Zimmermann for his PGP and was pushing a backdoored phone encryption system (sounds familiar?) known as the Clipper Chip. It was the First Crypto War.

Because the insecurity of DES was both a threat to citizens and businesses interests, RSA Security, Inc cooperated with the civil libertarians in the crypto war to push the NSA back. One thing it did was launching an anti-DES campaign. To demonstrate that the claims of DES's security by the U.S. government was false, in 1997, RSA Security Inc started "The DES Challenge" - the company published an encrypted ciphertext, and anyone who can break DES and discover the plaintext would win $10,000.

A group of computer scientists led by Rocke Verser, assisted by Justin Dolske and Matt Curtin, responded this challenge by started the DESCHALL Project. They created a distributed computing network - anyone who didn't like the NSA could help by contributing CPU time from a PC. It was truly an innovative idea. Later known as distributed.net, it's arguably the first volunteer-run distributed computing network, the predecessor of BONIC, SETI@Home or Folding@Home (the only competitor is GIMPS, which was established around the same time).

The server was on a 486-based PS/2 PC with 56 MiB of memory, and they announced the project via Usenet towards the end of March. Client software was rapidly written for a large variety of home machines and eventually some more powerful 64 bit systems. About 10,000 people joined, and the DES Challenge was broken in 96 days, DES was demonstratively broken, a definite proof. Later, RSA launched another DES Challenge, the DES Challenge II-1. DES Challenge II-1 was cracked by distrbuted.net in 39 days in early 1998. The plaintext message being solved for was "The secret message is: Many hands make light work." It was the first significant demonstration of the distributed computing's power. Distributed.net celebrated this moment in an email,

> Distributed.net is equivalent in processing power to:

    11,264    DEC Alpha 21064 533s
    15,316    Sun Ultra I 167s
    22,393    Intel Pentium II 333s
    68,859    Macintosh PowerPC 604e/200s.
    41,712    Intel Pentium 166s
    399,374   Intel 486DX2/66s
    7,446,033 Intel 386SX/20s

> (based solely on DES client performance)

> Prospective:

> If Keys were dollars, we could pay off the U.S. National Debt in 6.25 minutes

> If Keys were pennies, we could buy 536249385 Mazda Miatas each day.

> If Keys were pennies, we could buy 256728249 Jeep Cherokees each day!

> If you printed a single page to represent each key block as it was checked and placed those pages in a stack, it would grow 12.83 inches taller every minute.

> If blocks were liters of Dr. Pepper, we could produce 6381493 six-packs each day

> If Key Blocks were cheeseburgers, fries, and a large Dr. Pepper, we could feed the entire city of Toronto, Ontario lunch each day.

After this incident, the government still refused to acknowledge the insecurity of DES. FBI director Louis Freeh told Congress,

> "If we hooked together thousands of computers and worked together for months we might, as was recently demonstrated, decrypt one message bit. That is not going to make a difference in a kidnapping case. It is not going to make a difference in a national security case. We don't have the technology or the brute force capability to get to this information."

The RSA started yet another challenge, DES Challenge II-2. Meanwhile, the Electronic Frontier Foundation joined the game, thanks to its sponsors, the EFF spent $250,000 to build The EFF DES Cracker (nickname Deep Crack), a special purpose computer with 29 circuit boards and 1,856 ASIC chips. DES Challenge II-2 was solved in just 56 hours in July 1998. And the third challenge, DES Challenge II-3, was a cooperation between EFF and distributed.net, the key was found in just 22 hours 15 minutes in January 1999, and the plaintext was "See you in Rome (second AES Conference, March 22-23, 1999)".

And the rest was history, DES was retired in late 1999s, the review of the Advanced Encryption Standard was performed in a highly transparent manner, later completed in 2001, Rijndael won, and became the most widely used cipher on the Internet.

As you see, the First Crypto War was won largely due to > distributed computing. It was under this historical context that the EFF set up the cooperative computing award from an anonymous sponsor around 2000. Based on the history, it's persuadable to assume the intention by the anonymous donor was to encourage the applications of distributed computing, so the next Crypto War can be won.

Finally, fun fact: After the First Crypto War, distributed.net is irrelevant, but it still exists, and the RC5-72 cracking project from the 2000s is still running, it's estimated that the brute-force search of the keyspace will complete within 150 years.

[0] http://www.toad.com/des-stanford-meeting.html

toomuchtodo · on Dec 8, 2020

Really well written comment about the history of these challenges and distributed.net.

“Cracking DES”, a book published by the EFF, contains the necessary information to build your own Deep Crack DES brute force appliance. It’s available in the Internet Archive: https://archive.org/details/crackingdes00elec

schoen · on Dec 11, 2020

Although there are several generations of subsequent DES cracking hardware developed by other teams (albeit not with the level of open hardware detail as Deep Crack!).

https://www.copacobana.org/docs.html

https://crack.sh/

(I'm a little confused at why crack.sh had to spend roughly as much money per key per second as Copacobana, when it was being built with a more advanced model of the same manufacturer's FPGAs around a decade later. I wonder if the Virtex-6 has some high-end features that crack.sh doesn't really benefit from, but still had to pay for.)

Blahah · on Dec 4, 2020

> not secret

Clearly, no smaller primes are secret either, nor are they so numerous that iterating over them is computationally difficult. And mathematically, anyone can discover primes. Why does it matter whether primes are secret?

IvyMike · on Dec 4, 2020

> ...nor are they so numerous that iterating over them is computationally difficult

That's not correct; for a given bound x, the expected number of primes is x/ln(x).

For example, if you need to choose a prime of 100 digits, there are something like 10^96 of them; you can find a random one using probabilistic methods, do your cryptography, and know that your attacker would have to try 10^96 primes to break it. Since they can't really do this, this is not the weak point of your cryptosystem.

If you chose a Mersenne prime, there are only 51 of them. Your attacker would have to try less than 10^2 of them.

centimeter · on Dec 4, 2020

Some cryptosystems like RSA are based on secret primes. Density of primes is about 1/log(n), so there are plenty of primes to choose from.

Some cryptosystems like ECDSA are based on secret elements selected from a field, and the primes there are public.

Blahah · on Dec 4, 2020

Sorry, just to be clear, they are obscure, not secret. Anyone can compute primes. No prime is secret.

ifdefdebug · on Dec 4, 2020

Like in: anyone can combine characters. No password is secret?

segfaultbuserr · on Dec 5, 2020

They are secret. "Secret key" is a technical term in cryptography, and there are thousands of research papers published with this term [0], including a paper by Eli Biham and Adi Shamir on "Differential fault analysis of secret key cryptosystems". Using you believe that you are more knowledgeable in crypto than Adi Shamir (The "S" in "RSA") and insist on using an alternative definition of "secret" not accepted by cryptographers and infosec researchers, there should be no problem using the word "secret".

Otherwise, reductio ad absurdum, Following the same line of logic, all cryptography is "security through obscurity" because there exists a secret key, which is not incorrect. The practice of hiding the design of a system is obscurity, the practice of hiding a specific key to a ciphertext is security, not obscurity, as long as the method of generating and using it is publicly available. If there's nothing to reverse engineer, there can be no obscurity. The method of finding a symmetric key or finding a RSA key is well-known, not obscured, it's simply the size of the problem is not computationally tractable in practice by today's computers.

[0] https://scholar.google.com/scholar?q=secret+key

grifball · on Dec 4, 2020

I guess I don't fully understand this, but RSA works so there MUST be "secret" primes (as in, then can be randomly generated like a symmetric key can).

MichaelBurge · on Dec 5, 2020

All natural numbers are computable, and everything on a computer is represented as a natural number, so nothing on a computer can be secret.

lkbm · on Dec 4, 2020

Looking at the recent history:

  2020-11-27 All exponents below 99 million tested at least once.
  2020-11-17 All exponents below 98 million tested at least once.
  2020-11-17 All exponents below 97 million tested at least once.
  2020-11-09 All exponents below 96 million tested at least once.
  2020-11-03 All exponents below 95 million tested at least once.
  2020-10-24 All exponents below 94 million tested at least once.
  2020-10-09 All exponents below 93 million tested at least once.
  2020-10-04 All exponents below 92 million tested at least once.
  2020-06-20 All exponents below 91 million tested at least once.
  2020-04-09 All exponents below 90 million tested at least once.
  2020-03-01 All exponents below 89 million tested at least once.

What happened in June/July that caused it to stop, and in October to make it suddenly restart? Did they just switch from sequential to a grouping that checked all but a small portion of each million undone until November?

  September 17, 2020 All tests below 53 million verified.

Makes me think part of it is them spending extra time verifying old tests. Just curious 1. if that's all it is and 2. why / whether this was triggered by something. Was there a bug found that raised doubt in previous tests?

EDIT: Formatting

SethTro · on Dec 4, 2020

Ben Delo joined the project and has been running several hundred AWS cores.[1]

Slightly related a new method to validate the result was found which means that they no longer need to run a 2nd double check run to validate that there wasn't floating point rounding errors[2]. Interestingly this credits a HN thread as the source of inspiration[3].

[1] https://www.mersenneforum.org/showthread.php?t=24819&highlig... [2] https://mersenneforum.org/showthread.php?p=546560 [3] https://news.ycombinator.com/item?id=20587753

barkingcat · on Dec 4, 2020

I'm only tangencially related to the project, but I do have a spare machine running the software.

Yes, there was a big push to re-verify old abandoned tests. First, understand that the project requires multiple tests results from each exponent - I believe it needs a triplicate confirmation (ie 3 different computers have to run the same test and all have to match) At some point, they developed a new, different algorithm that can do the check in one go (but it's more expensive computationally). However, most of the pre-existing work was done using the type of calculation that needed the triplicate check.

There were a lot of exponents where some computer claimed one of the "verification tests" (ie it was already rejected as a prime, but we needed 2 more computers to verify), but then stopped mid way (it's pretty common since these primality tests take about 3-5 days on a modern fast threadripper - so it could take up to a month on slower machines from years ago).

It's very unlikely that these exponents signified primes (since they were already verified to NOT be prime by 1 computer, and it's just the rechecks that are pending) These abandoned exponents formed a huge amount of "leftovers" that people didn't want as work units because they are most likely not prime, and if you were hunting for primes, these are not the work you want to do.

So at some point, the project said they wanted a big push into picking up abandoned re-verifications and giving them a much higher priority to computers that could complete the test in a few days [either by finishing off what was left of the triplicate check, or by using the new algorithm that just needed one pass] - thus getting them out of the queue once and for all for good. If you have a fast computer (ie modern day i9, threadripper, xeon, anything with AVX512, etc) chances are you were assigned this type of rechecking task unless you rejected the work units.

It's not just the gap you noticed in June, this whole project of looking for and verifying old exponents has been ongoing for the bulk of 2020.

Another idea is that the project is always trying out new number techniques and algorithms and always looking for new instruction sets to use etc - a change or tweek in algorithm could cause this kind of "pause" as well, the client computers don't make the shift all at once (there's likely some verification, where a small subset of clients run both algorithms in parallel to see if the results are the same, etc) and then machines are moved onto the new algorithm as verifications are complete, etc.

EDIT: As I wrote this post, I started remembering more about the different algorithms the projects used. All of the text above can be check with the prime95 forums, where they discuss these algorithm changes, the push to re-verify, etc.

ViliamF · on Dec 6, 2020

I'm an active and more acknowledged member of the project.

To explain and correct things:

There are two types of primality checking algorithms for Mersenne numbers, PRP (probable-prime test) and LL (Lucas-Lehmer test). PRP does only say that the number is most probably prime, whereas LL, if correctly computed, says it with 100% certainty (i.e. it definitely is or definitely is not a prime). PRP tests have better error correction algorithms, and new feature is "Certification", which is rather complicated, but basically, it can confirm, that PRP was done correctly, but if the PRP indicates that number is prime, LL must be still done to be sure. LL test can have a lot more error, that are unnoticed by the existing error correction (which is different from the one used for PRP), thus it has to be at least DOUBLE-checked, not triple-checked. Triple checks are only necessary if the previous two runs don't match. If not even third run matches, fourth one is run, etc.

The order of assigning tasks is not uniform at all, there are "Categories" of assignments. Simply said, the more assignments computer finishes and turns in, the lower numbers it can get assigned next. If it returns a result, that turns out to be bad, it is downgraded in the eyes of the server, meaning it kind of goes back to the higher exponents. This is done so that the "wavefront" moves as quickly as possible with current amount of computing power.

I will probably write more later. Until then, you all can explore the page, especially the "More Information / Help" tab.

stillbourne · on Dec 4, 2020

I think it got hugged to death.

imglorp · on Dec 4, 2020

https://webcache.googleusercontent.com/search?q=cache:https:...

https://web.archive.org/web/*/https://www.mersenne.org/repor...