Cheyenne Super Computer Auction

mchannon · on April 29, 2024

I once bought a far larger supercomputer. It was 1/8 (roughly) of ASCI Blue Mountain. 72 racks. Commissioned in 1998 as #1 or #2 on the TOP500, officially decommissioned in 2004, purchased my 1/8 for $7k in ~2005.

Moving 72 racks was NOT easy. After paying substantial storage fees, I rented a 1500sf warehouse after selling off a few of them and they filled it up. Took a while to get 220V/30A service in there to run just one of them for testing purposes. Installing IRIX was 10x worse than any other OS. Imagine 8 CD's and you had to put them each in 2x during the process. Luckily somebody listed a set on eBay. SGI was either already defunct or just very unfriendly to second hand owners like myself.

The racks ran SGI Origin 2000s with CRAYlink interlinks. Sold 'em off 1-8 at a time, mainly to render farms. Toy Story had been made on similar hardware. The original NFL broadcasts with that magic yellow first down line were synthesized with similar hardware. One customer did the opening credits for a movie with one of my units.

I remember still having half of them around when Bitcoin first came out. It never occurred to me to try to mine with them, though I suspect if I'd been able to provide sufficient electrical service for the remainder, Satoshi and I would've been neck-and-neck for number of bitcoins in our respective wallets.

The whole exercise was probably worthwhile. I learned a lot, even if it does feel like seven lifetimes ago.

bri3d · on April 29, 2024

Wow, that's ridiculous. I bought two racks of Origin2000 with a friend in high school and that was enough logistic overhead for me! I can't imagine 72 racks!!

Installing IRIX doesn't require CDs; it's much, much easier done over the network. Back in the day it required some gymnastics to set up with a non-IRIX host, now Reanimator and LOVE exist to make IRIX net install easy. There are huge SGI-fan forums still active with a wealth of hardware and software knowledge - SGIUG and SGInet managed to take over from nekochan when it went defunct a few years ago.

I have two Origin 350s with 1Ghz R16ks (the last and fastest of the SGI big-MIPS CPUs) which I shoehorned V12 graphics into for a sort of rack-Tezro. I boot them up every so often to mess with video editing stuff - Smoke/Flame/Fire/Inferno and the old IRIX builds of Final Cut.

I think that by the time Bitcoin came out, Origin2000s would have been pretty majorly outgunned for Bitcoin mining or any kind of compute task. They were interesting machines but weren't even particularly fast compared to their contemporaries; the places they differentiated were big OpenGL hardware (InfiniteReality) with a lot of texture memory (for large-scale rendering and visualization) and single-system-image multiprocessor computing (NUMAlink), neither which would help for coin mining.

lightedman · on April 30, 2024

"The original NFL broadcasts with that magic yellow first down line were synthesized with similar hardware"

That was originally an XFL innovation.

strictnein · on April 30, 2024

I'm pretty sure it was ESPN and the NFL? The creator of the tech seems to indicate that and 1998 predates the first go around with the XFL.

https://www.sri.com/press/story/the-first-down-marker-how-te...

dataflow360 · on April 30, 2024

The company with the original tech was called Princeton Video Imaging, in NJ.

I interned for them one summer in... 1996?

They had sold their tech to Padres and Giants broadcasts initially to put ads on the outfield.

The first down line came a bit later and was a much better use of their tech.

Princeton Video Imaging even got a shout out (1-line) at the end of every NFL game for a while.

emchammer · on April 30, 2024

This sounds a lot like the system which drew the halo and trails on the hockey puck for NHL games, FoxTrax.

dunham · on April 30, 2024

Yeah, it's been a while, but I recall hockey getting it first, people not liking it, and then it working out for the football version.

ewams · on April 30, 2024

Really cool, do you have pictures or a blog post about this?

mchannon · on April 30, 2024

Email me and I’ll send a photo.

Never been much of a blogger. This was from a bygone era, before I (or anyone I knew) had a smartphone.

Isamu · on April 29, 2024

>Components of the Cheyenne Supercomputer

Installed Configuration: SGI ICE™ XA.

E-Cells: 14 units weighing 1500 lbs. each.

E-Racks: 28 units, all water-cooled

Nodes: 4,032 dual socket units configured as quad-node blades

Processors: 8,064 units of E5-2697v4 (18-core, 2.3 GHz base frequency, Turbo up to 3.6GHz, 145W TDP)

Total Cores: 145,152

Memory: DDR4-2400 ECC single-rank, 64 GB per node, with 3 High Memory E-Cells having 128GB per node, totaling 313,344 GB

Topology: EDR Enhanced Hypercube

IB Switches: 224 units

Moving this system necessitates the engagement of a professional moving company. Please note the four (4) attached documents detailing the facility requirements and specifications will be provided. Due to their considerable weight, the racks require experienced movers equipped with proper Professional Protection Equipment (PPE) to ensure safe handling. The purchaser assumes responsibility for transferring the racks from the facility onto trucks using their equipment.

mk_stjames · on April 29, 2024

Given that the individual nodes are just x86_64 Xeons and run linux... it would be interesting to part it out for sale as individual, but functional, nodes to people. There are a lot of people would like to have a ~2016 era watercooled 1U server from a supercomputer that was once near the top of the Top500 just to show to people.

Get little commemorative plaques for each one and sell for $200 each or so.

edit: it seems each motehrboard is a dual CPU board and so there are 4032 nodes, but the nodes are in blades that likely need their rack for power. But I think individual cabinets would be cool to own.

There are 144 nodes per cabinet... so 28 cabinets. I'd pay a fair amount just to own a cabinet to stick in my garage if I was near there.

electroly · on April 29, 2024

The individual servers are not watercooled. The compute racks are air-cooled; the adjacent cooling racks then exchange that heat using the building's chilled water. It's the rack as a whole that is watercooled. If you extract a single node, you won't get any of that. As the other commenters also point out, these are blades; you can't run an individual node by itself.

w-ll · on April 30, 2024

World of Warcraft sold decommissioned blades for about that much with no intention to be actually used. Just something to thru up in the cave.

chasil · on April 29, 2024

These are blades, so there is probably some kind of container chassis required to run them.

Using them as desktop PCs would likely be a challenge.

fnord77 · on April 29, 2024

I don't think there's that big of a market for obsolete server pieces as nostalgia...

But you could probably make a decent profit on just the CPUs alone parted out, even with the moving/handling costs.

RajT88 · on April 30, 2024

Going off one listing for a E5-2697v4, $50 with free shipping, 386 already sold.

If you figure after the double-dipping of eBay/Paypal and then shipping fees, that's ~$30 profit per CPU.

8024 x 30 = 241,920 USD. Not too shabby for what's got to be some weeks/months of work. You could probably assume that they can sell or scrap the rest of it for a bit more as well, minus the fees for storage and moving company.

atlgator · on April 30, 2024

I was thinking the same thing. Worst case just sell off the CPUs and RAM.

slavik81 · on April 30, 2024

I have a couple servers with this exact CPU that I run for a mixture of practical and sentimental reasons. I bought them off eBay and only after purchase discovered they were a piece of history. They have a second life testing GPU libraries for Debian from a rack in my basement.

For privacy reasons, I won't say who originally owned the servers, but they had a cool custom paint job and were labelled YETI1 and YETI2. If the original owner is on HN, perhaps they will recognize the machines and provide more information.

https://slerp.xyz/img/misc/argo-lyra.jpg https://slerp.xyz/img/misc/argo-lyra-open.jpg

CalRobert · on April 29, 2024

Is it not "Personal" protective equipment?

https://en.wikipedia.org/wiki/Personal_protective_equipment

op00to · on April 29, 2024

You need PPE to protect your profession of moving stuff.

chasil · on April 29, 2024

I can find a bunch of the E5-2697v4 CPUs on eBay in the $30-40 range.

I wonder if there is a market for the SGI hardware.

michaelt · on April 29, 2024

So getting 8,064 of them for $3,085 - 38 cents per CPU - is great value for money!

jazzyjackson · on April 29, 2024

this is basically "free grand piano" - not so free once you hire the movers and tuners

dylan604 · on April 29, 2024

At least a piano doesn't require power and cooling to operate.

zdragnar · on April 30, 2024

Not with that attitude...

kube-system · on April 29, 2024

Dump 8,064 old processors on eBay and you'll probably introduce some downwards price pressure.

jeremyjh · on April 29, 2024

That’s just the current bid and it hasn’t met the reserve.

slavik81 · on April 30, 2024

The reserve price for the auction is $100k.

Locutus_ · on April 29, 2024

There is, but really only for the MIPS hardware.

araes · on April 30, 2024

If somebody has the money, and the resources required to house the system, seems like decent value for the money at the reserve (apparently $100k) The range of, I'd buy if I had the money and a valid use case. Even partial resale like suggested. There's an argument about being behind the "most" advanced. Yet, its a petascale server from 2016 that was good enough for the NSF. It's not exactly an old dog.

The stuff I could quickly find was $1M+ in individual sales (not sure about auction sites and used).

DDR4 RAM $734,500 @ $75 / 32 GB for 313,344 GB

CPUs $484,000 @ $60 / (1) E5-2697v4 for 8,064 units

Figure the hard drives must also be enormous, and probably a huge amount of storage, just not sure on the quantity, and they may not be sold with the system.

They're not kidding on the moving company, that's 10's of tons of computer to haul somewhere. 26 1U Servers @ 2500 lbs each. Couple semi-trucks, since the most they can usually do is 44,000 lbs on US highways. Not sure if the E-Cell weight was 14 @ 1500 lbs in addition to the 26 @ 2500 lbs?

araes · on May 13, 2024

Apparently it ended up selling for the cost of the CPUs at $480,000. Also, guess my math was low, and it was a 47 ton computer.

https://cowboystatedaily.com/2024/05/07/cheyenne-supercomput...

unnouinceput · on April 29, 2024

>...totaling 313,344 GB

Can you imagine the RAMDisk? Yes, you can. Especially in 20 years when it will be the norm. And also the Windows version that will require half of it in order to run /s

christkv · on April 29, 2024

Does it come with a portable nuclear reactor to power it?

Workaccount2 · on April 29, 2024

For those curious, Cheyenne is a supercomputer from 2016/2017 that launched on the 20th spot in the top500 super computers. It was decommissioned in 2023 after pandemic lead to a two year operation extension.

It has a peak compute of 5.34 petaflops, 313TB of memory, and gobbles 1.7MW.

observationist · on April 29, 2024

In comparison, 18 A100 GPUs would have 5.6 petaflops and 1.4 TB vram, consuming 5.6 kw.

The speed of processing and interconnect is orders of magnitude faster for an A100 cluster - 1 8 gpu pod server will cost around $200k, so around $600k more or less beats the supercomputer performance (price I'm searching seems wildly variable, please correct me if I'm wrong.)

mk_stjames · on April 29, 2024

The Cheyenne numbers are 5.34 petaflops of *FP64*.

The 5.6PF you quote for 18 A100's would be in BF16. Not comparable.

The A100 can only do 9.746 TFLOPS in FP64.

So you would need 548 A100's to match the FP64 performance of the Cheyenne.

observationist · on April 29, 2024

Thanks, glad you guys caught that - could be generous and allow the tensor core tflops, since you'd more than likely be using a100 pods for something cuda optimized, in which case 19.5 tflops fp64 at peak per GPU, roughly 267 would be needed, or 34 pods, at $6.8 million, with 21.76 TB vram and 81 kw power consumption.

Double those for raw fp64.

latchkey · on April 29, 2024

AMD MI300x is 163.4 TFLOPS in FP64.

33 of them, which would also have 6,336TB of memory.

I'll have way more than that in my next purchase order.

It is really fun to build a super computer.

mk_stjames · on April 29, 2024

I'm an amateur, but I have code that I think could probably dispatch threads pretty efficiently on the Cheyenne thru it's management system simply because it's all xeons distributed. If I can run it on my personal 80-core cluster, I could get it to run on Cheyenne back then.

But hitting the roofline on those AMD GPGPU's? I'd probably get nowhere fucking close.

That is the thing that Cheyenne was built for. People doing CFD research with x86 code that was already nicely parallelized via OpenMPI or whathaveyou.

latchkey · on April 29, 2024

It is wild how much compute has grown.

I put dual Epyc 9754 into my first box of MI300x.

That's 256 cores + 8x MI300x, in a single box.

Agreed, it is a great solution for CFD, which is definitely one workload I'd love to host.

dekhn · on April 29, 2024

I used to build small clusters and use supercomputers and I can't imagine it's fun to build a super computer. It requires a massive infrastructure and significant employee base, and individual component failures can take down entire jobs. Finding enough jobs to keep the system loaded 24/7 while also keeping the interconnect (which was 15-20% of the total system cost) busy, and finding the folks who can write such jobs, is not easy. Even then, other systems will be constantly nipping at your heels with newer/cheaper/smaller/faster/cooler hardware.

latchkey · on April 29, 2024

Thanks for the feedback. You make a lot of good points. I've built a 150,000 GPU system previously, but it was lower end hardware. It was a lot of fun to make it run smoothly with its own challenges.

It doesn't take a lot of employee's, we did the above on essentially two technical people. Those same two are working on this business.

Finding workloads/jobs is definitely going to be an interesting adventure, that said, the need for compute isn't going away. By offering hard to get hardware at reasonable rates and contract lengths, I believe we are in a good position on that front, but time will tell.

We are only buying the best of the best that we can get today. The plan is to continuously cycle out older hardware as well as not pick sides on one over another. This should help us keep pace with other systems.

dekhn · on April 30, 2024

150K GPU with two people... presumably, 8 GPU/host, you had close to 20K servers.

I can't really see how that's achievable with only two people, given the time to install hardware, maintain it, deal with outages and planned maintainence and testing, etc. Note: I worked at Google and interfaced with hwops so I have some real-world experience to compare to.

Building a 150K GPU system without a well-understood customer base seems a bit crazy to me. You will either become a hyperscale, serve a niche, or go out of business, I fear.

latchkey · on April 30, 2024

7 separate data centers all around the US.

12 GPU/host. 130,000 of that kind. ~10,833 hosts.

The ASRock BC-250's we deployed were 12 individual blades and those were all PXE booted. We deployed 20,000 of those blades across 2 data centers. This was a massive feat of engineering, especially during covid where I couldn't even access the machine directly. Built a whole dashboard to monitor it all too.

I know, I can't believe we did it either, but we did. Software automation was king. I built a single binary that ran on each individual host and knew how to self configure / optimize everything. Idempotently. Even distributing upgrades to the binary was a neat challenge that I solved perfectly, in very creative ways.

Today, we are starting much smaller. Literally from zero/scratch. Given the cost of MI300x, I doubt we will ever get to 150k GPUs, that's an absurd amount of money, but who knows.

dekhn · on April 30, 2024

But who did the wiring? Even with blades which consolidate much of the cabling, there's still a tremendous amount of work to build the interconnect. On typical large systems I've seen a small team 3-5 guys working weeks+ to wire a modest DC.

latchkey · on April 30, 2024

We'd hire the initial deployment out to temporary contractors. It just took a few weeks to get a large deployment out. The hard part was the 12 GPUs needed to be inserted at the DC, which took a bunch of effort. Once it was done we generally had 1-2 people on the ground in the data centers to deal with breakfixes. Either contractors or supplied by the DC.

For this venture, again, we are starting small, so we are just flying to the DC and doing it ourselves. There are also staff there that are technical enough to swap stuff out when we need it. The plan will be to just hire one of their staff as our own.

I don't think we will make it for this next deployment due to time constraints, but ideally in our near future, we will go full L11. Assemble and ship out full racks at the manufacturer/VAR, bolt em down, wire them up and ready to go. That is my dream... we will see if we get there. L11 is hard cause a single missing cable can hold up an entire shipment.

dekhn · on May 2, 2024

I just realized we had this same conversation on HN before. IIRC I said last time and I'll repeat: if you say that you set up 15K GPUs with 2 people, and I ask who did the wiring, and you say an external company came in and spent a few weeks wiring the network for you, then you can't say that 2 people set up 15K GPUs. You're trying to externalize real costs (both time and money).

I understand your dream (having pursued similar ideas) but I think you have ot be realistic about the effort required, especially when you add picky customers to the mix.

latchkey · on May 2, 2024

Now you're nit picking two days later, which is fine. Sure, you got me there!

Two people hired some temporary workers and asked them to perform a task to get us up and running, which lasted a few days, out of years of operation.

¯\_(ツ)_/¯

nickpsecurity · on April 29, 2024

Also, supercomputers usually use general-purpose nodes supported by many standard tools, multiple methods of parallelization, and (for open standards) maybe multi-vendor. I imagine this one is much more flexible than A100’s.

jjtheblunt · on April 29, 2024

also, comparing SIMD with cheyenne is misleading

martinpw · on April 29, 2024

The supercomputer flops are FP64. The A100 stats you are using are FP16.

jeffbee · on April 29, 2024

It's fine. We will simply run weather forecast in BF16 mode and hallucinate the weather.

dgacmu · on April 29, 2024

Introducing our next supercomputer, Peyote.

adgjlsfhk1 · on April 29, 2024

weather forecasting is actually moving to reduced precision. none of the input data is known to more than a few digits, and it's a chaotic system so the numerical error is usually dominated by the modeling and spacial discretization error

CamperBob2 · on April 30, 2024

A black Sharpie marker is even cheaper...

Netcob · on April 29, 2024

Aw man... I was going to use it for my homelab but that's 1696320W more than I can supply. Well... maybe if I use two plugs instead of one...

DonHopkins · on April 29, 2024

Bet it runs warm. The cat will love sitting on it.

buescher · on April 29, 2024

It was at #160 in 2023 when it was decommissioned.

Animats · on April 29, 2024

It's really hard to find a home for large, old, high-maintenance technology. What do you do with a locomotive, or a Linotype? They need a support facility and staff to be more than scrap. So they're really cheap when available.

The Pacific Locomotive Association is an organization with that problem. About 20 locomotives, stored at Brightside near Sunol. They've been able to get about half of them working. It's all volunteer. Jobs that took days in major shops take years in a volunteer operation.

jasonwatkinspdx · on April 29, 2024

At the ill fated Portland TechShop I took woodworking classes from a retired gentleman, who professionally was a pattern maker for molding cast metal parts. This made his approach to woodworking really interesting. He had a huge array of freestanding sander machines, including a disc sander with more than a yard diameter.

For anyone unfamiliar, pattern makers would make wooden model versions of parts that were to be cast in metal. The pattern would be used to make the mold. He could use these various sanding machines to get 1/64" precision for complex geometries. It was fascinating to watch how he approached things, especially in comparison to modern CNC.

His major project outside of teaching the classes? Making patterns for a local steam locomotive restoration project. He had all these wooden versions of various parts of a locomotive sitting around.

dekhn · on April 29, 2024

Does 1/64" precision really mean anything in wood, where small fluctuations in air moisture can cause > 1/64" distortion? I guess it's OK if you stay within a climate controlled area.

jasonwatkinspdx · on April 29, 2024

So he would build parts by first making an oversize rough blank of bonded layers of marine grade plywood in a big press. Then he'd rough cut it various ways on a big band saw. Then he'd work his way through using all the sanders to slowly approach the net shape. He used precision squares to measure bigger stuff and calipers for smaller stuff.

I can't tell you the exact stability of marine grade plywood, but I know it's about as good as you can get for a wooden material, and I doubt he'd go to the effort of such precise measurements if it didn't matter.

dekhn · on April 30, 2024

Plywood is good for dimensional stability, but I'm pretty sure all this work must have been done in and around a toolroom with stable moisture content or the part was used immediately and then consumed/destroyed before it "moves" too much. However, he sounds like he's pretty knowledgeable so I'm going to guess this isn't just garage woodworking where 1/32" doesn't really matter when the wood is going to shrink/expand by 5-10% over the course of a year.

jasonwatkinspdx · on April 30, 2024

So the patterns once done would be taken to a foundry where they'd be used to make molds. I'm not totally up to speed on that process but I know it involves surrounding the pattern with a combination of refractory sand and binder. Where it gets tricky is complex parts that have multiple cores and so on.

And yeah, this guy was retired at the time but he'd been doing it for like 50 years. I'm very sure he knew what did and did not matter.

guenthert · on May 1, 2024

Funny, just yesterday I saw here in Germany a train where the locomotive was labeled 'rent me'. Apparently there's an organization which bought old locomotives (this particular one looked like 60s, maybe early 70s vintage) from the state-run former monopoly to rent them out (large industrial customers I'd think).

I was surprised to see such an old locomotive in operation, but apparently it's still good (i.e. economic) enough to shuttle cars from factory to port. Guess the air pollution restrictions aren't all that tough for diesel trains.

EDIT: didn't realize this is big business in Germany: https://railmarket.com/eu/germany/locomotive-rental-leasing

aebtebeten · on April 29, 2024

Hence my simple theory of aesthetics:

— if the best stuff from then is still better than good stuff from now, it's art

— if the best stuff from then is worse than bad stuff from now, it's technology

kibwen · on April 29, 2024

May I present my postmodern theory of aesthetics:

- If it's useful as a medium for money laundering, it's art.

- If it's useful as a facilitator for money laundering, it's technology.

h2odragon · on April 29, 2024

> the system is currently experiencing maintenance limitations due to faulty quick disconnects causing water spray. Given the expense and downtime associated with rectifying this issue in the last six months of operation, it's deemed more detrimental than the anticipated failure rate of compute nodes.

Even the RAM has aged out...

Very hard to justify running any of this; newer kit pays for itself in reduced power and maintenance quick in comparison.

Tempest1981 · on April 29, 2024

Are you trying to discourage others from bidding, so you can swoop in and win the auction?

h2odragon · on April 29, 2024

nah, i already have far more junk computers than i need.

I lusted after a Cray T3E once that I coulda had for $1k and trucking it across TN and NC; but even then I couldn't have run it. I'm two miles away from 3 phase power and even then couldn't have justified the power budget. At the time a slightly less scrap UltraSPARC 6k beat it up on running costs even with higher initial costs so i went with that instead. I did find a bunch of Alphas to do the byte swizzling tho. Ex "Titanic" render farm nodes.

I've been away from needing low budget big compute for a while, but having spent a few years watching the space i still can't help but go "ooo neat" and wonder what i could do with it.

https://en.wikipedia.org/wiki/Cray_T3E

haunter · on April 29, 2024

Never understood why can you bid below the reserve price, or more like why the reserve price is hidden because the whole point that they (the seller) have a price in mind they are not willing to go below.

freetime2 · on April 29, 2024

> a price in mind they are not willing to go below

I worked for an auction, and sellers accepted bids below the reserve price all the time. They just want to avoid a situation where an item sells at a “below market” price due to not having enough bidders in attendance - e.g. a single bidder is able to win the auction with a single lowball bid. If they see healthy bidding activity that’s often sufficient to convince them to part with the item below reserve.

Reserve prices are annoying for buyers, but below-reserve bids can provide really useful feedback for sellers.

We even had full-time staff whose job was to contact sellers after the auction ended and try to convince them to accept a below-reserve bid, or try to get the buyer and seller to meet somewhere in the middle. This worked frequently enough to make this the highest ROI group in our call center.

ansible · on April 29, 2024

It is playing on the psychology of the bidders. You want the bidders to be invested, to want to win the auction. To compete to win the prize.

Also, consider this: if the reserve is too high, and no one bids on it, then everyone looking at it is going to wonder what it is really worth. If there are several other bidders, then that gives reassurance to the rest for the price they each are bidding.

jeremyjh · on April 29, 2024

Also it would give feedback to the seller that the reserve may not be feasible.

bombcar · on April 29, 2024

It's entirely because of human nature - you want people to get invested in it, which having them bid any amount does.

It's the same reason an auction can go above the price/value of the thing, because you get invested in your $x bid, so $x+5 doesn't seem like paying $x+5, but instead "only $5 more to preserve your win" type of thing.

See penny auction scams - https://utahjustice.com/penny-auction-scams - for an extreme example.

tombert · on April 29, 2024

Man, if I had the space, the money, and the means of powering it I would bid on this immediately. It's so damn cool and will likely end up selling for a lot less than its worth due to its size.

I've always been fascinated by the supercomputer space, in no small part because I've been sadly somewhat removed from it; the SGI and Cray machines are a bit before my time, but I've always looked back in wonder, thinking of how cool they might have been to play with back in the 80s and 90s.

The closest I get to that now is occasionally getting to spin up some kind of HPC cluster on a cloud provider, which is fun in its own right, but I don't know, there's just something insanely cool about the giant racks of servers whose sole purpose is to crunch numbers [1].

[1] To the pendants, I know all computers' job is to crunch numbers in some capacity, but a lot of computers and their respective operating systems like to pretend that they don't.

neilv · on April 29, 2024

https://en.wikipedia.org/wiki/Cheyenne_(supercomputer)

> The Cheyenne supercomputer at the NCAR-Wyoming Supercomputing Center (NWSC) in Cheyenne, Wyoming began operation as one of the world’s most powerful and energy-efficient computers. Ranked in November 2016 as the 20th most powerful computer in the world[1] by Top500, the 5.34-petaflops system[2] is capable of more than triple the amount of scientific computing[3] performed by NCAR’s previous supercomputer, Yellowstone. It also is three times more energy efficient[4] than Yellowstone, with a peak computation rate of more than 3 billion calculations per second for every watt of energy consumed.[5]

nickpsecurity · on April 29, 2024

My favorite part of SGI computers, like Altix and UV lines, was the NUMA memory with flexible interconnect. NUMA let you program a pile of CPU’s more like a single-node, multithreaded system. Then, the flexibility let you plug in CPU’s, graphics cards, or FPGA’s. That’s right into the low-latency, high-speed, memory bus.

There was a company that made a card that connected AMD servers like that. I don’t know if such tech ever got down to commodity price points. If you had Infiniband, there were also Distributed, Shared Memory (DSM) libraries that simulated such machines on clusters. Data locality was even more important then, though.

formerly_proven · on April 29, 2024

Cray XT/SeaStar? iirc the interconnect ASIC pretends to be another peer CPU connected via HyperTransport. HPE Flex is similar, but works via QPI/UPI for Intel CPUs.

nickpsecurity · on April 29, 2024

It was NUMAscale. It’s mentioned in this article with some others for comparison:

https://www.nextplatform.com/2015/07/16/what-if-numa-scaling...

jonhohle · on April 30, 2024

Despite being SGI branded hardware, this was after the name was bought by Rackable and IIRC, SGi hardware became essentially rebadged Rackable hardware. Still cool, but not as cool as some custom MIPS hardware from old SGI.

fancyfredbot · on April 29, 2024

It's just not economical to run these given how power inefficient they are in comparison to modern processors.

This uses 1.7MW, or $6k per day of electricity. It would take only about four months of powering this thing to pay for 2000 5950X processors. Those would have a similar compute power to the 8000 Xeons in Cheyenne but they'd cost 1/4 the power consumption.

nerpderp82 · on April 29, 2024

If you can get 1.7MW service, then you are paying utility rates, or around 100-150 per MWh, or as you quoted 4-6k per day. In Seattle, running this off peak would cost one about 104$/hr before the other fees.

I would be neat if a subset of this could be made operational and booted once in awhile a computer history museum. I agree that it doesn't make sense to actually run it.

https://seattle.gov/city-light/business-solutions/business-b...

voytec · on April 29, 2024

> It took us fifteen years and three supercomputers to MacGyver a system for the gate on Earth

    Samantha Carter

techplex · on April 29, 2024

https://youtube.com/clip/UgkxiL2_knQQyrYUenU0PCCcZZpzfQwCq6L...

isodev · on April 29, 2024

I was just thinking how much of deep space radio telemetry this super computer must have seen.

brianhorakh · on April 29, 2024

This is Wopr! How about a nice game of chess?

bibliotekka · on April 29, 2024

the only winning move is not to play

queuebert · on April 29, 2024

I wonder who buys these. Crypto miners? My institution would make it nearly impossible to buy a secondhand supercomputer.

bragr · on April 29, 2024

There's a whole sub-industry of people bidding on government auctions in order to part out the stuff. I'd be pretty surprised if the whole cluster got reassembled. But people on a budget will buy those compute nodes, someone trying to keep their legacy IB network will snap up those switches, the racks, etc.

gabrielhidasy · on April 29, 2024

r/homelab will have a field-day getting those nodes up, some people will want just one for practicality, some people will want at least a couple and a IB switch just for the novelty of it.

jeffbee · on April 29, 2024

I can't imagine anyone from r/homelab has an SGI 8600 E-cell laying around that they could slap these blades into.

gh02t · on April 29, 2024

I can imagine it, some people on there are ridiculous, but yeah in my experience these supercomputer nodes are a lot more integrated/proprietary than most standard server hardware. It's not straightforward to just boot one up without all the support infrastructure. I'd assume they'd mostly be torn down and parted out.

bombcar · on April 29, 2024

You might be surprised - because they're pretty custom they are often "more open" than you might expect; as long as you have the connectors you can often get things running something. Sometimes they have bog-standard features present on the boards, just not enabled, etc.

It's the commoditized blade servers, etc that are stripped down to what they need to run and nothing more.

gh02t · on April 29, 2024

Oh I'm speaking from experience with the SGI supercomputer blades. They're pretty wacky, 4x independent, dual cpu boards per blade and all sorts of weird connectors and cooling and management interfaces. Custom, centralized liquid cooling that requires a separate dedicated cooling rack unit and heat exchanger, funky power delivery with 3 phase, odd networking topologies, highly integrated cluster management software to run them etc. I'm not sure if they have any sort of software locks on top of that, but I would bet they do and presumably NCAR wipes all of them so you likely won't have the software/licenses.

I dug up a link to some of the technical documentation https://irix7.com/techpubs/007-6399-001.pdf . Probably someone can get it working, but I imagine whoever is going to go through the hassle of buying this whole many-ton supercomputer is planning to just strip it down and sell the parts.

bombcar · on April 29, 2024

Yeah the licensing is often the stumbling block, unless you can just run some bog-standard linux on it. It sounds like this might be custom enough that it would be difficult (but I daresay we'll see a post in 5 years from someone getting part of it running after finding it on the side of the road).

gh02t · on April 29, 2024

Ultimately SGI was running Linux and AFAIK the actual hardware isn't using any secret sauce driver code, so yeah if you can get it powered on without it bursting in flames and get past the management locks you can probably get it working. It's definitely not impossible if you can somehow assemble the pieces.

lawlessone · on April 29, 2024

>Crypto miners?

I think mining crypto with these would burn far too much energy compared to the ASICS in use.

bufferoverflow · on April 30, 2024

Not every crypto can be mined with ASICs. For example, Monero is ASIC and GPU resistant.

latchkey · on April 29, 2024

Crypto is no longer mined commercially with GPU type compute. When ETH switched to PoS, it decimated the entire GPU mining industry. It is no longer profitable. The only people doing it now are hobbyists.

bufferoverflow · on April 30, 2024

Some crypto is ASIC- and GPU-resistant.

Monero is one of them.

ceinewydd · on April 30, 2024

Sure, but you can get (much) better price-performance-power out of a CPU which isn't approaching a decade old, when mining ASIC and GPU-resistant cryptocurrencies like Manero. I don't know it'd be worth the effort to buy E5-2697-v4 CPUs which are running in such a specialized configuration over and above AMD Ryzen or EPYC CPUs in commodity, inexpensive mainboards.

bufferoverflow · on May 1, 2024

Depends on the price this supercomputer sells at.

latchkey · on April 30, 2024

Monero is pegged to CPU.

I don't know why you downvoted me.

https://www.getmonero.org/resources/moneropedia/randomx.html

h2odragon · on April 29, 2024

Recyclers.

sambull · on April 29, 2024

any I've dealt with definitely wouldn't touch the 'you need to hire professional movers costing you $10k's of dollars to get it out of the facility' stipulation - they seem to prefer the 'where's the location of the storage shed' situation.

vel0city · on April 29, 2024

There's 8,064 E5-2697v4's in this. Those go on ebay for ~$50/ea. That's $400,000 of just CPUs to sell.

If the winning bid is $100k, you spend $40k to move it out of there, another $10k warehousing it while selling everything on ebay, and you're still up $250k on the processors alone.

EvanAnderson · on April 29, 2024

> That's $400,000 of just CPUs to sell.

But do you crater the market for those CPUs? What's the demand for 2016-era Xeons and how much of their price comes down to supply?

ansible · on April 29, 2024

I presume no one is building new motherboards for those processors either. While there is old stock laying around, you really need to run those systems close to as-is for them to be useful.

pantalaimon · on April 29, 2024

> I presume no one is building new motherboards for those processors either

That's actually far from the truth, LGA2011 is quite popular as a budged gaming system precicely because CPUs are so cheap on the 2nd hand market.

https://aliexpress.com/w/wholesale-X99.html

toast0 · on April 29, 2024

These are high spec cpus for the socket though. Lots of room for people with compatible boards that want to upgrade.

There's a lot of low budget hosting with old Xeon systems (I'm paying $30/month for a dual westmere system; but I've seen plenty of offers on newer gear); you can still do a lot with an 18 core Broadwell, if the power is cheap.

alchemist1e9 · on April 29, 2024

And how much labor costs to earn that $250K? Once that is factored in, I’m guessing fair price is zero or negative.

Plus knowing a bit about warehouse costs … your $10K is a bit on the low side don’t you think?

bombcar · on April 29, 2024

It's for the processors alone - a scrapping company dedicated to this stuff would be able to actualize more from other components - and they often have warehouse space available that they already own.

Let's come back and see if the auction failed; I doubt it will.

jfkfif · on April 29, 2024

Academic departments with low budgets and cheap electricity who can make due with old CPUs

hggh · on April 29, 2024

Archived version: https://web.archive.org/web/20240429122132/https://gsaauctio...

blakespot · on May 2, 2024

The comments in some of the news posts covering the auction popping up are curious / amusing. Lots of supercomputer hate - "let's use a few PCs," etc.

ex: https://wccftech.com/iconic-5-34-pflops-cheyenne-supercomput...

fnord77 · on April 29, 2024

> 8,064 units of E5-2697v4

Those alone are selling for about $40 on ebay.

Let's say you sold them for $15 each.

$120,000. Let's say the auction and the moving and the break down costs were $20,000.

maybe worth it?

bufferoverflow · on April 30, 2024

You won't find 8064 buyers of these CPUs any time soon. It will take many years, maybe a decade to offload.

For example, in April ebay sold 79 of them. You're looking at ~100 months. And that's assuming constant demand.

ceinewydd · on April 30, 2024

I ended up reading all the documentation provided alongside the auction because I was debating making a bid. I decided not to in the end.

In total, Cheyenne weighs in around ~95000lbs in weight, they require it be broken down in <= 3 business days on site and only offer 8AM - 4PM access. Therefore you'll need to support unbolting and palletizing all the cabinets for loading into a dry trailer, get a truck to show up, load, then FTL to the storage (or reinstall) location of your choosing for whatever's next.

Requirements for getting facility access seem to mean you bring both labor and any/all specialized equipment including protective equipment to not damage their floors while hauling away those 2400lb racks. Liability insurance of $1-2M across a wide range of categories (this isn't particularly unusual, or expensive, but it does mean you can't just show up with some mates and a truck you rented at Home Depot).

I'd guess you're looking at more like $25k just to move Cheyenne via truck, plus whatever it sells for at auction, unless you are located almost next door to them in Wyoming. Going from WY to WA the FTL costs alone would be $6-7k USD just for the freight cost (driver, dry trailer and fuel surcharges), add a bit if your destination doesn't have a loading dock and needs transloading for the last mile to >1 truck (probably) equipped with a tail lift. All the rest is finding a local contractor with semi-skilled labor and equipment to break it down and get it ready to load on a truck and covering the "job insurance".

Warehouse costs if you're breaking it for parts won't be trivial either and you'd be flooding the used market (unfavorably to you) were you to list 8000 CPUs or 300TB DDR4; it could take months or even to clear the parts without selling at a substantially depressed price.

It will take probably several thousand hours in labor to break this for parts assuming you need to sell them "Tested and Working" to achieve the $50/CPU unit pricing another commenter noticed on eBay, and "Untested; Sold as Seen" won't get anywhere near the same $/unit (for CPUs, DRAM, or anything else) and so even assuming $25/hr for fully burdened relatively unskilled labor, you could well be talking up to $100k in labor to break, test and list Cheyenne's guts on eBay or even sell onward in lots to other recyclers / used parts resellers. I don't think I could even find $25/hr labor in WA capable and willing of undertaking this work, I fear it'd be more like $45-60/hr in this market in 2024 (and this alone makes the idea of bidding unviable).

A lot of the components like the Cooling Distribution Units are, in my opinion, worth little more than scrap metal unless you're really going to try and make Cheyenne live again (which makes no sense, it's a 1.7MW supercomputer costing thousands of dollars per day to power, and which has similar TFLOPs to something needing 100x less power if you'd bought 2024-era hardware instead).

Anything you ultimately can't sell as parts or shift as scrap metal you are potentially going to have to junk (at cost) or send to specialist electronics recycling (at cost); your obligations here are going to vary by location and this could also be expensive -- the leftovers.

If anyone does take the plunge, please do write up your experiences!

jeffbee · on April 29, 2024

I wonder what the point of liquid cooling such a system was. Were they pressed for space?

michaelt · on April 29, 2024

The heat's going to be leaving the building in liquid-filled pipes, however you architect it. And with 1.7MW of peak power consumption, a nontrivial amount of liquid.

It's just a question of whether you want to add air and refrigerant into the mix.

It seems they're decommissioning it partly due to "faulty quick disconnects causing water spray" though, so an air cooling stage would have had its benefits...

toast0 · on April 29, 2024

> The heat's going to be leaving the building in liquid-filled pipes

In the right climate, and the right power density, you can use outside air for cooling, at least part of the time. Unlikely at this scale of machine, but there was a lot of work towards datacenter siting in the 2010s to find places were ambient cooling would significantly reduce the power needed to operate.

calaphos · on April 29, 2024

Has been really common in HPC for quite a while. I presume the higher interconnect/network of hpc favour the higher density of liquid cooling. Hardware utilization is also higher compared to normal datacenters, so the additional efficiency vs air cooling is more useful.

convolvatron · on April 29, 2024

for large machines the air setup is really less efficient and takes up alot of space. you end up building a big room with a pressurized floor which is completely ringed by large ac units. you have to move alot of air through the floor bringing it up through the cabinets and back through to the acs. its also a big control systems problem, you need to get the air through the cabinets evenly, so you need variable speed fans or controlled ducts..and those need to be adaptive but not oscillate.

with a water cooled setup you can move alot more heat through your pipes just be increasing flow rate. so you need pumps instead of fans. and now your machine room isn't a mini-hurricane, and you can more flexibly deal with the waste heat.

Galatians4_16 · on April 29, 2024

Built in a bunker under a mountain, so reduced airflow, plus need to hide heat signatures from outside surveillance?

Also, likely they had infrastructure available, from the nuclear power they use.

kimmeld · on April 29, 2024

Cheyenne Wyoming not Cheyenne mountain.

eptcyka · on April 29, 2024

Doesn't matter what conductor you use to move heat, the same amount of energy will have to be dispersed. And watercooling just implies more intermediate steps between the heatshield of the die and air. So I don't believe the heat signatures can really be helped.

jeffbee · on April 29, 2024

This is a weather supercomputer, not a defense one.

saalweachter · on April 29, 2024

So, people with experience moving this sort of hardware--

Let's say you just wanted to have this transported to a warehouse. How much are we talking, between the transportation cost and the space to store it?

gautamcgoel · on April 29, 2024

The listing says that 1% of the nodes have RAM with memory errors. I assume this means hard errors since soft errors would just be corrected. Is this typical? Does RAM deteriorate over time?

dunham · on April 29, 2024

Reading the whole paragraph, it sounds to me like they were accepting 1% errors rather than fixing the leaking cooling system.

bastardoperator · on April 29, 2024

Donate it to https://computerhistory.org/, assuming they want it.

monocasa · on April 29, 2024

It'd be great if this could end up in the hands of some group like the Living Computers Museum.

pnw · on April 29, 2024

Unfortunately LCM closed during the pandemic and laid off all their staff, with no sign of reopening.

seaourfreed · on April 29, 2024

The only problem is that the super computer keeps outputing "WANT TO PLAY A GAME?"

simonerlic · on April 29, 2024

A strange game; the only winning move is not to play

NickC25 · on April 29, 2024

What could someone possibly do with this? It's cool as hell but 8 years old.

layoric · on April 30, 2024

I dunno, I still think the 2011-v3 platform that these Xeons can run in is still a great setup for a homelab. A bit power hungry but if you can build a dual core workstation with 36 cores, and 256GB ram for <$1000, that is a solid server. Sure, a bit more power hungry, but that would still make for a hell of an app/db server. That's basically an m4.16xlarge, same CPU generation, platform etc, (yes, without all the surrounding infra) that will cost you something like $0.10 per hour worth of energy to run.

Take the Dell T7910 for example (I use one of these for my homelab), you can pick up a basic one with low end CPU/RAM for sometimes as little as $300. Dumping all these 18 core E5s and DDR4 ECC on the market should make it even cheaper to spec out. Currently they go for about $100-150 each on the CPUs, and ~$150-200 for the RAM. Not bad IMO.

neilv · on April 29, 2024

Hard to see in the low-res photos, but is that storage from Supermicro?

humansareok1 · on April 29, 2024

What's this thing actually worth? Current bid is ~3k.

freedomben · on April 29, 2024

Remember that the buyer has to move it. If that costs $50K (no idea, totally guessing) then it's currently "worth" $53K

HeyLaughingBoy · on April 29, 2024

Hang around for the winning bid and you'll see what it's worth then.

humansareok1 · on April 29, 2024

Auctions by default almost always undercut the actual market value so no not really?

HeyLaughingBoy · on April 29, 2024

My answer was a bit tongue-in-cheek, but the reality is that it depends on what you mean by "market value."

For the market of the auction, the selling price is the actual market value. Likewise, it's typically not too far off the value of the item in the wider market, assuming you are comparing it to a similar item in similar condition. The problem is that for most items purchased at auction, there's no similar item, readily available, to compare it to.

I've won multiple items at machine-shop auctions for a small fraction of their "new" price. The problem with the comparison is that e.g., the Starrett dial test indicator that I got for $10, and the new one that retails for around $200 are hard to compare because there's no liquid market for 30-year-old measuring equipment. While it's adequate for my hobby machinist use, it wouldn't be acceptable in a precision shop since it has no calibration history.

If you find an item where you can reasonably compare apples to apples, e.g., a car, you see that the final price of a car at auction is usually pretty close to the price of the same make/model being sold on the open used market. The slightly lower price of the auction car reflects the risk of the repairs that might be needed.

bombcar · on April 29, 2024

It's exactly in this "now vs later" that resellers and other brokers sit. If they know that X will sell for $Y "eventually" and how long that eventually is, they can work out how much they can pay for it now and still come out ahead.

Cars are very liquid and move quickly, so the now vs later price is close; weird things that nobody has heard of (but when they need it, they need it NOW) will have a much wider variance.

organsnyder · on April 29, 2024

Isn't the winning bid the actual market value, by definition?

toast0 · on April 29, 2024

Depends on the terms of the auction. If we take the California legal definition of Fair Market Value for real estate:

> The fair market value of the property taken is the highest price on the date of valuation that would be agreed to by a seller, being willing to sell but under no particular or urgent necessity for so doing, nor obliged to sell, and a buyer, being ready, willing, and able to buy but under no particular necessity for so doing, each dealing with the other with full knowledge of all the uses and purposes for which the property is reasonably adaptable and available.

A 7 day auction on a complex product like this may be a little short to qualify with the necessity clauses, IMHO; there's a bit too much time pressure, and not enough time for a buyer to inspect and research.

humansareok1 · on April 29, 2024

I think Auctions exist explicitly to potentially buy or sell an Item with a delta on it's Market Value? I.e. Buyers want the chance to buy below and sellers to sell above. Neither really wants to engage in the transaction at all in the reverse situation or even in the "Market Value" case. You would just make a direct sale and avoid the hassle of an auction.

jtriangle · on April 29, 2024

The market value of something is what someone is willing to pay for it.

Always has been, always will be.

9front · on May 7, 2024

Sold for $480,085.

bketelsen · on April 29, 2024

Imagine a Beowulf cluster of these.

Sorry I couldn't resist.

Isamu · on April 29, 2024

You forgot to say First Post!

whalesalad · on April 30, 2024

Will it run Crysis?

RobotToaster · on April 29, 2024

Is this what they used to run the stargate?

monocasa · on April 29, 2024

This is from Cheyenne, Wyoming, not Cheyenne Mountain (in Colorado Springs).