• 0 Posts
  • 22 Comments
Joined 8 months ago
cake
Cake day: March 27th, 2024

help-circle














  • It’s more of an operating cost issue. It’s almost decade-old hardware. It was efficient in its day, but compared to new hardware it just costs so much to run you would be better served investing in something with modern efficiency. It won’t be junked, it will be parted out. If you are someone that wants a cheap homelab with infiniband and shitloads of memory you could pick up a blade for a fraction of what it would otherwise cost. I fully expect it to turn into thousands of reasonably powerful servers for the prosumer and nerd markets instead of running as a monolithic cluster.


  • Hey, I have worked on this exact machine before, neat to see they are finally decommissioning it. It would be a terrible purchase to actually use these days though, for the cost of moving and deploying it you could rock a few Hopper or Grace clusters that would outperform the cluster for less than half of the operating overhead.

    I fully expect it to get parted out, the actual components would be far more useful on their own as cheap homelab systems, and would be a much better ROI versus using it as is. This thing is water cooled, just the plumbing would be a nightmare to deal with if you aren’t set up for it, and if you are you would be better off going with a modern architecture anyway.


  • Kind of, you would use a deployment node to manage the individual blades, they are running really specialized software that is basically useless without the management nodes. It wouldn’t be difficult to spin it up (Terascale would have it ready to batch out jobs within a few hours) but you are going to need to engineer your building around it to even get that far. Your foundation needs to support multiple tons of weight, be perfectly level, be able to deliver megawatts of power, remove megawatts of heat (it is water cooled, so you need to have infrastructure and cooling towers to handle that), and you need to be able to get it into the building to begin with. I have worked on this system a few times, just moving it would literally cost upwards of 7 figures. The computer is pretty easy to use, it’s all of the supporting infrastructure that will need a literal team of engineers. I could (and have, kind of) spin the machine up to start crunching data within a day on my own. Fuck moving it, and double fuck re-cabling it. Literal miles of fiber in those racks.

    You do literally pop in an image that is pre-configured in and it deploys to everything at once. That’s probably the easiest part of the whole setup.



  • Hi, I contribute to a number of projects that require incredibly specific information to facilitate (GPGPU kernel optimizations and unit tests for BLAS) and I use Reddit to collaborate with other engineers to solve issues like doing calculus on Lie groups resulting in a divide by zero because some non-zero groups multiply to zero in the middle of the calculation. The best engineers and mathematicians I know moved here, so I moved with them to continue the dissemination of these principles. The majority of memes and shitposts offer a common forum to get real work and study done in a way that publicly offers those solutions to anyone asking the same questions. Reddit wants just the shitposts and astroturfing, so they can keep it. I have work to do.