First Post!

How to Make as Many Mistakes as Possible When Deploying Your First Local AI LLM Model

(No Matter How Easy It May Seem)

Start with Research: I researched which GPU to buy and found a great deal on a used model from Amazon. What I didn’t do was check whether it would fit in my Cisco server or whether it was supported for GPU passthrough by Nutanix. I spent two days trying to install it—only to discover it didn’t fit and wasn’t supported. Classic.
Find a Supported GPU: I eventually found a compatible GPU that fit and was supported by Nutanix. Unfortunately, I forgot to order the correct power supply cable. Once installed, the GPU drew so much power that the server wouldn't even boot. The saga continues.
Upgrade the Power Supply: I decided to go big and ordered 1600W power supplies. What I didn’t realize? They required a 220V power outlet—which I didn’t have. Back to square one.
Try a Cable Fix: Thinking maybe the cable was the issue, I ordered a new one. Turns out, it was the wrong cable again. Strike two.
Order Smaller Power Supplies: I ordered smaller, more compatible power supplies—but they wouldn’t arrive for another week. At this point, I started doubting whether they'd even be powerful enough once they showed up.
Install a 220V Line: I looked into getting a 220V line installed—and found out I could get it done faster than waiting for the new power supplies to arrive. Plus, it’ll be useful in the future for a bigger UPS, so not a total loss.
More Cable Troubles: The third power cable for the GPU arrived—and still didn’t fit. That’s when I had an epiphany: I actually looked up the manual for the GPU, which detailed the correct pin configuration. Ever hear the saying about what happens when you make ASSumptions? I guess it also applies to nvidia tesla GPU accelerator cards.
Final Stretch: With the new 220V outlet installed (and a shoutout to my buddy Jeff from Amazon), I finally had the right cables—except I hadn’t measured their length. Of course, they were too short. I had to move the rack and remove the back door, but at last, the server powered on!
Success at Last: Later that day, the (very tired) delivery person arrived with the final missing piece: an EPS-to-EPS power cable for the Tesla P100 GPU. Five minutes later, I pressed the power button—and boom! GPU successfully assigned to a Large Language Model.

Aftermath: Amazon returns....so many returns...sorry jeff.

You call THAT a Cloud?

How not to load a local AI in more steps than you would think possible

First Post!

How to Make as Many Mistakes as Possible When Deploying Your First Local AI LLM Model

(No Matter How Easy It May Seem)

Comments

Post a Comment