In a world increasingly defined by specialized hardware, we're introducing two powerful GPU server configs via market-based pricing. What gives?
At Packet, we’re big believers in a future that will increasingly be defined by specialized hardware.
Not only is Moore’s Law really and truly dead, but workloads are getting bigger, AI/ML is just starting to make its mark, we’re living in a resource constrained world (power!) and portable software + DevOps means we can actually move software around to new hardware with ease.
These trends are helping to drive investment in everything from accelerators to offloads to alternative processor architectures. I believe this is one of the most exciting times to be in hardware - and I’ve been at this game professionally since before I had a driver's license.
Just look around at all the hardware companies breaking ground: from established names like Xilinx, Nvidia, AMD and Intel to upstarts like Graphcore, Cerebras, Mythic and Netronome. Heck, even the hyperscale cloud providers and webscale companies are making their own silicon: Apple with its Bionic processors, Google with its TPUs, and Amazon Web Services with its custom Nitro NICs.
All of this new hardware is great. As the guy responsible for charting Packet’s hardware strategy and pricing, the problem I’m faced with is: what’s it worth? I mean, we all know about what a standard Intel Scalable processor is worth in the market -- but a brand new AI chip or smart NIC for edge packet processing? Who knows!
These new technologies are missing the mature market dynamics of more established components and solutions.
Pricing and Cloud Provider Economics
In some ways, the cloud is perfect for testing and trying out new technologies. Spin something up quickly, develop in realtime and tear down if you don’t want it. Great, right? But if you look at the cloud catalogs today, you’ll find very little or no “special” hardware available to use. And if you do, it will carry a serious premium unless there is a long term reservation contract or similar risk reduction strategy from the cloud provider. Why?
Mainly, risk. Most cloud providers don’t want to buy tons of really special hardware, hope the market is mature enough to use it, guess the price that people are willing to pay, hope the product has a long enough lifecycle in the market to get a nice payback and wait for the money to flow in.
But a never-been-tried Graphcore IPU that promises a huge performance premium over current GPUs? A Mellanox Bluefield-enabled Arm box? Um, who knows. It depends on all kinds of things: software traction, market adoption, use cases etc.
Of course, our marketing team could pin a tail on the donkey - $14.32/hr! On sale today! Buy now! Or we could wait a year as the technologies prove themselves out - it’s much less risky to sell last year’s GPU than the one announced last week.
But we don’t think that’s a scalable model, especially as we deal with fluctuations in capacity, distribution to an increasing number of locations, and try to help the leading companies of the world use all this specialized hardware to transform their businesses on their own terms.
We’ll need to find a better way to sell special hardware.
By the Numbers
Let’s play out the numbers with a very real example. I’m looking at the BOM (bill of materials) for our new g2.xlarge configuration: dual Intel Xeon 6126 processors, 192 GB of memory, some fast SSDs and two Nvidia Tesla V100 GPUs with 32GBs of RAM each. Pretty sweet server.
We bought ~20 of these configs for our GPU launch from a major OEM that we buy a lot of gear from. While our price isn’t anywhere close to what a hyperscaler would pay, it’s also not retail market rate. Total cost including switching, optical cables, etc? About $25,000, or the cost of a Volkswagen GTI. :)
|2 x Intel Xeon Gold 6126 Processors||$3,500|
|2 x Nvidia V100 GPUs w/ 32GB RAM||$16,000|
|Server, Board, Disks, NICs, etc||$2,500|
Ignoring the implications of power and just focusing on the basics, we would need to sell each of these servers 24 hours per day, every day for 8 months straight at a cost of $5.00 per hour. Now, that could be exactly the right price. It could also be totally wrong - depending on your workload needs, this machine could be worth dramatically more to you for a few hours, or far less. How would I know?
(Note: selling reserved / committed hardware is quite easy math, as the risk is balanced by commitments from a client. We are happy to do that as part of our Custom Reserved or Private Deployment products anytime).
An Experiment in Market Based Pricing
With the launch next month of our first two public cloud GPU configs (specs down below), we’re going to let you set the price. We’ll peg it at $0.00/hr and let our spot market (help doc) take it from there by setting a bid, as well as a maximum threshold. If someone bids above your threshold, you'll be deleted with 120 seconds notice.
This means 100% market based pricing. We’ll expose the exact number of machines available in our capacity endpoint, and we’ll let you (the market) decide what $500,000 worth of hardware is worth.
If you want to leverage our new configs when they are available, you’ll need to get familiar with our spot market. We’ve recently updated our Terraform provider to support our spot market, so be sure to give it a try in advance. You can also use our portal interface to deploy your market-pricing based instances.
The Timeshare Model, and Other Ideas
While a spot market is great for many use cases, we think a timeshare model will be more effective in many cases. In fact, we are positive that there is a lot to learn and develop on our platform to help users consume specialized / diverse hardware.
That’s why we’re also working on a futures contract process, which will allow you to reserve hardware in advance. Need 4 hours to train your model? Reserve it / schedule it at the forward-looking market price, and move along. This may sound like high finance, yet most of us have used Expedia to search for plane tickets or browsed late at night for a good deal on Hotel Tonight -- these are resources with limited inventory -- and we think the “room with a view” isn’t that much different from a GPU-based server when it comes to availability and market dynamics.
No doubt we are missing a lot, both in terms of cool hardware options and platform features to support them. I would love to hear your thoughts.
GPU Server Specs
Inference Server - x2.xlarge
Execution of models, neural nets and matching engines.
- Dual Xeon 5120
- 384GB RAM
- 2 x 120GB SSD boot
- 1 x 3.2TB NVMe
- 1x Nvidia P4 GPU
- 2 x 10Gbps NICs
Training Server - g2.large
Training/modeling - DNN and CNN training.
- Dual Xeon 6126 (2x 12C/24T 2.6Ghz)
- 192GB RAM
- 2 x 120GB SSD boot
- 2 x 480GB SSD
- 2 x Nvidia V100 32GB GPU w/NVLINK
- 2 x 10Gbps NICs