The problem
In the age of generative AI and large language models (LLMs), access to GPU compute is the new oil. GPU pricing is volatile, opaque, and scattered across a fragmented market of cloud providers. For teams deploying inference pipelines at scale, these cost fluctuations aren’t just a nuisance—they’re a financial and operational risk.
For example, let’s take a look at spot pricing for p5.48xlarge ( 8 x Nvidia H100s) on AWS over the past 6 months:
Stockholm (eu-north-1) currently runs H100s at about $1.1 per gpu per hour whereas London (eu-west-2) pays almost $3.7 per gpu per hour. If Brexit hasn’t hit the UK hard enough, AWS is charging almost 3x more for running H100s in the UK! The higher cost in London is also an indication that gpu resources are tighter in London than Stockholm and thus provisioning 8 H100s will be more challenging for both spot and “On-Demand” instances.
This isn’t isolated to H100s, but also affects cheaper gpu hardware like the g4dn.xlarge instance, a single Nvidia T4 gpu:
which costs about $0.07 per gpu per hour in me-south-1 (Bahrain) whereas Singapore (ap-southeast-1) are paying $0.28 per gpu per hour for the same service. At GordianLabs, we often see 2-4x cost difference across datacentres from a single provider.
Cloud GPU pricing is a moving target. Depending on the provider, region, and instance type, spot instance prices can swing 2x–5x within days. On-demand and reserved prices vary too, with limited transparency into supply, demand, or underlying trends.
This leads to:
- Cost overruns on training runs
- Delayed deployments due to spot instance interruptions
- Missed savings from poor regional selection
Most teams react to GPU costs after the fact—when the bill arrives. GordianLabs.ai flips this model by predicting costs before you spin up a single instance.
The Solution
GordianLabs uses world leading AI experts and more than 55M data points of cloud pricing data to predict future GPU pricing from 1 day to 3 months in advance.
We serve these predictions in a simple API, which can be wired into your existing infrastructure. Drop us an email – [email protected] if you’d like save more than 50% on your gpu budget.