Supplier Integration API
A lightweight REST API that GPU providers implement to list inventory, provision instances, and integrate with GPU.ai. Designed for small GPU farms and colocations without existing APIs.
How it works
You implement three REST endpoints on your infrastructure. GPU.ai polls your inventory every 30 seconds, sends provision requests when customers order GPUs, and polls instance status for lifecycle tracking. You don't need to call GPU.ai — we call you.
- Contact integrations@gpu.ai to receive your
client_idandclient_secret - Implement the three endpoints described below
- GPU.ai configures your base URL and starts polling automatically
Authentication
GPU.ai authenticates to your endpoints using Bearer tokens obtained via the OAuth2 client credentials flow. GPU.ai calls its own token endpoint with your credentials, receives a short-lived token, and sends it in the Authorization header on every request.
Authorization: Bearer eyJhbGciOiJS...Tokens expire after 1 hour. GPU.ai handles token refresh automatically — your endpoints just need to validate the Bearer token on each request.
Base URL
All endpoints are relative to your base URL, configured during onboarding. For example, if your base URL is https://api.acme-gpu.com, the availability endpoint would be at https://api.acme-gpu.com/v1/gpu/available.
Endpoints
/v1/gpu/availableReturns your current GPU offerings with pricing and availability. GPU.ai polls this every 30 seconds. Only return GPUs that are currently provisionable.
Response body
offeringsarrayrequired// 200 OK
{
"offerings": [
{
"gpu_type": "h100_sxm",
"gpu_count": 1,
"vram_per_gpu_gb": 80,
"cpu_cores": 24,
"ram_gb": 128,
"storage_gb": 500,
"price_per_hour": 3.49,
"tier": "on_demand",
"region": "US",
"datacenter_location": "US-East-1",
"stock_status": "High",
"available_count": 12
}
]
}/v1/instancesProvisions a new GPU instance. The startup_script field contains a bootstrap script that must be executed on instance boot — it establishes the SSH tunnel back to GPU.ai. Return immediately with your instance ID; GPU.ai polls status separately.
Request body
instance_idstringrequiredgpu_typestringrequiredh100_sxm, a100_80gb)gpu_countintegerrequiredtierstringrequiredon_demand or spotregionstringssh_public_keysstring[]docker_imagestringstartup_scriptstringResponse body
upstream_idstringrequiredstatusstringrequiredcreating)cost_per_hournumberestimated_ready_secondsintegerdatacenter_locationstringregionstring// 201 Created
{
"upstream_id": "sup-12345",
"status": "creating",
"cost_per_hour": 3.49,
"estimated_ready_seconds": 60,
"datacenter_location": "US-East-1",
"region": "US"
}/v1/instances/{id}Returns the current status of a provisioned instance. GPU.ai polls this to track lifecycle transitions.
Response body
upstream_idstringrequiredstatusstringrequiredcreating, running, stopping, terminated, erroripstringcost_per_hournumberuptime_secondsinteger// 200 OK
{
"upstream_id": "sup-12345",
"status": "running",
"ip": "10.0.1.55",
"cost_per_hour": 3.49,
"uptime_seconds": 3600
}/v1/instances/{id}Terminates a running instance and releases all resources. This must be idempotent — terminating an already-terminated instance should return 204 without error.
Returns 204 No Content on success (no response body).
GPU type identifiers
Use GPU.ai's canonical GPU type strings in your gpu_type fields. These must match exactly.
| Identifier | GPU | VRAM |
|---|---|---|
h200_sxm | NVIDIA H200 SXM | 141 GB |
h100_sxm | NVIDIA H100 SXM | 80 GB |
h100_pcie | NVIDIA H100 PCIe | 80 GB |
a100_80gb | NVIDIA A100 | 80 GB |
l40s | NVIDIA L40S | 48 GB |
rtx_4090 | NVIDIA RTX 4090 | 24 GB |
rtx_3090 | NVIDIA RTX 3090 | 24 GB |
Error handling
GPU.ai retries on 5xx errors with exponential backoff (up to 3 attempts). Return 429 with a Retry-After header if you need to rate limit. 4xx errors (except 429) are not retried.
If the availability endpoint returns an error, GPU.ai isolates the failure — your supplier is temporarily skipped while other suppliers continue serving. GPU.ai will resume polling your endpoint on the next 30-second cycle.
OpenAPI spec
The full OpenAPI 3.0.3 specification is available for download. Import it into Postman, Swagger UI, or your API client of choice to explore the endpoints interactively.
View openapi.yaml on GitHub →Get started
Ready to integrate? Contact integrations@gpu.ai with your company name, datacenter locations, and GPU inventory. We'll provision your credentials and walk you through onboarding.