Workloads

Workloads are both native platform workloads, Workspaces, Training and Inference, as well as workloads that originate from third-party ML frameworks, tools, or the broader Kubernetes ecosystems. For more details on the supported workloads,, see Introduction to workloads. Workloads endpoints allow you to list, retrieve, count, and view telemetry or metrics data for all workload types in your environment.

List workloads.

Retrieve a list of active workloads with details.

SecuritybearerAuth
Request
query Parameters
deleted
boolean

Return only deleted resources when true.

offset
integer <int32>

The offset of the first item returned in the collection.

Example: offset=100
limit
integer <int32> [ 1 .. 500 ]
Default: 50

The maximum number of entries to return.

sortOrder
string
Default: "asc"

Sort results in descending or ascending order.

Enum: "asc" "desc"
sortBy
string

Sort results by a parameter.

Enum: "type" "name" "clusterId" "projectId" "projectName" "departmentId" "departmentName" "createdAt" "deletedAt" "submittedBy" "phase" "completedAt" "nodepool" "distributedFramework" "allocatedGPU" "idleGpus" "idleAllocatedGpus" "phaseUpdatedAt" "category" "priority" "totalPendingTimeSeconds" "totalRunningTimeSeconds" "priorityClassName" "guaranteedRuntimeEndsAt" "aiApplicationId" "aiApplicationName"
filterBy
Array of strings

Filter results by a parameter. Use the format field-name operator value. Operators are == Equals, != Not equals, <= Less than or equal, >= Greater than or equal, =@ contains, !@ Does not contain, =^ Starts with and =$ Ends with. Dates are in ISO 8601 timestamp format and available for operators ==, !=, <= and >=.

Example: filterBy=name!=some-workload-name,allocatedGPU>=2,createdAt>=2021-01-01T00:00:00Z
search
string

Filter results by a free text search.

Example: search=test project
Responses
200

Executed successfully.

401

Unauthorized

403

Forbidden

500

unexpected error

503

unexpected error

get/api/v1/workloads
Response samples
application/json
{
  • "next": 1,
  • "workloads": [
    ]
}

Get a workload.

Retrieve workload data using a workloadId.

SecuritybearerAuth
Request
path Parameters
workloadId
required
string <uuid>

The Universally Unique Identifier (UUID) of the workload.

Responses
200

Executed successfully.

401

Unauthorized

403

Forbidden

404

The specified resource was not found

500

unexpected error

503

unexpected error

get/api/v1/workloads/{workloadId}
Response samples
application/json
{
  • "tenantId": 1001,
  • "runningPods": 1,
  • "phaseUpdatedAt": "2022-06-08T11:28:24.131Z",
  • "k8sPhaseUpdatedAt": "2022-06-08T11:28:24.131Z",
  • "updatedAt": "2022-06-08T11:28:24.131Z",
  • "source": "CLI",
  • "deletedAt": "2022-08-12T19:28:24.131Z",
  • "type": "runai-job",
  • "name": "very-important-job",
  • "id": "497f6eca-6276-4993-bfeb-53cbbbba6f08",
  • "priority": 50,
  • "priorityClassName": "high-priority",
  • "submittedBy": "researcher@run.ai",
  • "clusterId": "71f69d83-ba66-4822-adf5-55ce55efd210",
  • "projectName": "proj-1",
  • "projectId": "1",
  • "departmentName": "department-1",
  • "departmentId": "1",
  • "namespace": "runai-proj-1",
  • "createdAt": "2022-01-01T03:49:52.531Z",
  • "workloadRequestedResources": {
    },
  • "podsRequestedResources": {
    },
  • "allocatedResources": {
    },
  • "actionsSupport": {
    },
  • "phase": "Creating",
  • "conditions": [
    ],
  • "phaseMessage": "Not enough resources in the requested nodepool",
  • "k8sPhase": "Pending",
  • "requestedPods": {
    },
  • "requestedNodePools": [
    ],
  • "currentNodePools": [
    ],
  • "completedAt": "2022-01-01T03:49:52.531Z",
  • "images": [
    ],
  • "urls": [
    ],
  • "datasources": [
    ],
  • "environments": [
    ],
  • "externalConnections": [
    ],
  • "distributedFramework": "Pytorch",
  • "additionalFields": { },
  • "preemptible": true,
  • "environmentVariables": {
    },
  • "command": "sleep",
  • "arguments": "1000",
  • "phaseReason": "NonPreemptibleOverQuota",
  • "idleGpus": 3,
  • "idleAllocatedGpus": 1,
  • "totalPendingTimeSeconds": 60,
  • "totalRunningTimeSeconds": 60,
  • "category": "Train",
  • "guaranteedRuntimeEndsAt": "2025-08-01T03:49:52.531Z",
  • "aiApplicationId": "string",
  • "aiApplicationName": "string",
  • "sourceApi": "WorkloadsV2",
  • "pendingSchedulingMessages": [
    ]
}

Count workloads.

Retrieve the number of workloads.

SecuritybearerAuth
Request
query Parameters
deleted
boolean

Return only deleted resources when true.

filterBy
Array of strings

Filter results by a parameter. Use the format field-name operator value. Operators are == Equals, != Not equals, <= Less than or equal, >= Greater than or equal, =@ contains, !@ Does not contain, =^ Starts with and =$ Ends with. Dates are in ISO 8601 timestamp format and available for operators ==, !=, <= and >=.

Example: filterBy=name!=some-workload-name,allocatedGPU>=2,createdAt>=2021-01-01T00:00:00Z
search
string

Filter results by a free text search.

Example: search=test project
Responses
200

Executed successfully.

401

Unauthorized

403

Forbidden

500

unexpected error

503

unexpected error

get/api/v1/workloads/count
Response samples
application/json
{
  • "count": 1
}

Get the workloads telemetry.

Retrieves workload data by telemetry type.

SecuritybearerAuth
Request
query Parameters
clusterId
string <uuid>

Filter using the Universally Unique Identifier (UUID) of the cluster.

Example: clusterId=d73a738f-fab3-430a-8fa3-5241493d7128
nodepoolName
string

Filter using the nodepool.

Example: nodepoolName=default
departmentId
string

Filter using the department id.

Example: departmentId=1
groupBy
Array of strings <= 2 items

Group workloads by field.

Items Enum: "ClusterId" "DepartmentId" "ProjectId" "Type" "CurrentNodepools" "Phase" "Category"
telemetryType
required
string (WorkloadTelemetryType)

Specifies the telemetry type.

Enum: "WORKLOADS_COUNT" "GPU_ALLOCATION"
Responses
200

Executed successfully.

400

Bad request.

401

Unauthorized

403

Forbidden

404

The specified resource was not found

500

unexpected error

503

unexpected error

get/api/v1/workloads/telemetry
Response samples
{
  • "type": "ALLOCATION_RATIO",
  • "timestamp": "2023-06-06 12:09:18.211",
  • "values": [
    ]
}

Get workload metrics data.

Retrieves workloads data metrics from the metrics database. Use in reporting and analysis tools.

SecuritybearerAuth
Request
path Parameters
workloadId
required
string <uuid>

The Universally Unique Identifier (UUID) of the workload.

query Parameters
metricType
required
Array of strings (WorkloadMetricType)

Specify which data to request.

Items Enum: "GPU_UTILIZATION" "GPU_MEMORY_USAGE_BYTES" "GPU_MEMORY_REQUEST_BYTES" "CPU_USAGE_CORES" "CPU_REQUEST_CORES" "CPU_LIMIT_CORES" "CPU_MEMORY_USAGE_BYTES" "CPU_MEMORY_REQUEST_BYTES" "CPU_MEMORY_LIMIT_BYTES" "POD_COUNT" "RUNNING_POD_COUNT" "GPU_ALLOCATION" "NIM_NUM_REQUESTS_RUNNING" "NIM_NUM_REQUESTS_WAITING" "NIM_NUM_REQUEST_MAX" "NIM_REQUEST_SUCCESS_TOTAL" "NIM_REQUEST_FAILURE_TOTAL" "NIM_GPU_CACHE_USAGE_PERC" "NIM_TIME_TO_FIRST_TOKEN_SECONDS" "NIM_E2E_REQUEST_LATENCY_SECONDS" "NIM_TIME_TO_FIRST_TOKEN_SECONDS_PERCENTILES" "NIM_E2E_REQUEST_LATENCY_SECONDS_PERCENTILES" "DCGM_FI_DEV_NVLINK_BANDWIDTH_TOTAL"
start
required
string <date-time>

Start date of time range to fetch data in ISO 8601 timestamp format.

Example: start=2023-06-06T12:09:18.211Z
end
required
string <date-time>

End date of time range to fetch data in ISO 8601 timestamp format.

Example: end=2023-06-07T12:09:18.211Z
numberOfSamples
integer [ 0 .. 1000 ]
Default: 20

The number of samples to take in the specified time range.

Example: numberOfSamples=20
Responses
200

Executed successfully.

207

Partial success.

400

Bad request.

401

Unauthorized

403

Forbidden

404

The specified resource was not found

500

unexpected error

503

unexpected error

get/api/v1/workloads/{workloadId}/metrics
Response samples
{
  • "measurements": [
    ],
  • "histogram": [
    ]
}