InferX — Serverless GPU Inference Platform for Production Workloads

Tenant Namespace Pod Name State Node Name Req. GPU Count Req. GPU vRam (MB) Type Standby (MB) Allocated GPU vRam (MB) Allocated GPU Slots
GPU Pageable Pinned GPU Slot Count
public Trial public/Trial/Huihui-Qwen3-Next-80B-A3B-Thinking-abliterated/85/140 Standby g8398d4 4 45000 Restore File : 179808 File : 7266 File : 33 0 N/A
public Trial public/Trial/L3.3-70B-Loki-V2.0/94/139 Ready g8398d4 2 71000 Restore File : 141700 File : 3604 File : 2 71168 5 278
6 278
public Trial public/Trial/L3.3-70B-Loki-V2.0/94/143 Standby g8398d4 2 71000 Restore File : 141700 File : 3604 File : 2 0 N/A
public Trial public/Trial/translategemma-27b-it-FP8-Dynamic/112/138 Ready g8398d4 1 32000 Restore File : 30540 File : 3188 File : 0 32000 3 125
public Trial public/Trial/translategemma-27b-it-FP8-Dynamic/112/144 Standby g8398d4 1 32000 Restore File : 30540 File : 3188 File : 0 0 N/A