InferX — Serverless GPU Inference Platform for Production Workloads

Model translategemma-27b-it-FP8-Dynamic

Namespace Model Name Type Standby GPU Standby Pageable Standby Pinned Memory GPU Count vRam (MB) CPU Memory (MB) State Revision
Trial translategemma-27b-it-FP8-Dynamic text2text File File File 1 32000 20.0 80000 Normal 112

Image

Prompt



Sample Rest Call

Pods

Tenant Namespace Pod Name State Required Resource Allocated Resource GPU
public Trial public/Trial/translategemma-27b-it-FP8-Dynamic/112/138 Ready
CPU
20000
Mem
80000
CacheMem
0
GPU Type
Any
GPU Count
1
GPU vRam
32000
GPU Contexts
0
Node Name
g8398d4
CPU
20000
Memory
78976
Cache Memory
1024
GPU Type
NVIDIA H100 80GB HBM3
vRam
32000
Slot Size
268435456
Total Slot Count
285
Max Context Per GPU
1
public Trial public/Trial/translategemma-27b-it-FP8-Dynamic/112/144 Standby
CPU
20000
Mem
80000
CacheMem
0
GPU Type
Any
GPU Count
1
GPU vRam
32000
GPU Contexts
0
Node Name
g8398d4
CPU
0
Memory
0
Cache Memory
0
GPU Type
NVIDIA H100 80GB HBM3
vRam
0
Slot Size
268435456
Total Slot Count
285
Max Context Per GPU
1

Logs

tenant namespace model name revision id node name create time exit info state
public Trial translategemma-27b-it-FP8-Dynamic 112 115 g8398d4 2026-03-01 23:07:06 None log

Snapshot History

tenant namespace model name revision nodename state detail updatetime
public Trial translategemma-27b-it-FP8-Dynamic 112 g8398d4 Scheduled Scheduled 2026-03-01 22:58:59
public Trial translategemma-27b-it-FP8-Dynamic 112 g8398d4 Done Done 2026-03-01 23:07:06

Model Spec


Policy