
Addressing enterprise challenges
The software provides two main services, according to SoftBank. The Kubernetes-as-a-Service component automates the stack from BIOS and RAID settings through the OS, GPU drivers, networking, Kubernetes controllers, and storage, the company said.
It reconfigures physical connectivity using Nvidia NVLink and memory allocation as users create, update, or delete clusters, according to the announcement. The system allocates nodes based on GPU proximity and NVLink domain configuration to reduce latency, SoftBank said.
Enterprises currently face complex GPU cluster provisioning, Kubernetes lifecycle management, inference scaling, and infrastructure tuning challenges that require deep expertise, according to Dai.
SoftBank’s automated approach addresses these pain points by handling BIOS-to-Kubernetes configuration, optimizing GPU interconnects, and abstracting inference into API-based services, he said. This allows teams to focus on model development rather than infrastructure maintenance, Dai said.
The Inference-as-a-Service component lets users deploy inference services by selecting large language models without configuring Kubernetes or underlying infrastructure, according to the company. It provides OpenAI-compatible APIs and scales across multiple nodes on platforms including the GB200 NVL72, SoftBank said.
The software includes tenant isolation through encrypted communications, automated system monitoring and failover, and APIs for connecting to portal, customer management, and billing systems, according to the announcement.





















