
“Switching is essentially a simpler operation. You just kind of send a packet or not,” Ayyar explained. “Routing is a more complex operation. You tell the packet where to go and what to do. You have a lot more richness and policy in what you do on the routing front.”
That policy-rich routing foundation is what Arrcus is now applying to AI inference.
The inference problem and how AINF addresses it
As AI workloads shift from centralized training to distributed inference, the network faces a different class of demands.
Inference nodes are geographically dispersed and must satisfy simultaneous constraints around latency, throughput, power capacity, data residency, and cost. Those constraints vary by location and change in real time, and traditional hardware-defined networking was not designed to handle them dynamically.
“These inference nodes are now going to become super critical in understanding exactly what the constraints are at those inference points,” Ayyar said. “Do you have a power constraint? Do you have a latency constraint? Do you have a throughput constraint? And if you do, how are you going to direct and steer your traffic?”
AINF addresses this by introducing a policy abstraction layer that sits between Kubernetes-based orchestration and the underlying silicon. Models expose their requirements via an API interface, disclosing the parameters they need. Those requirements flow down to the routing layer, which steers traffic accordingly.




















