Together AI launches self-service AI clusters

Instant Clusters automates the provisioning of GPU clusters


Together AI has launched a service automating the provisioning of GPU clusters for customers.


Dubbed Together Instant Clusters, the offering is now generally available and enables customers to access clusters from a single node with eight GPUs to larger systems with hundreds of GPUs.


The offering supports the Nvidia Hopper and Blackwell GPUs.


According to Together AI, the offering enables companies to manage sudden increases in demand, with Instant Clusters able to be provisioned in minutes and automated like "the rest of the cloud."


The GPU clusters are wired with Nvidia Quantum‑2 InfiniBand across nodes, and Nvidia NVLink and NVLink Switch inside the node. They are optimized for use with Kubernetes, Slurm, and other orchestration tools.


“The limiter isn’t just GPU peak FLOPs; it’s how fast we can get a GPU cluster to start. If we can spin up a clean Nvidia Hopper GPU or Nvidia Blackwell GPU cluster with good networking in minutes, our researchers can spend more cycles on data, model architecture, system design, and kernels. That’s how we optimize research velocity,” said Tri Dao, Together AI chief scientist.


Customers of the Instant Cluster offering include Fractal AI and Latent Health.


Kunal Singh, lead data scientist at Fractal AI said: "As an AI Lab, we regularly train a range of models — from large language models to multimodal systems — and our workloads are highly bursty. Together Instant Clusters let us spin up large GPU clusters on demand for 24–48 hours, run intensive training jobs, and then scale back down just as quickly. The ability to get high-performance, interconnected Nvidia GPUs without the delays of procurement or setup has been a game-changer for our team’s productivity and research velocity.”


Speaking to SiliconAngle, Together AI's chief product officer, Charles Zedlewski, said that the company had also added support for infrastructure-as-code tools Skypilot and Terraform.


“We added Terraform support so that people could build their own automations around these GPU clusters. We also added the ability to recreate clusters and remount them with the original data and storage,” said Zelewski.


He also noted that Together AI performs hardware checks, stress tests, and inter-node communication validations before making clusters available.


Together AI was founded in 2022. Self-described as an AI acceleration cloud, the company raised $305 million in its Series B funding round in February of this year. In February 2024, it raised $100m with a valuation of more than $1 billion.


The AI cloud announced a partnership with Hypertec in November 2024 to co-build a cluster of 36,000 Nvidia GB200 NVL72 GPUs, and launched in February 2025, Together GPU Clusters powered by Nvidia Blackwell GPUs. The company also provides access to the H200, H100, and A100 GPUs, all of which are interconnected with InfiniBand and NVLink. Together AI says it can scale from 16 to more than 100,000 GPUs.


Source: DCD

Read Also
ComEd breaks ground on 260MW substation set to serve Stream data center in Elk Grove, Illinois
Plans filed for data center site outside Columbia, South Carolina
India's Rovision signs MoU for data center near Navi Mumbai

Research