By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Exclusive: Together AI launches self-service GPU infrastructure – News
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > News > Exclusive: Together AI launches self-service GPU infrastructure – News
News

Exclusive: Together AI launches self-service GPU infrastructure – News

News Room
Last updated: 2025/09/09 at 1:47 PM
News Room Published 9 September 2025
Share
SHARE

Together Computer Inc., a startup building a cloud service optimized for artificial intelligence model development and deployment, today announced the general availability of Instant Clusters, a service that automates the provisioning of clusters of graphics processing units.

The company, which operates as Together AI, stated that its service allows customers to access GPU clusters, ranging from a single node with eight GPUs to large, multi-node systems with hundreds of processors, using a single application programming interface. It supports the latest Nvidia Corp. hardware, including Hopper and Blackwell GPUs, and is optimized for use cases such as distributed training and elastic inference.

The service has been in beta test since early summer and the GA release includes several updates based on user feedback, said Charles Zedlewski, chief product officer at Together AI. Among them are improved autoscaling features, the ability to extend reserved infrastructure dynamically and support for infrastructure-as-code tools Skypilot and Terraform.

“We added Terraform support so that people could build their own automations around these GPU clusters,” Zedlewski said. “We also added the ability to recreate clusters and remount them with the original data and storage.”

This remounting capability supports episodic training workloads, in which users pause and resume training jobs over extended periods in which are common in large-scale model development.

GPU cloud

Instant Clusters are essentially designed to emulate the user experience of conventional cloud infrastructure while handling the specific demands of AI workloads. Clusters come preloaded with drivers, schedulers and networking components, including GPU Operator, Nvidia Network Operator and InfiniBand interconnects. Configuring those components manually can take days, the company said.

Zedlewski said because GPU infrastructure differs fundamentally from traditional CPU environments, setup and configuration has remained primarily a manual process. “The whole stack of virtualization and automation around GPU infrastructure is meaningfully different than the equivalent stack that we’ve known for a long time with x86 CPU infrastructure,” he said. Cloud computing providers have spent 20 years fine-tuning CPU infrastructure but are still learning the ins and outs of how to optimize for AI.

Together AI said it performs hardware checks, stress tests and inter-node communication validations before making clusters available. “If you provisioned an eight-node, 64-GPU cluster, we basically pretest every node before it shows up in your environment,” Zedlewski said.

Instant Clusters are optimized for use with Kubernetes, Slurm and other orchestration tools. Customers can lock in specific driver and Nvidia Cuda versions and reuse custom container images to simplify reproducibility across training and inference phases.

Storage can be mounted to clusters on demand. Though users must use Together AI’s POSIX-compliant parallel file systems, storage and compute can be scaled independently.

The service supports variable pricing models based on usage duration, with hourly, daily and multimonth commitments available. A low-end Nvidia HGX H100 inference cluster ranges from $1.76 to $2.39 per hour based on the customer’s frequency commitment. Nvidia’s high-end HGX B200 costs $4 per hour with a long-term commitment and $5.50 per hour for on-demand usage.

Zedlewski said most organizations would struggle to match the infrastructure’s cost-efficiency by building in-house: “I’d be very surprised if anyone attempts to roll their own,” he said.

Image: News/Microsoft Image Creator

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About News Media

News Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of News, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — News Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, News Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article characteristics, technical file, price and launch date
Next Article AirPods Pro 3 arrive with heartbeat sensor and live translation
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

The Bluetti Pioneer 150 portable power station is a massive $1,000 off at Amazon
News
Fedora 44 Considering Additional Kernel Hardening For Better Security
Computing
Alberta Life Sciences Week challenges idea that global downturns dictate local outcomes
News
Spotify Launches New ‘Smart Filters’ To Help Organize Your Library On The Fly – BGR
News

You Might also Like

News

The Bluetti Pioneer 150 portable power station is a massive $1,000 off at Amazon

3 Min Read

Alberta Life Sciences Week challenges idea that global downturns dictate local outcomes

8 Min Read
News

Spotify Launches New ‘Smart Filters’ To Help Organize Your Library On The Fly – BGR

3 Min Read

Exposed: xAI’s Grok app exposed public conversations

7 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?