Together Computer Inc. today launched a major update to its Fine-Tuning Platform aimed at making it cheaper and easier for developers to adapt open-source large language models over time.
The startup, which does business as Together AI, operates a public cloud optimized for AI model development. The new features support fine-tuning from within a browser, bypassing the need to install a Python software development kit or make calls to an application programming interface.
The company also added support for direct preference optimization fine-tuning and the ability to start tuning jobs from the results of previous runs with a single command. It also adjusted pricing to lower training costs.
Together AI said the updates reflect its belief that AI models shouldn’t be static but should grow alongside the applications they serve. The browser-based interface allows developers to launch fine-tuning jobs without writing any code. Previously, such tasks required extra setup and technical know-how. Developers can upload datasets, define training parameters and track experiments, lowering barriers to continuous fine-tuning.
“While there’s no inherent quality improvement, since the underlying method is identical to fine-tuning via the API, the browser-based flow eliminates the need for scripting and streamlines the entire process into an intuitive, no-code experience,” said Anirudh Jain, fine-tuning product lead at Together AI. “This makes fine-tuning approachable to nontechnical users and saves around 50% of the time compared to the manual API approach.” The Python SDK and API are still available but not necessary, he said.
Preference-based training
Direct preference optimization is a method of training language models using preference data, in which the model is shown both a preferred and a less-desired response to a prompt. Instead of mimicking a fixed answer, the model learns to favor responses based on human feedback using a contrastive loss function. It teaches models to bring similar things closer and push dissimilar things further away in its representation space.
“Supervised fine-tuning helps the model learn what to say while DPO teaches it what not to say,” Jain said. SFT is preferred when using labeled input/output pairs and DPO when training data contains preferences from human raters or A/B tests.
Unlike traditional reinforcement learning techniques, DPO doesn’t require building a separate reward model, making it simpler, faster and more stable to implement. Developers can fine-tune models to align more closely with the way users interact with applications to improve accuracy and trustworthiness.
Continued training enables developers to resume fine-tuning from a previously trained model checkpoint. That feature is useful for refining models over time or running multi-stage training workflows that combine methods like instruction tuning and preference optimization. It’s invoked by referencing the job ID of an earlier training run and continuing to build from where the previous task left off.
“This is significantly more efficient and cost-effective, allowing for faster iteration and model improvement,” Jain said.
Another enhancement to its platform allows developers to assign different weights to messages in conversational data, essentially downplaying or ignoring certain responses without removing them from the training context entirely. A new cosine learning rate scheduler offers more flexibility and fine-grained control over training dynamics.
Updates to the platform’s data preprocessing engine have improved performance by up to 32% for large-scale training jobs and 17% for smaller ones, the company said.
Together AI is also now offering pay-as-you-go pricing with no minimums in an effort to make it easier for small teams and independent developers to experiment with customized LLMs. Prices vary depending on the model size and training method.
The platform currently supports fine-tuning for popular open-weight models, including Llama 3, Gemma and DeepSeek-R1 variants. The company said it plans to support larger models such as Llama 4 and future DeepSeek versions.
Image: News/DALL-E
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU