AI21 Labs Ltd. today introduced Maestro, a software system that promises to boost the output quality of large language models significantly.
Israel-based AI21 is an artificial intelligence startup backed by $336 million in funding from Nvidia Corp., Google LLC and other investors. It provides a series of enterprise-focused LLMs called Jamba. The models can process prompts with up 256,000 tokens and support retrieval-augmented generations, or RAG, a machine learning technique that allows an AI to analyze information not included in its training dataset.
Before enterprises deploy an LLM in production, they take steps to reduce the risk of output quality issues. The process often involves creating a software workflow that automatically checks prompt responses for errors. Such workflows can significantly reduce the risk of hallucinations, but they’re difficult to create and maintain.
AI21’s newly debuted Maestro platform is designed to address challenge. The platform, which is described as an AI planning and orchestration system, reduces the amount of work involved in mitigating LLM output errors. It also promises to ease several related tasks.
To use Maestro, workers have to provide a prompt along with a set of requirements that should be met while the prompt is being processed. For example, a user could specify that the cost of generating an LLM response shouldn’t exceed a certain threshold. AI21 says that Maestro automatically applies those customer-provided requirements and thereby reduces the need for manual coding.
When it receives a complex prompt, Maestro breaks down the task into substeps. Simplifying tasks in this manner has been shown to improve the quality of LLM responses. After completing the process, Maestro runs simulations to identify the most efficient way of entering the request into an LLM and delivering an accurate answer.
AI21 says that the platform considers multiple processing approaches and picks the one with the highest likelihood of delivering a correct LLM response. If necessary, Maestro can also scale up inference-time compute. That’s a method of improving reasoning-optimized LLMs’ accuracy by increasing the amount of time and infrastructure they spend on a task.
After a prompt response is generated, Maestro checks it for errors. The system also creates a log that displays every step of the process through which the prompt response was generated. Workers can check this log to validate the accuracy of LLM output.
In a series of internal tests, AI21 applied Maestro to several popular LLMs. It determined that the system boosts the accuracy of AI models by up to 50% in some cases. According to AI21, that means reasoning-optimized LLMs such as o3-mini can answer more than 95% of prompts correctly when they’re connected to Maestro.
The company envisions customers applying the system to a range of use cases. It says that Maestro can make LLMs better at analyzing complex documents and answering user questions. Additionally, the system lends itself to automating repetitive business chores such as data entry.
“Mass adoption of AI by enterprises is the key to the next industrial revolution,” said AI21 co-Chief Executive Officer Ori Goshen. “AI21’s Maestro is the first step toward that future – moving beyond the unpredictability of available solutions to deliver AI that is reliable at scale.”
Maestro is currently in early access. AI21 plans to make the platform generally available later this year.
Image: AI21
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU