In this post, we’ll demonstrate how to orchestrate Model Context Protocol (MCP) servers using llamaindex.TS in a real-world TypeScript application. We’ll use the Azure AI Travel Agents project as our base, focusing on best practices for secure, scalable, and maintainable orchestration. Feel free to star the repo to get notified of the latest changes.
If you are interested in an overview of the Azure AI Travel Agents project, please read our announcement blog!
Why llamaindex.TS and MCP?
- llamaindex.TS provides a modular, composable framework for building LLM-powered applications in TypeScript.
- MCP enables tool interoperability and streaming, making it ideal for orchestrating multiple AI services.
Project Structure
The Llamaindex.TS orchestrator lives in src/api/src/orchestrator/llamaindex
, with provider modules for different LLM backends and MCP clients. We currently support:
- Azure OpenAI
- Docker Models
- Azure AI Foundry Local
- Github Model
- Ollama
Feel free to explore the codebase and suggest more providers.
Setting Up the MCP Client with Streamable HTTP
To interact with MCP servers, without using Llamaindex.TS, we can write a custom implementation using the StreamableHTTPClientTransport
for efficient, authenticated, and streaming communication.
// filepath: src/api/src/mcp/mcp-http-client.ts
import EventEmitter from 'node:events';
import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { StreamableHTTPClientTransport } from '@modelcontextprotocol/sdk/client/streamableHttp.js';
export class MCPClient extends EventEmitter {
private client: Client;
private transport: StreamableHTTPClientTransport;
constructor(serverName: string, serverUrl: string, accessToken?: string) {
this.transport = new StreamableHTTPClientTransport({
url: serverUrl,
headers: accessToken ? { Authorization: `Bearer ${accessToken}` } : {},
});
this.client = new Client(serverName, this.transport);
}
async connect() {
await this.client.initialize();
}
async listTools() {
return this.client.listTools();
}
async callTool(name: string, toolArgs: any) {
return this.client.callTool(name, toolArgs);
}
async close() {
await this.client.closeGracefully();
}
}
Best Practice: Always pass the Authorization
header for secure access, as shown above.
Suppose you want to get destination recommendations from the MCP server:
import { MCPClient } from '../../mcp/mcp-http-client';
const DESTINATION_SERVER_URL = process.env.MCP_DESTINATION_RECOMMENDATION_URL!;
const ACCESS_TOKEN = process.env.MCP_DESTINATION_RECOMMENDATION_ACCESS_TOKEN;
const mcpClient = new MCPClient('destination-recommendation', DESTINATION_SERVER_URL, ACCESS_TOKEN);
await mcpClient.connect();
const tools = await mcpClient.listTools();
console.log('Available tools:', tools);
const result = await mcpClient.callTool('getDestinationsByPreferences', {
activity: 'CULTURAL',
budget: 'MODERATE',
season: 'SUMMER',
familyFriendly: true,
});
console.log('Recommended destinations:', result);
await mcpClient.close();
Tip: Always close the MCP client gracefully to release resources.
Orchestrating LLMs and MCP Tools With Llamaindex.TS
The mcp
client from @llamaindex/tools
makes it easy to connect to MCP servers and retrieve tool definitions dynamically. Below is a sample from the project’s orchestrator setup, showing how to use mcp
to fetch tools and create agents for each MCP server.
Here is an example of what an mcpServerConfig
object might look like:
const mcpServerConfig = {
url: "http://localhost:5007", // MCP server endpoint
accessToken: process.env.MCP_ECHO_PING_ACCESS_TOKEN, // Secure token from env
name: "echo-ping", // Logical name for the server
};
You can then use this config with the mcp
client:
import { mcp } from "@llamaindex/tools";
import { agent, multiAgent, ToolCallLLM } from "llamaindex";
// ...existing code...
const mcpServerConfig = mcpToolsConfig["echo-ping"].config;
const tools = await mcp(mcpServerConfig).tools();
const echoAgent = agent({
name: "EchoAgent",
systemPrompt:
"Echo back the received input. Do not respond with anything else. Always call the tools.",
tools,
llm,
verbose,
});
agentsList.push(echoAgent);
handoffTargets.push(echoAgent);
toolsList.push(...tools);
// ...other code...
const travelAgent = agent({
name: "TravelAgent",
systemPrompt:
"Acts as a triage agent to determine the best course of action for the user's query. If you cannot handle the query, please pass it to the next agent. If you can handle the query, please do so.",
tools: [...toolsList],
canHandoffTo: handoffTargets
.map((target) => target.getAgents().map((agent) => agent.name))
.flat(),
llm,
verbose,
});
agentsList.push(travelAgent);
// Create the multi-agent workflow
return multiAgent({
agents: agentsList,
rootAgent: travelAgent,
verbose,
});
You can repeat this pattern to compose a multi-agent workflow where each agent is powered by tools discovered at runtime from the MCP server. See project for a full example.
You can then use this LLM instance to orchestrate calls to MCP tools, such as itinerary planning or destination recommendation.
Security Considerations
- Always use access tokens and secure headers.
- Never hardcode secrets; use environment variables and secret managers.
We encourage you to join our Azure AI Foundry Developer Community to share your experiences, ask questions, and get support:
Conclusion
By combining llamaindex.TS with MCP’s Streamable HTTP transport, you can orchestrate powerful, secure, and scalable AI workflows in TypeScript. The Azure AI Travel Agents project provides a robust template for building your own orchestrator.
References: