While not all models support MCP, the official ShuttleAI model,
shuttle-3.5, supports it, including during streaming. When using MCP tools, the chat completions API handles interactions with the remote server automatically, resolving tool calls and incorporating results into the final response without requiring manual intervention from your code.What is an MCP Tool?
MCP, or Model Context Protocol, is an open-source standard introduced by Anthropic for connecting AI models to external systems, tools, and data sources in a secure, standardized way. It acts like a universal interface (often compared to USB-C for AI) that allows models to request and receive context or perform actions via remote servers. An MCP tool in ShuttleAI’s API specifies a connection to an MCP server, which hosts the actual tools (e.g., APIs for data retrieval or computations). The server handles tool discovery, execution, and responses. ShuttleAI provides its own official MCP server athttps://mcp.shuttleai.com/mcp, which includes the following self-explanatory tools:
chat_completion: For generating chat responses using sub-models or specialized completions.image_generation: For creating images based on prompts.list_models: For retrieving available models.model_analytics: For fetching user-specific model usage statistics (e.g., token counts, API calls).
Authorization header (Bearer token style) or as a query parameter in the server_url (e.g., https://mcp.shuttleai.com/mcp?api_key=your_key_here).
How should an MCP Tool look?
Here’s a basic example of an MCP tool connecting to the official ShuttleAI MCP server. Note thatserver_url must start with https://, and you can optionally restrict allowed_tools or set require_approval to “never” for automatic execution without model confirmation.
tools array for hybrid setups.
Example Usage
Let’s use themodel_analytics tool as an example. This tool retrieves your model usage stats from the ShuttleAI MCP server.
First, set up the tools as above. Then, make a chat completions request:
model_analytics if the model deems it necessary), process the results, and return a final completion. A sample response might look like this (actual stats depend on your usage):