> ## Documentation Index
> Fetch the complete documentation index at: https://docs.shuttleai.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Streaming

> Receive AI responses token-by-token in real time using server-sent events.

Streaming lets you display AI responses as they're generated, instead of waiting for the full response. This creates a much better user experience for chat interfaces.

ShuttleAI uses **Server-Sent Events (SSE)** — the same streaming format as OpenAI.

## Basic streaming

Set `stream: true` in your request to enable streaming:

<CodeGroup>
  ```python Python theme={null}
  from openai import OpenAI

  client = OpenAI(
      api_key="shuttle-xxx",
      base_url="https://api.shuttleai.com/v1"
  )

  stream = client.chat.completions.create(
      model="shuttleai/auto",
      messages=[{"role": "user", "content": "Write a short poem about AI."}],
      stream=True
  )

  for chunk in stream:
      content = chunk.choices[0].delta.content
      if content:
          print(content, end="", flush=True)
  ```

  ```javascript Node.js theme={null}
  import OpenAI from "openai";

  const client = new OpenAI({
    apiKey: "shuttle-xxx",
    baseURL: "https://api.shuttleai.com/v1",
  });

  const stream = await client.chat.completions.create({
    model: "shuttleai/auto",
    messages: [{ role: "user", content: "Write a short poem about AI." }],
    stream: true,
  });

  for await (const chunk of stream) {
    const content = chunk.choices[0]?.delta?.content;
    if (content) process.stdout.write(content);
  }
  ```

  ```bash cURL theme={null}
  curl https://api.shuttleai.com/v1/chat/completions \
    -H "Authorization: Bearer shuttle-xxx" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "shuttleai/auto",
      "messages": [{"role": "user", "content": "Write a short poem about AI."}],
      "stream": true
    }'
  ```
</CodeGroup>

## Async streaming (Python)

For async applications, use the async client:

```python theme={null}
import asyncio
from openai import AsyncOpenAI

client = AsyncOpenAI(
    api_key="shuttle-xxx",
    base_url="https://api.shuttleai.com/v1"
)

async def main():
    stream = await client.chat.completions.create(
        model="shuttleai/auto",
        messages=[{"role": "user", "content": "Explain quantum computing."}],
        stream=True
    )

    async for chunk in stream:
        content = chunk.choices[0].delta.content
        if content:
            print(content, end="", flush=True)

asyncio.run(main())
```

## Stream with usage stats

To receive token usage statistics with your stream, enable `stream_options`:

```python theme={null}
stream = client.chat.completions.create(
    model="shuttleai/auto",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True,
    stream_options={"include_usage": True}
)
```

The final chunk in the stream will include a `usage` object with `prompt_tokens`, `completion_tokens`, and `total_tokens`.

## SSE format

Each streamed chunk is a JSON object sent as an SSE event:

```
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":" world"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]
```

The stream ends with `data: [DONE]`.
