The ShuttleAI API provides the ability to stream responses back to a client in order to allow partial results for certain requests. To achieve this, we follow the Server-sent events standard.

Our official Python Library handles Server-sent events for you. In Python, a streaming request looks like:

import asyncio
from shuttleai import AsyncShuttleAI

async def main():
    async with AsyncShuttleAI() as shuttleai:
        response = await shuttleai.chat.completions.create(
            model="shuttle-3",
            messages=[{"role": "user", "content": "write me a short story about bees"}],
            stream=True,
        )

        async for chunk in response:
            print(chunk.choices[0].delta.content)

if __name__ == "__main__":
    asyncio.run(main())

Continue to Get Models to get started!

All additional optional kwargs will be listed in Endpoints category, be sure to check it out!