Overview

The OpenAI provider offers comprehensive support for all OpenAI services including GPT models, embeddings, function calling, and structured output. It’s the most feature-complete provider in the SDK.

Quick Start

from ai_sdk import openai, generate_text

model = openai("gpt-4.1-mini")
res = generate_text(model=model, prompt="Hello, world!")
print(res.text)

Available Models

Chat Models

ModelDescriptionMax TokensInput / Cached Input / Output (per 1M tokens)
gpt-4.1GPT-4.1 (latest)128k2.00/2.00 / 0.50 / $8.00
gpt-4.1-miniGPT-4.1 Mini128k0.40/0.40 / 0.10 / $1.60
gpt-4.1-nanoGPT-4.1 Nano128k0.10/0.10 / 0.025 / $0.40
gpt-4.5-previewGPT-4.5 Preview128k75.00/75.00 / 37.50 / $150.00
gpt-4oGPT-4o (Omni)128k2.50/2.50 / 1.25 / $10.00
gpt-4o-miniGPT-4o Mini128k0.15/0.15 / 0.075 / $0.60
gpt-4o-mini-audio-previewGPT-4o Mini Audio Preview128k0.15//0.15 / - / 0.60
o3OpenAI O3128k2.00/2.00 / 0.50 / $8.00
o3-deep-researchO3 Deep Research128k10.00/10.00 / 2.50 / $40.00
o3-proO3 Pro128k20.00//20.00 / - / 80.00
o3-miniO3 Mini128k1.10/1.10 / 0.55 / $4.40
o4-miniO4 Mini128k1.10/1.10 / 0.275 / $4.40
o4-mini-deep-researchO4 Mini Deep Research128k2.00/2.00 / 0.50 / $8.00
o1OpenAI O1128k15.00/15.00 / 7.50 / $60.00
o1-proO1 Pro128k150.00//150.00 / - / 600.00
o1-miniO1 Mini128k1.10/1.10 / 0.55 / $4.40
codex-mini-latestCodex Mini (latest)128k1.50/1.50 / 0.375 / $6.00
gpt-4o-mini-search-previewGPT-4o Mini Search Preview128k0.15//0.15 / - / 0.60
gpt-4o-search-previewGPT-4o Search Preview128k2.50//2.50 / - / 10.00
computer-use-previewComputer Use Preview128k3.00//3.00 / - / 12.00
| gpt-4o-mini | Latest GPT-4 model | 128k | 5/1Minput,5/1M input, 15/1M output |

Embedding Models

ModelDimensionsCost
text-embedding-3-large3072$0.13/1M tokens
text-embedding-3-small1536$0.02/1M tokens
text-embedding-ada-0021536$0.0001/1K tokens

Basic Usage

Text Generation

from ai_sdk import openai, generate_text

model = openai("gpt-4.1-mini")
res = generate_text(
    model=model,
    prompt="Write a haiku about programming"
)
print(res.text)

Streaming

import asyncio
from ai_sdk import openai, stream_text

async def main():
    model = openai("gpt-4.1-mini")
    stream_res = stream_text(
        model=model,
        prompt="Tell me a story about a robot"
    )

    async for chunk in stream_res.text_stream:
        print(chunk, end="", flush=True)

asyncio.run(main())

With System Instructions

from ai_sdk import openai, generate_text

model = openai("gpt-4.1-mini")
res = generate_text(
    model=model,
    system="You are a helpful coding assistant. Always provide clear, concise explanations.",
    prompt="Explain what recursion is in simple terms"
)
print(res.text)

Advanced Features

Structured Output

OpenAI supports native structured output with response_format="json_object":
from ai_sdk import openai, generate_object
from pydantic import BaseModel

class User(BaseModel):
    name: str
    age: int
    email: str

model = openai("gpt-4.1-mini")
res = generate_object(
    model=model,
    schema=User,
    prompt="Create a user profile for John Doe, age 30"
)
print(res.object)  # User(name='John Doe', age=30, email='john@example.com')

Function Calling

from ai_sdk import openai, generate_text, tool

def get_weather(city: str) -> str:
    """Get weather for a city."""
    weather_data = {
        "New York": "72°F, Sunny",
        "London": "55°F, Rainy"
    }
    return weather_data.get(city, "Weather data not available")

weather_tool = tool(
    name="get_weather",
    description="Get current weather for a city",
    parameters={
        "type": "object",
        "properties": {
            "city": {"type": "string", "description": "City name"}
        },
        "required": ["city"]
    },
    execute=get_weather
)

model = openai("gpt-4.1-mini")
res = generate_text(
    model=model,
    prompt="What's the weather like in New York?",
    tools=[weather_tool]
)
print(res.text)

Embeddings

from ai_sdk import openai, embed_many, cosine_similarity

# Create embedding model
embed_model = openai.embedding("text-embedding-3-small")

# Embed multiple texts
texts = [
    "The cat sat on the mat.",
    "A dog was lying on the rug.",
    "Python is a programming language."
]

result = embed_many(model=embed_model, values=texts)

# Calculate similarity
similarity = cosine_similarity(result.embeddings[0], result.embeddings[1])
print(f"Similarity: {similarity:.3f}")

Vision Models

from ai_sdk import openai, generate_text
from ai_sdk.types import CoreUserMessage, ImagePart

model = openai("gpt-4.1")  # Vision-capable model

# Create message with image
message = CoreUserMessage(content=[
    ImagePart(image_url="https://example.com/image.jpg"),
    "Describe this image in detail."
])

res = generate_text(model=model, messages=[message])
print(res.text)

Configuration

API Key

Set your OpenAI API key:
export OPENAI_API_KEY="sk-..."
Or pass it directly:
model = openai("gpt-4.1-mini", api_key="sk-...")

Default Parameters

Configure default parameters for all requests:
model = openai(
    "gpt-4.1-mini",
    temperature=0.7,
    max_tokens=1000,
    top_p=0.9,
    frequency_penalty=0.1,
    presence_penalty=0.1,
    user="my-app/123"  # For analytics
)

Organization

For team accounts:
model = openai(
    "gpt-4.1-mini",
    organization="org-..."  # Your organization ID
)

Parameters

Common Parameters

ParameterTypeDefaultDescription
modelstr-Model identifier (required)
api_keystrNoneAPI key (uses OPENAI_API_KEY env var)
organizationstrNoneOrganization ID
base_urlstrhttps://api.openai.com/v1API base URL

Generation Parameters

ParameterTypeDefaultDescription
temperaturefloat1.0Controls randomness (0.0 = deterministic)
max_tokensintNoneMaximum tokens to generate
top_pfloat1.0Nucleus sampling parameter
frequency_penaltyfloat0.0Reduces repetition
presence_penaltyfloat0.0Encourages new topics
response_formatstrNone"json_object" for structured output
seedintNoneFor reproducible results

Embedding Parameters

ParameterTypeDefaultDescription
encoding_formatstr"float""float" or "base64"
dimensionsintNoneOutput dimensions (model-specific)

Error Handling

Rate Limiting

import time
from ai_sdk import openai, generate_text

def generate_with_retry(prompt, max_retries=3):
    model = openai("gpt-4.1-mini")

    for attempt in range(max_retries):
        try:
            return generate_text(model=model, prompt=prompt)
        except Exception as e:
            if "rate_limit" in str(e).lower() and attempt < max_retries - 1:
                wait_time = 2 ** attempt
                print(f"Rate limited, waiting {wait_time}s...")
                time.sleep(wait_time)
                continue
            raise

res = generate_with_retry("Hello!")

Token Limits

from ai_sdk import openai, generate_text

def truncate_prompt(prompt, max_tokens=4000):
    """Truncate prompt to fit within token limits."""
    # Rough estimation: 1 token ≈ 4 characters
    max_chars = max_tokens * 4
    if len(prompt) > max_chars:
        return prompt[:max_chars] + "..."
    return prompt

model = openai("gpt-4.1-mini")
long_prompt = "A very long prompt..." * 1000
truncated = truncate_prompt(long_prompt)
res = generate_text(model=model, prompt=truncated)

Best Practices

1. Model Selection

Choose the right model for your use case:
# For simple tasks - fast and cheap
model = openai("gpt-3.5-turbo")

# For complex reasoning - powerful but expensive
model = openai("gpt-4.1")

# For structured output - native JSON support
model = openai("gpt-4.1-mini")

2. Cost Optimization

Monitor and optimize costs:
from ai_sdk import openai, generate_text

model = openai("gpt-4.1-mini")
res = generate_text(model=model, prompt="Hello!")

if res.usage:
    input_cost = res.usage.prompt_tokens * 0.00000015  # $0.15/1M tokens
    output_cost = res.usage.completion_tokens * 0.0000006  # $0.6/1M tokens
    total_cost = input_cost + output_cost
    print(f"Cost: ${total_cost:.6f}")

3. Prompt Engineering

Use clear, specific prompts:
# Good
prompt = """
You are a helpful coding assistant. The user will ask you questions about Python programming.

Please provide:
1. A clear explanation
2. A code example
3. Best practices to follow

User question: {user_question}
"""

# Avoid
prompt = "Help me with Python"

4. Streaming for Long Responses

Use streaming for better user experience:
import asyncio
from ai_sdk import openai, stream_text

async def generate_long_response():
    model = openai("gpt-4.1-mini")
    stream_res = stream_text(
        model=model,
        prompt="Write a detailed tutorial about Python decorators"
    )

    print("Generating response...")
    async for chunk in stream_res.text_stream:
        print(chunk, end="", flush=True)
    print("\nDone!")

asyncio.run(generate_long_response())

5. Structured Output for Reliability

Use structured output for consistent results:
from ai_sdk import openai, generate_object
from pydantic import BaseModel
from typing import List

class CodeReview(BaseModel):
    issues: List[str]
    suggestions: List[str]
    score: int

model = openai("gpt-4.1-mini")
res = generate_object(
    model=model,
    schema=CodeReview,
    prompt="Review this Python code: def hello(): print('world')"
)

review = res.object
print(f"Score: {review.score}/10")
print(f"Issues: {review.issues}")

Troubleshooting

Common Issues

  1. Invalid API Key
    Error: Invalid API key
    
    • Check your API key is correct
    • Ensure you have sufficient credits
  2. Model Not Found
    Error: The model `gpt-4.1-mini` does not exist
    
    • Verify model name spelling
    • Check if model is available in your region
  3. Rate Limiting
    Error: Rate limit exceeded
    
    • Implement exponential backoff
    • Consider upgrading your plan
  4. Token Limit Exceeded
    Error: Request too large
    
    • Reduce input length
    • Use a model with higher token limits

Debug Mode

Enable detailed logging:
import logging
logging.basicConfig(level=logging.DEBUG)

from ai_sdk import openai, generate_text
model = openai("gpt-4.1-mini")
res = generate_text(model=model, prompt="Hello!")

OpenAI models are constantly being updated. Check the OpenAI API documentation for the latest model availability and pricing.