Documentation Index
Fetch the complete documentation index at: https://pythonaisdk.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The OpenAI provider offers comprehensive support for all OpenAI services including GPT models, embeddings, function calling, and structured output. It’s the most feature-complete provider in the SDK.
Quick Start
from ai_sdk import openai, generate_text
model = openai("gpt-4.1-mini")
res = generate_text(model=model, prompt="Hello, world!")
print(res.text)
Available Models
Chat Models
| Model | Description | Max Tokens | Input / Cached Input / Output (per 1M tokens) |
|---|
gpt-4.1 | GPT-4.1 (latest) | 128k | 2.00/0.50 / $8.00 |
gpt-4.1-mini | GPT-4.1 Mini | 128k | 0.40/0.10 / $1.60 |
gpt-4.1-nano | GPT-4.1 Nano | 128k | 0.10/0.025 / $0.40 |
gpt-4.5-preview | GPT-4.5 Preview | 128k | 75.00/37.50 / $150.00 |
gpt-4o | GPT-4o (Omni) | 128k | 2.50/1.25 / $10.00 |
gpt-4o-mini | GPT-4o Mini | 128k | 0.15/0.075 / $0.60 |
gpt-4o-mini-audio-preview | GPT-4o Mini Audio Preview | 128k | 0.15/−/0.60 |
o3 | OpenAI O3 | 128k | 2.00/0.50 / $8.00 |
o3-deep-research | O3 Deep Research | 128k | 10.00/2.50 / $40.00 |
o3-pro | O3 Pro | 128k | 20.00/−/80.00 |
o3-mini | O3 Mini | 128k | 1.10/0.55 / $4.40 |
o4-mini | O4 Mini | 128k | 1.10/0.275 / $4.40 |
o4-mini-deep-research | O4 Mini Deep Research | 128k | 2.00/0.50 / $8.00 |
o1 | OpenAI O1 | 128k | 15.00/7.50 / $60.00 |
o1-pro | O1 Pro | 128k | 150.00/−/600.00 |
o1-mini | O1 Mini | 128k | 1.10/0.55 / $4.40 |
codex-mini-latest | Codex Mini (latest) | 128k | 1.50/0.375 / $6.00 |
gpt-4o-mini-search-preview | GPT-4o Mini Search Preview | 128k | 0.15/−/0.60 |
gpt-4o-search-preview | GPT-4o Search Preview | 128k | 2.50/−/10.00 |
computer-use-preview | Computer Use Preview | 128k | 3.00/−/12.00 |
| gpt-4o-mini | Latest GPT-4 model | 128k | 5/1Minput,15/1M output |
Embedding Models
| Model | Dimensions | Cost |
|---|
text-embedding-3-large | 3072 | $0.13/1M tokens |
text-embedding-3-small | 1536 | $0.02/1M tokens |
text-embedding-ada-002 | 1536 | $0.0001/1K tokens |
Basic Usage
Text Generation
from ai_sdk import openai, generate_text
model = openai("gpt-4.1-mini")
res = generate_text(
model=model,
prompt="Write a haiku about programming"
)
print(res.text)
Streaming
import asyncio
from ai_sdk import openai, stream_text
async def main():
model = openai("gpt-4.1-mini")
stream_res = stream_text(
model=model,
prompt="Tell me a story about a robot"
)
async for chunk in stream_res.text_stream:
print(chunk, end="", flush=True)
asyncio.run(main())
With System Instructions
from ai_sdk import openai, generate_text
model = openai("gpt-4.1-mini")
res = generate_text(
model=model,
system="You are a helpful coding assistant. Always provide clear, concise explanations.",
prompt="Explain what recursion is in simple terms"
)
print(res.text)
Advanced Features
Structured Output
OpenAI supports native structured output with response_format="json_object":
from ai_sdk import openai, generate_object
from pydantic import BaseModel
class User(BaseModel):
name: str
age: int
email: str
model = openai("gpt-4.1-mini")
res = generate_object(
model=model,
schema=User,
prompt="Create a user profile for John Doe, age 30"
)
print(res.object) # User(name='John Doe', age=30, email='john@example.com')
Function Calling
from ai_sdk import openai, generate_text, tool
def get_weather(city: str) -> str:
"""Get weather for a city."""
weather_data = {
"New York": "72°F, Sunny",
"London": "55°F, Rainy"
}
return weather_data.get(city, "Weather data not available")
weather_tool = tool(
name="get_weather",
description="Get current weather for a city",
parameters={
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"}
},
"required": ["city"]
},
execute=get_weather
)
model = openai("gpt-4.1-mini")
res = generate_text(
model=model,
prompt="What's the weather like in New York?",
tools=[weather_tool]
)
print(res.text)
Embeddings
from ai_sdk import openai, embed_many, cosine_similarity
# Create embedding model
embed_model = openai.embedding("text-embedding-3-small")
# Embed multiple texts
texts = [
"The cat sat on the mat.",
"A dog was lying on the rug.",
"Python is a programming language."
]
result = embed_many(model=embed_model, values=texts)
# Calculate similarity
similarity = cosine_similarity(result.embeddings[0], result.embeddings[1])
print(f"Similarity: {similarity:.3f}")
Vision Models
from ai_sdk import openai, generate_text
from ai_sdk.types import CoreUserMessage, ImagePart
model = openai("gpt-4.1") # Vision-capable model
# Create message with image
message = CoreUserMessage(content=[
ImagePart(image_url="https://example.com/image.jpg"),
"Describe this image in detail."
])
res = generate_text(model=model, messages=[message])
print(res.text)
Configuration
API Key
Set your OpenAI API key:
export OPENAI_API_KEY="sk-..."
Or pass it directly:
model = openai("gpt-4.1-mini", api_key="sk-...")
Default Parameters
Configure default parameters for all requests:
model = openai(
"gpt-4.1-mini",
temperature=0.7,
max_tokens=1000,
top_p=0.9,
frequency_penalty=0.1,
presence_penalty=0.1,
user="my-app/123" # For analytics
)
Organization
For team accounts:
model = openai(
"gpt-4.1-mini",
organization="org-..." # Your organization ID
)
Parameters
Common Parameters
| Parameter | Type | Default | Description |
|---|
model | str | - | Model identifier (required) |
api_key | str | None | API key (uses OPENAI_API_KEY env var) |
organization | str | None | Organization ID |
base_url | str | https://api.openai.com/v1 | API base URL |
Generation Parameters
| Parameter | Type | Default | Description |
|---|
temperature | float | 1.0 | Controls randomness (0.0 = deterministic) |
max_tokens | int | None | Maximum tokens to generate |
top_p | float | 1.0 | Nucleus sampling parameter |
frequency_penalty | float | 0.0 | Reduces repetition |
presence_penalty | float | 0.0 | Encourages new topics |
response_format | str | None | "json_object" for structured output |
seed | int | None | For reproducible results |
Embedding Parameters
| Parameter | Type | Default | Description |
|---|
encoding_format | str | "float" | "float" or "base64" |
dimensions | int | None | Output dimensions (model-specific) |
Error Handling
Rate Limiting
import time
from ai_sdk import openai, generate_text
def generate_with_retry(prompt, max_retries=3):
model = openai("gpt-4.1-mini")
for attempt in range(max_retries):
try:
return generate_text(model=model, prompt=prompt)
except Exception as e:
if "rate_limit" in str(e).lower() and attempt < max_retries - 1:
wait_time = 2 ** attempt
print(f"Rate limited, waiting {wait_time}s...")
time.sleep(wait_time)
continue
raise
res = generate_with_retry("Hello!")
Token Limits
from ai_sdk import openai, generate_text
def truncate_prompt(prompt, max_tokens=4000):
"""Truncate prompt to fit within token limits."""
# Rough estimation: 1 token ≈ 4 characters
max_chars = max_tokens * 4
if len(prompt) > max_chars:
return prompt[:max_chars] + "..."
return prompt
model = openai("gpt-4.1-mini")
long_prompt = "A very long prompt..." * 1000
truncated = truncate_prompt(long_prompt)
res = generate_text(model=model, prompt=truncated)
Best Practices
1. Model Selection
Choose the right model for your use case:
# For simple tasks - fast and cheap
model = openai("gpt-3.5-turbo")
# For complex reasoning - powerful but expensive
model = openai("gpt-4.1")
# For structured output - native JSON support
model = openai("gpt-4.1-mini")
2. Cost Optimization
Monitor and optimize costs:
from ai_sdk import openai, generate_text
model = openai("gpt-4.1-mini")
res = generate_text(model=model, prompt="Hello!")
if res.usage:
input_cost = res.usage.prompt_tokens * 0.00000015 # $0.15/1M tokens
output_cost = res.usage.completion_tokens * 0.0000006 # $0.6/1M tokens
total_cost = input_cost + output_cost
print(f"Cost: ${total_cost:.6f}")
3. Prompt Engineering
Use clear, specific prompts:
# Good
prompt = """
You are a helpful coding assistant. The user will ask you questions about Python programming.
Please provide:
1. A clear explanation
2. A code example
3. Best practices to follow
User question: {user_question}
"""
# Avoid
prompt = "Help me with Python"
4. Streaming for Long Responses
Use streaming for better user experience:
import asyncio
from ai_sdk import openai, stream_text
async def generate_long_response():
model = openai("gpt-4.1-mini")
stream_res = stream_text(
model=model,
prompt="Write a detailed tutorial about Python decorators"
)
print("Generating response...")
async for chunk in stream_res.text_stream:
print(chunk, end="", flush=True)
print("\nDone!")
asyncio.run(generate_long_response())
5. Structured Output for Reliability
Use structured output for consistent results:
from ai_sdk import openai, generate_object
from pydantic import BaseModel
from typing import List
class CodeReview(BaseModel):
issues: List[str]
suggestions: List[str]
score: int
model = openai("gpt-4.1-mini")
res = generate_object(
model=model,
schema=CodeReview,
prompt="Review this Python code: def hello(): print('world')"
)
review = res.object
print(f"Score: {review.score}/10")
print(f"Issues: {review.issues}")
Troubleshooting
Common Issues
-
Invalid API Key
- Check your API key is correct
- Ensure you have sufficient credits
-
Model Not Found
Error: The model `gpt-4.1-mini` does not exist
- Verify model name spelling
- Check if model is available in your region
-
Rate Limiting
Error: Rate limit exceeded
- Implement exponential backoff
- Consider upgrading your plan
-
Token Limit Exceeded
- Reduce input length
- Use a model with higher token limits
Debug Mode
Enable detailed logging:
import logging
logging.basicConfig(level=logging.DEBUG)
from ai_sdk import openai, generate_text
model = openai("gpt-4.1-mini")
res = generate_text(model=model, prompt="Hello!")
OpenAI models are constantly being updated. Check the OpenAI API
documentation for the latest model availability and
pricing.