OpenAI API - Complete Guide
Master AI Integration with Practical Examples
• Introduction to OpenAI API
○ What is OpenAI API?
OpenAI API is a powerful cloud-based service that provides access to advanced artificial intelligence models. It allows developers to integrate AI capabilities like text generation, code completion, language translation, and more into their applications.
○ Key Features
- Natural Language Processing: Understand and generate human-like text
- Code Generation: Write and debug code in multiple programming languages
- Language Translation: Translate between different languages
- Content Creation: Generate articles, stories, and creative content
- Data Analysis: Extract insights from text data
- Conversation AI: Build chatbots and virtual assistants
○ Mind Map: OpenAI API Ecosystem
• Getting Started
○ Prerequisites
- OpenAI Account: Sign up at platform.openai.com
- API Key: Generate your unique API key
- Programming Knowledge: Basic understanding of Python, JavaScript, or similar
- HTTP/REST Concepts: Familiarity with API requests
○ Installation Process Flowchart
Visit platform.openai.com
Check inbox and confirm
Navigate to API section
pip install openai
Run sample code
Check credentials
○ Setup Code Examples
Python Installation:
# Install OpenAI Python library
pip install openai
# Import and configure
import openai
openai.api_key = "your-api-key-here"
JavaScript/Node.js Installation:
// Install OpenAI npm package
npm install openai
// Import and configure
const { Configuration, OpenAIApi } = require("openai");
const configuration = new Configuration({
apiKey: process.env.OPENAI_API_KEY,
});
const openai = new OpenAIApi(configuration);
• Authentication & API Keys
○ Understanding API Keys
API keys are secure tokens that authenticate your requests to OpenAI's servers. They act as your digital signature and are linked to your billing account.
○ Security Best Practices Table
| Practice | Description | Importance |
|---|---|---|
| Never Commit to Git | Don't include API keys in version control | ⭐⭐⭐⭐⭐ |
| Use Environment Variables | Store keys in .env files | ⭐⭐⭐⭐⭐ |
| Rotate Keys Regularly | Change keys every 3-6 months | ⭐⭐⭐⭐ |
| Use Multiple Keys | Different keys for dev/staging/prod | ⭐⭐⭐⭐ |
| Monitor Usage | Track API calls for anomalies | ⭐⭐⭐⭐ |
○ Implementation Example
Secure Key Management (Python):
# Using environment variables
import os
from dotenv import load_dotenv
load_dotenv()
api_key = os.getenv("OPENAI_API_KEY")
# Using config file (not recommended for production)
import json
with open('config.json') as f:
config = json.load(f)
api_key = config['api_key']
• Understanding Models
○ Available Models Comparison
| Model Name | Description | Max Tokens | Best For |
|---|---|---|---|
| GPT-4 | Most capable model, best reasoning | 8,192 | Complex tasks, analysis |
| GPT-4-32k | Extended context version | 32,768 | Long documents, detailed content |
| GPT-3.5-Turbo | Fast, cost-effective | 4,096 | Chatbots, quick responses |
| GPT-3.5-Turbo-16k | Extended context, affordable | 16,384 | Medium-length content |
| Text-Davinci-003 | Legacy model, versatile | 4,097 | General purpose |
○ Model Selection Flowchart
• Pricing Structure
○ Current Pricing (in Indian Rupees)
Note: Prices are approximate and based on 1 USD ≈ ₹83. Always check official OpenAI website for current rates.
| Model | Input (per 1K tokens) | Output (per 1K tokens) | Use Case |
|---|---|---|---|
| GPT-4 | ₹2.49 | ₹4.98 | Premium applications |
| GPT-4-32k | ₹4.98 | ₹9.96 | Extended context needs |
| GPT-3.5-Turbo | ₹0.12 | ₹0.17 | Cost-effective solutions |
| GPT-3.5-Turbo-16k | ₹0.25 | ₹0.33 | Balanced performance |
○ Cost Estimation Calculator
Estimate Your Monthly Costs
• Chat Completion API
○ Understanding the Chat Format
The Chat Completion API uses a conversational format with messages. Each message has a role (system, user, or assistant) and content.
○ Message Roles Table
| Role | Purpose | Example |
|---|---|---|
| system | Set behavior and context | "You are a helpful coding assistant" |
| user | User's input/question | "How do I create a loop in Python?" |
| assistant | AI's previous responses | "Here's how to create a loop..." |
○ Python Implementation Example
import openai
# Set your API key
openai.api_key = "your-api-key-here"
# Basic chat completion
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing in simple terms"}
],
temperature=0.7,
max_tokens=500
)
# Extract response
answer = response.choices[0].message.content
print(answer)
○ JavaScript Implementation Example
const { Configuration, OpenAIApi } = require("openai");
// Configure OpenAI
const configuration = new Configuration({
apiKey: process.env.OPENAI_API_KEY,
});
const openai = new OpenAIApi(configuration);
// Chat completion function
async function chatCompletion() {
try {
const response = await openai.createChatCompletion({
model: "gpt-3.5-turbo",
messages: [
{role: "system", content: "You are a helpful assistant."},
{role: "user", content: "Explain machine learning"}
],
temperature: 0.7,
max_tokens: 500
});
console.log(response.data.choices[0].message.content);
} catch (error) {
console.error("Error:", error);
}
}
chatCompletion();
• Advanced Features
○ Temperature & Top-P Settings
| Parameter | Range | Effect | Best Use |
|---|---|---|---|
| Temperature | 0.0 - 2.0 | Controls randomness | 0.7 for balanced output |
| Top-P | 0.0 - 1.0 | Nucleus sampling | 0.9 for diverse responses |
| Max Tokens | 1 - model limit | Response length | Based on use case |
| Frequency Penalty | -2.0 - 2.0 | Reduce repetition | 0.5 to avoid redundancy |
| Presence Penalty | -2.0 - 2.0 | Encourage new topics | 0.6 for variety |
○ Streaming Responses
Streaming allows you to receive responses in real-time as they're generated, improving user experience for long responses.
Python Streaming Example:
import openai
openai.api_key = "your-api-key-here"
# Stream responses
for chunk in openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "Write a story"}],
stream=True
):
content = chunk.choices[0].delta.get("content", "")
print(content, end="", flush=True)
○ Function Calling
Function calling enables the model to intelligently call external functions and APIs based on user requests.
functions = [
{
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City name"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"]
}
},
"required": ["location"]
}
}
]
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "What's the weather in Mumbai?"}],
functions=functions,
function_call="auto"
)
• Best Practices
○ Prompt Engineering Tips
- Be Specific: Provide clear, detailed instructions
- Use Examples: Show the model what you want with examples
- Set Context: Use system messages to define behavior
- Break Complex Tasks: Divide into smaller steps
- Iterate: Refine prompts based on results
- Use Delimiters: Separate different sections clearly
○ Error Handling Best Practices
| Error Type | Cause | Solution |
|---|---|---|
| Rate Limit | Too many requests | Implement exponential backoff |
| Token Limit | Input/output too long | Truncate or split content |
| Invalid API Key | Wrong/expired key | Verify and regenerate key |
| Timeout | Request took too long | Increase timeout, reduce tokens |
| Server Error | OpenAI service issue | Retry with exponential backoff |
○ Error Handling Code Example
import openai
import time
def call_openai_with_retry(messages, max_retries=3):
for attempt in range(max_retries):
try:
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=messages,
timeout=30
)
return response
except openai.error.RateLimitError:
wait_time = 2 ** attempt
print(f"Rate limit hit. Waiting {wait_time} seconds...")
time.sleep(wait_time)
except openai.error.APIError as e:
print(f"API Error: {e}")
if attempt == max_retries - 1:
raise
except Exception as e:
print(f"Unexpected error: {e}")
raise
raise Exception("Max retries exceeded")
• Real-World Use Cases
○ Common Applications
1. Customer Support Chatbot
def customer_support_bot(user_message, conversation_history):
system_prompt = """You are a helpful customer support agent for an
e-commerce company. Be polite, professional, and solve issues efficiently.
If you cannot help, escalate to human support."""
messages = [{"role": "system", "content": system_prompt}]
messages.extend(conversation_history)
messages.append({"role": "user", "content": user_message})
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=messages,
temperature=0.5
)
return response.choices[0].message.content
2. Content Generation Tool
def generate_blog_post(topic, keywords, word_count):
prompt = f"""Write a {word_count}-word blog post about {topic}.
Include these keywords: {', '.join(keywords)}
Make it engaging, informative, and SEO-friendly.
Include an introduction, main body with subheadings, and conclusion."""
response = openai.ChatCompletion.create(
model="gpt-4",
messages=[
{"role": "system", "content": "You are a professional content writer."},
{"role": "user", "content": prompt}
],
temperature=0.7,
max_tokens=word_count * 2
)
return response.choices[0].message.content
3. Code Review Assistant
def review_code(code, language):
prompt = f"""Review this {language} code and provide:
1. Potential bugs or issues
2. Performance improvements
3. Best practice suggestions
4. Security concerns
Code:
```{language}
{code}
```"""
response = openai.ChatCompletion.create(
model="gpt-4",
messages=[
{"role": "system", "content": "You are an expert code reviewer."},
{"role": "user", "content": prompt}
],
temperature=0.3
)
return response.choices[0].message.content
○ Use Case Comparison Table
| Use Case | Best Model | Avg Cost/Request (₹) | Implementation Difficulty |
|---|---|---|---|
| Simple Chatbot | GPT-3.5-Turbo | ₹0.05 | Easy |
| Content Writing | GPT-4 | ₹2.50 | Medium |
| Code Generation | GPT-4 | ₹3.00 | Medium |
| Data Analysis | GPT-4-32k | ₹8.00 | Hard |
| Translation | GPT-3.5-Turbo | ₹0.08 | Easy |
• Troubleshooting
○ Common Issues & Solutions
Issue 1: "Invalid API Key" Error
Symptoms: Authentication fails, 401 error
Solutions:
- Verify API Key: Check for typos or spaces
- Check Expiration: Ensure key is active
- Regenerate Key: Create new key if needed
- Environment Variable: Confirm proper loading
Issue 2: Rate Limit Exceeded
Symptoms: 429 error, requests failing
Solutions:
- Implement Backoff: Wait before retrying
- Reduce Frequency: Batch requests
- Upgrade Tier: Increase rate limits
- Cache Results: Store common responses
Issue 3: Unexpected Response Quality
Symptoms: Irrelevant or poor responses
Solutions:
- Refine Prompt: Be more specific
- Adjust Temperature: Lower for consistency
- Use Examples: Show desired format
- Try Different Model: Upgrade to GPT-4
○ Debugging Flowchart
• Practice Questions & Answers
Answer:
GPT-4:
- More capable - Better reasoning and understanding
- More accurate - Fewer mistakes and hallucinations
- Better context handling - Follows complex instructions
- Higher cost - ₹2.49 per 1K input tokens
GPT-3.5-Turbo:
- Faster responses - Lower latency
- Cost-effective - ₹0.12 per 1K input tokens
- Good for simple tasks - Chatbots, basic queries
- Sufficient for most use cases - Great performance/cost ratio
Answer:
Cost Reduction Strategies:
- Use GPT-3.5-Turbo - 20x cheaper than GPT-4
- Optimize prompts - Be concise, avoid unnecessary tokens
- Implement caching - Store and reuse common responses
- Set max_tokens - Limit response length
- Batch requests - Combine multiple queries
- Use streaming - Stop generation when needed
- Monitor usage - Track and optimize high-cost queries
Answer:
Temperature controls the randomness of the AI's responses:
| Temperature | Effect | Best Use |
|---|---|---|
| 0.0 - 0.3 | Very deterministic, focused | Factual answers, code generation |
| 0.4 - 0.7 | Balanced creativity/consistency | General purpose, chatbots |
| 0.8 - 1.0 | More creative, varied | Creative writing, brainstorming |
| 1.1 - 2.0 | Highly random, experimental | Art, experimental content |
Answer:
Implement Exponential Backoff Strategy:
import time
import openai
def api_call_with_backoff(messages, max_retries=5):
for attempt in range(max_retries):
try:
return openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=messages
)
except openai.error.RateLimitError:
if attempt == max_retries - 1:
raise
wait_time = (2 ** attempt) + random.uniform(0, 1)
print(f"Rate limit hit. Waiting {wait_time:.2f}s")
time.sleep(wait_time)
Additional Tips:
- Monitor rate limits - Check usage dashboard
- Implement queuing - Process requests sequentially
- Upgrade tier - Increase limits if needed
- Use batch API - For non-urgent requests
Answer:
Tokens are pieces of text that the model processes. They can be words, parts of words, or punctuation.
Token Counting Rules:
- 1 token ≈ 4 characters in English
- 1 token ≈ ¾ of a word on average
- 100 tokens ≈ 75 words
- Punctuation counts as separate tokens
- Spaces count in tokenization
Examples:
- "Hello World" = 2 tokens
- "ChatGPT" = 2 tokens (Chat + GPT)
- "I love AI!" = 4 tokens (I, love, AI, !)
Calculate tokens in Python:
import tiktoken
def count_tokens(text, model="gpt-3.5-turbo"):
encoding = tiktoken.encoding_for_model(model)
return len(encoding.encode(text))
text = "How many tokens is this?"
print(f"Token count: {count_tokens(text)}")
Answer:
The API is stateless, so you must manually maintain conversation history:
class ChatBot:
def __init__(self):
self.conversation_history = []
self.system_message = {
"role": "system",
"content": "You are a helpful assistant."
}
def chat(self, user_message):
# Add user message to history
self.conversation_history.append({
"role": "user",
"content": user_message
})
# Prepare messages (system + history)
messages = [self.system_message] + self.conversation_history
# Get response
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=messages
)
# Add assistant response to history
assistant_message = response.choices[0].message.content
self.conversation_history.append({
"role": "assistant",
"content": assistant_message
})
return assistant_message
def clear_history(self):
self.conversation_history = []
# Usage
bot = ChatBot()
print(bot.chat("My name is Amit"))
print(bot.chat("What's my name?")) # Will remember!
Important Considerations:
- Token limits - History consumes tokens
- Truncate old messages - Keep recent context
- Summarize long conversations - Compress history
- Store in database - For persistent memory
