OpenAI API - Complete E-Book Guide

OpenAI API - Complete Guide

Master AI Integration with Practical Examples

• Introduction to OpenAI API

○ What is OpenAI API?

OpenAI API is a powerful cloud-based service that provides access to advanced artificial intelligence models. It allows developers to integrate AI capabilities like text generation, code completion, language translation, and more into their applications.

○ Key Features

Natural Language Processing: Understand and generate human-like text
Code Generation: Write and debug code in multiple programming languages
Language Translation: Translate between different languages
Content Creation: Generate articles, stories, and creative content
Data Analysis: Extract insights from text data
Conversation AI: Build chatbots and virtual assistants

○ Mind Map: OpenAI API Ecosystem

OpenAI API

Models

GPT-4

GPT-3.5

DALL-E

Whisper

Use Cases

Chatbots

Content

Coding

Analysis

Features

Text Gen

Translation

Summarization

Q&A

Integration

REST API

Python

Node.js

cURL

• Getting Started

○ Prerequisites

OpenAI Account: Sign up at platform.openai.com
API Key: Generate your unique API key
Programming Knowledge: Basic understanding of Python, JavaScript, or similar
HTTP/REST Concepts: Familiarity with API requests

○ Installation Process Flowchart

START

↓

Create OpenAI Account
Visit platform.openai.com

↓

Verify Email
Check inbox and confirm

↓

Generate API Key
Navigate to API section

↓

Install SDK
pip install openai

↓

Test Connection
Run sample code

↓

Success?

← No

Troubleshoot
Check credentials

Yes →

START BUILDING!

○ Setup Code Examples

Python Installation:

# Install OpenAI Python library
pip install openai

# Import and configure
import openai
openai.api_key = "your-api-key-here"

JavaScript/Node.js Installation:

// Install OpenAI npm package
npm install openai

// Import and configure
const { Configuration, OpenAIApi } = require("openai");
const configuration = new Configuration({
  apiKey: process.env.OPENAI_API_KEY,
});
const openai = new OpenAIApi(configuration);

• Authentication & API Keys

○ Understanding API Keys

API keys are secure tokens that authenticate your requests to OpenAI's servers. They act as your digital signature and are linked to your billing account.

○ Security Best Practices Table

Practice	Description	Importance
Never Commit to Git	Don't include API keys in version control	⭐⭐⭐⭐⭐
Use Environment Variables	Store keys in .env files	⭐⭐⭐⭐⭐
Rotate Keys Regularly	Change keys every 3-6 months	⭐⭐⭐⭐
Use Multiple Keys	Different keys for dev/staging/prod	⭐⭐⭐⭐
Monitor Usage	Track API calls for anomalies	⭐⭐⭐⭐

○ Implementation Example

Secure Key Management (Python):

# Using environment variables
import os
from dotenv import load_dotenv

load_dotenv()
api_key = os.getenv("OPENAI_API_KEY")

# Using config file (not recommended for production)
import json
with open('config.json') as f:
    config = json.load(f)
    api_key = config['api_key']

• Understanding Models

○ Available Models Comparison

Model Name	Description	Max Tokens	Best For
GPT-4	Most capable model, best reasoning	8,192	Complex tasks, analysis
GPT-4-32k	Extended context version	32,768	Long documents, detailed content
GPT-3.5-Turbo	Fast, cost-effective	4,096	Chatbots, quick responses
GPT-3.5-Turbo-16k	Extended context, affordable	16,384	Medium-length content
Text-Davinci-003	Legacy model, versatile	4,097	General purpose

○ Model Selection Flowchart

Choose a Model

↓

Need highest quality?

← Yes

Long content?

Yes → GPT-4-32k

No → GPT-4

No →

Budget conscious?

Yes → GPT-3.5-Turbo

Long → GPT-3.5-Turbo-16k

• Pricing Structure

○ Current Pricing (in Indian Rupees)

Note: Prices are approximate and based on 1 USD ≈ ₹83. Always check official OpenAI website for current rates.

Model	Input (per 1K tokens)	Output (per 1K tokens)	Use Case
GPT-4	₹2.49	₹4.98	Premium applications
GPT-4-32k	₹4.98	₹9.96	Extended context needs
GPT-3.5-Turbo	₹0.12	₹0.17	Cost-effective solutions
GPT-3.5-Turbo-16k	₹0.25	₹0.33	Balanced performance

○ Cost Estimation Calculator

Estimate Your Monthly Costs

Model:

Daily API Calls:

Avg Input Tokens:

Avg Output Tokens:

• Chat Completion API

○ Understanding the Chat Format

The Chat Completion API uses a conversational format with messages. Each message has a role (system, user, or assistant) and content.

○ Message Roles Table

Role	Purpose	Example
system	Set behavior and context	"You are a helpful coding assistant"
user	User's input/question	"How do I create a loop in Python?"
assistant	AI's previous responses	"Here's how to create a loop..."

○ Python Implementation Example

import openai

# Set your API key
openai.api_key = "your-api-key-here"

# Basic chat completion
response = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum computing in simple terms"}
    ],
    temperature=0.7,
    max_tokens=500
)

# Extract response
answer = response.choices[0].message.content
print(answer)

○ JavaScript Implementation Example

const { Configuration, OpenAIApi } = require("openai");

// Configure OpenAI
const configuration = new Configuration({
    apiKey: process.env.OPENAI_API_KEY,
});
const openai = new OpenAIApi(configuration);

// Chat completion function
async function chatCompletion() {
    try {
        const response = await openai.createChatCompletion({
            model: "gpt-3.5-turbo",
            messages: [
                {role: "system", content: "You are a helpful assistant."},
                {role: "user", content: "Explain machine learning"}
            ],
            temperature: 0.7,
            max_tokens: 500
        });
        
        console.log(response.data.choices[0].message.content);
    } catch (error) {
        console.error("Error:", error);
    }
}

chatCompletion();

• Advanced Features

○ Temperature & Top-P Settings

Parameter	Range	Effect	Best Use
Temperature	0.0 - 2.0	Controls randomness	0.7 for balanced output
Top-P	0.0 - 1.0	Nucleus sampling	0.9 for diverse responses
Max Tokens	1 - model limit	Response length	Based on use case
Frequency Penalty	-2.0 - 2.0	Reduce repetition	0.5 to avoid redundancy
Presence Penalty	-2.0 - 2.0	Encourage new topics	0.6 for variety

○ Streaming Responses

Streaming allows you to receive responses in real-time as they're generated, improving user experience for long responses.

Python Streaming Example:

import openai

openai.api_key = "your-api-key-here"

# Stream responses
for chunk in openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "user", "content": "Write a story"}],
    stream=True
):
    content = chunk.choices[0].delta.get("content", "")
    print(content, end="", flush=True)

○ Function Calling

Function calling enables the model to intelligently call external functions and APIs based on user requests.

functions = [
    {
        "name": "get_weather",
        "description": "Get current weather for a location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "City name"
                },
                "unit": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"]
                }
            },
            "required": ["location"]
        }
    }
]

response = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "user", "content": "What's the weather in Mumbai?"}],
    functions=functions,
    function_call="auto"
)

• Best Practices

○ Prompt Engineering Tips

Be Specific: Provide clear, detailed instructions
Use Examples: Show the model what you want with examples
Set Context: Use system messages to define behavior
Break Complex Tasks: Divide into smaller steps
Iterate: Refine prompts based on results
Use Delimiters: Separate different sections clearly

○ Error Handling Best Practices

Error Type	Cause	Solution
Rate Limit	Too many requests	Implement exponential backoff
Token Limit	Input/output too long	Truncate or split content
Invalid API Key	Wrong/expired key	Verify and regenerate key
Timeout	Request took too long	Increase timeout, reduce tokens
Server Error	OpenAI service issue	Retry with exponential backoff

○ Error Handling Code Example

import openai
import time

def call_openai_with_retry(messages, max_retries=3):
    for attempt in range(max_retries):
        try:
            response = openai.ChatCompletion.create(
                model="gpt-3.5-turbo",
                messages=messages,
                timeout=30
            )
            return response
        except openai.error.RateLimitError:
            wait_time = 2 ** attempt
            print(f"Rate limit hit. Waiting {wait_time} seconds...")
            time.sleep(wait_time)
        except openai.error.APIError as e:
            print(f"API Error: {e}")
            if attempt == max_retries - 1:
                raise
        except Exception as e:
            print(f"Unexpected error: {e}")
            raise
    
    raise Exception("Max retries exceeded")

• Real-World Use Cases

○ Common Applications

1. Customer Support Chatbot

def customer_support_bot(user_message, conversation_history):
    system_prompt = """You are a helpful customer support agent for an 
    e-commerce company. Be polite, professional, and solve issues efficiently.
    If you cannot help, escalate to human support."""
    
    messages = [{"role": "system", "content": system_prompt}]
    messages.extend(conversation_history)
    messages.append({"role": "user", "content": user_message})
    
    response = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=messages,
        temperature=0.5
    )
    
    return response.choices[0].message.content

2. Content Generation Tool

def generate_blog_post(topic, keywords, word_count):
    prompt = f"""Write a {word_count}-word blog post about {topic}.
    Include these keywords: {', '.join(keywords)}
    Make it engaging, informative, and SEO-friendly.
    Include an introduction, main body with subheadings, and conclusion."""
    
    response = openai.ChatCompletion.create(
        model="gpt-4",
        messages=[
            {"role": "system", "content": "You are a professional content writer."},
            {"role": "user", "content": prompt}
        ],
        temperature=0.7,
        max_tokens=word_count * 2
    )
    
    return response.choices[0].message.content

3. Code Review Assistant

def review_code(code, language):
    prompt = f"""Review this {language} code and provide:
    1. Potential bugs or issues
    2. Performance improvements
    3. Best practice suggestions
    4. Security concerns
    
    Code:
    ```{language}
    {code}
    ```"""
    
    response = openai.ChatCompletion.create(
        model="gpt-4",
        messages=[
            {"role": "system", "content": "You are an expert code reviewer."},
            {"role": "user", "content": prompt}
        ],
        temperature=0.3
    )
    
    return response.choices[0].message.content

○ Use Case Comparison Table

Use Case	Best Model	Avg Cost/Request (₹)	Implementation Difficulty
Simple Chatbot	GPT-3.5-Turbo	₹0.05	Easy
Content Writing	GPT-4	₹2.50	Medium
Code Generation	GPT-4	₹3.00	Medium
Data Analysis	GPT-4-32k	₹8.00	Hard
Translation	GPT-3.5-Turbo	₹0.08	Easy

• Troubleshooting

○ Common Issues & Solutions

Issue 1: "Invalid API Key" Error

Symptoms: Authentication fails, 401 error

Solutions:

Verify API Key: Check for typos or spaces
Check Expiration: Ensure key is active
Regenerate Key: Create new key if needed
Environment Variable: Confirm proper loading

Issue 2: Rate Limit Exceeded

Symptoms: 429 error, requests failing

Solutions:

Implement Backoff: Wait before retrying
Reduce Frequency: Batch requests
Upgrade Tier: Increase rate limits
Cache Results: Store common responses

Issue 3: Unexpected Response Quality

Symptoms: Irrelevant or poor responses

Solutions:

Refine Prompt: Be more specific
Adjust Temperature: Lower for consistency
Use Examples: Show desired format
Try Different Model: Upgrade to GPT-4

○ Debugging Flowchart

API Call Fails

↓

Check Error Code

401: Auth Issue

→ Verify API Key

429: Rate Limit

→ Implement Backoff

500: Server Error

→ Retry Later

• Practice Questions & Answers

Q1: What is the difference between GPT-4 and GPT-3.5-Turbo? ▼

Answer:

GPT-4:

More capable - Better reasoning and understanding
More accurate - Fewer mistakes and hallucinations
Better context handling - Follows complex instructions
Higher cost - ₹2.49 per 1K input tokens

GPT-3.5-Turbo:

Faster responses - Lower latency
Cost-effective - ₹0.12 per 1K input tokens
Good for simple tasks - Chatbots, basic queries
Sufficient for most use cases - Great performance/cost ratio

Q2: How do I reduce API costs? ▼

Answer:

Cost Reduction Strategies:

Use GPT-3.5-Turbo - 20x cheaper than GPT-4
Optimize prompts - Be concise, avoid unnecessary tokens
Implement caching - Store and reuse common responses
Set max_tokens - Limit response length
Batch requests - Combine multiple queries
Use streaming - Stop generation when needed
Monitor usage - Track and optimize high-cost queries

Q3: What is the purpose of the "temperature" parameter? ▼

Answer:

Temperature controls the randomness of the AI's responses:

Temperature	Effect	Best Use
0.0 - 0.3	Very deterministic, focused	Factual answers, code generation
0.4 - 0.7	Balanced creativity/consistency	General purpose, chatbots
0.8 - 1.0	More creative, varied	Creative writing, brainstorming
1.1 - 2.0	Highly random, experimental	Art, experimental content

Q4: How do I handle rate limiting errors? ▼

Answer:

Implement Exponential Backoff Strategy:

import time
import openai

def api_call_with_backoff(messages, max_retries=5):
    for attempt in range(max_retries):
        try:
            return openai.ChatCompletion.create(
                model="gpt-3.5-turbo",
                messages=messages
            )
        except openai.error.RateLimitError:
            if attempt == max_retries - 1:
                raise
            wait_time = (2 ** attempt) + random.uniform(0, 1)
            print(f"Rate limit hit. Waiting {wait_time:.2f}s")
            time.sleep(wait_time)

Additional Tips:

Monitor rate limits - Check usage dashboard
Implement queuing - Process requests sequentially
Upgrade tier - Increase limits if needed
Use batch API - For non-urgent requests

Q5: What are tokens and how are they counted? ▼

Answer:

Tokens are pieces of text that the model processes. They can be words, parts of words, or punctuation.

Token Counting Rules:

1 token ≈ 4 characters in English
1 token ≈ ¾ of a word on average
100 tokens ≈ 75 words
Punctuation counts as separate tokens
Spaces count in tokenization

Examples:

"Hello World" = 2 tokens
"ChatGPT" = 2 tokens (Chat + GPT)
"I love AI!" = 4 tokens (I, love, AI, !)

Calculate tokens in Python:

import tiktoken

def count_tokens(text, model="gpt-3.5-turbo"):
    encoding = tiktoken.encoding_for_model(model)
    return len(encoding.encode(text))

text = "How many tokens is this?"
print(f"Token count: {count_tokens(text)}")

Q6: How can I make my chatbot remember conversation history? ▼

Answer:

The API is stateless, so you must manually maintain conversation history:

class ChatBot:
    def __init__(self):
        self.conversation_history = []
        self.system_message = {
            "role": "system",
            "content": "You are a helpful assistant."
        }
    
    def chat(self, user_message):
        # Add user message to history
        self.conversation_history.append({
            "role": "user",
            "content": user_message
        })
        
        # Prepare messages (system + history)
        messages = [self.system_message] + self.conversation_history
        
        # Get response
        response = openai.ChatCompletion.create(
            model="gpt-3.5-turbo",
            messages=messages
        )
        
        # Add assistant response to history
        assistant_message = response.choices[0].message.content
        self.conversation_history.append({
            "role": "assistant",
            "content": assistant_message
        })
        
        return assistant_message
    
    def clear_history(self):
        self.conversation_history = []

# Usage
bot = ChatBot()
print(bot.chat("My name is Amit"))
print(bot.chat("What's my name?"))  # Will remember!

Important Considerations:

Token limits - History consumes tokens
Truncate old messages - Keep recent context
Summarize long conversations - Compress history
Store in database - For persistent memory