TL;DR
We will create an AI tool to create slides from a PDF. I'll show you how to build a backend service that generates PowerPoint slides asnyc using Python, Celery, and python-pptx. The backend simply accepts a PDF and returns slides as a pptx file. Exciting stuff isn't it.
The architecture of this tool is heavily inspired by what we work on at SlideSpeak. SlideSpeak is an AI tool to create slides from PDF and more. The code for this tutorial is available here:
Here's how the results of the PDF to slides AI generator look like:
But since we all absolutely love PowerPoint slides, let's get into it.
What You'll Build
This tutorial will walk you through creating a backend service that:
- Provides a RESTful API to request slide generation
- Processes slide requests asynchronously with Celery
- Creates professional PowerPoint slides with python-pptx
- Supports multiple slide layouts (title, content, bullet points, etc.)
- Extracts text from PDF files
- Uses OpenAI to generate presentation content automatically
- Scales efficiently to handle multiple requests
Tech Stack
- FastAPI: For creating the RESTful API endpoints
- Celery: For handling asynchronous tasks
- Redis: As message broker and result backend for Celery
- python-pptx: For programmatically creating PowerPoint files
- PyPDF2: For extracting text from PDF files
- OpenAI API: For intelligent content generation
- Docker & Docker Compose: For containerizing the application
Architecture
Getting Started
Before diving into the code, let's understand the project structure:
presentation_generator/
βββ app/
β βββ __init__.py
β βββ main.py # FastAPI application
β βββ models.py # Pydantic models
β βββ config.py # Configuration
β βββ ppt_generator.py # slide generation logic
β βββ pdf_processor.py # PDF processing and OpenAI integration
βββ celery_app/
β βββ __init__.py
β βββ tasks.py # Celery tasks
β βββ celery_config.py # Celery configuration
βββ requirements.txt
βββ docker-compose.yml
Step 1: Setting Up the Environment
Let's start by creating our project directory and installing the required dependencies:
mkdir presentation_generator
cd presentation_generator
python -m venv venv
source venv/bin/activate # On Windows, use: venv\Scripts\activate
Now, create a requirements.txt
file with the following dependencies:
fastapi==0.103.1
uvicorn==0.23.2
celery==5.3.4
redis==5.0.0
python-pptx==0.6.21
python-multipart==0.0.6
pydantic==2.3.0
pydantic-settings==2.0.3
pypdf2==3.0.1
openai==1.6.0
python-dotenv==1.0.0
Install these dependencies:
pip install -r requirements.txt
Step 2: Setting Up Configuration
Let's create a configuration file to manage our application settings. Create app/config.py
:
from pydantic_settings import BaseSettings
class Settings(BaseSettings):
APP_NAME: str = "Presentation Generator"
REDIS_URL: str = "redis://localhost:6379/0"
RESULT_BACKEND: str = "redis://localhost:6379/0"
STORAGE_PATH: str = "./storage"
OPENAI_API_KEY: str = ""
class Config:
env_file = ".env"
settings = Settings()
This configuration can be overridden with environment variables or values in a .env
file. Note that we've added an OPENAI_API_KEY
setting that we'll use later.
Step 3: Creating Data Models
Next, let's define our data models with Pydantic. Create app/models.py
:
from pydantic import BaseModel, Field
from typing import List, Optional
from enum import Enum
class SlideType(str, Enum):
TITLE = "title"
CONTENT = "content"
IMAGE = "image"
BULLET_POINTS = "bullet_points"
TWO_COLUMN = "two_column"
class SlideContent(BaseModel):
type: SlideType
title: str
content: Optional[str] = None
image_url: Optional[str] = None
bullet_points: Optional[List[str]] = None
column1: Optional[str] = None
column2: Optional[str] = None
class PresentationRequest(BaseModel):
title: str
author: str
slides: List[SlideContent]
theme: Optional[str] = "default"
# New model for PDF-based presentation requests
class PDFPresentationRequest(BaseModel):
title: Optional[str] = None
author: Optional[str] = "Generated Presentation"
theme: Optional[str] = "default"
num_slides: Optional[int] = 5
class PresentationResponse(BaseModel):
task_id: str
status: str = "pending"
class PresentationStatus(BaseModel):
task_id: str
status: str
file_url: Optional[str] = None
message: Optional[str] = None
We've added a new PDFPresentationRequest
model for handling PDF uploads. This model allows customizing the title, author, theme, and number of slides to generate.
Step 4: Implementing the AI Slide Generator
Now, let's create the core AI slide generation logic. Create app/ppt_generator.py
:
import os
from pathlib import Path
import uuid
from pptx import Presentation
from pptx.util import Inches, Pt
from app.models import SlideType, SlideContent, PresentationRequest
from app.config import settings
class PPTGenerator:
def __init__(self):
# Ensure storage directory exists
os.makedirs(settings.STORAGE_PATH, exist_ok=True)
def generate_presentation(self, request: PresentationRequest) -> str:
"""Generate a PowerPoint slide based on the request"""
prs = Presentation()
# Add title slide
title_slide_layout = prs.slide_layouts[0]
slide = prs.slides.add_slide(title_slide_layout)
title = slide.shapes.title
subtitle = slide.placeholders[1]
title.text = request.title
subtitle.text = f"By {request.author}"
# Add content slides
for slide_content in request.slides:
self._add_slide(prs, slide_content)
# Save the presentation
file_id = str(uuid.uuid4())
file_path = os.path.join(settings.STORAGE_PATH, f"{file_id}.pptx")
prs.save(file_path)
return file_path
def _add_slide(self, prs: Presentation, content: SlideContent):
"""Add a slide based on its type and content"""
if content.type == SlideType.TITLE:
slide_layout = prs.slide_layouts[0]
slide = prs.slides.add_slide(slide_layout)
title = slide.shapes.title
subtitle = slide.placeholders[1]
title.text = content.title
if content.content:
subtitle.text = content.content
elif content.type == SlideType.CONTENT:
slide_layout = prs.slide_layouts[1]
slide = prs.slides.add_slide(slide_layout)
title = slide.shapes.title
body = slide.placeholders[1]
title.text = content.title
if content.content:
body.text = content.content
elif content.type == SlideType.BULLET_POINTS:
slide_layout = prs.slide_layouts[1]
slide = prs.slides.add_slide(slide_layout)
title = slide.shapes.title
body = slide.placeholders[1]
title.text = content.title
if content.bullet_points:
tf = body.text_frame
tf.text = "" # Clear default text
for point in content.bullet_points:
p = tf.add_paragraph()
p.text = point
p.level = 0
elif content.type == SlideType.TWO_COLUMN:
slide_layout = prs.slide_layouts[3] # Assuming layout 3 is two-content
slide = prs.slides.add_slide(slide_layout)
title = slide.shapes.title
title.text = content.title
# Handle columns - this may vary based on your pptx template
left = slide.placeholders[1]
right = slide.placeholders[2]
if content.column1:
left.text = content.column1
if content.column2:
right.text = content.column2
elif content.type == SlideType.IMAGE:
# Basic image slide
slide_layout = prs.slide_layouts[5] # Blank slide with title
slide = prs.slides.add_slide(slide_layout)
title = slide.shapes.title
title.text = content.title
# Note: In a real application, you would handle image downloads
# and insertion here. For simplicity, we're omitting this.
This class handles the creation of PowerPoint slides using the python-pptx library. It supports different slide types and saves the generated files with unique IDs.
Step 5: Setting Up Celery
Now, let's configure Celery for asynchronous task processing. First, create celery_app/celery_config.py
:
from app.config import settings
broker_url = settings.REDIS_URL
result_backend = settings.RESULT_BACKEND
task_serializer = 'json'
result_serializer = 'json'
accept_content = ['json']
timezone = 'UTC'
task_track_started = True
worker_hijack_root_logger = False
Next, initialize the Celery application in celery_app/__init__.py
:
from celery import Celery
from app.config import settings
app = Celery('presentation_generator')
app.config_from_object('celery_app.celery_config')
# Import tasks to ensure they're registered
from celery_app import tasks
Step 6: Creating Celery Tasks
Let's define our asynchronous task for generating slides. Create celery_app/tasks.py
:
import os
import logging
from celery import shared_task
from app.models import PresentationRequest
from app.ppt_generator import PPTGenerator
logger = logging.getLogger(__name__)
@shared_task(bind=True)
def generate_presentation_task(self, request_dict):
"""Generate a PowerPoint presentation asynchronously"""
try:
# Convert dict back to PresentationRequest
request = PresentationRequest(**request_dict)
logger.info(f"Starting presentation generation for: {request.title}")
# Generate the presentation
generator = PPTGenerator()
file_path = generator.generate_presentation(request)
# In a real application, you might upload to S3 or similar
file_url = f"/download/{os.path.basename(file_path)}"
return {
"status": "completed",
"file_url": file_url,
"message": "Presentation generated successfully"
}
except Exception as e:
logger.error(f"Error generating presentation: {str(e)}")
self.update_state(
state="FAILURE",
meta={
"status": "failed",
"message": f"Error: {str(e)}"
}
)
raise
This task will be processed asynchronously by Celery workers.
Step 7: Creating the PDF Processor
Now, let's add the PDF processing functionality. Create app/pdf_processor.py
:
import os
import tempfile
from PyPDF2 import PdfReader
from openai import OpenAI
from typing import List, Dict, Any
from app.config import settings
from app.models import SlideContent, SlideType
class PDFProcessor:
def __init__(self):
self.client = OpenAI(api_key=settings.OPENAI_API_KEY)
def extract_text_from_pdf(self, pdf_content: bytes) -> str:
"""Extract text content from PDF bytes"""
with tempfile.NamedTemporaryFile(delete=False) as temp:
temp.write(pdf_content)
temp_path = temp.name
try:
pdf = PdfReader(temp_path)
text = ""
for page in pdf.pages:
text += page.extract_text() + "\n"
return text
finally:
# Clean up the temp file
if os.path.exists(temp_path):
os.unlink(temp_path)
def generate_presentation_content(self, text: str, title: str = None, num_slides: int = 5) -> Dict[str, Any]:
"""Generate presentation content using OpenAI"""
# Prepare the system message
system_message = f"""
You are an expert presentation creator. Your task is to create a well-structured presentation
from the provided text content. Extract the key points and organize them into a cohesive presentation.
Create a presentation with the following:
1. A title slide with an engaging title (if not provided) and subtitle
2. {num_slides-1} content slides
Structure the presentation logically and extract the most important information.
"""
# Prepare the user message
user_message = f"""
Create a presentation based on the following content:
{text[:10000]} # Limit text to avoid token limits
Please structure your response in JSON format with the following structure:
{{
"title": "Main Title of Presentation",
"slides": [
{{
"type": "title",
"title": "Presentation Title",
"content": "Subtitle - e.g. Author's Name"
}},
{{
"type": "bullet_points",
"title": "Key Point 1",
"bullet_points": ["Point 1", "Point 2", "Point 3"]
}},
...
]
}}
Ensure all slide content is concise and impactful. Use different slide types appropriately:
- title: For title slides with a subtitle
- content: For slides with paragraphs of text
- bullet_points: For key points in a list format
- two_column: For comparing information side by side
"""
if title:
user_message += f"\nUse '{title}' as the presentation title."
# Call the OpenAI API
response = self.client.chat.completions.create(
model="gpt-4o",
response_format={"type": "json_object"},
messages=[
{"role": "system", "content": system_message},
{"role": "user", "content": user_message}
]
)
# Extract the response content
content = response.choices[0].message.content
# Parse the JSON content
import json
presentation_data = json.loads(content)
return presentation_data
This class handles the extraction of text from PDF files and uses OpenAI to generate presentation content based on that text. It uses PyPDF2 to read the PDF and extract text, then sends that text to OpenAI's API with specific instructions to create a well-structured presentation.
Step 8: Updating Celery Tasks
Next, let's update our Celery tasks to handle PDF processing. Modify celery_app/tasks.py
:
import os
import logging
from celery import shared_task
from app.models import PresentationRequest, PDFPresentationRequest
from app.ppt_generator import PPTGenerator
from app.pdf_processor import PDFProcessor
logger = logging.getLogger(__name__)
@shared_task(bind=True)
def generate_presentation_task(self, request_dict):
"""Generate a PowerPoint presentation asynchronously"""
try:
# Convert dict back to PresentationRequest
request = PresentationRequest(**request_dict)
logger.info(f"Starting presentation generation for: {request.title}")
# Generate the presentation
generator = PPTGenerator()
file_path = generator.generate_presentation(request)
# In a real application, you might upload to S3 or similar
file_url = f"/download/{os.path.basename(file_path)}"
return {
"status": "completed",
"file_url": file_url,
"message": "Presentation generated successfully"
}
except Exception as e:
logger.error(f"Error generating presentation: {str(e)}")
self.update_state(
state="FAILURE",
meta={
"status": "failed",
"message": f"Error: {str(e)}"
}
)
raise
@shared_task(bind=True)
def generate_presentation_from_pdf_task(self, pdf_text, request_dict):
"""Generate a PowerPoint presentation from PDF text asynchronously"""
try:
# Convert dict back to PDFPresentationRequest
request = PDFPresentationRequest(**request_dict)
logger.info(f"Starting presentation generation from PDF")
# Process the PDF text with OpenAI
processor = PDFProcessor()
presentation_data = processor.generate_presentation_content(
pdf_text,
title=request.title,
num_slides=request.num_slides
)
# Create a PresentationRequest from the generated content
presentation_request = PresentationRequest(
title=presentation_data.get("title", request.title or "Generated Presentation"),
author=request.author,
theme=request.theme,
slides=presentation_data.get("slides", [])
)
# Generate the presentation
generator = PPTGenerator()
file_path = generator.generate_presentation(presentation_request)
# In a real application, you might upload to S3 or similar
file_url = f"/download/{os.path.basename(file_path)}"
return {
"status": "completed",
"file_url": file_url,
"message": "Presentation generated successfully from PDF"
}
except Exception as e:
logger.error(f"Error generating presentation from PDF: {str(e)}")
self.update_state(
state="FAILURE",
meta={
"status": "failed",
"message": f"Error: {str(e)}"
}
)
raise
We've added a new task generate_presentation_from_pdf_task
that takes the extracted PDF text and request details, then uses the PDF processor to generate presentation content with OpenAI.
Step 9: Updating the FastAPI Application
Now, let's update our FastAPI application to add the PDF upload endpoint. Modify app/main.py
:
import os
from fastapi import FastAPI, BackgroundTasks, HTTPException, UploadFile, File, Form, Depends
from fastapi.responses import FileResponse
from fastapi.staticfiles import StaticFiles
from celery.result import AsyncResult
from typing import Optional
from app.models import PresentationRequest, PDFPresentationRequest, PresentationResponse, PresentationStatus
from app.config import settings
from app.pdf_processor import PDFProcessor
from celery_app.tasks import generate_presentation_task, generate_presentation_from_pdf_task
app = FastAPI(title=settings.APP_NAME)
# Mount storage directory for file downloads
app.mount("/download", StaticFiles(directory=settings.STORAGE_PATH), name="download")
@app.post("/api/presentations", response_model=PresentationResponse)
async def create_presentation(request: PresentationRequest):
"""Submit a new presentation generation task"""
# Submit task to Celery
task = generate_presentation_task.delay(request.model_dump())
return PresentationResponse(task_id=task.id)
@app.post("/api/presentations/from-pdf", response_model=PresentationResponse)
async def create_presentation_from_pdf(
pdf_file: UploadFile = File(...),
title: Optional[str] = Form(None),
author: str = Form("Generated Presentation"),
theme: str = Form("default"),
num_slides: int = Form(5)
):
"""Submit a presentation generation task from PDF file"""
if not pdf_file.filename.endswith('.pdf'):
raise HTTPException(status_code=400, detail="File must be a PDF")
# Read PDF file content
pdf_content = await pdf_file.read()
# Extract text from PDF
processor = PDFProcessor()
pdf_text = processor.extract_text_from_pdf(pdf_content)
# Create request object
request = PDFPresentationRequest(
title=title or f"Presentation based on {pdf_file.filename}",
author=author,
theme=theme,
num_slides=num_slides
)
# Submit task to Celery
task = generate_presentation_from_pdf_task.delay(pdf_text, request.model_dump())
return PresentationResponse(task_id=task.id)
@app.get("/api/presentations/{task_id}", response_model=PresentationStatus)
async def get_presentation_status(task_id: str):
"""Get the status of a presentation generation task"""
task_result = AsyncResult(task_id)
if task_result.state == 'PENDING':
return PresentationStatus(
task_id=task_id,
status="pending",
message="Task is pending"
)
elif task_result.state == 'FAILURE':
return PresentationStatus(
task_id=task_id,
status="failed",
message=str(task_result.info.get('message', 'Unknown error'))
)
elif task_result.state == 'SUCCESS':
result = task_result.get()
return PresentationStatus(
task_id=task_id,
status="completed",
file_url=result.get('file_url'),
message=result.get('message')
)
else:
return PresentationStatus(
task_id=task_id,
status=task_result.state.lower(),
message="Task is in progress"
)
@app.get("/api/download/{file_id}")
async def download_presentation(file_id: str):
"""Download a generated presentation"""
file_path = os.path.join(settings.STORAGE_PATH, file_id)
if not os.path.exists(file_path):
raise HTTPException(status_code=404, detail="File not found")
return FileResponse(path=file_path, filename=f"presentation_{file_id}")
We've added a new endpoint /api/presentations/from-pdf
that accepts PDF file uploads along with optional parameters like title, author, theme, and the number of slides to generate.
Step 10: Containerizing with Docker
Let's update our Docker configuration to include the OpenAI API key. First, create a .env
file:
APP_NAME=Presentation Generator
REDIS_URL=redis://redis:6379/0
RESULT_BACKEND=redis://redis:6379/0
STORAGE_PATH=/app/storage
OPENAI_API_KEY=your_openai_api_key_here
Next, update the docker-compose.yml
file to include the OpenAI API key:
version: '3'
services:
api:
build: .
command: uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
volumes:
- .:/app
- presentation_data:/app/storage
ports:
- "8000:8000"
depends_on:
- redis
environment:
- REDIS_URL=redis://redis:6379/0
- RESULT_BACKEND=redis://redis:6379/0
- OPENAI_API_KEY=${OPENAI_API_KEY}
worker:
build: .
command: celery -A celery_app worker --loglevel=info
volumes:
- .:/app
- presentation_data:/app/storage
depends_on:
- redis
environment:
- REDIS_URL=redis://redis:6379/0
- RESULT_BACKEND=redis://redis:6379/0
- OPENAI_API_KEY=${OPENAI_API_KEY}
redis:
image: redis:7-alpine
ports:
- "6379:6379"
volumes:
presentation_data:
This setup will pass your OpenAI API key from the .env
file to the containerized services.Separation of concerns** - API, task processing, and presentation generation are separate
- Asynchronous processing - Long-running tasks don't block the API
- Containerization - Easy deployment and scaling
- Type safety - Pydantic models ensure data validation
You can extend this project in many ways, such as adding more slide types, integrating with data visualization libraries, or implementing template management.
Feel free to customize this service to fit your specific needs and save yourself from the drudgery of creating presentations manually!
GitHub Repository
The complete code for this tutorial is available on GitHub.
If you found this tutorial helpful, give it a β€οΈ and share it with others who might benefit from creating slides with AI!
Happy coding! π
Top comments (0)