Kevin Goedecke

Posted on Apr 21

Create a PDF to Slide AI Generator with Python, Celery, and python-pptx 🔥🚀

#ai #celery #python #fastapi

TL;DR

We will create an AI tool to create slides from a PDF. I'll show you how to build a backend service that generates PowerPoint slides asnyc using Python, Celery, and python-pptx. The backend simply accepts a PDF and returns slides as a pptx file. Exciting stuff isn't it.

The architecture of this tool is heavily inspired by what we work on at SlideSpeak. SlideSpeak is an AI tool to create slides from PDF and more. The code for this tutorial is available here:

Here's how the results of the PDF to slides AI generator look like:

But since we all absolutely love PowerPoint slides, let's get into it.

What You'll Build

This tutorial will walk you through creating a backend service that:

Provides a RESTful API to request slide generation
Processes slide requests asynchronously with Celery
Creates professional PowerPoint slides with python-pptx
Supports multiple slide layouts (title, content, bullet points, etc.)
Extracts text from PDF files
Uses OpenAI to generate presentation content automatically
Scales efficiently to handle multiple requests

Tech Stack

FastAPI: For creating the RESTful API endpoints
Celery: For handling asynchronous tasks
Redis: As message broker and result backend for Celery
python-pptx: For programmatically creating PowerPoint files
PyPDF2: For extracting text from PDF files
OpenAI API: For intelligent content generation
Docker & Docker Compose: For containerizing the application

Architecture

Getting Started

Before diving into the code, let's understand the project structure:

presentation_generator/
├── app/
│   ├── __init__.py
│   ├── main.py              # FastAPI application
│   ├── models.py            # Pydantic models
│   ├── config.py            # Configuration
│   ├── ppt_generator.py     # slide generation logic
│   └── pdf_processor.py     # PDF processing and OpenAI integration
├── celery_app/
│   ├── __init__.py  
│   ├── tasks.py             # Celery tasks
│   └── celery_config.py     # Celery configuration
├── requirements.txt
└── docker-compose.yml

Step 1: Setting Up the Environment

Let's start by creating our project directory and installing the required dependencies:

mkdir presentation_generator
cd presentation_generator
python -m venv venv
source venv/bin/activate  # On Windows, use: venv\Scripts\activate

Now, create a requirements.txt file with the following dependencies:

fastapi==0.103.1
uvicorn==0.23.2
celery==5.3.4
redis==5.0.0
python-pptx==0.6.21
python-multipart==0.0.6
pydantic==2.3.0
pydantic-settings==2.0.3
pypdf2==3.0.1
openai==1.6.0
python-dotenv==1.0.0

Install these dependencies:

pip install -r requirements.txt

Step 2: Setting Up Configuration

Let's create a configuration file to manage our application settings. Create app/config.py:

from pydantic_settings import BaseSettings

class Settings(BaseSettings):
    APP_NAME: str = "Presentation Generator"
    REDIS_URL: str = "redis://localhost:6379/0"
    RESULT_BACKEND: str = "redis://localhost:6379/0"
    STORAGE_PATH: str = "./storage"
    OPENAI_API_KEY: str = ""

    class Config:
        env_file = ".env"

settings = Settings()

This configuration can be overridden with environment variables or values in a .env file. Note that we've added an OPENAI_API_KEY setting that we'll use later.

Step 3: Creating Data Models

Next, let's define our data models with Pydantic. Create app/models.py:

from pydantic import BaseModel, Field
from typing import List, Optional
from enum import Enum

class SlideType(str, Enum):
    TITLE = "title"
    CONTENT = "content"
    IMAGE = "image"
    BULLET_POINTS = "bullet_points"
    TWO_COLUMN = "two_column"

class SlideContent(BaseModel):
    type: SlideType
    title: str
    content: Optional[str] = None
    image_url: Optional[str] = None
    bullet_points: Optional[List[str]] = None
    column1: Optional[str] = None
    column2: Optional[str] = None

class PresentationRequest(BaseModel):
    title: str
    author: str
    slides: List[SlideContent]
    theme: Optional[str] = "default"

# New model for PDF-based presentation requests
class PDFPresentationRequest(BaseModel):
    title: Optional[str] = None
    author: Optional[str] = "Generated Presentation"
    theme: Optional[str] = "default"
    num_slides: Optional[int] = 5

class PresentationResponse(BaseModel):
    task_id: str
    status: str = "pending"

class PresentationStatus(BaseModel):
    task_id: str
    status: str
    file_url: Optional[str] = None
    message: Optional[str] = None

We've added a new PDFPresentationRequest model for handling PDF uploads. This model allows customizing the title, author, theme, and number of slides to generate.

Step 4: Implementing the AI Slide Generator

Now, let's create the core AI slide generation logic. Create app/ppt_generator.py:

import os
from pathlib import Path
import uuid
from pptx import Presentation
from pptx.util import Inches, Pt
from app.models import SlideType, SlideContent, PresentationRequest
from app.config import settings

class PPTGenerator:
    def __init__(self):
        # Ensure storage directory exists
        os.makedirs(settings.STORAGE_PATH, exist_ok=True)

    def generate_presentation(self, request: PresentationRequest) -> str:
        """Generate a PowerPoint slide based on the request"""
        prs = Presentation()

        # Add title slide
        title_slide_layout = prs.slide_layouts[0]
        slide = prs.slides.add_slide(title_slide_layout)
        title = slide.shapes.title
        subtitle = slide.placeholders[1]
        title.text = request.title
        subtitle.text = f"By {request.author}"

        # Add content slides
        for slide_content in request.slides:
            self._add_slide(prs, slide_content)

        # Save the presentation
        file_id = str(uuid.uuid4())
        file_path = os.path.join(settings.STORAGE_PATH, f"{file_id}.pptx")
        prs.save(file_path)

        return file_path

    def _add_slide(self, prs: Presentation, content: SlideContent):
        """Add a slide based on its type and content"""
        if content.type == SlideType.TITLE:
            slide_layout = prs.slide_layouts[0]
            slide = prs.slides.add_slide(slide_layout)
            title = slide.shapes.title
            subtitle = slide.placeholders[1]
            title.text = content.title
            if content.content:
                subtitle.text = content.content

        elif content.type == SlideType.CONTENT:
            slide_layout = prs.slide_layouts[1]
            slide = prs.slides.add_slide(slide_layout)
            title = slide.shapes.title
            body = slide.placeholders[1]
            title.text = content.title
            if content.content:
                body.text = content.content

        elif content.type == SlideType.BULLET_POINTS:
            slide_layout = prs.slide_layouts[1]
            slide = prs.slides.add_slide(slide_layout)
            title = slide.shapes.title
            body = slide.placeholders[1]
            title.text = content.title

            if content.bullet_points:
                tf = body.text_frame
                tf.text = ""  # Clear default text

                for point in content.bullet_points:
                    p = tf.add_paragraph()
                    p.text = point
                    p.level = 0

        elif content.type == SlideType.TWO_COLUMN:
            slide_layout = prs.slide_layouts[3]  # Assuming layout 3 is two-content
            slide = prs.slides.add_slide(slide_layout)
            title = slide.shapes.title
            title.text = content.title

            # Handle columns - this may vary based on your pptx template
            left = slide.placeholders[1]
            right = slide.placeholders[2]

            if content.column1:
                left.text = content.column1
            if content.column2:
                right.text = content.column2

        elif content.type == SlideType.IMAGE:
            # Basic image slide
            slide_layout = prs.slide_layouts[5]  # Blank slide with title
            slide = prs.slides.add_slide(slide_layout)
            title = slide.shapes.title
            title.text = content.title

            # Note: In a real application, you would handle image downloads
            # and insertion here. For simplicity, we're omitting this.

This class handles the creation of PowerPoint slides using the python-pptx library. It supports different slide types and saves the generated files with unique IDs.

Step 5: Setting Up Celery

Now, let's configure Celery for asynchronous task processing. First, create celery_app/celery_config.py:

from app.config import settings

broker_url = settings.REDIS_URL
result_backend = settings.RESULT_BACKEND
task_serializer = 'json'
result_serializer = 'json'
accept_content = ['json']
timezone = 'UTC'
task_track_started = True
worker_hijack_root_logger = False

Next, initialize the Celery application in celery_app/__init__.py:

from celery import Celery
from app.config import settings

app = Celery('presentation_generator')
app.config_from_object('celery_app.celery_config')

# Import tasks to ensure they're registered
from celery_app import tasks

Step 6: Creating Celery Tasks

Let's define our asynchronous task for generating slides. Create celery_app/tasks.py:

import os
import logging
from celery import shared_task
from app.models import PresentationRequest
from app.ppt_generator import PPTGenerator

logger = logging.getLogger(__name__)

@shared_task(bind=True)
def generate_presentation_task(self, request_dict):
    """Generate a PowerPoint presentation asynchronously"""
    try:
        # Convert dict back to PresentationRequest
        request = PresentationRequest(**request_dict)

        logger.info(f"Starting presentation generation for: {request.title}")

        # Generate the presentation
        generator = PPTGenerator()
        file_path = generator.generate_presentation(request)

        # In a real application, you might upload to S3 or similar
        file_url = f"/download/{os.path.basename(file_path)}"

        return {
            "status": "completed",
            "file_url": file_url,
            "message": "Presentation generated successfully"
        }

    except Exception as e:
        logger.error(f"Error generating presentation: {str(e)}")
        self.update_state(
            state="FAILURE",
            meta={
                "status": "failed",
                "message": f"Error: {str(e)}"
            }
        )
        raise

This task will be processed asynchronously by Celery workers.

Step 7: Creating the PDF Processor

Now, let's add the PDF processing functionality. Create app/pdf_processor.py:

import os
import tempfile
from PyPDF2 import PdfReader
from openai import OpenAI
from typing import List, Dict, Any
from app.config import settings
from app.models import SlideContent, SlideType

class PDFProcessor:
    def __init__(self):
        self.client = OpenAI(api_key=settings.OPENAI_API_KEY)

    def extract_text_from_pdf(self, pdf_content: bytes) -> str:
        """Extract text content from PDF bytes"""
        with tempfile.NamedTemporaryFile(delete=False) as temp:
            temp.write(pdf_content)
            temp_path = temp.name

        try:
            pdf = PdfReader(temp_path)
            text = ""
            for page in pdf.pages:
                text += page.extract_text() + "\n"

            return text
        finally:
            # Clean up the temp file
            if os.path.exists(temp_path):
                os.unlink(temp_path)

    def generate_presentation_content(self, text: str, title: str = None, num_slides: int = 5) -> Dict[str, Any]:
        """Generate presentation content using OpenAI"""
        # Prepare the system message
        system_message = f"""
        You are an expert presentation creator. Your task is to create a well-structured presentation 
        from the provided text content. Extract the key points and organize them into a cohesive presentation.

        Create a presentation with the following:
        1. A title slide with an engaging title (if not provided) and subtitle
        2. {num_slides-1} content slides

        Structure the presentation logically and extract the most important information.
        """

        # Prepare the user message
        user_message = f"""
        Create a presentation based on the following content:

        {text[:10000]}  # Limit text to avoid token limits

        Please structure your response in JSON format with the following structure:
        {{
            "title": "Main Title of Presentation",
            "slides": [
                {{
                    "type": "title",
                    "title": "Presentation Title",
                    "content": "Subtitle - e.g. Author's Name"
                }},
                {{
                    "type": "bullet_points",
                    "title": "Key Point 1",
                    "bullet_points": ["Point 1", "Point 2", "Point 3"]
                }},
                ...
            ]
        }}

        Ensure all slide content is concise and impactful. Use different slide types appropriately:
        - title: For title slides with a subtitle
        - content: For slides with paragraphs of text
        - bullet_points: For key points in a list format
        - two_column: For comparing information side by side
        """

        if title:
            user_message += f"\nUse '{title}' as the presentation title."

        # Call the OpenAI API
        response = self.client.chat.completions.create(
            model="gpt-4o",
            response_format={"type": "json_object"},
            messages=[
                {"role": "system", "content": system_message},
                {"role": "user", "content": user_message}
            ]
        )

        # Extract the response content
        content = response.choices[0].message.content

        # Parse the JSON content
        import json
        presentation_data = json.loads(content)

        return presentation_data

This class handles the extraction of text from PDF files and uses OpenAI to generate presentation content based on that text. It uses PyPDF2 to read the PDF and extract text, then sends that text to OpenAI's API with specific instructions to create a well-structured presentation.

Step 8: Updating Celery Tasks

Next, let's update our Celery tasks to handle PDF processing. Modify celery_app/tasks.py:

import os
import logging
from celery import shared_task
from app.models import PresentationRequest, PDFPresentationRequest
from app.ppt_generator import PPTGenerator
from app.pdf_processor import PDFProcessor

logger = logging.getLogger(__name__)

@shared_task(bind=True)
def generate_presentation_task(self, request_dict):
    """Generate a PowerPoint presentation asynchronously"""
    try:
        # Convert dict back to PresentationRequest
        request = PresentationRequest(**request_dict)

        logger.info(f"Starting presentation generation for: {request.title}")

        # Generate the presentation
        generator = PPTGenerator()
        file_path = generator.generate_presentation(request)

        # In a real application, you might upload to S3 or similar
        file_url = f"/download/{os.path.basename(file_path)}"

        return {
            "status": "completed",
            "file_url": file_url,
            "message": "Presentation generated successfully"
        }

    except Exception as e:
        logger.error(f"Error generating presentation: {str(e)}")
        self.update_state(
            state="FAILURE",
            meta={
                "status": "failed",
                "message": f"Error: {str(e)}"
            }
        )
        raise

@shared_task(bind=True)
def generate_presentation_from_pdf_task(self, pdf_text, request_dict):
    """Generate a PowerPoint presentation from PDF text asynchronously"""
    try:
        # Convert dict back to PDFPresentationRequest
        request = PDFPresentationRequest(**request_dict)

        logger.info(f"Starting presentation generation from PDF")

        # Process the PDF text with OpenAI
        processor = PDFProcessor()
        presentation_data = processor.generate_presentation_content(
            pdf_text, 
            title=request.title,
            num_slides=request.num_slides
        )

        # Create a PresentationRequest from the generated content
        presentation_request = PresentationRequest(
            title=presentation_data.get("title", request.title or "Generated Presentation"),
            author=request.author,
            theme=request.theme,
            slides=presentation_data.get("slides", [])
        )

        # Generate the presentation
        generator = PPTGenerator()
        file_path = generator.generate_presentation(presentation_request)

        # In a real application, you might upload to S3 or similar
        file_url = f"/download/{os.path.basename(file_path)}"

        return {
            "status": "completed",
            "file_url": file_url,
            "message": "Presentation generated successfully from PDF"
        }

    except Exception as e:
        logger.error(f"Error generating presentation from PDF: {str(e)}")
        self.update_state(
            state="FAILURE",
            meta={
                "status": "failed",
                "message": f"Error: {str(e)}"
            }
        )
        raise

We've added a new task generate_presentation_from_pdf_task that takes the extracted PDF text and request details, then uses the PDF processor to generate presentation content with OpenAI.

Step 9: Updating the FastAPI Application

Now, let's update our FastAPI application to add the PDF upload endpoint. Modify app/main.py:

import os
from fastapi import FastAPI, BackgroundTasks, HTTPException, UploadFile, File, Form, Depends
from fastapi.responses import FileResponse
from fastapi.staticfiles import StaticFiles
from celery.result import AsyncResult
from typing import Optional

from app.models import PresentationRequest, PDFPresentationRequest, PresentationResponse, PresentationStatus
from app.config import settings
from app.pdf_processor import PDFProcessor
from celery_app.tasks import generate_presentation_task, generate_presentation_from_pdf_task

app = FastAPI(title=settings.APP_NAME)

# Mount storage directory for file downloads
app.mount("/download", StaticFiles(directory=settings.STORAGE_PATH), name="download")

@app.post("/api/presentations", response_model=PresentationResponse)
async def create_presentation(request: PresentationRequest):
    """Submit a new presentation generation task"""
    # Submit task to Celery
    task = generate_presentation_task.delay(request.model_dump())

    return PresentationResponse(task_id=task.id)

@app.post("/api/presentations/from-pdf", response_model=PresentationResponse)
async def create_presentation_from_pdf(
    pdf_file: UploadFile = File(...),
    title: Optional[str] = Form(None),
    author: str = Form("Generated Presentation"),
    theme: str = Form("default"),
    num_slides: int = Form(5)
):
    """Submit a presentation generation task from PDF file"""
    if not pdf_file.filename.endswith('.pdf'):
        raise HTTPException(status_code=400, detail="File must be a PDF")

    # Read PDF file content
    pdf_content = await pdf_file.read()

    # Extract text from PDF
    processor = PDFProcessor()
    pdf_text = processor.extract_text_from_pdf(pdf_content)

    # Create request object
    request = PDFPresentationRequest(
        title=title or f"Presentation based on {pdf_file.filename}",
        author=author,
        theme=theme,
        num_slides=num_slides
    )

    # Submit task to Celery
    task = generate_presentation_from_pdf_task.delay(pdf_text, request.model_dump())

    return PresentationResponse(task_id=task.id)

@app.get("/api/presentations/{task_id}", response_model=PresentationStatus)
async def get_presentation_status(task_id: str):
    """Get the status of a presentation generation task"""
    task_result = AsyncResult(task_id)

    if task_result.state == 'PENDING':
        return PresentationStatus(
            task_id=task_id,
            status="pending",
            message="Task is pending"
        )
    elif task_result.state == 'FAILURE':
        return PresentationStatus(
            task_id=task_id,
            status="failed",
            message=str(task_result.info.get('message', 'Unknown error'))
        )
    elif task_result.state == 'SUCCESS':
        result = task_result.get()
        return PresentationStatus(
            task_id=task_id,
            status="completed",
            file_url=result.get('file_url'),
            message=result.get('message')
        )
    else:
        return PresentationStatus(
            task_id=task_id,
            status=task_result.state.lower(),
            message="Task is in progress"
        )

@app.get("/api/download/{file_id}")
async def download_presentation(file_id: str):
    """Download a generated presentation"""
    file_path = os.path.join(settings.STORAGE_PATH, file_id)

    if not os.path.exists(file_path):
        raise HTTPException(status_code=404, detail="File not found")

    return FileResponse(path=file_path, filename=f"presentation_{file_id}")

We've added a new endpoint /api/presentations/from-pdf that accepts PDF file uploads along with optional parameters like title, author, theme, and the number of slides to generate.

Step 10: Containerizing with Docker

Let's update our Docker configuration to include the OpenAI API key. First, create a .env file:

APP_NAME=Presentation Generator
REDIS_URL=redis://redis:6379/0
RESULT_BACKEND=redis://redis:6379/0
STORAGE_PATH=/app/storage
OPENAI_API_KEY=your_openai_api_key_here

Next, update the docker-compose.yml file to include the OpenAI API key:

version: '3'

services:
  api:
    build: .
    command: uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
    volumes:
      - .:/app
      - presentation_data:/app/storage
    ports:
      - "8000:8000"
    depends_on:
      - redis
    environment:
      - REDIS_URL=redis://redis:6379/0
      - RESULT_BACKEND=redis://redis:6379/0
      - OPENAI_API_KEY=${OPENAI_API_KEY}

  worker:
    build: .
    command: celery -A celery_app worker --loglevel=info
    volumes:
      - .:/app
      - presentation_data:/app/storage
    depends_on:
      - redis
    environment:
      - REDIS_URL=redis://redis:6379/0
      - RESULT_BACKEND=redis://redis:6379/0
      - OPENAI_API_KEY=${OPENAI_API_KEY}

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"

volumes:
  presentation_data:

This setup will pass your OpenAI API key from the .env file to the containerized services.Separation of concerns** - API, task processing, and presentation generation are separate

Asynchronous processing - Long-running tasks don't block the API
Containerization - Easy deployment and scaling
Type safety - Pydantic models ensure data validation

You can extend this project in many ways, such as adding more slide types, integrating with data visualization libraries, or implementing template management.

Feel free to customize this service to fit your specific needs and save yourself from the drudgery of creating presentations manually!

GitHub Repository

The complete code for this tutorial is available on GitHub.

If you found this tutorial helpful, give it a ❤️ and share it with others who might benefit from creating slides with AI!

Happy coding! 🚀

DEV Community