Introduction & Context
Images are everywhere—but they often lack proper descriptions. Whether for accessibility, SEO, or UX, adding meaningful captions is essential. But what if we could generate them automatically using AI?
In this article, we’ll walk you through building a Vue 3 component that uses the Hugging Face Inference API to generate human-like captions for uploaded images. You’ll learn how to create a drag-and-drop uploader, handle image previews, and get smart captions with just a few lines of code.
Perfect for frontend devs looking to bring intelligent features into visual interfaces!
Goals and What You’ll Learn
- By the end of this tutorial, you’ll be able to:
- Build a drag-and-drop or file upload component
- Send images to an AI image captioning model
- Display the uploaded image preview and its AI-generated caption
- Handle loading and error states gracefully
- Enhance accessibility with alt-text or voiceover features
Tech Stack
- Vue 3 +
- Axios for API calls
- Hugging Face Inference API (BLIP or similar image-to-text model)
- Tailwind CSS (optional for styling)
Prerequisites
- Create a free account on Hugging Face.
- Generate an API token from your account settings.
Store it in an .env
file:
VITE_HUGGINGFACE_API_KEY=your_token_here
Building the Component: ImageCaptioner.vue
<template>
<div class="max-w-lg mx-auto p-4 border rounded bg-white">
<h2 class="text-lg font-semibold mb-2">AI Image Caption Generator</h2>
<input type="file" accept="image/*" @change="handleUpload" class="mb-4" />
<div v-if="imageUrl" class="mb-4">
<img :src="https://dev.to/imageUrl" alt="Uploaded preview" class="rounded shadow" />
</div>
<button
@click="generateCaption"
:disabled="!imageBlob || loading"
class="bg-blue-600 text-white px-4 py-2 rounded disabled:opacity-50"
>
{{ loading ? 'Generating...' : 'Generate Caption' }}
</button>
<div v-if="caption" class="mt-4 p-2 border rounded bg-gray-100">
<strong>Caption:</strong>
<p>{{ caption }}</p>
</div>
</div>
</template>
<script setup>
import { ref } from 'vue'
import axios from 'axios'
const imageUrl = ref(null)
const imageBlob = ref(null)
const caption = ref('')
const loading = ref(false)
const handleUpload = (e) => {
const file = e.target.files[0]
if (!file) return
imageBlob.value = file
imageUrl.value = URL.createObjectURL(file)
caption.value = ''
}
const generateCaption = async () => {
if (!imageBlob.value) return
loading.value = true
caption.value = ''
const formData = new FormData()
formData.append('file', imageBlob.value)
try {
const response = await axios.post(
'https://api-inference.huggingface.co/models/Salesforce/blip-image-captioning-base',
imageBlob.value,
{
headers: {
Authorization: `Bearer ${import.meta.env.VITE_HUGGINGFACE_API_KEY}`,
'Content-Type': 'application/octet-stream'
}
}
)
caption.value = response.data[0]?.generated_text || 'No caption returned.'
} catch (err) {
console.error('Error:', err)
caption.value = 'An error occurred. Try again.'
} finally {
loading.value = false
}
}
</script>
Accessibility and UX Improvements
Image Preview: Helps users confirm they uploaded the correct file.
Disabled Button State: Prevents repeated submissions.
Alt Text: Use the generated caption as an alt attribute for better accessibility.
💡 Tip: You can add aria-live="polite" to the caption container to make it screen reader-friendly.
Feature Ideas for Enhancement
- Add a copy caption button.
- Automatically insert the caption as
alt
for the image. - Add text-to-speech (TTS) using the Web Speech API.
- Support multiple captions or translation with AI.
Links and References
Hugging Face Inference API
BLIP Model
Vue 3 Docs
Axios
Summary and Conclusion
You’ve just built an intelligent, user-friendly image captioning tool powered by Hugging Face and Vue 3. This type of component is a great example of how frontend developers can leverage AI to improve usability and accessibility in real-world apps.
With just a few tools, you can turn static interfaces into intelligent, responsive experiences—and delight your users in the process.
Call to Action / Community Engagement
What other AI features would you want to see in Vue components? Have you tried AI for image recognition, tagging, or alt-text generation?
Drop your thoughts, feedback, or experiments in the comments below!
Top comments (0)