-
Notifications
You must be signed in to change notification settings - Fork 48
Open
Labels
Contributors neededFoss OverFlowenhancementNew feature or requestNew feature or requesthardhelp wantedExtra attention is neededExtra attention is needed
Description
Feature Request: Automatic Scraping & Parsing of Academic Calendar
Problem Statement
Current Issue:
The academic calendar is uploaded on the IIT Bhilai website as a PDF/HTML table, and currently admins must manually copy dates into our Calendar component. This is time-consuming, error-prone, and requires updates every semester/year.
Proposed Solution
✅ Automatic Academic Calendar Scraping
- Add a backend service that periodically scrapes the academic calendar from:
https://www.iitbhilai.ac.in/index.php?pid=aca_calendar - Parse events and categorize them like "academic", "holiday", etc
✅ Integration with Existing Calendar UI
- New events should automatically appear in the existing Calendar component with category-appropriate icons and color coding.
- Admins should have ability to review/edit scraped data before publishing (optional improvement).
Technical Notes
| Layer | Requirement |
|---|---|
| Scraper | A cron-triggered script |
| Data Parsing | Match patterns like: "holiday", "exam", "commencement", "registration" |
| Database | Add relevant fields to calendar events table |
| Retry & Fail-Safe | If scraping fails, continue using last known data |
| Optional | Cache PDF locally with versioning for historical comparison |
Possible Libraries
BeautifulSoup4/lxml(for HTML scraping)PyMuPDForpdfplumberif format changes back to PDF
Alternatives Considered
- Manual Upload of Events
❌ Still requires regular admin effort - Direct API Feed from Institute Website
❌ No such API currently exists
Mockups & Visual Examples
- Checkout the figma design mockups in README
Metadata
Metadata
Assignees
Labels
Contributors neededFoss OverFlowenhancementNew feature or requestNew feature or requesthardhelp wantedExtra attention is neededExtra attention is needed