An AI based Excel Analyzer
Demo : https://www.youtube.com/watch?v=TZCHgOOhzY0
The Excel Analyzer is a Streamlit application that allows users to upload Excel files, ask questions about the data, and receive answers generated by a language model. It uses a Retrieval-Augmented Generation (RAG) approach to provide relevant and informative responses.
- Streamlit: For creating the web application.
- Pandas: For loading and manipulating Excel data.
- Langchain: To orchestrate the RAG pipeline.
- Hugging Face Embeddings: For generating embeddings.
- FAISS: For the vector database.
- Google Gemini API: For the language model.
- Install the required libraries:
pip install streamlit pandas sentence-transformers faiss-cpu langchain google-generativeai - Obtain a Google Gemini API key and enter it in the sidebar of the application.
- Upload an Excel file using the file uploader.
- Ask questions about the data in the text input field.
- The application will display the answer generated by the language model.
app.py: Contains the main application code, including the UI elements, data loading, chunking, embedding, retrieval, and response generation logic.run_app.bat: A batch file that launches the Streamlit application.documentation.txt: This file, providing technical documentation for the application.
- Data Loading: The application loads the Excel data into a Pandas DataFrame.
- Data Chunking: The data is split into smaller chunks using
CharacterTextSplitter. - Embedding Generation: Embeddings are generated for each chunk using
HuggingFaceEmbeddings. - Vector Database: The chunks and their embeddings are stored in a FAISS vector database.
- Retrieval: When a question is asked, the application generates an embedding for the question and searches the vector database for the most similar chunks.
- Response Generation: The question and the retrieved chunks are passed to the Google Gemini API to generate an answer.
The run_app.bat file contains the following commands:
@echo off: Disables command echoing.echo Starting Streamlit app...: Displays a message indicating that the Streamlit app is starting.streamlit run app.py: Runs the Streamlit application.pause: Pauses the command prompt window after the application is closed, allowing the user to see any error messages.
- The chunk size and overlap can be adjusted in the
app.pyfile to optimize the retrieval performance. - The language model can be changed by modifying the
model_nameparameter in theChatGoogleGenerativeAIclass. - The UI can be further customized using Streamlit's API.
- If the application fails to start, ensure that all the required libraries are installed and that the Google Gemini API key is valid.
- If the application returns incorrect answers, try adjusting the chunking parameters or modifying the prompt to the language model.
The Excel Analyzer is a simple tool that lets you ask questions about data in your Excel files and get answers powered by AI.
- Install Python:
- Download Python 3.12 from the official website: https://www.python.org/downloads/windows/
- Run the installer.
- Important: Make sure to check the box that says "Add Python to PATH" during the installation process. This will allow you to run Python commands from the command prompt.
- Complete the installation.
- Install Required Libraries:
- Open a command prompt window.
- Type the following command and press Enter:
pip install streamlit pandas sentence-transformers faiss-cpu langchain google-generativeai - Wait for the installation to complete.
- Run the App: Double-click the
run_app.batfile. This will open a command prompt window and launch the Excel Analyzer in your web browser. - Enter API Key: In the left sidebar, you'll see a field to enter your API key. This is needed to connect to the AI model.
- Upload Excel File: Click the "Browse files" button to upload your Excel file (
.xlsxor.xlsformat). - Ask a Question: Once the file is uploaded, a text box will appear where you can type your question about the data.
- Get the Answer: Press Enter or click outside the text box. The AI will analyze the data and provide an answer in a chat-like format below the question box.
- Be specific with your questions.
- Use keywords that are present in your Excel data.
- If you don't get the answer you expect, try rephrasing your question.
- App Doesn't Start: Make sure you have all the necessary software installed (Python, Streamlit, etc.). If you're unsure, contact your system administrator.
- No Answer: Double-check that you've entered your API key correctly. Also, make sure your question is relevant to the data in the Excel file.
- Incorrect Answer: The AI might not always be perfect. Try rephrasing your question or providing more context.
Have fun exploring your Excel data with the Excel Analyzer!