Skip to content

[SIP-166] AI Assistant #33215

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
dscarabelliTT opened this issue Apr 23, 2025 · 0 comments
Open

[SIP-166] AI Assistant #33215

dscarabelliTT opened this issue Apr 23, 2025 · 0 comments
Labels
change:backend Requires changing the backend sip Superset Improvement Proposal sqllab Namespace | Anything related to the SQL Lab

Comments

@dscarabelliTT
Copy link

dscarabelliTT commented Apr 23, 2025

[SIP-166] Proposal for AI Assistant

Motivation

An accurate text-to-SQL translator (AI Assistant) can greatly enhance the SQLLab user experience by increasing productivity, supporting users with limited SQL knowledge, and making it easier to discover and access data in SQLLab.

Proposed Change

We propose implementing a text-to-SQL translator that is intentionally simple — avoiding the use of RAG, vector databases, or agentic LLM frameworks. This approach is designed to maximize compatibility across diverse database types and sizes, provide flexible configuration options, and leverage user-supplied context filtering when available. The system is built to handle scenarios with limited support gracefully, ensuring robust operation even when some functionality is unavailable.

We believe that by intentionally keeping this solution simple and avoiding complex dependencies, it will be easier for the community to reach consensus and approve its inclusion. This practical and accessible first implementation of the AI Assistant is designed to accelerate its adoption and help it materialize sooner as an official Superset OSS release.

The AI Assistant was developed in alignment with the guiding principles outlined above, within a dedicated fork of the Superset repository, based on the 5.0.0rc2 tag. For a comprehensive overview of its features and configuration, refer to the AI Assistant documentation.

New or Changed Public Interfaces

  • React Components:

    • AI Assistant Editor: Introduced in SQLLab as a text input bar for interacting with the AI Assistant.
    • Table Selector: Enhanced to allow multi-selection of schemas.
    • SQL Editor: Updated to support schema multi-selection.
    • AI Assistant Options: Added as a tab in the Database modal for configuring AI Assistant settings per database.
    • Table View: Added a SQL comment icon next to each column name.
  • REST Endpoints:

    • sqllab/generate_db_context: Initiates a rebuild of the database metadata LLM context.
    • sqllab/generate_sql: Sends user prompts to the LLM provider to generate SQL queries.
    • sqllab/db_context_status: Retrieves the status of the database metadata context and the context builder worker.
    • database/{db_id}/schema_tables: Returns all schemas and tables for a specified database.
  • Dashboards or Visualizations:
    No changes.

  • Superset CLI:
    No changes.

  • Deployment:
    No changes.

To simplify the setup of a custom Docker Compose deployment (e.g. deploying this fork), we have provided a shell script and configuration files. Detailed instructions and resources can be found here.

New dependencies

The new dependencies introduced are primarily related to integration with supported LLM API providers and data structure validation for building the database metadata context JSON file:

  • google-genai: Python SDK for Google Generative AI.
  • openai: Python SDK for OpenAI models.
  • anthropic: Python SDK for Anthropic models.
  • pydantic: Used for robust data validation and serialization.

These dependencies are required to enable AI Assistant functionality and ensure reliable handling of LLM-related data.

Migration Plan and Compatibility

Since these are additive changes, migration should be straightforward.

Changes to metadata database tables:

  • dbs (Model view: Database): Added columns:
    • llm_provider
    • llm_model
    • llm_api_key
    • llm_enabled
    • llm_context_options
  • context_builder_task: New table introduced.

No breaking changes are expected, and existing deployments can be upgraded without data loss. Standard database migration procedures apply.

@dscarabelliTT dscarabelliTT added the sip Superset Improvement Proposal label Apr 23, 2025
@dosubot dosubot bot added change:backend Requires changing the backend sqllab Namespace | Anything related to the SQL Lab labels Apr 23, 2025
@rusackas rusackas changed the title [SIP] AI Assistant [SIP-166] AI Assistant Apr 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
change:backend Requires changing the backend sip Superset Improvement Proposal sqllab Namespace | Anything related to the SQL Lab
Projects
Development

No branches or pull requests

1 participant