Feature/hubsot handler updates by tino097 · Pull Request #12046 · mindsdb/minds

tino097 · 2025-12-30T17:11:12Z

Description

HubSpot integration updates

(Please delete options that are not relevant)

🐛 Bug fix (non-breaking change which fixes an issue)
⚡ New feature (non-breaking change which adds functionality)
📢 Breaking change (fix or feature that would cause existing functionality not to work as expected)
📄 This change requires a documentation update

Verification Process

To ensure the changes are working as expected:

Test Location: Specify the URL or path for testing.
Verification Steps: Outline the steps or queries needed to validate the change. Include any data, configurations, or actions required to reproduce or see the new functionality.

Additional Media:

I have attached a brief loom video or screenshots showcasing the new functionality or change.

Checklist:

My code follows the style guidelines(PEP 8) of MindsDB.
I have appropriately commented on my code, especially in complex areas.
Necessary documentation updates are either made or tracked in issues.
Relevant unit and integration tests are updated or added.

entelligence-ai-pr-reviews · 2025-12-30T17:16:34Z

        ("lastname", "VARCHAR", "Last name"),
        ("phone", "VARCHAR", "Phone number"),
        ("company", "VARCHAR", "Associated company"),
+        ("city", "VARCHAR", "City"),
        ("website", "VARCHAR", "Website URL"),
        ("createdate", "TIMESTAMP", "Creation date"),
        ("lastmodifieddate", "TIMESTAMP", "Last modification date"),


Correctness: Add "city" to the searchable_columns set in _search_contacts_by_conditions call (line ~680). Without it, _build_hubspot_search_filters returns None for city-based WHERE conditions, forcing fallback to the slower get_all API instead of the optimized search API.

🤖 AI Agent Prompt for Cursor/Windsurf

📋 Copy this prompt to your AI coding assistant (Cursor, Windsurf, etc.) to get help fixing this issue

File: mindsdb/integrations/handlers/hubspot_handler/hubspot_tables.py Lines: ~580-585 (in the `get_contacts` method) Problem: The `city` column was added to the contacts schema (line 37), but it's missing from the searchable columns set used by `_build_hubspot_search_filters`. This causes city-based WHERE conditions to bypass the optimized search API and fall back to the slower `get_all` method. Fix: Update the searchable columns set in the `get_contacts` method to include `"city"` (and verify if `"website"` should also be searchable). Locate the call to `_build_hubspot_search_filters` around line 580-585 and add `"city"` to the set: `{"email", "id", "firstname", "lastname", "phone", "company", "city", "website"}`.

✨ Committable Code Suggestion

💡 This is a one-click fix! Click "Commit suggestion" to apply this change directly to your branch.

Suggested change

("lastname", "VARCHAR", "Last name"),

("phone", "VARCHAR", "Phone number"),

("company", "VARCHAR", "Associated company"),

("city", "VARCHAR", "City"),

("website", "VARCHAR", "Website URL"),

("createdate", "TIMESTAMP", "Creation date"),

("lastmodifieddate", "TIMESTAMP", "Last modification date"),

("firstname", "VARCHAR", "First name"),

("lastname", "VARCHAR", "Last name"),

("phone", "VARCHAR", "Phone number"),

("company", "VARCHAR", "Associated company"),

("city", "VARCHAR", "City"),

("website", "VARCHAR", "Website URL"),

("createdate", "TIMESTAMP", "Creation date"),

("lastmodifieddate", "TIMESTAMP", "Last modification date"),

entelligence-ai-pr-reviews · 2025-12-30T17:16:34Z

        self.create_companies(company_data)

    def modify(self, conditions: List[FilterCondition], values: Dict) -> None:
-        companies_df = pd.json_normalize(self.get_companies(limit=1000))
+        normalized_conditions = _normalize_filter_conditions(conditions)
+        companies_df = pd.json_normalize(self.get_companies(limit=200, where_conditions=normalized_conditions))

        if companies_df.empty:
            raise ValueError(
                "No companies retrieved from HubSpot to evaluate update conditions. Verify your connection and permissions."
            )

-        normalized_conditions = _normalize_filter_conditions(conditions)
        update_query_executor = UPDATEQueryExecutor(companies_df, normalized_conditions)
        filtered_df = update_query_executor.execute_query()



Correctness: modify() passes limit=200 to get_companies(), truncating results before UPDATEQueryExecutor filters. Companies beyond 200 matching where_conditions are silently excluded from update. Remove the hardcoded limit or set limit=None to retrieve all matching records via pagination.

🤖 AI Agent Prompt for Cursor/Windsurf

📋 Copy this prompt to your AI coding assistant (Cursor, Windsurf, etc.) to get help fixing this issue

File: mindsdb/integrations/handlers/hubspot_handler/hubspot_tables.py Lines: 223 Problem: The `modify()` method calls `get_companies(limit=200, where_conditions=normalized_conditions)` which artificially caps results at 200 records. When more than 200 companies match the filter conditions, only the first 200 will be retrieved and updated — the rest are silently ignored, causing data inconsistency. Fix Instructions: 1. Remove the `limit=200` parameter from the `get_companies()` call on line 223 2. Let `get_companies()` retrieve all matching records (it already handles pagination internally via `_execute_hubspot_search` when conditions are provided) 3. The `UPDATEQueryExecutor` will then correctly filter the complete dataset 4. Verify the same issue doesn't exist in `remove()` method (line 246) and `ContactsTable.modify()` (line 434) and `DealsTable.modify()` (line 634) — apply the same fix if present Expected result: All companies matching the WHERE conditions will be updated, regardless of count.

✨ Committable Code Suggestion

💡 This is a one-click fix! Click "Commit suggestion" to apply this change directly to your branch.

Suggested change

self.create_companies(company_data)

def modify(self, conditions: List[FilterCondition], values: Dict) -> None:

companies_df = pd.json_normalize(self.get_companies(limit=1000))

normalized_conditions = _normalize_filter_conditions(conditions)

companies_df = pd.json_normalize(self.get_companies(limit=200, where_conditions=normalized_conditions))

if companies_df.empty:

raise ValueError(

"No companies retrieved from HubSpot to evaluate update conditions. Verify your connection and permissions."

)

normalized_conditions = _normalize_filter_conditions(conditions)

update_query_executor = UPDATEQueryExecutor(companies_df, normalized_conditions)

filtered_df = update_query_executor.execute_query()

def add(self, company_data: List[dict]):

self.create_companies(company_data)

def modify(self, conditions: List[FilterCondition], values: Dict) -> None:

normalized_conditions = _normalize_filter_conditions(conditions)

companies_df = pd.json_normalize(self.get_companies(where_conditions=normalized_conditions))

if companies_df.empty:

raise ValueError(

"No companies retrieved from HubSpot to evaluate update conditions. Verify your connection and permissions."

)

update_query_executor = UPDATEQueryExecutor(companies_df, normalized_conditions)

filtered_df = update_query_executor.execute_query()

entelligence-ai-pr-reviews · 2025-12-30T17:16:34Z


    def remove(self, conditions: List[FilterCondition]) -> None:
        where_conditions = _normalize_filter_conditions(conditions)
-        contacts_df = pd.json_normalize(self.get_contacts(limit=1000, where_conditions=where_conditions))
+        contacts_df = pd.json_normalize(self.get_contacts(limit=200, where_conditions=where_conditions))

        if contacts_df.empty:
            raise ValueError(


Correctness: [hubspot_tables.py:467] remove() fetches only 200 contacts with limit=200. If >200 contacts match where_conditions, only the first 200 are deleted, silently leaving the rest. Remove the limit or fetch all matching records before filtering.

🤖 AI Agent Prompt for Cursor/Windsurf

📋 Copy this prompt to your AI coding assistant (Cursor, Windsurf, etc.) to get help fixing this issue

File: mindsdb/integrations/handlers/hubspot_handler/hubspot_tables.py Lines: 474-476 Problem: The `remove()` method in `ContactsTable` class uses `limit=200` when fetching contacts to evaluate delete conditions. This causes incomplete deletions when more than 200 contacts match the WHERE clause - only the first 200 are deleted while the rest are silently ignored. Fix Required: 1. Change `limit=200` to `limit=None` in the `get_contacts()` call on line 475 2. Apply the same fix to the `modify()` method on line 461 (same issue) 3. Review and fix identical patterns in `CompaniesTable.remove()` (line ~304) and `CompaniesTable.modify()` (line ~283) 4. Review and fix identical patterns in `DealsTable.remove()` and `DealsTable.modify()` methods Rationale: When evaluating conditions for DELETE or UPDATE operations, we must retrieve ALL matching records, not just a subset. The limit should only apply to read operations (SELECT), not to condition evaluation for write operations. Using `limit=None` ensures complete data retrieval for accurate condition matching.

✨ Committable Code Suggestion

💡 This is a one-click fix! Click "Commit suggestion" to apply this change directly to your branch.

Suggested change

def remove(self, conditions: List[FilterCondition]) -> None:

where_conditions = _normalize_filter_conditions(conditions)

contacts_df = pd.json_normalize(self.get_contacts(limit=1000, where_conditions=where_conditions))

contacts_df = pd.json_normalize(self.get_contacts(limit=200, where_conditions=where_conditions))

if contacts_df.empty:

raise ValueError(

self.update_contacts(contact_ids, values)

def remove(self, conditions: List[FilterCondition]) -> None:

where_conditions = _normalize_filter_conditions(conditions)

contacts_df = pd.json_normalize(self.get_contacts(limit=None, where_conditions=where_conditions))

if contacts_df.empty:

raise ValueError(

entelligence-ai-pr-reviews · 2025-12-30T17:16:34Z

                "lastname": properties.get("lastname"),
                "phone": properties.get("phone"),
                "company": properties.get("company"),
+                "city": properties.get("city"),
                "website": properties.get("website"),
                "createdate": properties.get("createdate"),
                "lastmodifieddate": properties.get("lastmodifieddate"),


Correctness: [hubspot_tables.py:543] Add "city" to default_properties in get_contacts(). Without it, get_all() returns None for city while search API returns actual values, breaking data consistency.

🤖 AI Agent Prompt for Cursor/Windsurf

📋 Copy this prompt to your AI coding assistant (Cursor, Windsurf, etc.) to get help fixing this issue

File: mindsdb/integrations/handlers/hubspot_handler/hubspot_tables.py Lines: 543-551 Problem: The `city` field was added to `_contact_to_dict()` method (line 589) but is missing from the `default_properties` list in `get_contacts()` method. This causes inconsistent behavior where contacts fetched via search API include city data, but contacts fetched via `get_all()` will have `city` as `None`. Fix: Add `"city"` to the `default_properties` list in the `get_contacts()` method (around line 547) to ensure the city field is consistently fetched from HubSpot API regardless of the fetch method used.

✨ Committable Code Suggestion

💡 This is a one-click fix! Click "Commit suggestion" to apply this change directly to your branch.

Suggested change

"lastname": properties.get("lastname"),

"phone": properties.get("phone"),

"company": properties.get("company"),

"city": properties.get("city"),

"website": properties.get("website"),

"createdate": properties.get("createdate"),

"lastmodifieddate": properties.get("lastmodifieddate"),

"firstname": properties.get("firstname"),

"lastname": properties.get("lastname"),

"phone": properties.get("phone"),

"company": properties.get("company"),

"city": properties.get("city"),

"website": properties.get("website"),

"createdate": properties.get("createdate"),

"lastmodifieddate": properties.get("lastmodifieddate"),

MinuraPunchihewa

Hey @tino097,
Overall, this looks fine to me, but I am wondering if the three tables that we've added support for here are sufficient for answering the questions our customers are interested in. If not, I think we will need to add these tables in.

I also think it might be a good idea for you to rebase this branch against the release branch because there seem to be some rogue commits in there.

You will also need to check why the unit tests are failing.

MinuraPunchihewa · 2026-01-06T05:50:32Z

@hamishfagg Do you have any idea why the tests are failing? Is it because this PR originates from a fork?

@tino097 Do you have access to create a PR against MindsDB directly? If not, @hamishfagg can we grant him this access? I think we will need to deploy this PR to one of our environments and I am not sure this can be done via the fork?

Co-authored-by: tino097 <konstantin.sivakov@gmail.com>

tino097 · 2026-01-07T18:59:35Z

Closed in favor of #12058

entelligence-ai-pr-reviews Bot reviewed Dec 30, 2025

View reviewed changes

martyna-mindsdb requested a review from MinuraPunchihewa January 5, 2026 16:44

MinuraPunchihewa reviewed Jan 6, 2026

View reviewed changes

ZoranPandovski and others added 5 commits January 6, 2026 17:52

Add option for Snowflake key value auth (mindsdb#12024)

0a6a644

bumped version for release

0f65a1e

BigQuery fixes (mindsdb#12057)

13f952c

Co-authored-by: tino097 <konstantin.sivakov@gmail.com>

Improve searching in the HubSpotAPI

57ccb12

Ruff format

3ded00e

tino097 force-pushed the feature/hubsot-handler-updates branch from e38b7da to 3ded00e Compare January 7, 2026 13:20

tino097 changed the base branch from releases/26.0.0 to releases/25.13.1 January 7, 2026 13:23

tino097 changed the base branch from releases/25.13.1 to releases/25.14.0 January 7, 2026 16:04

tino097 closed this Jan 7, 2026

github-actions Bot locked and limited conversation to collaborators Jan 7, 2026

mintlify Bot deployed to staging - docs January 7, 2026 18:59 View deployment

tino097 temporarily deployed to alpha-dev January 9, 2026 17:51 — with GitHub Actions Inactive

tino097 had a problem deploying to alpha-dev January 9, 2026 18:00 — with GitHub Actions Failure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/hubsot handler updates#12046

Feature/hubsot handler updates#12046
tino097 wants to merge 5 commits into
mindsdb:releases/25.14.0from
tino097:feature/hubsot-handler-updates

tino097 commented Dec 30, 2025

Uh oh!

entelligence-ai-pr-reviews Bot Dec 30, 2025

Uh oh!

entelligence-ai-pr-reviews Bot Dec 30, 2025

Uh oh!

entelligence-ai-pr-reviews Bot Dec 30, 2025

Uh oh!

entelligence-ai-pr-reviews Bot Dec 30, 2025

Uh oh!

MinuraPunchihewa left a comment

Uh oh!

MinuraPunchihewa commented Jan 6, 2026

Uh oh!

tino097 commented Jan 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

tino097 commented Dec 30, 2025

Description

Verification Process

Additional Media:

Checklist:

Uh oh!

entelligence-ai-pr-reviews Bot Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

entelligence-ai-pr-reviews Bot Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

entelligence-ai-pr-reviews Bot Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

entelligence-ai-pr-reviews Bot Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

MinuraPunchihewa left a comment

Choose a reason for hiding this comment

Uh oh!

MinuraPunchihewa commented Jan 6, 2026

Uh oh!

tino097 commented Jan 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants