Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: pinecone-io/pinecone-python-client
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: main
Choose a base ref
...
head repository: pinecone-io/pinecone-python-client
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: release-candidate/2025-10
Choose a head ref
Checking mergeability… Don’t worry, you can still create the pull request.
  • 12 commits
  • 262 files changed
  • 2 contributors

Commits on Nov 3, 2025

  1. Codegen updates for 2025-10 (#526)

    ## Problem
    
    Need to generate files off the updated spec
    
    ### Generated code: API updates
    
    - Admin API: Enhanced API keys and projects endpoints; added update
    operations
    - DB Control API: Updated index management with new read capacity
    support
    - DB Data API: Enhanced vector operations, namespace operations, and
    bulk operations
    - Inference API: Updated embedding and reranking models
    - OAuth API: Updated token request/response models
    
    ### Test updates
    
    - Made a few fixes in wrapper code to account for changes to generated
    names and shapes
    - Fixed integration tests for admin, control, and data operations
    - Updated unit tests for model changes
    - Fixed namespace-related tests
    - Updated index configuration tests
    jhamon authored Nov 3, 2025
    Configuration menu
    Copy the full SHA
    ba768fe View commit details
    Browse the repository at this point in the history

Commits on Nov 4, 2025

  1. Add Admin API Update Endpoints and Organization Resource (#527)

    # Add Admin API Update Endpoints and Organization Resource
    
    ## Summary
    
    This PR implements new update endpoints for the Admin API and adds a new
    `OrganizationResource` class to expose organization management
    functionality. The changes include:
    
    1. **API Key Updates**: Added `update()` method to `ApiKeyResource` to
    support updating API key names and roles
    2. **Organization Resource**: Created a new `OrganizationResource` class
    attached to the Admin class with full CRUD operations (list, fetch, get,
    describe, update)
    3. **Integration**: Exposed `OrganizationResource` in the `Admin` class
    via `organization` and `organizations` properties
    4. **Testing**: Added comprehensive integration tests for all new
    functionality
    
    ## Changes
    
    ### 1. API Key Resource Updates
    
    **File**: `pinecone/admin/resources/api_key.py`
    
    - Added `update()` method to `ApiKeyResource` class
    - Supports updating API key `name` and `roles`
    - Includes RST-formatted docstrings with examples
    - Follows existing patterns from other resource classes
    
    **Example Usage**:
    ```python
    from pinecone import Admin
    
    # When initializing, the Admin class reads creds from PINECONE_CLIENT_ID 
    and PINECONE_CLIENT_SECRET environment variables
    admin = Admin() 
    
    # Update API key name
    api_key = admin.api_key.update(
        api_key_id='my-api-key-id',
        name='updated-api-key-name'
    )
    
    # Update API key roles
    api_key = admin.api_key.update(
        api_key_id='my-api-key-id',
        roles=['ProjectViewer']
    )
    ```
    
    ### 2. Organization Resource
    
    **File**: `pinecone/admin/resources/organization.py` (new file)
    
    Created a new `OrganizationResource` class with the following methods:
    
    - `list()`: List all organizations associated with the account
    - `fetch(organization_id)`: Fetch an organization by ID
    - `get(organization_id)`: Alias for `fetch()`
    - `describe(organization_id)`: Alias for `fetch()`
    - `update(organization_id, name)`: Update an organization's name
    
    **Example Usage**:
    ```python
    from pinecone import Admin
    
    admin = Admin()
    
    # List all organizations
    organizations = admin.organization.list()
    for org in organizations.data:
        print(org.name)
    
    # Fetch an organization
    org = admin.organization.get(organization_id="my-org-id")
    
    # Update an organization
    org = admin.organization.update(
        organization_id="my-org-id",
        name="updated-name"
    )
    ```
    
    ### 3. Integration Tests
    
    **File**: `tests/integration/admin/test_api_key.py`
    
    - Added `test_update_api_key()` test covering:
      - Updating API key name only
      - Updating API key roles only
      - Updating both name and roles
      - Verifying changes persist after fetch
      - Proper cleanup of created resources
    
    **File**: `tests/integration/admin/test_organization.py` (new file)
    
    Added comprehensive integration tests:
    
    - `test_update_organization()`: Tests updating organization name with
    proper cleanup (reverts name changes)
    - `test_list_organizations()`: Tests listing organizations, verifies
    response structure, field types, and dictionary/get-style access
    - `test_fetch_organization()`: Tests fetching an organization by ID,
    verifies all fields match list results
    - `test_fetch_aliases()`: Tests that `fetch()`, `get()`, and
    `describe()` return identical results
    
    All tests include proper error handling and cleanup to avoid resource
    waste.
    
    ## Implementation Details
    
    - All methods follow existing patterns from `ProjectResource` and
    `ApiKeyResource`
    - Uses `@require_kwargs` decorator for parameter validation
    - Error handling follows existing patterns
    - Tests verify both attribute access and dictionary-style access for
    compatibility
    
    ## Backward Compatibility
    
    ✅ All changes are backward compatible. No existing functionality is
    modified or removed.
    
    ## Related
    
    - Implements update endpoints found in `pinecone/core/openapi/admin/`
    (generated OpenAPI code)
    - Follows workspace rules for RST docstrings and integration testing
    jhamon authored Nov 4, 2025
    Configuration menu
    Copy the full SHA
    57d8d34 View commit details
    Browse the repository at this point in the history
  2. Add .cursor to .gitignore

    jhamon committed Nov 4, 2025
    Configuration menu
    Copy the full SHA
    56c8f8b View commit details
    Browse the repository at this point in the history
  3. Update protobuf to 5.29.5 to address security vulnerability (#525)

    ## Problem
    
    The Pinecone Python client is currently using protobuf version `^5.29`,
    which includes vulnerable versions that are affected by
    [GHSA-8qvm-5x2c-j2w7](GHSA-8qvm-5x2c-j2w7).
    This vulnerability involves uncontrolled recursion in Protobuf's
    pure-Python backend, which could lead to Denial of Service (DoS)
    attacks.
    
    ## Solution
    
    Updated the protobuf dependency constraint from `^5.29` to `^5.29.5` to
    ensure we're using the patched version that addresses this security
    vulnerability. The changes include:
    
    - Updated `pyproject.toml`: Changed protobuf version constraint from
    `^5.29` to `^5.29.5`
    - Updated `testing-dependency-grpc.yaml`: Updated protobuf version from
    `5.29.1` to `5.29.5` in all three dependency testing matrix
    configurations
    - Verified that `poetry.lock` already contains protobuf 5.29.5, so no
    additional lock file updates were needed
    
    This is a patch version update, so no breaking changes are expected. The
    protobuf dependency is optional and only installed when the `grpc` extra
    is requested.
    
    **Note:** This is a security patch release to address the immediate
    vulnerability for existing users. A future release will include a
    comprehensive update to protobuf 6.x, which may include breaking changes
    and will require more extensive testing and migration planning.
    
    
    ## Type of Change
    
    - [X] Bug fix (non-breaking change which fixes an issue)
    - [ ] New feature (non-breaking change which adds functionality)
    - [ ] Breaking change (fix or feature that would cause existing
    functionality to not work as expected)
    - [ ] This change requires a documentation update
    - [ ] Infrastructure change (CI configs, etc)
    - [ ] Non-code change (docs, etc)
    - [ ] None of the above: (explain here)
    
    ## Test Plan
    
    - Verified protobuf 5.29.5 is already installed and working
    - Updated CI/CD pipeline to test with the new version
    - No breaking changes expected as this is a patch version update
    rohanshah18 authored and jhamon committed Nov 4, 2025
    Configuration menu
    Copy the full SHA
    9f0eddb View commit details
    Browse the repository at this point in the history
  4. Dedicated Read Capacity and Metadata Schema Configuration for Serverl…

    …ess Indexes (#528)
    
    # Add Support for Read Capacity and Metadata Schema Configuration for
    Serverless Indexes
    
    ## Summary
    
    This PR adds support for configuring `read_capacity` and `schema`
    (metadata schema) for serverless indexes in the Pinecone Python client.
    These features allow users to:
    
    - Configure dedicated read capacity nodes for better performance and
    cost predictability
    - Limit metadata indexing to specific fields for improved performance
    - Configure these settings both at index creation and after creation
    (for `read_capacity`)
    
    ## Features Added
    
    ### 1. Read Capacity Configuration
    
    Serverless indexes can now be configured with either **OnDemand**
    (default) or **Dedicated** read capacity modes. Dedicated mode allocates
    dedicated read nodes for your workload, providing more predictable
    performance and costs.
    
    ### 2. Metadata Schema Configuration
    
    Users can now specify which metadata fields are filterable, limiting
    metadata indexing to only the fields needed for query filtering. This
    improves index building and query performance when dealing with large
    amounts of metadata.
    
    ## Code Examples
    
    ### Creating a Serverless Index with Dedicated Read Capacity
    
    ```python
    from pinecone import Pinecone, ServerlessSpec, CloudProvider, GcpRegion, Metric
    
    pc = Pinecone(api_key='YOUR_API_KEY')
    
    # Create an index with dedicated read capacity
    pc.create_index(
        name='my-index',
        dimension=1536,
        metric=Metric.COSINE,
        spec=ServerlessSpec(
            cloud=CloudProvider.GCP,
            region=GcpRegion.US_CENTRAL1,
            read_capacity={
                "mode": "Dedicated",
                "dedicated": {
                    "node_type": "t1",
                    "scaling": "Manual",
                    "manual": {
                        "shards": 2,
                        "replicas": 2
                    }
                }
            }
        )
    )
    ```
    
    ### Creating a Serverless Index with Metadata Schema
    
    ```python
    from pinecone import Pinecone, ServerlessSpec, CloudProvider, AwsRegion, Metric
    
    pc = Pinecone(api_key='YOUR_API_KEY')
    
    # Create an index with metadata schema configuration
    pc.create_index(
        name='my-index',
        dimension=1536,
        metric=Metric.COSINE,
        spec=ServerlessSpec(
            cloud=CloudProvider.AWS,
            region=AwsRegion.US_WEST_2,
            schema={
                "genre": {"filterable": True},
                "year": {"filterable": True},
                "description": {"filterable": True}
            }
        )
    )
    ```
    
    ### Creating an Index for Model with Read Capacity and Schema
    
    ```python
    from pinecone import Pinecone, CloudProvider, AwsRegion, EmbedModel
    
    pc = Pinecone(api_key='YOUR_API_KEY')
    
    # Create an index for a model with dedicated read capacity and schema
    pc.create_index_for_model(
        name='my-index',
        cloud=CloudProvider.AWS,
        region=AwsRegion.US_EAST_1,
        embed={
            "model": EmbedModel.Multilingual_E5_Large,
            "field_map": {"text": "my-sample-text"}
        },
        read_capacity={
            "mode": "Dedicated",
            "dedicated": {
                "node_type": "t1",
                "scaling": "Manual",
                "manual": {"shards": 1, "replicas": 1}
            }
        },
        schema={
            "category": {"filterable": True},
            "tags": {"filterable": True}
        }
    )
    ```
    
    ### Configuring Read Capacity on an Existing Index
    
    ```python
    from pinecone import Pinecone
    
    pc = Pinecone(api_key='YOUR_API_KEY')
    
    # Switch to OnDemand read capacity
    pc.configure_index(
        name='my-index',
        read_capacity={"mode": "OnDemand"}
    )
    
    # Switch to Dedicated read capacity with manual scaling
    pc.configure_index(
        name='my-index',
        read_capacity={
            "mode": "Dedicated",
            "dedicated": {
                "node_type": "t1",
                "scaling": "Manual",
                "manual": {
                    "shards": 3,
                    "replicas": 2
                }
            }
        }
    )
    
    # Scale up by increasing shards and replicas
    pc.configure_index(
        name='my-index',
        read_capacity={
            "mode": "Dedicated",
            "dedicated": {
                "node_type": "t1",
                "scaling": "Manual",
                "manual": {
                    "shards": 4,
                    "replicas": 3
                }
            }
        }
    )
    
    # Verify the configuration was applied
    desc = pc.describe_index("my-index")
    assert desc.spec.serverless.read_capacity.mode == "Dedicated"
    ```
    
    ### Async Examples
    
    All functionality is also available in the async client:
    
    ```python
    import asyncio
    from pinecone import PineconeAsyncio, ServerlessSpec, CloudProvider, AwsRegion, Metric
    
    async def main():
        async with PineconeAsyncio(api_key='YOUR_API_KEY') as pc:
            # Create index with dedicated read capacity
            await pc.create_index(
                name='my-index',
                dimension=1536,
                metric=Metric.COSINE,
                spec=ServerlessSpec(
                    cloud=CloudProvider.AWS,
                    region=AwsRegion.US_EAST_1,
                    read_capacity={
                        "mode": "Dedicated",
                        "dedicated": {
                            "node_type": "t1",
                            "scaling": "Manual",
                            "manual": {"shards": 2, "replicas": 2}
                        }
                    }
                )
            )
            
            # Configure read capacity later
            await pc.configure_index(
                name='my-index',
                read_capacity={
                    "mode": "Dedicated",
                    "dedicated": {
                        "node_type": "t1",
                        "scaling": "Manual",
                        "manual": {"shards": 3, "replicas": 2}
                    }
                }
            )
    
    asyncio.run(main())
    ```
    
    ## Type Safety Improvements
    
    This PR also improves type hints throughout the codebase by replacing
    `Any` types with specific TypedDict and OpenAPI model types for better
    IDE support and type checking. The following types are now exported from
    the top-level package:
    
    - `ReadCapacityDict`
    - `ReadCapacityOnDemandDict`
    - `ReadCapacityDedicatedDict`
    - `ReadCapacityDedicatedConfigDict`
    - `ScalingConfigManualDict`
    - `MetadataSchemaFieldConfig`
    
    ## Changes
    
    ### Core Functionality
    - Added `read_capacity` and `schema` parameters to `ServerlessSpec`
    class
    - Extended `create_index` to support `read_capacity` and `schema` via
    `ServerlessSpec`
    - Extended `create_index_for_model` to support `read_capacity` and
    `schema`
    - Extended `configure_index` to support `read_capacity` for serverless
    indexes
    - Added helper methods `__parse_read_capacity` and `__parse_schema` in
    request factory
    - Improved type hints throughout the codebase (replacing `Any` with
    specific types)
    
    ### Documentation
    - Updated `create_index` docstrings in both sync and async interfaces
    - Updated `create_index_for_model` docstrings in both sync and async
    interfaces
    - Updated `configure_index` docstrings in both sync and async interfaces
    - Added comprehensive examples in
    `docs/db_control/serverless-indexes.md`
    - Added code examples showing how to configure read capacity
    
    ### Testing
    - Added integration tests for `create_index` with `read_capacity` and
    `schema`
    - Added integration tests for `create_index_for_model` with
    `read_capacity` and `schema`
    - Added integration tests for `configure_index` with `read_capacity`
    - Tests cover both sync and async clients
    - Tests cover edge cases including transitions between read capacity
    modes
    
    ## Breaking Changes
    
    None. All changes are additive and backward compatible.
    jhamon authored Nov 4, 2025
    Configuration menu
    Copy the full SHA
    b63a907 View commit details
    Browse the repository at this point in the history
  5. Implement fetch_by_metadata for Index and IndexAsyncio (#529)

    # Implement `fetch_by_metadata` for Index and IndexAsyncio
    
    This PR adds the `fetch_by_metadata` method to both synchronous and
    asynchronous Pinecone index clients, allowing users to retrieve vectors
    based on metadata filters rather than requiring explicit vector IDs.
    
    ## Overview
    
    The `fetch_by_metadata` operation enables querying vectors by their
    metadata attributes, similar to how `query` works but without requiring
    a query vector. This is particularly useful for:
    - Retrieving all vectors matching specific metadata criteria
    - Building data pipelines that filter by metadata
    - Implementing metadata-based data retrieval workflows
    
    ## Usage Examples
    
    ### Basic Usage (Synchronous)
    
    ```python
    from pinecone import Pinecone
    
    pc = Pinecone(api_key='your-api-key')
    index = pc.Index(host='your-index-host')
    
    # Fetch vectors with simple metadata filter
    result = index.fetch_by_metadata(
        filter={"genre": "action"},
        namespace="movies"
    )
    
    # Iterate over results
    for vec_id, vector in result.vectors.items():
        print(f"ID: {vector.id}, Metadata: {vector.metadata}")
    ```
    
    ### Complex Filtering
    
    ```python
    # Using multiple filter conditions
    result = index.fetch_by_metadata(
        filter={
            "genre": {"$in": ["comedy", "drama"]},
            "year": {"$gte": 2020},
            "rating": {"$gt": 7.5}
        },
        namespace="movies",
        limit=100
    )
    ```
    
    ### Pagination
    
    ```python
    # First page
    result = index.fetch_by_metadata(
        filter={"status": "active"},
        namespace="products",
        limit=50
    )
    
    # Continue to next page if available
    if result.pagination and result.pagination.next:
        next_page = index.fetch_by_metadata(
            filter={"status": "active"},
            namespace="products",
            limit=50,
            pagination_token=result.pagination.next
        )
    ```
    
    ### Asynchronous Usage
    
    ```python
    import asyncio
    from pinecone import Pinecone
    
    async def main():
        pc = Pinecone(api_key='your-api-key')
        async with pc.IndexAsyncio(host='your-index-host') as index:
            result = await index.fetch_by_metadata(
                filter={"category": "electronics", "in_stock": True},
                namespace="inventory",
                limit=100
            )
            
            for vec_id, vector in result.vectors.items():
                print(f"Product {vector.id}: {vector.metadata}")
    
    asyncio.run(main())
    ```
    
    ### gRPC Usage
    
    ```python
    from pinecone.grpc import PineconeGRPC
    
    pc = PineconeGRPC(api_key='your-api-key')
    index = pc.Index(host='your-index-host')
    
    # Synchronous gRPC call
    result = index.fetch_by_metadata(
        filter={"tag": "featured"},
        namespace="articles"
    )
    
    # Asynchronous gRPC call (returns future)
    future = index.fetch_by_metadata(
        filter={"tag": "featured"},
        namespace="articles",
        async_req=True
    )
    
    # Wait for result
    result = future.result()
    ```
    
    ### Filter Operators
    
    The `fetch_by_metadata` method supports all standard Pinecone metadata
    filter operators:
    
    ```python
    # Equality
    filter={"status": "active"}
    
    # Comparison operators
    filter={"price": {"$gt": 100}}
    filter={"age": {"$gte": 18}}
    filter={"score": {"$lt": 0.5}}
    filter={"count": {"$lte": 10}}
    
    # Array operators
    filter={"tags": {"$in": ["red", "blue", "green"]}}
    filter={"categories": {"$nin": ["deprecated"]}}
    
    # Existence check
    filter={"description": {"$exists": True}}
    
    # Logical operators
    filter={
        "$and": [
            {"status": "active"},
            {"price": {"$lt": 50}}
        ]
    }
    
    filter={
        "$or": [
            {"category": "electronics"},
            {"category": "computers"}
        ]
    }
    ```
    
    ## Response Structure
    
    The method returns a `FetchByMetadataResponse` object containing:
    
    ```python
    class FetchByMetadataResponse:
        namespace: str                    # The namespace queried
        vectors: Dict[str, Vector]        # Dictionary of vector ID to Vector objects
        usage: Usage                      # API usage information
        pagination: Optional[Pagination]  # Pagination token for next page (if available)
    ```
    
    ## Technical Changes
    
    ### Core Implementation
    
    - Added `fetch_by_metadata` method to `Index` (sync) and `_IndexAsyncio`
    (async) classes
    - Added `fetch_by_metadata` method to `GRPCIndex` with support for
    `async_req`
    - Created `FetchByMetadataResponse` dataclass with pagination support
    - Added request factory method
    `IndexRequestFactory.fetch_by_metadata_request`
    - Added gRPC response parser `parse_fetch_by_metadata_response`
    
    ### Protobuf Migration
    
    - Migrated from `db_data_2025_04` protobuf stubs to `db_data_2025_10`
    stubs
    - Updated all gRPC-related imports and references
    - Removed deprecated 2025-04 stub files
    
    ### Testing
    
    - Added comprehensive integration tests for sync
    (`test_fetch_by_metadata.py`)
    - Added comprehensive integration tests for async
    (`test_fetch_by_metadata.py`)
    - Added gRPC futures tests (`test_fetch_by_metadata_future.py`)
    - Added unit tests for request factory (`test_request_factory.py`)
    - Added unit tests for Index class (`test_index.py`)
    - Updated all unit test files to use 2025-10 protobuf stubs
    
    ### Documentation
    
    - Added usage examples to `docs/db_data/index-usage-byov.md`
    - Updated interface docstrings with examples
    
    ## Breaking Changes
    
    None. This is a new feature addition.
    
    ## Migration Notes
    
    No migration required. This is a new feature that doesn't affect
    existing functionality.
    jhamon authored Nov 4, 2025
    Configuration menu
    Copy the full SHA
    b3267a5 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    3c166cc View commit details
    Browse the repository at this point in the history
  7. Add support for match_terms parameter in search operations (#530)

    # Add support for `match_terms` parameter in search operations
    
    ## Summary
    
    This PR adds support for the `match_terms` parameter in the `search` and
    `search_records` methods for both `Pinecone` and `PineconeAsyncio`
    clients. The `match_terms` feature allows users to specify which terms
    must be present in the text of each search hit based on a specified
    strategy.
    
    ## Changes
    
    ### Core Implementation
    
    - **Type Definitions**
    (`pinecone/db_data/types/search_query_typed_dict.py`):
    - Added `match_terms` field to `SearchQueryTypedDict` with comprehensive
    docstring including limitations
    
    - **Dataclass** (`pinecone/db_data/dataclasses/search_query.py`):
    - Added `match_terms: Optional[Dict[str, Any]]` field to `SearchQuery`
    dataclass
      - Updated `as_dict()` method to include `match_terms` when present
    
    - **Request Factory** (`pinecone/db_data/request_factory.py`):
    - Updated `_parse_search_query()` to convert `match_terms` dictionary to
    `SearchMatchTerms` OpenAPI model
      - Added proper type conversion to ensure API compatibility
    
    - **Interfaces**:
    - Updated `IndexInterface.search()` and
    `IndexInterface.search_records()` docstrings in
    `pinecone/db_data/interfaces.py`
    - Updated `IndexAsyncioInterface.search()` and
    `IndexAsyncioInterface.search_records()` docstrings in
    `pinecone/db_data/index_asyncio_interface.py`
      - Added documentation explaining `match_terms` usage and limitations
    
    ### Testing
    
    Added integration tests for both synchronous and asynchronous clients:
    
    - `tests/integration/data/test_search_and_upsert_records.py`:
    - `test_search_with_match_terms_dict`: Tests `match_terms` using
    dictionary input
    - `test_search_with_match_terms_searchquery`: Tests `match_terms` using
    `SearchQuery` dataclass
    
    - `tests/integration/data_asyncio/test_search_and_upsert_records.py`:
    - `test_search_with_match_terms_dict`: Async version with dictionary
    input
    - `test_search_with_match_terms_searchquery`: Async version with
    `SearchQuery` dataclass
    
    All tests handle the expected API limitation where `match_terms` is only
    supported for specific model configurations.
    
    ## Usage
    
    Users can now pass `match_terms` in their search queries:
    
    ```python
    from pinecone import Pinecone
    
    pc = Pinecone()
    index = pc.Index("my-index")
    
    # Using dictionary
    query = {
        "inputs": {"text": "Apple corporation"},
        "top_k": 3,
        "match_terms": {"strategy": "all", "terms": ["Apple", "corporation"]}
    }
    results = index.search(namespace="my-namespace", query=query)
    
    # Using SearchQuery dataclass
    from pinecone.db_data.dataclasses.search_query import SearchQuery
    
    query = SearchQuery(
        inputs={"text": "Apple corporation"},
        top_k=3,
        match_terms={"strategy": "all", "terms": ["Apple", "corporation"]}
    )
    results = index.search(namespace="my-namespace", query=query)
    ```
    
    ## Limitations
    
    **Important:** `match_terms` is only supported for sparse indexes with
    integrated embedding configured to use the `pinecone-sparse-english-v0`
    model. This limitation is documented in all relevant docstrings and
    interface methods.
    
    The implementation gracefully handles API errors when `match_terms` is
    used with unsupported models, ensuring the parameter is correctly passed
    to the API even when the model configuration doesn't support it.
    
    ## API Compatibility
    
    This implementation follows the OpenAPI specification in
    `pinecone/core/openapi/db_data/model/search_records_request_query.py`,
    which defines `match_terms` as part of `SearchRecordsRequestQuery` (used
    by `search` and `search_records` methods). Note that `match_terms` is
    not available for the `query` method, which uses `QueryRequest`.
    
    ## Testing
    
    - ✅ All integration tests pass for both sync and async clients
    - ✅ Tests verify correct parameter passing and error handling
    - ✅ Linter checks pass with no errors
    - ✅ Type hints verified with mypy
    jhamon authored Nov 4, 2025
    Configuration menu
    Copy the full SHA
    96fbe24 View commit details
    Browse the repository at this point in the history
  8. Add FilterBuilder for Metadata Filter Construction (#531)

    # Add FilterBuilder for Metadata Filter Construction
    
    ## Summary
    
    Introduces a `FilterBuilder` class that provides a fluent, type-safe API
    for constructing Pinecone metadata filters. This helps prevent common
    filter construction errors such as misspelled operator names or invalid
    filter structures.
    
    ## Changes
    
    ### New Features
    
    - **FilterBuilder class** (`pinecone/db_data/filter_builder.py`):
    - Fluent builder API for all Pinecone filter operators (`eq`, `ne`,
    `gt`, `gte`, `lt`, `lte`, `in_`, `nin`, `exists`)
      - Operator overloading: `&` for AND, `|` for OR
      - Supports nested logical combinations
      - Full type hints (no `Any` types)
      - RST-formatted docstrings with examples
    
    - **Updated filter types** (`pinecone/db_data/types/query_filter.py`):
      - Added `$or` support (`OrFilter`)
      - Added `$exists` support (`ExistsFilter`)
      - Updated `FilterTypedDict` to include both
    
    - **Package exports**:
    - `FilterBuilder` exported from main `pinecone` package for easy access
    
    - **Unit tests** (`tests/unit/data/test_filter_builder.py`):
      - Coverage for all operators
      - Operator overloading tests
      - Complex nested filter tests
      - Edge cases and error conditions
    
    ## Usage Examples
    
    ### Simple Filters
    
    ```python
    from pinecone import FilterBuilder
    
    # Simple equality
    filter = FilterBuilder().eq("genre", "drama").build()
    # Returns: {"genre": "drama"}
    
    # Using operators
    filter = FilterBuilder().gt("year", 2020).build()
    # Returns: {"year": {"$gt": 2020}}
    
    filter = FilterBuilder().in_("genre", ["comedy", "drama"]).build()
    # Returns: {"genre": {"$in": ["comedy", "drama"]}}
    ```
    
    ### Complex Filters with Operator Overloading
    
    ```python
    # Multiple conditions with AND using & operator
    filter = (FilterBuilder().eq("genre", "drama") & 
              FilterBuilder().gt("year", 2020)).build()
    # Returns: {"$and": [{"genre": "drama"}, {"year": {"$gt": 2020}}]}
    
    # Multiple conditions with OR using | operator
    filter = (FilterBuilder().eq("genre", "comedy") | 
              FilterBuilder().eq("genre", "drama")).build()
    # Returns: {"$or": [{"genre": "comedy"}, {"genre": "drama"}]}
    
    # Complex nested conditions
    filter = ((FilterBuilder().eq("genre", "drama") & 
               FilterBuilder().gt("year", 2020)) |
              (FilterBuilder().eq("genre", "comedy") & 
               FilterBuilder().lt("year", 2000))).build()
    ```
    
    ### Using with Query Methods
    
    ```python
    from pinecone import FilterBuilder
    
    # In query
    index.query(
        vector=embedding,
        top_k=10,
        filter=FilterBuilder().eq("genre", "drama").build()
    )
    
    # In fetch_by_metadata
    filter = (FilterBuilder().in_("genre", ["comedy", "drama"]) &
              FilterBuilder().eq("year", 2019)).build()
    index.fetch_by_metadata(filter=filter, namespace='my_namespace')
    ```
    
    ## Benefits
    
    1. **Type safety**: Prevents misspelled operator names (e.g., `$eq` vs
    `$equals`)
    2. **Structure validation**: Prevents invalid filter structures (e.g.,
    multiple operators as siblings without `$and`/`$or`)
    3. **Better ergonomics**: Operator overloading makes complex filters
    more readable
    4. **Consistency**: Method names match Pinecone API operators (`$in` →
    `in_()`, `$nin` → `nin()`, etc.)
    5. **Backward compatible**: Users can still use raw dicts; FilterBuilder
    is optional
    
    ## Testing
    
    - 40+ unit tests covering all operators, operator overloading, nested
    filters, and edge cases
    - All tests pass with comprehensive coverage of the FilterBuilder API
    
    ## Backward Compatibility
    
    This change is fully backward compatible. Existing code using raw filter
    dictionaries continues to work unchanged. FilterBuilder is an optional
    helper that users can adopt at their own pace.
    jhamon authored Nov 4, 2025
    Configuration menu
    Copy the full SHA
    05b71a7 View commit details
    Browse the repository at this point in the history
  9. Add create_namespace method to Index and IndexAsyncio (#532)

    # Add `create_namespace` method to Index and IndexAsyncio
    
    ## Summary
    
    This PR adds the `create_namespace` method to both synchronous and
    asynchronous Index clients, as well as the GRPC implementation. The
    method allows users to create namespaces in serverless indexes with
    optional schema configuration.
    
    ## Changes
    
    ### REST API Implementation (Sync & Async)
    
    - **Request Factory**
    (`pinecone/db_data/resources/sync/namespace_request_factory.py`):
      - Added `CreateNamespaceArgs` TypedDict
    - Added `create_namespace_args` method with validation for namespace
    name and optional schema handling
    
    - **Resource Classes**:
      - `NamespaceResource.create()` - Synchronous implementation
      - `NamespaceResourceAsyncio.create()` - Asynchronous implementation
    - Both methods accept `name` and optional `schema` (as dictionary)
    parameters
    
    - **Interface Definitions**:
      - Added `create_namespace()` abstract method to `IndexInterface`
    - Added `create_namespace()` abstract method to `IndexAsyncioInterface`
      - Both include comprehensive RST docstrings with examples
    
    - **Class Implementations**:
      - `Index.create_namespace()` - Delegates to namespace resource
    - `IndexAsyncio.create_namespace()` - Delegates to namespace resource
    with async support
    
    ### GRPC Implementation
    
    - **GRPCIndex** (`pinecone/grpc/index_grpc.py`):
    - Added `create_namespace()` method with `async_req` support for GRPC
    futures
    - Handles schema conversion from dictionary to `MetadataSchema` proto
    object
      - Supports both synchronous and asynchronous (future-based) execution
    
    ### Testing
    
    - **Unit Tests** (`tests/unit_grpc/test_grpc_index_namespace.py`):
      - `test_create_namespace` - Basic functionality
      - `test_create_namespace_with_timeout` - Timeout handling
      - `test_create_namespace_with_schema` - Schema conversion validation
    
    - **Integration Tests** (`tests/integration/data/test_namespace.py`):
      - `test_create_namespace` - Successful namespace creation
    - `test_create_namespace_duplicate` - Error handling for duplicate
    namespaces
    
    - **Integration Tests**
    (`tests/integration/data_asyncio/test_namespace_asyncio.py`):
      - `test_create_namespace` - Async successful namespace creation
    - `test_create_namespace_duplicate` - Async error handling for duplicate
    namespaces
    
    - **GRPC Futures Integration Tests**
    (`tests/integration/data_grpc_futures/test_namespace_future.py`):
    - `test_create_namespace_future` - Creating namespace with
    `async_req=True`
    - `test_create_namespace_future_duplicate` - Error handling with futures
    - `test_create_namespace_future_multiple` - Concurrent namespace
    creation
    
    ## API Design
    
    The `create_namespace` method signature is consistent across all
    implementations:
    
    ```python
    def create_namespace(
        self, 
        name: str, 
        schema: Optional[Dict[str, Any]] = None, 
        **kwargs
    ) -> NamespaceDescription
    ```
    
    - **Public API**: Uses `Optional[Dict[str, Any]]` for schema to avoid
    exposing OpenAPI types
    - **Schema Format**: Accepts a dictionary with `fields` key containing
    field definitions
    - **Returns**: `NamespaceDescription` object containing namespace
    information
    
    ## Examples
    
    ### REST API (Synchronous)
    ```python
    from pinecone import Pinecone
    
    pc = Pinecone()
    index = pc.Index(host="example-index.svc.pinecone.io")
    
    # Create namespace without schema
    namespace = index.create_namespace(name="my-namespace")
    
    # Create namespace with schema
    schema = {
        "fields": {
            "field1": {"filterable": True},
            "field2": {"filterable": False}
        }
    }
    namespace = index.create_namespace(name="my-namespace", schema=schema)
    ```
    
    ### REST API (Asynchronous)
    ```python
    import asyncio
    from pinecone import Pinecone
    
    async def main():
        pc = Pinecone()
        async with pc.IndexAsyncio(host="example-index.svc.pinecone.io") as index:
            namespace = await index.create_namespace(name="my-namespace")
            print(f"Created namespace: {namespace.name}")
    
    asyncio.run(main())
    ```
    
    ### GRPC (Synchronous)
    ```python
    from pinecone.grpc import PineconeGRPC
    
    pc = PineconeGRPC()
    index = pc.Index(host="example-index.svc.pinecone.io")
    
    namespace = index.create_namespace(name="my-namespace")
    ```
    
    ### GRPC (Asynchronous/Futures)
    ```python
    from pinecone.grpc import PineconeGRPC
    from concurrent.futures import as_completed
    
    pc = PineconeGRPC()
    index = pc.Index(host="example-index.svc.pinecone.io")
    
    # Create namespace asynchronously
    future = index.create_namespace(name="my-namespace", async_req=True)
    namespace = future.result(timeout=30)
    
    # Create multiple namespaces concurrently
    futures = [
        index.create_namespace(name=f"ns-{i}", async_req=True) 
        for i in range(3)
    ]
    for future in as_completed(futures):
        namespace = future.result()
        print(f"Created: {namespace.name}")
    ```
    
    ## Type Hints
    
    - Public-facing methods use `Optional[Dict[str, Any]]` for schema
    parameter
    - Internal resource methods handle conversion from dict to OpenAPI
    models
    - GRPC implementation converts dict to `MetadataSchema` proto object
    
    ## Error Handling
    
    - Validates that namespace name is a non-empty string
    - Raises `PineconeApiException` for REST API errors
    - Raises `PineconeException` for GRPC errors
    - Properly handles duplicate namespace creation attempts
    
    ## Documentation
    
    All methods include comprehensive RST docstrings with:
    - Parameter descriptions
    - Return value descriptions
    - Usage examples
    - Links to relevant documentation
    
    ## Testing Status
    
    ✅ All unit tests passing  
    ✅ All integration tests passing (REST sync/async)  
    ✅ All GRPC futures integration tests passing  
    
    ## Notes
    
    - This operation is only supported for serverless indexes
    - Namespaces must have unique names within an index
    - Schema configuration is optional and can be added when creating the
    namespace or later
    jhamon authored Nov 4, 2025
    Configuration menu
    Copy the full SHA
    23dda74 View commit details
    Browse the repository at this point in the history
  10. Bump assistant plugin (#534)

    # Update Pinecone Assistant Plugin to v3.0.0
    
    ## Summary
    
    This PR updates the `pinecone-plugin-assistant` dependency from `^1.6.0`
    to `3.0.0`.
    
    ## Changes
    
    - Updated `pinecone-plugin-assistant` version constraint in
    `pyproject.toml` from `^1.6.0` to `3.0.0`
    - Updated `poetry.lock` to reflect the new dependency version and
    resolved sub-dependencies
    
    ## Breaking Changes
    
    None - This is a dependency version update only.
    jhamon authored Nov 4, 2025
    Configuration menu
    Copy the full SHA
    e872037 View commit details
    Browse the repository at this point in the history
  11. Intelligent CI Test Selection for PRs (#536)

    # Intelligent CI Test Selection for PRs
    
    ## Summary
    
    This PR implements intelligent test selection for pull requests,
    automatically determining which integration test suites to run based on
    changed files. This reduces CI time and costs by running only relevant
    tests while maintaining safety through fallback mechanisms.
    
    ## Problem
    
    Previously, all integration test suites ran on every PR regardless of
    what code changed. This resulted in:
    - Unnecessary CI execution time and costs
    - Slower feedback cycles for developers
    - Resource waste when only a small portion of the codebase changed
    
    ## Solution
    
    The implementation analyzes changed files in PRs and maps them to
    specific test suites. It includes:
    - **Automatic test selection**: Runs only test suites relevant to
    changed code paths
    - **Safety fallbacks**: Runs all tests when changes touch critical
    infrastructure or when analysis fails
    - **Manual override**: Option to force running all tests via workflow
    dispatch
    
    ## Changes
    
    ### 1. Test Suite Mapping Script
    (`.github/scripts/determine-test-suites.py`)
    - Analyzes git diff to identify changed files
    - Maps code paths to test suites:
    - `pinecone/db_control/` → control tests (serverless, resources/index,
    resources/collections, asyncio variants)
      - `pinecone/db_data/` → data tests (sync, asyncio, gRPC)
      - `pinecone/inference/` → inference tests (sync, asyncio)
      - `pinecone/admin/` → admin tests
      - `pinecone/grpc/` → gRPC-specific tests
      - Plugin-related files → plugin tests
    - Identifies critical paths that require full test suite:
      - `pinecone/config/`, `pinecone/core/`, `pinecone/openapi_support/`
      - `pinecone/utils/`, `pinecone/exceptions/`
      - Core interface files (`pinecone.py`, `pinecone_asyncio.py`, etc.)
    - Falls back to running all tests if:
      - Script execution fails
      - No files match any mapping
      - Critical paths are touched
    
    ### 2. Updated PR Workflow (`.github/workflows/on-pr.yaml`)
    - Added `determine-test-suites` job that runs before integration tests
    - Added `run_all_tests` input parameter for manual override via workflow
    dispatch
    - Passes selected test suites to integration test workflow
    - Includes error handling and validation
    
    ### 3. Updated Integration Test Workflow
    (`.github/workflows/testing-integration.yaml`)
    - Added optional inputs for each job type's test suites:
      - `rest_sync_suites_json`
      - `rest_asyncio_suites_json`
      - `grpc_sync_suites_json`
      - `admin_suites_json`
    - Filters test matrix based on provided suites
    - Skips jobs when their test suite array is empty
    - Maintains backward compatibility (runs all tests when inputs not
    provided)
    
    ## Usage
    
    ### Automatic (Default)
    On every PR, the workflow automatically:
    1. Analyzes changed files
    2. Determines relevant test suites
    3. Runs only those test suites
    
    ### Manual Override
    To force running all tests on a PR:
    1. Go to Actions → "Testing (PR)" workflow
    2. Click "Run workflow"
    3. Check "Run all integration tests regardless of changes"
    4. Run the workflow
    
    ## Safety Features
    
    1. **Critical path detection**: Changes to core infrastructure (config,
    utils, exceptions, etc.) trigger full test suite
    2. **Fallback on failure**: If the analysis script fails, falls back to
    running all tests
    3. **Empty result handling**: If no tests match, runs all tests as a
    safety measure
    4. **Main branch unchanged**: Main branch workflows continue to run all
    tests
    
    ## Example Scenarios
    
    ### Scenario 1: Change only `pinecone/db_data/index.py`
    - **Runs**: `data`, `data_asyncio`, `data_grpc_futures` test suites
    - **Skips**: `control/*`, `inference/*`, `admin`, `plugins` test suites
    - **Result**: ~70% reduction in test execution
    
    ### Scenario 2: Change `pinecone/config/pinecone_config.py`
    - **Runs**: All test suites (critical path)
    - **Reason**: Configuration changes affect all functionality
    
    ### Scenario 3: Change `pinecone/inference/inference.py`
    - **Runs**: `inference/sync`, `inference/asyncio` test suites
    - **Skips**: Other test suites
    - **Result**: ~85% reduction in test execution
    
    ## Testing
    
    The implementation has been tested with:
    - ✅ YAML syntax validation
    - ✅ Python script syntax validation
    - ✅ Test suite mapping logic verification
    - ✅ Edge case handling (empty arrays, failures, etc.)
    
    ## Benefits
    
    - **Cost savings**: Reduce CI costs by running only relevant tests
    - **Faster feedback**: Developers get test results faster when only
    subset runs
    - **Better resource utilization**: CI runners are used more efficiently
    - **Maintainability**: Easy to update mappings as codebase evolves
    
    ## Backward Compatibility
    
    - Main branch workflows unchanged (still run all tests)
    - PR workflows backward compatible (can manually trigger full suite)
    - Existing test suite structure unchanged
    - No changes to test code itself
    
    ## Future Improvements
    
    Potential enhancements for future PRs:
    - Track test execution time savings
    - Add metrics/logging for test selection decisions
    - Fine-tune mappings based on actual usage patterns
    - Consider test dependencies (e.g., if A changes, also run B)
    jhamon authored Nov 4, 2025
    Configuration menu
    Copy the full SHA
    c27d3c2 View commit details
    Browse the repository at this point in the history
Loading