D4S.Indexer
1.0.20
dotnet add package D4S.Indexer --version 1.0.20
NuGet\Install-Package D4S.Indexer -Version 1.0.20
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="D4S.Indexer" Version="1.0.20" />
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="D4S.Indexer" Version="1.0.20" />
<PackageReference Include="D4S.Indexer" />
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add D4S.Indexer --version 1.0.20
The NuGet Team does not provide support for this client. Please contact its maintainers for support.
#r "nuget: D4S.Indexer, 1.0.20"
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package D4S.Indexer@1.0.20
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=D4S.Indexer&version=1.0.20
#tool nuget:?package=D4S.Indexer&version=1.0.20
The NuGet Team does not provide support for this client. Please contact its maintainers for support.
D4S.Indexer
Document indexing library for Azure AI Search: extracts text, generates vector embeddings, and uploads searchable chunks.
Quick start
var indexer = IndexerBuilder.Create("my-index")
.WithAzureSearch(searchEndpoint, searchKey)
.WithAzureOpenAI(aoaiEndpoint, aoaiKey, embeddingDeployment, embeddingDimensions)
.WithLocalFiles("./documents")
.WithFileMetadataFields()
.Build();
var result = await indexer.IndexAsync();
See src/Rag/samples/ for working examples (local files, SharePoint, OCR, agentic retrieval).
Architecture
D4S.Indexer.Domain Entities, abstractions (interfaces)
D4S.Indexer.Application Orchestration (DocumentIndexerService, DocumentExtractor)
D4S.Indexer.Infrastructure Azure implementations, builder, processors, sources
| Interface | Purpose |
|---|---|
IDocumentSource |
Enumerates documents from a data source |
IDocumentProcessor |
Extracts text/metadata from a document |
IEmbeddingService |
Generates vector embeddings |
ISearchIndexService |
Manages the index (CRUD on chunks) |
ITextChunker |
Splits text into chunks |
IOcrService / IKeywordExtractor |
OCR for scans / AI keyword extraction |
Built-in sources: local filesystem, multi-site SharePoint (PnP Core). Built-in processors: PDF, DOCX, XLSX, PPTX, TXT/Markdown.
Indexing modes
- Full (default): all documents fetched from every source; documents missing from the source list are deleted from the index.
- Delta (
.WithDeltaMode()): only changed/new/deleted documents are provided; deletion is driven byDocumentMetadata.DeletedDate(set it and passnullforGetContentAsync). No implicit cleanup.
Both modes compare LastModifiedDate against the index to skip unchanged documents.
Builder options
IndexerBuilder.Create("index-name")
// Required
.WithAzureSearch(endpoint, apiKey)
.WithAzureOpenAI(endpoint, apiKey, deployment, dimensions)
// Sources (at least one)
.WithLocalFiles("./docs") // or: opts => { opts.Path = …; opts.FileExtensions = […]; }
.WithSharePointMultiSite(spOptions, contextFactory)
.WithCustomDocumentSource<T>(serviceProvider, serviceKey)
// Optional
.WithDeltaMode()
.WithFileMetadataFields()
.WithChunkSize(maxSize: 1000, overlap: 200)
.WithBatchSize(50)
.WithKeywordExtraction(gptDeployment, maxKeywords: 10)
.WithAzureDocumentIntelligence(endpoint, apiKey) // OCR
.WithCustomDocumentProcessor<T>(serviceProvider, serviceKey)
.ContinueOnError(true)
.Filter(meta => meta.Extension == ".pdf")
.ConfigureMetadata(meta => meta with { CustomFields = … })
.AddCustomField("Status", CustomFieldType.String, filterable: true)
.AddIndexFieldsFromAttributes<MyModel>()
.OnProgress(p => Console.WriteLine(p.Phase))
.WithLogging()
.Build();
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
-
net10.0
- Azure.AI.DocumentIntelligence (>= 1.0.0)
- Azure.AI.OpenAI (>= 2.8.0-beta.1)
- Azure.Identity (>= 1.18.0)
- Azure.Search.Documents (>= 11.8.0-beta.1)
- Azure.Storage.Blobs (>= 12.28.0-beta.1)
- D4S.Indexer.Application (>= 1.0.20)
- D4S.Indexer.Domain (>= 1.0.20)
- DocumentFormat.OpenXml (>= 3.4.1)
- Microsoft.Extensions.Logging.Abstractions (>= 10.0.3)
- Microsoft.Extensions.Logging.Console (>= 10.0.3)
- Microsoft.Extensions.Options (>= 10.0.3)
- Microsoft.SemanticKernel.Connectors.AzureOpenAI (>= 1.72.0)
- Microsoft.SemanticKernel.Core (>= 1.72.0)
- Microsoft.SemanticKernel.Plugins.Document (>= 1.70.0-alpha)
- PdfPig (>= 0.1.13)
- PnP.Core (>= 1.15.0)
- PnP.Core.Auth (>= 1.15.0)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.