Skip to content

Conversation

brendan-kellam
Copy link
Contributor

@brendan-kellam brendan-kellam commented Oct 15, 2025

image

...well this PR escalated quickly 👀

TL;DR: solved some repo indexing concurrency issues by moving from BullMQ to GroupMQ, refactored how we represent indexing jobs in the db, and generally deleted a ton of things that got in the way of web perf.

Where it started

We were getting bug reports (#523, #436) that repo indexing would fail at the clone or fetch stage with strange git errors. After some investigation on a test deployment of ~1k repos, we noticed that the root cause seemed to be that multiple indexing jobs could run on the same repository at the same time. When this happens, two workers will attempt to perform git operations on the same folder, resulting in the weird git behaviour we were seeing. For example:

  1. Worker 1 scheduled on repoid=1
  2. Worker 2 scheduled on repoid=1
  3. Worker 1 starts. Spawns git process to clone into /repos/1
  4. Worker 1 pre-empted by Node.JS runtime.
  5. Worker 2 starts. Spawns git process to clone into /repos/1
    <-- now two git processes are operating on the same repo. bad news -->
  6. Worker 2 receives error from git process that clone failed. Exists with job failed.
  7. Worker.1 pre-empted by Node.JS runtime.
  8. Worker 1 receives error from git process that clone failed. Exists with job failed.

Q: Why were two workers being scheduled for the same repository in the first place?
A: I'm still not certain, but here's my theory: The job scheduler uses the the database in order to schedule new indexing jobs. This happens in two operations: 1) fetch all repositories that need indexing, and 2) create BullMQ jobs for these repositories and flag them as indexing. We were not performing 1 and 2 in a transaction, so this was not an atomic operation. I noticed that failures would typically happen in our k8s cluster when a new replica was being spun up. Likely, we were getting interleavings between replica A and B something like A:1, B:1, A:2, B:2, resulting in duplicate jobs for the same set of repos.

Put another way, we were using the Postgres database as a distributed lock to enforce the "1 repo per worker" invariant. This felt error prone - what we really want is the invariant to be enforced at the queue level within Redis.

BullMQ has the concept of groups:

Groups allows you to use a single queue while distributing the jobs among groups so that the jobs are processed one by one relative to the group they belong to.

This sounds like the perfect solution: each repository can be in it's own group such that we are guaranteed that two jobs operating on the same repository must be processed one-by-one. Unfortunately, this feature is only available in the commercial version of BullMQ, and taking a dependency on a commercially licensed upstream dependency felt like a non-starter.

Enter groupmq - it's a brand new library (literally on v1.0.0) built by OpenPanel.dev. It supports grouping, has a similar api as BullMQ, and is MIT licensed. Was slightly hesitant to use such a new library, but their docs look good, so I decided to move to that.

repoIndexManager.ts contains the new & improved indexer. In addition to moving to GroupMQ, I've also moved to representing job status in a separate RepoJob table in s.t., we can maintain a 1:1 mapping between a job in Redis and how we represent it in the database. This PR deprecates the repoIndexingStatus field.

Where it went

After updating the indexer and moving to using the RepoJob table, I had to refactor how the progress indicators, repo carousel, and repo table work. This led into a bunch of chore work in web:

  • Removes long polling / client side fetching of repositories, in favour of fetching data in server components and having refresh buttons.
  • Updated progress indicator look. The progress indicator now will show progress the first time indexing of repositories.
image
  • Removes the warning and error nav indicators.
  • Optimizes the Ask mode search scope selector with virtualizing
  • Added separate search and ask navigation bar items
image
  • Removed the /connections view

Fixes #523
Fixes #436
Fixes #462
Fixes #530
Fixes #452

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced structured repository indexing job system with detailed status tracking (syncing, indexed, pending).
    • Added comprehensive search landing page with syntax guides and recommended repositories.
    • Enhanced chat interface with language model selection and error messaging for misconfiguration.
  • Improvements

    • Updated repository status indicators for clearer sync and indexing states.
    • Redesigned navigation menu with real-time repository indexing progress tracking.
    • Streamlined search experience with landing page and results view.
  • Bug Fixes

    • Fixed language model validation and error handling in chat submissions.
    • Improved error messaging when language models are not configured.

Copy link

coderabbitai bot commented Oct 15, 2025

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

This PR performs a comprehensive refactor of the repository indexing system, UI architecture, and job orchestration. It replaces BullMQ-based RepoManager with a groupmq-based RepoIndexManager, restructures the database schema to use discrete job records, refactors Git operations to support cancellation, and significantly reorganizes web UI components including removal of connection management pages and restructuring of the homepage, search, and chat interfaces.

Changes

Cohort / File(s) Summary
Environment & Build Configuration
.env.development, package.json, packages/backend/package.json, packages/backend/vitest.config.ts, packages/web/src/env.mjs
Removed NEXT_PUBLIC_POLLING_INTERVAL_MS environment variable, replaced npm-run-all with concurrently for dev task orchestration, added concurrently, @bull-board/*, and groupmq dependencies, configured test environment variables.
Core Backend Job Orchestration
packages/backend/src/repoManager.ts (deleted), packages/backend/src/repoIndexManager.ts (new), packages/backend/src/index.ts
Replaced BullMQ-based RepoManager with groupmq-backed RepoIndexManager for asynchronous job orchestration; updated startup logic to conditionally enable permission syncing based on feature flags and entitlements; added 30-second disposal timeout with graceful shutdown.
Backend Infrastructure & Lifecycle
packages/backend/src/connectionManager.ts, packages/backend/src/promClient.ts, packages/backend/src/ee/repoPermissionSyncer.ts, packages/backend/src/ee/userPermissionSyncer.ts
Made dispose() methods asynchronous with proper cleanup sequencing for workers and queues; added HTTP server reference and closure logic to PromClient.
Git Operations & Utilities
packages/backend/src/git.ts, packages/backend/src/repoCompileUtils.ts, packages/backend/src/zoekt.ts
Unified Git client creation with centralized helper supporting AbortSignal for operation cancellation; updated function signatures to accept signal parameter and object-destructured arguments; replaced context-based path resolution with constants.
Backend Constants & Configuration
packages/backend/src/constants.ts, packages/backend/src/env.ts, packages/backend/src/types.ts, packages/backend/src/utils.ts
Added REPOS_CACHE_DIR and INDEX_CACHE_DIR constants; added DEBUG_ENABLE_GROUPMQ_LOGGING environment variable; removed AppContext type; introduced groupmqLifecycleExceptionWrapper for centralized error handling; updated getRepoPath signature to remove context dependency.
Database Schema & Migrations
packages/db/prisma/schema.prisma, packages/db/prisma/migrations/20251018212113_add_repo_indexing_job_table/migration.sql
Replaced RepoIndexingStatus enum with structured RepoIndexingJob model containing discrete job records with RepoIndexingJobType (INDEX, CLEANUP) and RepoIndexingJobStatus enums; added one-to-many relation on Repo; preserved indexedAt field.
Logger & MCP
packages/logger/src/index.ts, packages/mcp/src/schemas.ts
Exported Logger type from logger module; removed RepoIndexingStatus enum and repoIndexingStatus field from MCP schemas.
Web Actions & API
packages/web/src/actions.ts, packages/web/src/app/api/(server)/chat/route.ts, packages/web/src/initialize.ts
Removed connection management actions (createConnection, getConnections, updateConnectionDisplayName, etc.), simplified getRepos to object parameters, added getReposStats function, updated language model handling to use objects instead of IDs, removed failed repo status reset logic.
Web Constants & Utilities
packages/web/src/lib/constants.ts, packages/web/src/lib/schemas.ts, packages/web/src/lib/utils.ts, packages/web/src/prisma.ts
Removed SEARCH_MODE_COOKIE_NAME, removed repoIndexingStatus from repository schema, added getShortenedNumberDisplayString utility, fixed closing brace syntax.
Browse Navigation Refactoring
packages/web/src/app/[domain]/browse/hooks/useBrowseNavigation.ts, packages/web/src/app/[domain]/browse/hooks/useBrowsePath.ts, packages/web/src/app/[domain]/browse/hooks/utils.ts, packages/web/src/app/[domain]/components/pathHeader.tsx, packages/web/src/app/[domain]/components/pureTreePreviewPanel.tsx, packages/web/src/ee/features/codeNav/components/exploreMenu/referenceList.tsx, packages/web/src/features/fileTree/components/pureFileTreePanel.tsx
Extracted getBrowsePath utility from navigation hook to centralized utils module; updated all import paths accordingly.
Chat & Search UI Infrastructure
packages/web/src/features/chat/types.ts, packages/web/src/features/chat/utils.ts, packages/web/src/features/chat/useSelectedLanguageModel.ts, packages/web/src/app/api/(server)/chat/route.ts
Added languageModelInfoSchema with optional displayName; replaced languageModelId with languageModel object in request params; added getLanguageModelKey utility; updated useSelectedLanguageModel to require languageModels array with protective synchronization.
Chat Components
packages/web/src/features/chat/components/chatBox/chatBox.tsx, packages/web/src/features/chat/components/chatBox/chatBoxToolbar.tsx, packages/web/src/features/chat/components/chatBox/atMentionButton.tsx, packages/web/src/features/chat/components/chatBox/languageModelSelector.tsx, packages/web/src/features/chat/components/chatBox/searchScopeSelector.tsx, packages/web/src/features/chat/components/chatBox/languageModelInfoCard.tsx, packages/web/src/features/chat/components/chatThread/chatThread.tsx, packages/web/src/features/chat/components/notConfiguredErrorBanner.tsx
Added isDisabled prop to ChatBox, introduced AtMentionButton and LanguageModelInfoCard components, enhanced LanguageModelSelector and SearchScopeSelector with tooltips and keyboard navigation, added NotConfiguredErrorBanner for missing models, updated language model initialization logic.
Chat & Homepage Restructuring
packages/web/src/app/[domain]/chat/page.tsx, packages/web/src/app/[domain]/chat/layout.tsx, packages/web/src/app/[domain]/chat/components/landingPageChatBox.tsx, packages/web/src/app/[domain]/chat/components/demoCards.tsx, packages/web/src/app/[domain]/chat/components/tutorialDialog.tsx, packages/web/src/app/[domain]/chat/components/newChatPanel.tsx (deleted)
Restructured chat page to render dynamic LandingPageChatBox, RepositoryCarousel, and DemoCards; renamed AgenticSearch to LandingPageChatBox with reduced props; renamed DemoCards component; converted TutorialDialog API from onClose callback to isOpen boolean prop; removed NewChatPanel component; added server-side cookie-based tutorial dismissal.
Search Pages
packages/web/src/app/[domain]/search/page.tsx, packages/web/src/app/[domain]/search/components/searchLandingPage.tsx, packages/web/src/app/[domain]/search/components/searchResultsPage.tsx, packages/web/src/app/[domain]/search/components/searchResultsPanel/fileMatch.tsx
Converted SearchPage to async server component with conditional rendering of SearchLandingPage or SearchResultsPage; added new SearchLandingPage with repository carousel and search guidance; added SearchResultsPage with filter panel, results panel, and code preview with keyboard shortcuts.
Repositories Page
packages/web/src/app/[domain]/repos/page.tsx, packages/web/src/app/[domain]/repos/repositoryTable.tsx, packages/web/src/app/[domain]/repos/columns.tsx, packages/web/src/app/[domain]/repos/layout.tsx
Replaced RepoIndexingStatus with simplified RepoStatus (syncing/indexed/not-indexed); updated RepositoryTable to accept repos array and domain as props; added getReposWithJobs helper; changed column header from "Last Indexed" to "Last Synced".
Navigation Menu Restructuring
packages/web/src/app/[domain]/components/navigationMenu.tsx (deleted), packages/web/src/app/[domain]/components/navigationMenu/index.tsx (new), packages/web/src/app/[domain]/components/navigationMenu/navigationItems.tsx (new), packages/web/src/app/[domain]/components/navigationMenu/progressIndicator.tsx (new), packages/web/src/app/[domain]/components/navigationMenu/trialIndicator.tsx
Refactored monolithic NavigationMenu into modular structure; added navigationItems component with repo count badge; added new progressIndicator with live indexing progress; removed polling-based error/warning indicators; fetches repo stats server-side.
Indicators & Removed Components
packages/web/src/app/[domain]/components/progressNavIndicator.tsx (deleted), packages/web/src/app/[domain]/components/warningNavIndicator.tsx (deleted), packages/web/src/app/[domain]/components/errorNavIndicator.tsx (deleted), packages/web/src/app/[domain]/components/repositoryCarousel.tsx (new), packages/web/src/app/[domain]/components/topBar.tsx
Removed polling-based indicator components; added new server-side RepositoryCarousel component; added optional homePath prop to TopBar.
Homepage Removal
packages/web/src/app/[domain]/components/homepage/index.tsx (deleted), packages/web/src/app/[domain]/components/homepage/preciseSearch.tsx (deleted), packages/web/src/app/[domain]/components/homepage/repositorySnapshot.tsx (deleted), packages/web/src/app/[domain]/components/homepage/repositoryCarousel.tsx (deleted), packages/web/src/app/[domain]/page.tsx
Removed entire homepage component infrastructure; replaced with SearchPage rendering in main page.
Connections Management Removal
packages/web/src/app/[domain]/connections/
Removed entire connection management page and components (overview, repo list, settings forms), associated UI indicators, and quick actions configuration.
Layout & Styling
packages/web/src/app/[domain]/layout.tsx, packages/web/src/app/globals.css, packages/web/tailwind.config.ts, packages/web/src/components/ui/data-table.tsx, packages/web/src/components/ui/navigation-menu.tsx, packages/web/src/app/[domain]/components/searchModeSelector.tsx
Added UpgradeToast component to layout, added --error CSS variables, added error sidebar color and ping-slow animation to Tailwind config, adjusted data-table and navigation-menu spacing, removed search mode cookie persistence and related UI warnings.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Web as Web Frontend
    participant SearchPage as Search Page
    participant LandingPage as SearchLandingPage
    participant ResultsPage as SearchResultsPage
    participant Backend as Backend API
    
    User->>Web: Navigate to /search
    Web->>SearchPage: Render with domain & query params
    
    alt Query is empty
        SearchPage->>LandingPage: Render landing with repos & stats
        LandingPage->>Backend: Fetch recommended repos & stats
        Backend-->>LandingPage: Return repos with indexedAt
        LandingPage->>User: Show carousel & search guide
    else Query provided
        SearchPage->>ResultsPage: Render with query
        ResultsPage->>Backend: Execute search (React Query)
        Backend-->>ResultsPage: Return results with duration/stats
        ResultsPage->>User: Show results panel + filters + preview
        User->>ResultsPage: Toggle filter panel (mod+b)
        User->>ResultsPage: Select file/match for preview
        ResultsPage->>User: Update code preview panel
    end
Loading
sequenceDiagram
    participant Backend as Backend
    participant RepoIndexManager as RepoIndexManager
    participant GroupMQ as GroupMQ Queue
    participant Worker as Job Worker
    participant Git as Git Operations
    
    Backend->>RepoIndexManager: new RepoIndexManager(db, settings, redis)
    Backend->>RepoIndexManager: startScheduler()
    
    RepoIndexManager->>RepoIndexManager: scheduleIndexJobs() (periodic)
    RepoIndexManager->>GroupMQ: createJobs({type: INDEX, repoId, ...})
    
    RepoIndexManager->>Worker: Start worker with concurrency
    
    loop Job Processing
        GroupMQ-->>Worker: Dequeue job
        Worker->>Git: cloneRepository(signal) or fetchRepository(signal)
        Git-->>Worker: Repo ready
        Worker->>Git: upsertGitConfig({path, gitConfig, signal})
        Worker->>Git: indexGitRepository(signal)
        Worker->>RepoIndexManager: Job complete
        RepoIndexManager->>RepoIndexManager: onJobCompleted (update indexedAt)
    end
    
    Note over Worker,Git: AbortSignal enables cancellation throughout
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ 90+ minutes

This PR involves substantial, intricate changes spanning multiple architectural layers: replacement of the entire job orchestration system (BullMQ → groupmq), database schema migration with breaking changes, comprehensive UI restructuring (homepage removal, search/chat rewrite, connections page removal), and backend git operation refactoring for cancellation support. The heterogeneous nature of changes—touching nearly every layer of both backend and frontend—and the high logic density in new systems like RepoIndexManager demand separate reasoning for each major subsystem.

Possibly related PRs

Suggested labels

sourcebot-team

Poem

🐰 A queue of jobs hopped into groupmq's den,
While search and chat danced their UI dance again.
RepoIndexManager takes the helm so bright,
Bare repos and configs—now handled just right!
Signals abort old tasks with graceful goodbye, 🌸

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (4 passed)
Check name Status Explanation
Title Check ✅ Passed The PR title "chore(worker,web): Repo indexing stability improvements + perf improvements to web" accurately reflects the main changes in the changeset. The title captures the two primary objectives: (1) stability improvements to repo indexing via the GroupMQ migration, RepoIndexingJob table, and async cleanup methods, and (2) performance improvements to the web package through removed polling, server-side data fetching, component refactoring, and virtualization. The title is concise, clear, and specific enough for a teammate reviewing history to understand the primary focus without unnecessary detail or vague language.
Linked Issues Check ✅ Passed The code changes address the primary coding requirements from all linked issues: Issue #523 is addressed through git operation refactoring and better error handling in the repoIndexManager; Issue #436 is addressed via the new RepoIndexingJob table and cleanup/recovery logic; Issue #462 is directly solved by replacing BullMQ with GroupMQ to prevent concurrent jobs on the same repo and persisting job state in the database; Issue #452 is addressed through improved cookie handling with maxAge for the tutorial dismissal; and Issue #530 is addressed by domain-aware routing in SearchModeSelector that allows the search/ask preference to persist through URL navigation. All core coding objectives from the linked issues have been implemented in the changeset.
Out of Scope Changes Check ✅ Passed The changes in this PR are aligned with the stated objectives from the PR description and linked issues. The significant removals (entire /connections view, various nav indicators, polling infrastructure) are explicitly mentioned in the PR objectives as part of the web refactoring. The BullMQ-to-GroupMQ migration, RepoIndexingJob table, async cleanup, git operation refactoring, and web performance improvements (virtualization, server components, removed polling) all directly support the core objectives of improving indexing stability and web performance. Changes to environment variables, CSS, Tailwind config, and test infrastructure are supporting changes for these core objectives and are not out of scope.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@blacksmith-sh blacksmith-sh bot deleted a comment from brendan-kellam Oct 16, 2025
@brendan-kellam brendan-kellam changed the title [wip] chore(worker): Repo indexing stability improvements chore(worker,web): Repo indexing stability improvements + perf improvements to web Oct 17, 2025
@brendan-kellam brendan-kellam marked this pull request as ready for review October 18, 2025 01:22

This comment has been minimized.

@brendan-kellam
Copy link
Contributor Author

@coderabbitai review

Copy link

coderabbitai bot commented Oct 18, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 24

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (12)
packages/web/src/app/[domain]/components/searchModeSelector.tsx (1)

18-18: Fix typo in TODO comment.

The comment contains @tood instead of @todo.

Apply this diff:

-// @tood: point this to the actual docs page
+// @todo: point this to the actual docs page
packages/web/src/features/chat/components/chatThread/chatThread.tsx (1)

131-136: Pass both provider and model identifiers to ensure correct model matching.

The code passes only selectedLanguageModel.model to generateAndUpdateChatNameFromMessage, but the lookup in actions.ts:197 searches all configured models using only the model field. If different providers offer models with identical names, this could match the wrong provider's configuration.

Since LanguageModelInfo maintains provider and model as separate fields (not a combined identifier), the lookup should verify both to ensure uniqueness:

  • chatThread.tsx line 134: Pass { provider: selectedLanguageModel.provider, model: selectedLanguageModel.model } or use a combined identifier
  • actions.ts line 197: Match using both provider and model: .find((model) => model.provider === providerId && model.model === modelId)
packages/web/src/components/ui/navigation-menu.tsx (1)

1-6: Mark this shadcn/Radix UI wrapper as a Client Component.

Radix primitives require client execution. Without "use client", importing this from a Server Component will error at build/runtime.

Apply:

+ "use client";
+
 import * as React from "react"
 import * as NavigationMenuPrimitive from "@radix-ui/react-navigation-menu"
packages/web/src/app/[domain]/layout.tsx (1)

27-35: Fix Next.js App Router layout params typing and remove unnecessary await.

params is not a Promise. Current typing/await is misleading and can confuse tooling.

-interface LayoutProps {
-  children: React.ReactNode,
-  params: Promise<{ domain: string }>
-}
+interface LayoutProps {
+  children: React.ReactNode;
+  params: { domain: string };
+}
 ...
-export default async function Layout(props: LayoutProps) {
-  const params = await props.params;
+export default async function Layout(props: LayoutProps) {
+  const params = props.params;
packages/web/src/initialize.ts (1)

129-139: Prisma: findUnique with extra filter is invalid and will not compile/run.

findUnique only accepts a unique selector; adding role filter makes it non-unique. Use findFirst with combined conditions, or fetch by unique key then check role.

Apply one of these fixes:

-  const guestUser = await prisma.userToOrg.findUnique({
-      where: {
-          orgId_userId: {
-              orgId: SINGLE_TENANT_ORG_ID,
-              userId: SOURCEBOT_GUEST_USER_ID,
-          },
-          role: {
-              not: OrgRole.GUEST,
-          }
-      },
-  });
+  const guestUser = await prisma.userToOrg.findFirst({
+    where: {
+      orgId: SINGLE_TENANT_ORG_ID,
+      userId: SOURCEBOT_GUEST_USER_ID,
+      role: { not: OrgRole.GUEST },
+    },
+  });

Alternatively:

-  const guestUser = await prisma.userToOrg.findUnique({ where: { orgId_userId: { orgId: SINGLE_TENANT_ORG_ID, userId: SOURCEBOT_GUEST_USER_ID }, role: { not: OrgRole.GUEST } }});
+  const edge = await prisma.userToOrg.findUnique({ where: { orgId_userId: { orgId: SINGLE_TENANT_ORG_ID, userId: SOURCEBOT_GUEST_USER_ID } }});
+  const guestUser = edge && edge.role !== OrgRole.GUEST ? edge : null;
packages/web/src/app/[domain]/chat/components/tutorialDialog.tsx (1)

139-141: Typo: “Reposet”.

Consider “Repo set” or “Repository set” in the UI copy.

packages/web/src/actions.ts (2)

537-564: Auth bypass: user-supplied where can override orgId.

Spreading ...where after orgId lets callers query other orgs. Force orgId via AND.

-export const getRepos = async ({
-    where,
-    take,
-}: {
-    where?: Prisma.RepoWhereInput,
-    take?: number
-} = {}) => sew(() =>
+export const getRepos = async ({
+    where,
+    take,
+}: {
+    where?: Prisma.RepoWhereInput,
+    take?: number
+} = {}) => sew(() =>
   withOptionalAuthV2(async ({ org, prisma }) => {
-        const repos = await prisma.repo.findMany({
-            where: {
-                orgId: org.id,
-                ...where,
-            },
-            take,
-        });
+        const whereClause: Prisma.RepoWhereInput =
+          where ? { AND: [ where, { orgId: org.id } ] } : { orgId: org.id };
+        const MAX = 200;
+        const repos = await prisma.repo.findMany({
+            where: whereClause,
+            take: take ? Math.min(Math.max(take, 1), MAX) : undefined,
+            orderBy: { id: 'desc' },
+        });
         return repos.map((repo) => repositoryQuerySchema.parse({
           /* unchanged */
         }))
   }));

1731-1766: Prevent token/header leakage on image proxy (SSRF guard).

Authorization headers are added before validating the image URL’s origin. A malicious imageUrl could exfiltrate tokens. Only attach auth when the image host matches the expected provider host; otherwise send no credentials.

-        const authHeaders: Record<string, string> = {};
-        for (const { connection } of repo.connections) {
+        const authHeaders: Record<string, string> = {};
+        const imageUrlObj = new URL(repo.imageUrl);
+        for (const { connection } of repo.connections) {
             try {
                 if (connection.connectionType === 'github') {
-                    const config = connection.config as unknown as GithubConnectionConfig;
-                    if (config.token) {
-                        const token = await getTokenFromConfig(config.token, connection.orgId, prisma);
-                        authHeaders['Authorization'] = `token ${token}`;
-                        break;
-                    }
+                    // GitHub avatars are public; do NOT forward tokens to third-party origins.
+                    // Intentionally no auth header here.
                 } else if (connection.connectionType === 'gitlab') {
                     const config = connection.config as unknown as GitlabConnectionConfig;
                     if (config.token) {
                         const token = await getTokenFromConfig(config.token, connection.orgId, prisma);
-                        authHeaders['PRIVATE-TOKEN'] = token;
-                        break;
+                        const expectedHost = new URL(config.url ?? 'https://gitlab.com/').hostname;
+                        if (imageUrlObj.hostname === expectedHost) {
+                            authHeaders['PRIVATE-TOKEN'] = token;
+                        }
+                        break;
                     }
                 } else if (connection.connectionType === 'gitea') {
-                    const config = connection.config as unknown as GiteaConnectionConfig;
-                    if (config.token) {
-                        const token = await getTokenFromConfig(config.token, connection.orgId, prisma);
-                        authHeaders['Authorization'] = `token ${token}`;
-                        break;
-                    }
+                    // Gitea avatars are typically public; avoid forwarding tokens.
                 }
             } catch (error) {
                 logger.warn(`Failed to get token for connection ${connection.id}:`, error);
             }
         }

Optionally add a short timeout to fetch using AbortController to avoid hanging requests.

packages/web/src/app/[domain]/components/pathHeader.tsx (1)

258-274: Fix nested interactive elements in dropdown (Link inside DropdownMenuItem)

Current structure nests an interactive DropdownMenuItem inside an anchor, which harms a11y and can impede keyboard selection. Use Radix’s asChild to make Link the actionable element.

Apply this diff:

-                                    {hiddenSegments.map((segment) => (
-                                        <Link
-                                            href={getBrowsePath({
-                                                repoName: repo.name,
-                                                path: segment.fullPath,
-                                                pathType: segment.isLastSegment ? pathType : 'tree',
-                                                revisionName: branchDisplayName,
-                                                domain,
-                                            })}
-                                            className="font-mono text-sm hover:cursor cursor-pointer"
-                                            key={segment.fullPath}
-                                        >
-                                            <DropdownMenuItem className="hover:cursor cursor-pointer">
-                                                {renderSegmentWithHighlight(segment)}
-                                            </DropdownMenuItem>
-                                        </Link>
-                                    ))}
+                                    {hiddenSegments.map((segment) => (
+                                        <DropdownMenuItem key={segment.fullPath} asChild>
+                                            <Link
+                                                href={getBrowsePath({
+                                                    repoName: repo.name,
+                                                    path: segment.fullPath,
+                                                    pathType: segment.isLastSegment ? pathType : 'tree',
+                                                    revisionName: branchDisplayName,
+                                                    domain,
+                                                })}
+                                                className="font-mono text-sm"
+                                            >
+                                                {renderSegmentWithHighlight(segment)}
+                                            </Link>
+                                        </DropdownMenuItem>
+                                    ))}
packages/backend/src/git.ts (3)

61-78: Clone CWD/destination mismatch: this will fail or clone into a nested path.

You set cwd to path and then call git.clone(..., path). That attempts to clone inside the target directory or fails if it exists. For a bare clone with cwd=path, clone into "." and ensure the directory is empty.

Apply these diffs:

- import { mkdir } from 'node:fs/promises';
+ import { mkdir, readdir } from 'node:fs/promises';
-    await mkdir(path, { recursive: true });
-
-    const git = createGitClientForPath(path, onProgress, signal);
+    await mkdir(path, { recursive: true });
+    const contents = await readdir(path);
+    if (contents.length > 0) {
+        throw new Error(`Clone destination exists and is not empty: ${path}`);
+    }
+    const git = createGitClientForPath(path, onProgress, signal);
-    await git.clone(cloneUrl, path, cloneArgs);
+    // Clone into the current (empty) directory
+    await git.clone(cloneUrl, ".", cloneArgs);

108-139: Cleanup can override the primary error and misses env/signal; use helper and unset-all.

  • finally’s git.raw can throw and mask the fetch error.
  • Use the same helper (ceiling + signal), scope to local, and remove all extraHeader entries.

Apply this diff:

-  } finally {
-      if (authHeader) {
-          const git = simpleGit({
-              progress: onProgress,
-          }).cwd({
-              path: path,
-          })
-
-          await git.raw(["config", "--unset", "http.extraHeader", authHeader]);
-      }
-  }
+  } finally {
+      if (authHeader) {
+          try {
+              const git = createGitClientForPath(path, onProgress, signal);
+              await git.raw(["config", "--local", "--unset-all", "http.extraHeader"]);
+          } catch {
+              // best-effort cleanup; do not override the original error
+          }
+      }
+  }

192-203: Config scope bug; prefer --local and --unset-all, and don’t rely on listConfig(all).

listConfig aggregates all scopes; you may try to unset keys that only exist globally and trigger errors. Unset locally and ignore “not set” errors. Also handle multi-valued keys.

Apply this diff:

-    try {
-        const configList = await git.listConfig();
-        const setKeys = Object.keys(configList.all);
-
-        for (const key of keys) {
-            if (setKeys.includes(key)) {
-                await git.raw(['config', '--unset', key]);
-            }
-        }
-    } catch (error: unknown) {
+    try {
+        for (const key of keys) {
+            try {
+                await git.raw(['config', '--local', '--unset-all', key]);
+            } catch {
+                // ignore if not set locally
+            }
+        }
+    } catch (error: unknown) {
🧹 Nitpick comments (51)
packages/web/src/lib/utils.ts (2)

379-381: Enhance JSDoc with examples and edge case documentation.

The comment "Converts a number to a string" is minimal. Consider documenting the formatting behavior and providing examples.

Apply this diff to improve the documentation:

 /**
- * Converts a number to a string
+ * Converts a number to a shortened display string with SI suffixes.
+ * - Numbers < 1,000 are returned as-is (e.g., 42 → "42")
+ * - Numbers < 1,000,000 are displayed in thousands (e.g., 1,500 → "1.5k")
+ * - Numbers ≥ 1,000,000 are displayed in millions (e.g., 2,300,000 → "2.3m")
+ * 
+ * @param number The number to format
+ * @returns The formatted string with appropriate suffix
  */

382-390: Consider adding input validation for edge cases.

The function doesn't validate for NaN, Infinity, or negative numbers. While unlikely in the context of displaying counts/progress, defensive validation would prevent unexpected output if corrupted data is passed.

Apply this diff to add validation:

 export const getShortenedNumberDisplayString = (number: number) => {
+    if (!Number.isFinite(number)) {
+        return '0';
+    }
+    if (number < 0) {
+        return '0';
+    }
     if (number < 1000) {
         return number.toString();
     } else if (number < 1000000) {
         return `${(number / 1000).toFixed(1)}k`;
     } else {
         return `${(number / 1000000).toFixed(1)}m`;
     }
 }
packages/web/src/app/[domain]/components/searchModeSelector.tsx (1)

31-31: Consider syncing focusedSearchMode with the searchMode prop.

The focusedSearchMode state is initialized from the searchMode prop but won't update if the parent changes the prop value. This could cause the tooltip to display for the wrong mode.

Add a useEffect to keep them in sync:

 const [focusedSearchMode, setFocusedSearchMode] = useState<SearchMode>(searchMode);
+
+useEffect(() => {
+    setFocusedSearchMode(searchMode);
+}, [searchMode]);

Don't forget to import useEffect:

-import { useCallback, useState } from "react";
+import { useCallback, useState, useEffect } from "react";
packages/web/src/app/[domain]/search/components/searchLandingPage.tsx (1)

20-31: Parallelize data fetches to cut TTFB.
Currently sequential; use Promise.all.

-    const carouselRepos = await getRepos({
+    const [carouselRepos, repoStats] = await Promise.all([
+      getRepos({
         where: {
           indexedAt: { not: null },
         },
         take: 10,
-    });
-
-    const repoStats = await getReposStats();
+      }),
+      getReposStats(),
+    ]);
packages/web/src/app/[domain]/search/components/searchResultsPage.tsx (6)

323-329: Use a button for “load more” (accessibility + semantics).
Divs aren’t keyboard/AT friendly.

-        <div
-            className="cursor-pointer text-blue-500 text-sm hover:underline ml-4"
-            onClick={onLoadMoreResults}
-        >
-            (load more)
-        </div>
+        <button
+          type="button"
+          className="text-blue-500 text-sm hover:underline ml-4"
+          onClick={onLoadMoreResults}
+          aria-label="Load more results"
+        >
+          (load more)
+        </button>

303-308: Handle clipboard write asynchronously and robustly.
Avoid unhandled promise; return success/failure.

-  <CopyIconButton
-    onCopy={() => {
-      navigator.clipboard.writeText(JSON.stringify(searchStats, null, 2));
-      return true;
-    }}
-    className="ml-auto"
-  />
+  <CopyIconButton
+    onCopy={async () => {
+      try {
+        await navigator.clipboard.writeText(JSON.stringify(searchStats ?? null, null, 2));
+        return true;
+      } catch {
+        return false;
+      }
+    }}
+    className="ml-auto"
+  />

213-215: Scope filter-panel state per domain.
Current localStorage key is global; include domain to avoid cross-domain bleed.

Option A (simpler): call useDomain inside PanelGroup.

 const PanelGroup = ({
   ...
 }: PanelGroupProps) => {
+  const domain = useDomain();
   ...
-  const [isFilterPanelCollapsed, setIsFilterPanelCollapsed] = useLocalStorage('isFilterPanelCollapsed', false);
+  const [isFilterPanelCollapsed, setIsFilterPanelCollapsed] =
+    useLocalStorage(`isFilterPanelCollapsed:${domain}`, false);

Option B: pass domain from SearchResultsPage to PanelGroup as a prop.


266-275: Add accessible label to the filter toggle button.
Improves screen reader UX.

-  <Button
+  <Button
     variant="ghost"
     size="icon"
     className="h-8 w-8"
+    aria-label="Open filter panel"
     onClick={() => {
       filterPanelRef.current?.expand();
     }}
   >

160-165: Spinner: add role and live region.
Assistive tech will announce progress.

-  <div className="flex flex-col items-center justify-center h-full gap-2">
+  <div className="flex flex-col items-center justify-center h-full gap-2" role="status" aria-live="polite">

74-78: Optional: keep previous data to reduce UI jank on refetch.

-  refetchOnWindowFocus: false,
-  retry: false,
-  staleTime: 0,
+  refetchOnWindowFocus: false,
+  retry: false,
+  staleTime: 0,
+  keepPreviousData: true,
packages/web/src/app/[domain]/components/repositoryCarousel.tsx (5)

80-87: Use stable keys and remove redundant child key.

Index keys cause reconciliation issues; RepositoryBadge doesn’t need its own key.

-                    {displayRepos.map((repo, index) => (
-                        <CarouselItem key={index} className="basis-auto">
-                            <RepositoryBadge
-                                key={index}
-                                repo={repo}
-                            />
-                        </CarouselItem>
-                    ))}
+                    {displayRepos.map((repo) => (
+                        <CarouselItem key={repo.repoName} className="basis-auto">
+                            <RepositoryBadge repo={repo} />
+                        </CarouselItem>
+                    ))}

40-45: External docs link: prefer <a target="_blank" rel="noopener noreferrer"> for consistency and safety.

Elsewhere you already do this for docs. Align here too.

-                                <Link href={`https://docs.sourcebot.dev/docs/connections/overview`} className="text-blue-500 hover:underline inline-flex items-center gap-1">
-                                    connection
-                                </Link>{" "}
+                                <a
+                                  href="https://docs.sourcebot.dev/docs/connections/overview"
+                                  target="_blank"
+                                  rel="noopener noreferrer"
+                                  className="text-primary hover:underline inline-flex items-center gap-1"
+                                >
+                                  connection
+                                </a>{" "}

150-155: Prevent long repo names from overflowing.

Add truncation and hidden overflow so badges don’t blow up layout.

-            className={clsx("flex flex-row items-center gap-2 border rounded-md p-2 text-clip")}
+            className={clsx("flex flex-row items-center gap-2 border rounded-md p-2 overflow-hidden")}
         >
             {repoIcon}
-            <span className="text-sm font-mono">
+            <span className="text-sm font-mono truncate">
                 {displayName}
             </span>

64-78: Consider pausing auto-scroll on focus for keyboard users.

Add stopOnFocusIn: true so the carousel doesn’t move while tabbing/interacting; keep stopOnMouseEnter as-is. Based on learnings.

                     Autoscroll({
                         startDelay: 0,
                         speed: 1,
                         stopOnMouseEnter: true,
                         stopOnInteraction: false,
+                        stopOnFocusIn: true,
                     }),

141-156: Perf: consider disabling prefetch on many repo links.

If displayRepos is large, Next.js may prefetch many pages. You can opt out per-link.

-        <Link
+        <Link
+            prefetch={false}
             href={getBrowsePath({
packages/web/src/app/[domain]/chat/components/demoCards.tsx (1)

14-20: Consider renaming the interface to follow React conventions.

The interface DemoCards has the same name as the component, which diverges from the typical React/TypeScript pattern of naming props interfaces ComponentNameProps.

Apply this diff to align with React conventions:

-interface DemoCards {
+interface DemoCardsProps {
     demoExamples: DemoExamples;
 }
 
 export const DemoCards = ({
     demoExamples,
-}: DemoCards) => {
+}: DemoCardsProps) => {
packages/web/src/features/chat/components/chatBox/chatBox.tsx (2)

294-301: Disabled styles won’t apply to Editable; add aria-disabled + conditional classes.

Tailwind’s disabled: variants won’t trigger here because Editable doesn’t get a disabled attribute. Use aria-disabled and explicit class toggles.

-            <Editable
-                className="w-full focus-visible:outline-none focus-visible:ring-0 bg-background text-base disabled:cursor-not-allowed disabled:opacity-50 md:text-sm"
+            <Editable
+                aria-disabled={isDisabled}
+                className={cn(
+                  "w-full focus-visible:outline-none focus-visible:ring-0 bg-background text-base md:text-sm",
+                  isDisabled && "cursor-not-allowed opacity-50"
+                )}
                 placeholder="Ask a question about your code. @mention files or select search scopes to refine your query."
                 renderElement={renderElement}
                 renderLeaf={renderLeaf}
                 onKeyDown={onKeyDown}
                 readOnly={isDisabled}
             />

83-86: Guard focus and key handlers when disabled to avoid confusing UX.

When isDisabled, prevent focusing via "/" and short‑circuit onKeyDown.

-    useHotkeys("/", (e) => {
-        e.preventDefault();
-        ReactEditor.focus(editor);
-    });
+    useHotkeys(
+      "/",
+      (e) => {
+        if (isDisabled) return;
+        e.preventDefault();
+        ReactEditor.focus(editor);
+      },
+      [editor, isDisabled]
+    );
-    const onKeyDown = useCallback((event: KeyboardEvent<HTMLDivElement>) => {
+    const onKeyDown = useCallback((event: KeyboardEvent<HTMLDivElement>) => {
+        if (isDisabled) return;
         if (suggestionMode === "none") {
           ...
-    }, [suggestionMode, suggestions, onSubmit, index, onInsertSuggestion]);
+    }, [suggestionMode, suggestions, onSubmit, index, onInsertSuggestion, isDisabled]);

Also applies to: 209-251, 37-49, 44-49

packages/web/src/features/chat/types.ts (1)

174-179: Tighten schema to reduce drift with server model.

If you have canonical provider/model enums or a server Zod schema, import/derive from that instead of z.string(). Otherwise, at least enforce non‑empty strings.

-export const languageModelInfoSchema = z.object({
-    provider: z.string(),
-    model: z.string(),
-    displayName: z.string().optional(),
-});
+export const languageModelInfoSchema = z.object({
+  provider: z.string().min(1),
+  model: z.string().min(1),
+  displayName: z.string().optional(),
+});
packages/web/src/app/[domain]/chat/page.tsx (1)

24-38: Fetch independent data in parallel to reduce TTFB.

getConfiguredLanguageModelsInfo, getSearchContexts, getRepos, carousel getRepos, and getReposStats don’t depend on each other; run them concurrently.

-    const languageModels = await getConfiguredLanguageModelsInfo();
-    const searchContexts = await getSearchContexts(params.domain);
-    const allRepos = await getRepos();
-    const carouselRepos = await getRepos({
-        where: { indexedAt: { not: null } },
-        take: 10,
-    });
-    const repoStats = await getReposStats();
+    const [
+      languageModels,
+      searchContexts,
+      allRepos,
+      carouselRepos,
+      repoStats,
+    ] = await Promise.all([
+      getConfiguredLanguageModelsInfo(),
+      getSearchContexts(params.domain),
+      getRepos(),
+      getRepos({ where: { indexedAt: { not: null } }, take: 10 }),
+      getReposStats(),
+    ]);

Also applies to: 55-63

packages/web/src/features/chat/components/chatBox/languageModelSelector.tsx (2)

64-67: De-duplication is O(n²); switch to a Map for linear-time.

Improves perf for long model lists without changing behavior.

Apply:

-    const languageModels = useMemo(() => {
-        return _languageModels.filter((model, selfIndex, selfArray) =>
-            selfIndex === selfArray.findIndex((t) => getLanguageModelKey(t) === getLanguageModelKey(model))
-        );
-    }, [_languageModels]);
+    const languageModels = useMemo(() => {
+        const seen = new Map<string, LanguageModelInfo>();
+        for (const m of _languageModels) {
+            const k = getLanguageModelKey(m);
+            if (!seen.has(k)) seen.set(k, m);
+        }
+        return Array.from(seen.values());
+    }, [_languageModels]);

74-105: Popover/Tooltip trigger stacking and a11y.

  • Add aria-expanded/aria-controls on the Button for screen readers.
  • Hide Tooltip while Popover is open to avoid dual overlays.

Apply:

-                        <Button
-                            onClick={handleTogglePopover}
+                        <Button
+                            aria-haspopup="listbox"
+                            aria-expanded={isPopoverOpen}
+                            aria-controls="model-popover"
+                            onClick={handleTogglePopover}
                             className={cn(
                                 "flex p-1 rounded-md items-center justify-between bg-inherit h-6",
                                 className
                             )}
                         >
-                <TooltipContent side="bottom" className="p-0 border-0 bg-transparent shadow-none">
+                {/* Avoid tooltip when popover is open */}
+                {!isPopoverOpen && (
+                  <TooltipContent side="bottom" className="p-0 border-0 bg-transparent shadow-none">
                     <LanguageModelInfoCard />
-                </TooltipContent>
+                  </TooltipContent>
+                )}
-                <PopoverContent
+                <PopoverContent
+                    id="model-popover"
                     className="w-auto p-0"
                     align="start"
                     onEscapeKeyDown={() => setIsPopoverOpen(false)}
                 >

Also applies to: 109-157

packages/web/src/app/[domain]/chat/components/landingPageChatBox.tsx (1)

14-18: Rename props interface to avoid shadowing the component name.

Improves readability and IDE hints.

Apply:

-interface LandingPageChatBox {
+interface LandingPageChatBoxProps {
     languageModels: LanguageModelInfo[];
     repos: RepositoryQuery[];
     searchContexts: SearchContextQuery[];
 }
 
-export const LandingPageChatBox = ({
+export const LandingPageChatBox = ({
     languageModels,
     repos,
     searchContexts,
-}: LandingPageChatBox) => {
+}: LandingPageChatBoxProps) => {

Also applies to: 20-24

packages/web/src/features/chat/components/chatBox/searchScopeSelector.tsx (2)

189-203: Auto-focus the search input when the popover opens.

Speeds keyboard workflows and improves a11y.

Apply:

-        // Measure virtualizer when popover opens and container is mounted
+        // Measure virtualizer when popover opens and container is mounted; focus input
         useEffect(() => {
             if (isOpen) {
                 setIsMounted(true);
                 setHighlightedIndex(0);
                 // Give the DOM a tick to render before measuring
                 requestAnimationFrame(() => {
                     if (scrollContainerRef.current) {
                         virtualizer.measure();
                     }
+                    inputRef.current?.focus();
                 });
             } else {
                 setIsMounted(false);
             }
         }, [isOpen, virtualizer]);
-                            <div className="flex items-center border-b px-3">
-                                <Input
+                            <div className="flex items-center border-b px-3">
+                                <Input
+                                    ref={inputRef}
                                     placeholder="Search scopes..."
                                     value={searchQuery}
                                     onChange={(e) => setSearchQuery(e.target.value)}
                                     onKeyDown={handleInputKeyDown}
                                     className="border-0 focus-visible:ring-0 focus-visible:ring-offset-0 h-11"
                                 />

Add at top with other refs:

-        const [searchQuery, setSearchQuery] = useState("");
+        const [searchQuery, setSearchQuery] = useState("");
+        const inputRef = useRef<HTMLInputElement>(null);

Also applies to: 264-270


229-236: Add aria roles/attributes for better screen reader support.

  • Button should expose combobox semantics.
  • List container should be role="listbox" with option items and aria-activedescendant.

Apply:

-                            <Button
+                            <Button
+                                aria-haspopup="listbox"
+                                aria-expanded={isOpen}
+                                aria-controls="search-scope-popover"
                                 ref={ref}
                                 {...props}
                                 onClick={handleTogglePopover}
-                    <PopoverContent
+                    <PopoverContent
+                        id="search-scope-popover"
                         className="w-[400px] p-0"
                         align="start"
                         onEscapeKeyDown={() => onOpenChanged(false)}
                     >
-                            <div
+                            <div
                                 ref={scrollContainerRef}
-                                className="max-h-[300px] overflow-auto"
+                                className="max-h-[300px] overflow-auto"
+                                role="listbox"
+                                aria-activedescendant={
+                                  sortedSearchScopeItems.length > 0
+                                    ? `scope-${sortedSearchScopeItems[highlightedIndex]?.item.type}-${sortedSearchScopeItems[highlightedIndex]?.item.value}`
+                                    : undefined
+                                }
                             >
-                                                    <div
-                                                        key={`${item.type}-${item.value}`}
+                                                    <div
+                                                        id={`scope-${item.type}-${item.value}`}
+                                                        role="option"
+                                                        aria-selected={isSelected}
+                                                        key={`${item.type}-${item.value}`}
                                                         onClick={() => toggleItem(item)}

Also applies to: 257-263, 276-347, 297-344

packages/web/src/components/ui/navigation-menu.tsx (1)

43-45: Remove duplicate "group" class on Trigger.

navigationMenuTriggerStyle() already includes group; passing "group" again is redundant.

-    className={cn(navigationMenuTriggerStyle(), "group", className)}
+    className={cn(navigationMenuTriggerStyle(), className)}

Also applies to: 53-55

packages/web/src/app/[domain]/components/navigationMenu/navigationItems.tsx (2)

34-41: Use Next.js Link with Radix Link via asChild for prefetch and a11y.

Radix NavigationMenuLink renders an <a>; using Next <Link> as child improves routing (prefetch) and avoids nested interactive elements elsewhere.

-import { NavigationMenuItem, NavigationMenuLink, NavigationMenuList, navigationMenuTriggerStyle } from "@/components/ui/navigation-menu";
+import { NavigationMenuItem, NavigationMenuLink, NavigationMenuList, navigationMenuTriggerStyle } from "@/components/ui/navigation-menu";
+import Link from "next/link";
...
-<NavigationMenuLink
-  href={`/${domain}/search`}
-  className={cn(navigationMenuTriggerStyle(), "gap-2")}
->
+<NavigationMenuLink asChild className={cn(navigationMenuTriggerStyle(), "gap-2")}>
+  <Link href={`/${domain}/search`}>
     <SearchIcon className="w-4 h-4 mr-1" />
     Search
-</NavigationMenuLink>
+  </Link>
+</NavigationMenuLink>
...
-<NavigationMenuLink href={`/${domain}/chat`} className={navigationMenuTriggerStyle()}>
+<NavigationMenuLink asChild className={navigationMenuTriggerStyle()}>
+  <Link href={`/${domain}/chat`}>
     <MessageCircleIcon className="w-4 h-4 mr-1" />
     Ask
-</NavigationMenuLink>
+  </Link>
+</NavigationMenuLink>
...
-<NavigationMenuLink href={`/${domain}/repos`} className={navigationMenuTriggerStyle()}>
+<NavigationMenuLink asChild className={navigationMenuTriggerStyle()}>
+  <Link href={`/${domain}/repos`}>
     <BookMarkedIcon className="w-4 h-4 mr-1" />
     <span className="mr-2">Repositories</span>
     <Badge ...>...</Badge>
-</NavigationMenuLink>
+  </Link>
+</NavigationMenuLink>
...
-<NavigationMenuLink href={`/${domain}/settings`} className={navigationMenuTriggerStyle()}>
+<NavigationMenuLink asChild className={navigationMenuTriggerStyle()}>
+  <Link href={`/${domain}/settings`}>
     <SettingsIcon className="w-4 h-4 mr-1" />
     Settings
-</NavigationMenuLink>
+  </Link>
+</NavigationMenuLink>

Also applies to: 44-51, 54-66, 71-79


24-29: Tighten active matching to path segments.

startsWith can over-match (e.g., /org/settings-advanced). Consider segment-aware checks.

-const isActive = (href: string) => {
-  if (href === `/${domain}`) return pathname === `/${domain}`;
-  return pathname.startsWith(href);
-};
+const isActive = (href: string) => {
+  if (href === `/${domain}`) return pathname === `/${domain}`;
+  const withSlash = href.endsWith("/") ? href : `${href}/`;
+  return pathname === href || pathname.startsWith(withSlash);
+};
packages/web/src/app/[domain]/layout.tsx (1)

143-146: headers() and cookies() are synchronous in App Router.

No need to await; dropping it avoids confusion.

-const headersList = await headers();
-const cookieStore = await cookies()
+const headersList = headers();
+const cookieStore = cookies();
packages/web/src/app/[domain]/components/navigationMenu/trialIndicator.tsx (1)

31-40: Clamp negative days and handle pluralization.

If nextBillingDate is past, UI can show negative days. Clamp to 0 and pluralize.

-<span>
-  {Math.ceil((subscription.nextBillingDate * 1000 - Date.now()) / (1000 * 60 * 60 * 24))} days left in trial
-</span>
+{(() => {
+  const days = Math.max(
+    0,
+    Math.ceil((subscription.nextBillingDate * 1000 - Date.now()) / (1000 * 60 * 60 * 24))
+  );
+  return <span>{days} {days === 1 ? "day" : "days"} left in trial</span>;
+})()}
packages/web/src/app/[domain]/components/navigationMenu/progressIndicator.tsx (1)

70-75: Pluralize “View more” label.

Minor UX polish for singular vs plural.

-  {`View ${numRepos - sampleRepos.length} more`}
+  {`View ${numRepos - sampleRepos.length} more ${numRepos - sampleRepos.length === 1 ? "repository" : "repositories"}`}
packages/web/src/app/[domain]/components/navigationMenu/index.tsx (1)

109-137: Prefer simple anchor links for external destinations.

Server Actions are overkill for static external links and add overhead. Use <Link href=...> or <a> with rel="noreferrer" where appropriate.

-<form action={async () => { "use server"; redirect(SOURCEBOT_DISCORD_URL); }}>
-  <Button variant="outline" size="icon" type="submit">
-    <DiscordLogoIcon className="w-4 h-4" />
-  </Button>
-</form>
+<a href={SOURCEBOT_DISCORD_URL} target="_blank" rel="noreferrer">
+  <Button asChild variant="outline" size="icon">
+    <span><DiscordLogoIcon className="w-4 h-4" /></span>
+  </Button>
+</a>
...
-<form action={async () => { "use server"; redirect(SOURCEBOT_GITHUB_URL); }}>
+<a href={SOURCEBOT_GITHUB_URL} target="_blank" rel="noreferrer">
   <Button variant="outline" size="icon" type="submit">
     <GitHubLogoIcon className="w-4 h-4" />
   </Button>
-</form>
+</a>
packages/web/src/initialize.ts (2)

37-39: Config equality via JSON.stringify can be brittle.

Key order/undefined differences can create false positives. Consider stable stringify or deep-equal with key sorting.

Example:

const stable = (o: unknown) => JSON.stringify(o, Object.keys(o as object).sort());
const syncNeededOnUpdate =
  (currentConnectionConfig && stable(currentConnectionConfig) !== stable(newConnectionConfig)) ||
  (currentConnection?.syncStatus === ConnectionSyncStatus.FAILED);

70-89: Deleting declarative connections: ensure referential integrity.

Hard-deleting connections may violate FK constraints (e.g., repos, jobs). Wrap deletes in a transaction and confirm onDelete cascade/cleanup.

Suggestion:

await prisma.$transaction(async (tx) => {
  for (const c of deletedConnections) {
    await tx.connection.delete({ where: { id: c.id }});
  }
});
packages/backend/src/ee/userPermissionSyncer.ts (1)

104-110: Make shutdown resilient; close concurrently and guard against partial failures.

If worker.close() throws, queue.close() is skipped. Close both with Promise.allSettled and log failures.

   public async dispose() {
     if (this.interval) {
       clearInterval(this.interval);
     }
-    await this.worker.close();
-    await this.queue.close();
+    const results = await Promise.allSettled([this.worker.close(), this.queue.close()]);
+    results.forEach((r, i) => {
+      if (r.status === 'rejected') {
+        Sentry.captureException(r.reason, { tags: { component: 'user-permission-syncer', phase: 'dispose', target: i === 0 ? 'worker' : 'queue' }});
+      }
+    });
   }
package.json (1)

9-9: Ensure DB is migrated before dev services start.

Previous flow ran migrations up front. If backend doesn’t run them on boot, newcomers may hit schema errors.

Options:

  • Document: run yarn dev:prisma:migrate:dev once before yarn dev.
  • Or add a guarded prestart: a small script that checks a migration fingerprint and prompts the user.
packages/web/src/app/[domain]/chat/layout.tsx (1)

11-11: Remove unnecessary await on cookies().

cookies() is synchronous in Next server components; drop await for clarity.

-    const isTutorialDismissed = (await cookies()).get(AGENTIC_SEARCH_TUTORIAL_DISMISSED_COOKIE_NAME)?.value === "true";
+    const isTutorialDismissed = cookies().get(AGENTIC_SEARCH_TUTORIAL_DISMISSED_COOKIE_NAME)?.value === "true";
packages/web/src/actions.ts (2)

1840-1847: Harden and persist the tutorial-dismissed cookie.

Set path, sameSite, secure; you already set maxAge (great).

 export const setAgenticSearchTutorialDismissedCookie = async (dismissed: boolean) => sew(async () => {
   const cookieStore = await cookies();
   cookieStore.set(AGENTIC_SEARCH_TUTORIAL_DISMISSED_COOKIE_NAME, dismissed ? "true" : "false", {
     httpOnly: false, // Allow client-side access
     maxAge: 365 * 24 * 60 * 60, // 1 year in seconds
+    path: '/',
+    sameSite: 'lax',
+    secure: process.env.NODE_ENV === 'production',
   });
   return true;
 });

1849-1853: Persist and scope the mobile splash-screen dismissal cookie, too.

Align with the tutorial cookie so it survives reloads and is site-wide.

 export const dismissMobileUnsupportedSplashScreen = async () => sew(async () => {
   const cookieStore = await cookies();
-  cookieStore.set(MOBILE_UNSUPPORTED_SPLASH_SCREEN_DISMISSED_COOKIE_NAME, 'true');
+  cookieStore.set(MOBILE_UNSUPPORTED_SPLASH_SCREEN_DISMISSED_COOKIE_NAME, 'true', {
+    httpOnly: false,
+    maxAge: 365 * 24 * 60 * 60,
+    path: '/',
+    sameSite: 'lax',
+    secure: process.env.NODE_ENV === 'production',
+  });
   return true;
 });
packages/backend/src/repoCompileUtils.ts (1)

49-55: Use POSIX joins for repoName construction (cross‑platform).

path.join is OS‑dependent; on Windows it will emit backslashes and break URL-like repo names. Prefer path.posix.join for URL segments.

Example:

import path from 'path';
const repoName = path.posix.join(repoNameRoot, repoDisplayName);

Also applies to: 122-131, 199-207, 266-273, 351-358, 408-412, 520-521, 645-647, 686-691

packages/backend/vitest.config.ts (1)

7-9: Use a temp test cache dir to prevent repo pollution

Data cache is currently set to 'test-data' (relative path), which writes test artifacts to the repository root during test execution. No cleanup hooks are present in the test suite, and 'test-data/' is not in .gitignore, so the directory persists after tests complete, polluting the working tree.

Using OS temp directory (os.tmpdir()) is best practice for test caches and eliminates this issue.

Apply this diff:

-import { defineConfig } from 'vitest/config';
+import { defineConfig } from 'vitest/config';
+import os from 'node:os';
+import path from 'node:path';
 
 export default defineConfig({
     test: {
         environment: 'node',
         watch: false,
         env: {
-            DATA_CACHE_DIR: 'test-data'
+            DATA_CACHE_DIR: path.join(os.tmpdir(), 'sourcebot-test-data')
         }
     }
 });
packages/web/src/app/[domain]/repos/page.tsx (1)

11-19: Guard against empty jobs array.

Line 12 accesses repo.jobs[0] without checking if the array is empty. If a repository has no indexing jobs, this will evaluate to undefined, which is safe but could be made more explicit.

Apply this diff for clarity:

 function getRepoStatus(repo: { indexedAt: Date | null, jobs: RepoIndexingJob[] }): RepoStatus {
-    const latestJob = repo.jobs[0];
+    const latestJob = repo.jobs.length > 0 ? repo.jobs[0] : null;
     
     if (latestJob?.status === 'PENDING' || latestJob?.status === 'IN_PROGRESS') {
         return 'syncing';
     }
     
     return repo.indexedAt ? 'indexed' : 'not-indexed';
 }
packages/db/prisma/schema.prisma (1)

51-53: Add an index on Repo.indexedAt to speed periodic scans.

Schedulers filter by indexedAt; index it to reduce full scans on large tables.

 model Repo {
   // ...
-  indexedAt          DateTime? /// When the repo was last indexed successfully.
+  indexedAt          DateTime? /// When the repo was last indexed successfully.
+
+  @@index([indexedAt])
 }
packages/backend/src/index.ts (2)

67-68: Avoid multi-replica double-scheduling; run one scheduler cluster-wide.

Starting schedulers on every replica risks duplicate job creation (even with GroupMQ). Use leader election or a distributed lock (e.g., Redis-based redlock) so only one instance runs the scheduling loop at a time. Workers can still run on all replicas.

I can wire a simple Redis lock around startScheduler if you want.


37-42: Remove TOCTOU on mkdir; just mkdir with recursive.

existsSync + mkdir is racy across replicas. Calling mkdir with { recursive: true } is idempotent.

-if (!existsSync(reposPath)) {
-    await mkdir(reposPath, { recursive: true });
-}
-if (!existsSync(indexPath)) {
-    await mkdir(indexPath, { recursive: true });
-}
+await mkdir(reposPath, { recursive: true });
+await mkdir(indexPath, { recursive: true });
packages/backend/src/repoIndexManager.ts (4)

246-264: Avoid per-job process signal listeners; risk MaxListeners warnings at high concurrency.

Register one handler in the constructor that aborts all active jobs. Track controllers in a Set; add/remove per job.

 export class RepoIndexManager {
-    private interval?: NodeJS.Timeout;
+    private interval?: NodeJS.Timeout;
     private queue: Queue<JobPayload>;
     private worker: Worker<JobPayload>;
+    private activeJobControllers = new Set<AbortController>();
+    private onSignal = () => {
+        logger.info('Received shutdown signal, aborting active jobs...');
+        for (const c of this.activeJobControllers) c.abort();
+    };

     constructor(
       // ...
     ) {
       // ...
+      process.on('SIGTERM', this.onSignal);
+      process.on('SIGINT', this.onSignal);
     }

And in runJob:

-        const abortController = new AbortController();
-        const signalHandler = () => {
-            logger.info(`Received shutdown signal, aborting...`);
-            abortController.abort();
-        };
-        process.on('SIGTERM', signalHandler);
-        process.on('SIGINT', signalHandler);
+        const abortController = new AbortController();
+        this.activeJobControllers.add(abortController);

         try {
           // ...
         } finally {
-            process.off('SIGTERM', signalHandler);
-            process.off('SIGINT', signalHandler);
+            this.activeJobControllers.delete(abortController);
         }

Also remove listeners in dispose if you add them in constructor:

   public async dispose() {
     if (this.interval) clearInterval(this.interval);
     await this.worker.close();
     await this.queue.close();
+    process.off('SIGTERM', this.onSignal);
+    process.off('SIGINT', this.onSignal);
   }

279-289: Redundant git-repo validation; call once with signal.

isPathAValidGitRepoRoot is called twice; use a single call with signal.

-        if (existsSync(repoPath) && !(await isPathAValidGitRepoRoot( { path: repoPath } ))) {
-            const isValidGitRepo = await isPathAValidGitRepoRoot({
-                path: repoPath,
-                signal,
-            });
+        if (existsSync(repoPath)) {
+            const isValidGitRepo = await isPathAValidGitRepoRoot({ path: repoPath, signal });
             if (!isValidGitRepo && !isReadOnly) {
                 logger.warn(`${repoPath} is not a valid git repository root. Deleting directory and performing fresh clone.`);
                 await rm(repoPath, { recursive: true, force: true });
             }
-        }
+        }

391-397: Make cleanup completion idempotent.

The second duplicate CLEANUP job will throw on repo.delete. Use deleteMany and log outcome.

-                const repo = await this.db.repo.delete({
-                    where: { id: jobData.repoId },
-                });
-
-                logger.info(`Completed cleanup job ${job.data.jobId} for repo ${repo.name} (id: ${repo.id})`);
+                const res = await this.db.repo.deleteMany({ where: { id: jobData.repoId } });
+                if (res.count > 0) {
+                    logger.info(`Completed cleanup job ${job.data.jobId} for repoId ${jobData.repoId}`);
+                } else {
+                    logger.info(`Cleanup job ${job.data.jobId}: repoId ${jobData.repoId} already deleted`);
+                }

363-367: Parallelize shard deletions and avoid full-dir scans if directory is large.

  • Deleting sequentially is slow; use Promise.allSettled for better throughput.
  • Consider storing shard paths or segregating per-org/repo dirs to avoid scanning the full INDEX_CACHE_DIR.
-        for (const file of files) {
-            const filePath = `${INDEX_CACHE_DIR}/${file}`;
-            logger.info(`Deleting shard file ${filePath}`);
-            await rm(filePath, { force: true });
-        }
+        await Promise.allSettled(
+            files.map((file) => {
+                const filePath = `${INDEX_CACHE_DIR}/${file}`;
+                logger.info(`Deleting shard file ${filePath}`);
+                return rm(filePath, { force: true });
+            })
+        );
packages/web/src/app/[domain]/repos/columns.tsx (1)

112-162: Status filter uses labels; works, but consider value-based filters to avoid i18n coupling.

Filtering by display labels ties logic to UI text. Prefer filtering by the enum value and map labels only for rendering.

-            const uniqueLabels = Object.values(statusLabels);
-            const currentFilter = column.getFilterValue() as string | undefined;
+            const uniqueValues = Object.keys(statusLabels) as RepoStatus[];
+            const currentFilter = column.getFilterValue() as RepoStatus | undefined;
...
-                            {uniqueLabels.map((label) => (
-                                <DropdownMenuItem key={label} onClick={() => column.setFilterValue(label)}>
+                            {uniqueValues.map((value) => (
+                                <DropdownMenuItem key={value} onClick={() => column.setFilterValue(value)}>
-                                    <Check className={cn("mr-2 h-4 w-4", column.getFilterValue() === label ? "opacity-100" : "opacity-0")} />
-                                    {label}
+                                    <Check className={cn("mr-2 h-4 w-4", column.getFilterValue() === value ? "opacity-100" : "opacity-0")} />
+                                    {statusLabels[value]}
                                 </DropdownMenuItem>
                             ))}
...
-            const status = row.getValue(id) as RepoStatus;
-            return statusLabels[status] === value;
+            const status = row.getValue(id) as RepoStatus;
+            return value === undefined ? true : status === value;
packages/backend/src/git.ts (1)

149-161: LGTM; consider explicit local scope for clarity.

Writes look correct. Optionally pass an explicit local scope to avoid ambiguity across environments.

@msukkari msukkari self-requested a review October 18, 2025 22:57
msukkari
msukkari previously approved these changes Oct 18, 2025
@brendan-kellam brendan-kellam merged commit 4ebe4e0 into main Oct 18, 2025
9 checks passed
@brendan-kellam brendan-kellam deleted the bkellam/repo_indexing_v2 branch October 18, 2025 23:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment