Extend default indexed file types and support extension-less files#292
Open
alfredsgenkins wants to merge 3 commits intozilliztech:masterfrom
Open
Extend default indexed file types and support extension-less files#292alfredsgenkins wants to merge 3 commits intozilliztech:masterfrom
alfredsgenkins wants to merge 3 commits intozilliztech:masterfrom
Conversation
Uncomment and extend the DEFAULT_SUPPORTED_EXTENSIONS list to include file types commonly found in real projects: YAML, JSON, SQL, CSS, HTML, shell scripts, Prisma schemas, GraphQL, Protobuf, Terraform, and Dockerfiles (by extension). Without these, Kubernetes manifests, CI/CD workflows, database migrations, and styling are invisible to semantic search. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The default MilvusClient gRPC timeout (15s) is too short for Zilliz Serverless clusters, causing DEADLINE_EXCEEDED errors during indexing. Two changes: 1. Set MilvusClient timeout to 60s (configurable via MILVUS_TIMEOUT_MS) 2. Replace expensive dummy collection create/drop in checkCollectionLimit with a lightweight listCollections call Fixes zilliztech#289 Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Adds a DEFAULT_EXTENSIONLESS_FILENAMES set of well-known files that have no extension but should be indexed. When the file traversal encounters a file with no extension, it checks this set before skipping. Configurable via CUSTOM_EXTENSIONLESS_FILENAMES env var (comma-separated). Fixes zilliztech#191 Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
aec92d6 to
a8b3f04
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two related improvements to file indexing coverage:
1. Extend
DEFAULT_SUPPORTED_EXTENSIONSThe existing list covered programming languages and markdown but left out common project files. Extensions like
.yaml,.yml,.json,.sql,.css,.prisma,.graphql,.tfetc. were commented out in the source. This PR uncomments and expands them so K8s manifests, CI/CD workflows, database migrations, schemas, and stylesheets are indexed by default.2. Support extension-less files (fixes #191)
Files like
Dockerfile,Makefile,Jenkinsfile, andVagrantfilehave no extension, sopath.extname()returns''and they were silently skipped. This PR adds:DEFAULT_EXTENSIONLESS_FILENAMES— aSetof well-known extension-less files checked during traversalCUSTOM_EXTENSIONLESS_FILENAMESenv var — comma-separated list so users can add their own (e.g.Dockerfile.dev,Containerfile)🤖 Generated with Claude Code