Skip to content

fix: resolve windows encoding and harden FAISS indexing logic#283

Open
srinivasan-ai-dev wants to merge 3 commits intojenkinsci:mainfrom
srinivasan-ai-dev:fix/windows-compatibility-and-robustness
Open

fix: resolve windows encoding and harden FAISS indexing logic#283
srinivasan-ai-dev wants to merge 3 commits intojenkinsci:mainfrom
srinivasan-ai-dev:fix/windows-compatibility-and-robustness

Conversation

@srinivasan-ai-dev
Copy link

This PR addresses critical environment and pipeline bottlenecks identified during Windows setup and testing with smaller datasets.

Key Fixes:
Windows Compatibility: Enforced UTF-8 encoding in JSON reading to prevent UnicodeDecodeError.

Schema Resiliency: Updated extract_chunk_docs.py to handle nested dictionary structures, preventing KeyError.

FAISS Fallback: Implemented dynamic switching to IndexFlatL2 when the dataset is smaller than the cluster count (nlist), preventing training crashes.

Dependency Hardening: Updated requirements-cpu.txt to ensure a smoother local setup.

This ensures the project is more accessible for new contributors on diverse operating systems.

@srinivasan-ai-dev srinivasan-ai-dev requested a review from a team as a code owner March 15, 2026 08:17
@srinivasan-ai-dev
Copy link
Author

All feedback has been addressed and verified. CI checks are passing, and I've resolved the individual conversation threads. Ready for further review!

@srinivasan-ai-dev srinivasan-ai-dev force-pushed the fix/windows-compatibility-and-robustness branch from 30506c2 to 159e663 Compare March 15, 2026 17:19
@srinivasan-ai-dev
Copy link
Author

Hi @berviantoleo, I have addressed the Pylint style violations and fixed the AttributeError in the backend unit tests by implementing type-checking for the chunking input. The PR should now be ready for a final review. Thank you!

@srinivasan-ai-dev srinivasan-ai-dev force-pushed the fix/windows-compatibility-and-robustness branch from 159e663 to 8a7973f Compare March 15, 2026 18:54
@srinivasan-ai-dev
Copy link
Author

Hi @berviantoleo, just checking in to see if there’s anything I can clarify regarding the FAISS indexing changes or if you’d like me to resolve any specific blockers! Thanks.

@berviantoleo berviantoleo added the bug For changelog: Minor bug. Will be listed after features label Mar 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug For changelog: Minor bug. Will be listed after features

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants