Commit Graph

6 Commits

Author SHA1 Message Date
Kshitij e6fc2590c9 chore(data): add 5 new standards, enrich chunks with full_title/keywords, update eval scores. 2026-05-04 15:45:33 +05:30
Kshitij 3fbf91c706 feat(retrieval): add grade matching and same-family part disambiguation.
Boost scores when query grade matches standard title grade, penalize mismatches. Add part disambiguation to correctly route queries to specific standard parts (e.g., IS 12269 (Part 1) vs (Part 2)). Regenerate retrieval results with improved ranking.
2026-05-04 00:20:19 +05:30
Kshitij 0440b76111 Revert "chore: remove unnecessary files from /data directory and move /data/processed/retrieval_results.json to /data/results.json"
This reverts commit 1efc0e3482.
2026-05-03 01:32:10 +05:30
Kshitij 1efc0e3482 chore: remove unnecessary files from /data directory and move /data/processed/retrieval_results.json to /data/results.json 2026-05-03 01:18:14 +05:30
Kshitij 29b32dfcac fix: complete requirements.txt and inference.py output correctness.
- Add faiss-cpu, rank-bm25, sentence-transformers, numpy to requirements.txt.
  (previously only pymupdf was listed; other deps were manual-install only)
- Cast score to float() before round() to avoid numpy type serialization errors.
- Pass expected_standards through _format_result for eval script compatibility.
- Update retrieval_results.json with expected_standards per query for eval.
2026-05-03 00:03:03 +05:30
Kshitij f65185b91e feat: add data processing outputs. 2026-04-28 23:55:59 +05:30