47 Commits

Author SHA1 Message Date
Kshitij 1fdd0aa840 fix: git clone link in README 2026-05-04 23:08:27 +05:30
Kshitij 82fc701ff1 docs: add presentation. 2026-05-04 23:05:40 +05:30
Kshitij 35b3e455cf docs: update README. 2026-05-04 15:46:42 +05:30
Kshitij e6fc2590c9 chore(data): add 5 new standards, enrich chunks with full_title/keywords, update eval scores. 2026-05-04 15:45:33 +05:30
Kshitij 458bd93434 feat(retrieval): add part-number discriminator and improve part disambiguation. 2026-05-04 15:45:02 +05:30
Kshitij fdae5d2318 chore: update evaluation results with revised scores. 2026-05-04 15:11:16 +05:30
Kshitij b055edbbc0 feat: add 10 new standards and chunks for steel, wood, electrical categories. 2026-05-04 15:10:53 +05:30
Kshitij 42fed21586 fix: correct erroneous scope descriptions for 7 standards. 2026-05-04 15:10:27 +05:30
Kshitij 697bdcbd80 docs(readme): update scores to MRR=1.000 and reflect parser/retrieval improvements.
- MRR @5: 0.783 → 1.000 (all 10 queries now return expected standard at rank 1)
- Chunking: document 4-pass boundary detection (Pass 3 scope recovery, Pass 4 bleed truncation)
- Chunk count: 1,261 → 1,236 across all references
- Re-ranking: add grade discriminator (+0.35/-0.40) and Part disambiguation bullets
2026-05-04 00:24:22 +05:30
Kshitij 3fbf91c706 feat(retrieval): add grade matching and same-family part disambiguation.
Boost scores when query grade matches standard title grade, penalize mismatches. Add part disambiguation to correctly route queries to specific standard parts (e.g., IS 12269 (Part 1) vs (Part 2)). Regenerate retrieval results with improved ranking.
2026-05-04 00:20:19 +05:30
Kshitij 28bb4ca1de fix(parser): recover stolen scope text and truncate next-standard bleed
Add Pass 3 to recover scope text incorrectly placed in previous block, and Pass 4 to truncate bleed from the following standard. Regenerate standards.json and standards_chunks.json with the improved parser.
2026-05-04 00:18:17 +05:30
Kshitij 80aa252c3e fix(server): initialize queue and pending arrays in retriever service. 2026-05-03 22:43:25 +05:30
Kshitij 4c548ebc61 fix(server): validate JSON data, sanitize inputs, and harden query parsing. 2026-05-03 22:43:11 +05:30
Kshitij e8b5beca5e feat(server): add production mode with static file serving and SPA fallback. 2026-05-03 22:42:52 +05:30
Kshitij fd73b8bde5 feat(client): add error boundary around root App component. 2026-05-03 22:42:19 +05:30
Kshitij c54c893eac docs: update README. 2026-05-03 22:23:23 +05:30
Kshitij 6af7b05c53 docs: clarify PYTHON_BIN validation rules in env example. 2026-05-03 17:26:42 +05:30
Kshitij 4c69ee4fc1 fix: add timeout to LLM API calls to prevent hung requests. 2026-05-03 17:26:20 +05:30
Kshitij d2a75be7b6 fix: harden server input validation and prevent information leakage. 2026-05-03 17:25:25 +05:30
Kshitij 844973fb39 chore: add updated results in json format; output of inference.py 2026-05-03 01:45:18 +05:30
Kshitij 14a2328c81 Revert "refactor: remove emoji icons and normalize dashes in client UI."
This reverts commit 33fe20021a.
2026-05-03 01:41:46 +05:30
Kshitij 0440b76111 Revert "chore: remove unnecessary files from /data directory and move /data/processed/retrieval_results.json to /data/results.json"
This reverts commit 1efc0e3482.
2026-05-03 01:32:10 +05:30
Kshitij f2953ef56a chore: remove outdated /ui directory. 2026-05-03 01:31:13 +05:30
Kshitij 1efc0e3482 chore: remove unnecessary files from /data directory and move /data/processed/retrieval_results.json to /data/results.json 2026-05-03 01:18:14 +05:30
Kshitij 68d85898a1 chore: update backend example env to include PYTHON_BIN env var. 2026-05-03 01:17:09 +05:30
Kshitij 0f91db798c chore: remove old favicon and icon, update code with new logo. 2026-05-03 00:51:55 +05:30
Kshitij de1d14f125 refactor: move inference.py to root. 2026-05-03 00:43:32 +05:30
Kshitij f88a45968a docs: add JSDoc and normalize comments across server. 2026-05-03 00:16:42 +05:30
Kshitij 33fe20021a refactor: remove emoji icons and normalize dashes in client UI. 2026-05-03 00:16:05 +05:30
Kshitij 0aa0f808a1 chore: move eval_script and inference script to root. 2026-05-03 00:04:29 +05:30
Kshitij 29b32dfcac fix: complete requirements.txt and inference.py output correctness.
- Add faiss-cpu, rank-bm25, sentence-transformers, numpy to requirements.txt.
  (previously only pymupdf was listed; other deps were manual-install only)
- Cast score to float() before round() to avoid numpy type serialization errors.
- Pass expected_standards through _format_result for eval script compatibility.
- Update retrieval_results.json with expected_standards per query for eval.
2026-05-03 00:03:03 +05:30
Kshitij 8e1348fb63 feat: add react-i18next with English and Hindi locale support.
- Add i18next + react-i18next + i18next-browser-languagedetector.
- EN/HI translation files covering all UI strings across every page and component.
- Language switcher button in Navbar; choice persisted to localStorage.
- document.documentElement.lang synced to active language in App.
- Skip-nav link and #main-content anchor for keyboard accessibility.
- aria-describedby on modal dialog; page title and meta description in index.html.
- Secure page title set to 'BIS SP-21 Standards.'
2026-05-03 00:01:14 +05:30
Kshitij 0d8b2cdb3f security: add helmet, rate limiting, strict CORS, input sanitization.
- Add helmet for secure HTTP response headers.
- Add express-rate-limit: 60 req/min general, 20 req/min on LLM endpoints.
- Restrict CORS to localhost origins in dev, CORS_ORIGIN env var in prod.
- Cap request body at 16kb.
- Add sanitizeText() to strip control chars on all string inputs.
- Add isValidStandardId() regex guard on :id param and standard_id fields.
- All route handlers use sanitized values; no raw req.body/req.query access.
2026-05-02 23:59:33 +05:30
notkshitij 92cc8274df Merge pull request #1 from kshitij-ka/Atharva
docs: add JSDoc to useDebounce hook
2026-05-02 13:05:08 +05:30
atharvaombase 5f78ab02a9 docs: add JSDoc to useDebounce hook 2026-05-01 17:33:20 +05:30
atharvaombase 316b71827f fix: disable setState-in-effect ESLint rule 2026-05-01 17:29:02 +05:30
atharvaombase 2b85a7573b docs: add JSDoc comments to API functions 2026-05-01 17:28:36 +05:30
Kshitij 6c81ec597c chore: add MIT license. 2026-04-29 00:04:18 +05:30
Kshitij db88fe1ca7 docs: update README. 2026-04-29 00:02:34 +05:30
Kshitij a5cf7bbfda feat: add web client frontend with monorepo config. 2026-04-28 23:56:23 +05:30
Kshitij 3a0c32ea8f feat: add web server backend. 2026-04-28 23:56:07 +05:30
Kshitij 3065a0adce chore: add legacy UI templates. 2026-04-28 23:56:05 +05:30
Kshitij f65185b91e feat: add data processing outputs. 2026-04-28 23:55:59 +05:30
Kshitij 434c8b288a chore: add evaluation scripts. 2026-04-28 23:54:45 +05:30
Kshitij 13fd04e7e1 chore: add Python dependencies and core pipeline. 2026-04-28 23:54:26 +05:30
Kshitij e6d7669cba chore: add .gitignore. 2026-04-28 23:53:46 +05:30
Kshitij 6218b5b1d9 Initial commit. 2026-04-28 23:52:20 +05:30