5f9546b16c
Request: - Rework archivist handling for stubborn T0/T1 HKEX document gaps and unresolved T2 grey-market gaps. Changes: - Query HKEXnews titleSearchServlet with IPO-date windows instead of only the latest title-search page. - Recognize SHARE OFFER listing documents and archive official HTML allotment-result notices when no PDF is published. - Mark source-only allotment completion clearly when structured demand parsing is not yet covered. - Add a reusable grey-market gap marker and archivist source policy for T2 data. - Archive newly discovered HKEX raw sources, update SQLite, and refresh CSV snapshots. - Treat raw evidence files as binary in Git attributes. Verification: - Ran py_compile for archive_hkex_documents.py, update_sync_state.py, and mark_grey_market_gaps.py. - Ran HKEX document archive backfill and grey-market gap marker. - Checked SQLite integrity, foreign keys, source paths, source hashes, and DB-vs-snapshot row counts. - Ran git diff --cached --check after marking raw archives binary. Next useful context: - T0 is now complete for 293 tickers. - T1 has 291 complete and 2 pending_not_due tickers. - T2 has 291 blocked gaps pending an approved grey-market source strategy.
832 KiB
832 KiB