hk-ipo

Author	SHA1	Message	Date
geometrybase	e346690bb7	Archive current HKEX IPO candidates Request: - Use the analyst workflow to analyze the latest Hong Kong IPOs, connect their source data, and produce a current report. Changes: - Added a current HKEX New Listing Information page seeder that archives the official page, seeds visible tickers, and records source_refs. - Archived current HKEX prospectus and allotment-result sources for the 16 visible Main Board candidates and extracted their text. - Extended prospectus parsing for offer price, derived gross proceeds, HDR offerings, and listing-date text extracted with split characters. - Rebuilt the analysis dataset and added a Chinese 2026-06-21 latest IPO report separating live T0 watchlist names from past-cutoff T1/D1 candidates. Verification: - Ran py_compile for update_recent_ipo_list.py, archive_hkex_current_new_listings.py, archive_hkex_documents.py, and build_analysis_dataset.py. - Re-ran HKEX current page seeding, document archiving, and analysis dataset build as of 2026-06-21T08:44:59Z. - Ran git diff --check and git diff --cached --check. - Ran SQLite integrity_check and foreign_key_check. - Verified source_refs paths, file existence, SHA-256 hashes, and report source paths. Next useful context: - Capture T0.95 market heat before the 2026-06-23 and 2026-06-24 order cutoffs before converting the new watchlist into execution calls. - Treat 02667 as a stale/special HKEX page item until a fresh June timetable or official result appears.	2026-06-21 09:05:13 +00:00
geometrybase	fb7bf3af7d	Analyze latest HK IPO candidates Request: - Use the project analyst workflow to analyze the latest upcoming Hong Kong IPO candidates. Changes: - Refreshed recent HK IPO target coverage through 2026-06-17 and archived current HKEX source updates. - Archived 06675 allotment results and D1 Yahoo price performance for boundary-case review. - Archived a 2026-06-17 T0.5 VBKR/Jieli market-heat snapshot for still-actionable 02335 and 06106. - Rebuilt the v0 analysis dataset and snapshots at 2026-06-17T08:20:00Z. - Added a Chinese horizontal analyst report ranking 06106, 02335, 06132, 06067, 01392, with 06675 separated as a T1/D1 review sample. Verification: - Ran SQLite PRAGMA integrity_check and foreign_key_check. - Ran git diff --check and git diff --cached --check. - Confirmed report source paths exist. Next useful context: - 06106 is the top still-actionable T0.5 candidate at this as-of time. - 02335 needs another pre-deadline heat sample before a stronger call. - 01392, 06067, and 06132 are now mainly waiting for T1 official allotment results.	2026-06-17 08:27:35 +00:00
geometrybase	a2ec016769	Add selected T0 horizontal IPO report Request: - Combine the currently selected T0 IPO reports into one cross-sectional analysis report. Changes: - Add a Chinese horizontal T0 report comparing 01392, 02335, 06067, 06106, and 06132. - Rank the selected IPOs by the current T0 model and short-exit discipline focused on T2/D1 selling. - Backfill 02335's Chinese company name from its Chinese HKEX prospectus and archive the source PDF plus extracted text. - Refresh the v0 analysis dataset and sync-state snapshots at 2026-06-15T18:20:00Z. Verification: - .venv/bin/python -m py_compile scripts/build_analysis_dataset.py scripts/generate_ipo_report.py scripts/extract_pdf_text.py scripts/update_sync_state.py - Python sqlite3 PRAGMA integrity_check returned ok and foreign_key_check returned zero rows. - Confirmed 02335 Chinese source_ref, extracted text manifest row, and selected horizontal report content. - git diff --cached --check Next useful context: - Untracked PDF exports of individual reports and the horizontal report were left out of this focused commit.	2026-06-15 15:17:06 +00:00
geometrybase	fcb795b583	Add 02335 T0 analyst report Request: - Generate an analyst report for HK IPO ticker 02335. Changes: - Archived the official HKEXnews 02335 prospectus PDF and extracted text under project-relative data paths. - Seeded 02335 T0 prospectus facts, source references, sync state, and analysis snapshots. - Generated reports/2026-06-15_02335_T0_prospectus_analysis.md in Simplified Chinese with concrete T0/T1/T2/D1 dates and short-exit T2/D1 discipline. - Made PDF text extraction tolerant of invalid Unicode surrogate characters emitted by pypdf. Verification: - Compiled archive_hkex_documents.py, generate_ipo_report.py, build_analysis_dataset.py, extract_pdf_text.py, and update_sync_state.py. - Ran SQLite integrity_check and foreign_key_check. - Verified the archived 02335 PDF hash, extracted-text manifest row, and analysis dataset row. - Ran git diff --check. Next useful context: - 02335 is currently T0_prospectus; T1_allotment is pending for 2026-06-23.	2026-06-15 15:07:44 +00:00
geometrybase	42c18131e8	Add 06067 T0 analyst report Request: - Generate an analyst report for HK IPO ticker 06067. Changes: - Archived the official HKEXnews 06067 prospectus PDF and extracted text under project-relative data paths. - Seeded 06067 T0 prospectus facts, source references, sync state, and analysis snapshots. - Generated reports/2026-06-15_06067_T0_prospectus_analysis.md in Simplified Chinese with concrete T0/T1/T2/D1 dates and short-exit T2/D1 discipline. - Updated the HKEX document archiver so over-allotment shares are only recorded when the prospectus supports them, with explicit no-option cases stored as zero. Verification: - Compiled archive_hkex_documents.py, generate_ipo_report.py, build_analysis_dataset.py, extract_pdf_text.py, and update_sync_state.py. - Ran SQLite integrity_check and foreign_key_check. - Verified the archived 06067 PDF hash, extracted-text manifest row, and analysis dataset row. - Ran git diff --check. Next useful context: - 06067 is currently T0_prospectus; T1_allotment is pending for 2026-06-22.	2026-06-15 15:03:07 +00:00
geometrybase	77b405e4f3	Add T0 analyst reports for active IPOs Request: - Analyze HK IPO ticker 01392 with the analyst skill. - Preserve the in-flight 06132 archive/report work already created for the prior request. Changes: - Archived official HKEX prospectus PDFs and extracted text for 01392 and 06132. - Seeded structured T0 facts into the SQLite archive and refreshed CSV snapshots and sync state. - Rebuilt the v0 analysis dataset and model calibration report. - Generated Simplified Chinese T0 prospectus-stage analyst reports for 01392 and 06132. - Adjusted report stage calendars so T2 uses the previous business day before D1 when listing is separated from allocation by a weekend. Verification: - Compiled modified Python scripts with in-memory syntax checks. - Ran SQLite quick_check and foreign_key_check. - Confirmed DB row counts match CSV snapshots for key tables. - Verified 01392/06132 source paths are repo-relative, raw files exist, hashes match, and PDF text manifest rows are ok. - Ran git diff --cached --check. Next useful context: - 01392 T1 is due on 2026-06-18; rerun analyst after allotment results are archived. - 06132 T1 is due on 2026-06-22; rerun analyst after allotment results are archived.	2026-06-15 14:51:44 +00:00
geometrybase	6d05056609	Backfill structured T1 demand from archived text Request: - Use archivist to close the 137 T1 ipo_demand source-only gaps using extracted PDF text. Changes: - Add an incremental T1 demand text backfill script. - Parse existing allotment-result extracted text into ipo_demand. - Archive linked Summary PDFs from old HKEX HTML allotment-result pages. - Correct allotment-result selection to prefer primary result announcements over clarification or supplemental notices. - Add robust line-aware allotment parsing and document the workflow in archivist and README. - Record the backfill result in a report. Execution: - Selected 137 source-only T1 demand gaps. - Wrote 137 ipo_demand rows, increasing ipo_demand from 154 to 291 rows. - Archived 38 new HKEX allotment-result PDFs and extracted their text. - Confirmed an incremental rerun selects 0 gaps and writes 0 rows. Verification: - Ran git diff --cached --check. - Ran py_compile for archive_hkex_documents.py and backfill_t1_demand_from_text.py. - Checked SQLite integrity and foreign keys. - Confirmed DB row counts match CSV snapshots. - Verified no T1 complete row is missing ipo_demand. - Verified source_refs paths/files/hashes and PDF extracted-text manifest hashes. Next useful context: - T1 demand structure is complete for listed rows; 06106 and 06675 remain pending_not_due. - T2 grey-market and due price-performance gaps remain separate archivist priorities. - Analyst output should be regenerated before using the new T1 demand facts for scoring.	2026-06-15 13:59:06 +00:00
geometrybase	8a0dfd88f0	Make PDF text extraction a standard archive step Request: - Add extracted PDF text generation to the archivist workflow as a standard step. Changes: - Run PDF text extraction automatically for newly archived HKEX PDF sources. - Make the PDF text extractor incremental and manifest-preserving. - Document extracted-text handling in the archivist skill and README. - Mark generated extracted text as no-diff data evidence. - Backfill extracted text for all archived PDF source references. Verification: - Ran git diff --cached --check. - Ran .venv/bin/python -m py_compile scripts/extract_pdf_text.py scripts/archive_hkex_documents.py. - Ran full PDF extraction, then confirmed an incremental rerun skips unchanged files. - Verified 557 PDF source_refs, 557 manifest rows, all status ok, and zero missing text/hash/path issues. Next useful context: - HKEX HTML notices and Yahoo JSON market data remain under data/raw and are not expected in data/extracted_text.	2026-06-15 13:27:41 +00:00
geometrybase	eae427d85b	Add PDF text extraction workflow Request: - Provide a way to install or develop a PDF extraction tool for archived HK IPO documents. Changes: - Add requirements.txt with pypdf as the lightweight PDF text extraction dependency. - Add scripts/extract_pdf_text.py to extract text from PDF source_refs into repo-relative data/extracted_text files. - Add extracted text outputs and an extracted_text_manifest snapshot for the six archived HKEXnews PDFs. - Document the extraction workflow in README.md. - Ignore .venv and keep generated SQLite/Python transient files out of git. - Use extracted text to verify the 06106 full prospectus, update source_refs, remove the related data gap, and fill 06106 offering terms. Verification: - Installed python3.14-venv system support, created a local .venv, and installed requirements.txt. - Re-ran scripts/bootstrap_historical_data.py and scripts/extract_pdf_text.py. - Verified extracted text paths and hashes against data/snapshots/extracted_text_manifest.csv. - Verified SQLite integrity and snapshot row counts. - Ran git diff --cached --check and searched durable files for machine-specific absolute paths.	2026-06-15 06:21:16 +00:00

9 Commits