Commit Graph

3 Commits

Author SHA1 Message Date
geometrybase 5b9835c289 Refresh latest IPO candidates
Request:
Run the scheduled HK IPO analyst refresh as of 2026-06-23T15:00:19Z, refresh online archive facts first, rebuild the analysis dataset, write the latest Chinese broad candidate report, mirror it to reports/README.md, and preserve stage discipline.

Changes:
- Refreshed HKEX current-listing pages, VBKR/Jieli T0.95 market heat, ipohk external history, A/H quote evidence, and current HKEX document searches.
- Archived official HKEX allotment-result PDFs and extracted text for 02335 and 06106; parsed official T1 demand into ipo_demand without copying market heat into official fields.
- Rebuilt analysis_model_v0_dataset.csv and refreshed sync/source snapshots.
- Updated reports/2026-06-23_latest_ipo_candidates_analysis.md and mirrored the same content to reports/README.md, including current ranking, fundamentals, unresolved-D1 risk/reward table, closed/waiting names, 30-day review, guardrails, and sources.

Verification:
- git diff --check
- Rebuilt analysis dataset for 2026-06-23T15:00:19Z
- Python check that reports/README.md matches the dated report and required new facts are present
- Python check that 15:00Z heat has 8 ipo_market_heat rows and current actionable names have no official ipo_demand rows
- Python check that 02335 and 06106 official T1 fields match HKEX allotment results
- Python check that 77 source refs archived at 2026-06-23T15:00:19Z use repo-relative paths, files exist, and hashes match

Next useful context:
- 02335 and 06106 now have official T1 demand, but D1/T2 remain data_gap until listing-day evidence is archived.
- 00901 Yahoo D1 fetch still returns 404; ipohk remains only a third-party cross-check.
2026-06-23 15:13:18 +00:00
geometrybase fb7bf3af7d Analyze latest HK IPO candidates
Request:
- Use the project analyst workflow to analyze the latest upcoming Hong Kong IPO candidates.

Changes:
- Refreshed recent HK IPO target coverage through 2026-06-17 and archived current HKEX source updates.
- Archived 06675 allotment results and D1 Yahoo price performance for boundary-case review.
- Archived a 2026-06-17 T0.5 VBKR/Jieli market-heat snapshot for still-actionable 02335 and 06106.
- Rebuilt the v0 analysis dataset and snapshots at 2026-06-17T08:20:00Z.
- Added a Chinese horizontal analyst report ranking 06106, 02335, 06132, 06067, 01392, with 06675 separated as a T1/D1 review sample.

Verification:
- Ran SQLite PRAGMA integrity_check and foreign_key_check.
- Ran git diff --check and git diff --cached --check.
- Confirmed report source paths exist.

Next useful context:
- 06106 is the top still-actionable T0.5 candidate at this as-of time.
- 02335 needs another pre-deadline heat sample before a stronger call.
- 01392, 06067, and 06132 are now mainly waiting for T1 official allotment results.
2026-06-17 08:27:35 +00:00
geometrybase eae427d85b Add PDF text extraction workflow
Request:
- Provide a way to install or develop a PDF extraction tool for archived HK IPO documents.

Changes:
- Add requirements.txt with pypdf as the lightweight PDF text extraction dependency.
- Add scripts/extract_pdf_text.py to extract text from PDF source_refs into repo-relative data/extracted_text files.
- Add extracted text outputs and an extracted_text_manifest snapshot for the six archived HKEXnews PDFs.
- Document the extraction workflow in README.md.
- Ignore .venv and keep generated SQLite/Python transient files out of git.
- Use extracted text to verify the 06106 full prospectus, update source_refs, remove the related data gap, and fill 06106 offering terms.

Verification:
- Installed python3.14-venv system support, created a local .venv, and installed requirements.txt.
- Re-ran scripts/bootstrap_historical_data.py and scripts/extract_pdf_text.py.
- Verified extracted text paths and hashes against data/snapshots/extracted_text_manifest.csv.
- Verified SQLite integrity and snapshot row counts.
- Ran git diff --cached --check and searched durable files for machine-specific absolute paths.
2026-06-15 06:21:16 +00:00