Files
hk-ipo/data/snapshots/ipo_master.csv
T
geometrybase eae427d85b Add PDF text extraction workflow
Request:
- Provide a way to install or develop a PDF extraction tool for archived HK IPO documents.

Changes:
- Add requirements.txt with pypdf as the lightweight PDF text extraction dependency.
- Add scripts/extract_pdf_text.py to extract text from PDF source_refs into repo-relative data/extracted_text files.
- Add extracted text outputs and an extracted_text_manifest snapshot for the six archived HKEXnews PDFs.
- Document the extraction workflow in README.md.
- Ignore .venv and keep generated SQLite/Python transient files out of git.
- Use extracted text to verify the 06106 full prospectus, update source_refs, remove the related data gap, and fill 06106 offering terms.

Verification:
- Installed python3.14-venv system support, created a local .venv, and installed requirements.txt.
- Re-ran scripts/bootstrap_historical_data.py and scripts/extract_pdf_text.py.
- Verified extracted text paths and hashes against data/snapshots/extracted_text_manifest.csv.
- Verified SQLite integrity and snapshot row counts.
- Ran git diff --cached --check and searched durable files for machine-specific absolute paths.
2026-06-15 06:21:16 +00:00

5 lines
1.1 KiB
CSV

ticker,company_name_en,company_name_zh,stock_short_name,exchange,board,status,listing_date,application_start_date,application_end_date,allotment_results_expected_date,industry_label,data_as_of,notes
06106,"Shanghai Seer Intelligent Technology Co., Ltd.",上海仙工智能科技股份有限公司,,HKEX,Main Board,open_for_subscription,2026-06-24,2026-06-15,2026-06-18,2026-06-23,Industrial intelligent robots / robot controllers,2026-06-15T06:15:00Z,Seeded from HKEXnews global offering announcement; full prospectus source classification needs follow-up.
06658,"Liuliumei Co., Ltd.",溜溜梅股份有限公司,LIULIUMEI,HKEX,Main Board,listed,2026-06-15,2026-06-05,2026-06-10,2026-06-12,Snack food / preserved fruit,2026-06-15T06:15:00Z,Seeded from HKEXnews prospectus and allotment results.
06675,"SENASIC Electronics Technology Co., Ltd.",琻捷電子科技(江蘇)股份有限公司,,HKEX,Main Board,pending_listing,2026-06-17,2026-06-09,2026-06-12,2026-06-16,Automotive wireless sensing SoC / semiconductors,2026-06-15T06:15:00Z,Seeded from HKEXnews prospectus and global offering announcement; allotment results not yet archived.