Files

T

geometrybase d3b67fa473 Refresh latest IPO candidates

Request:
Run the scheduled HK IPO analyst refresh as of 2026-06-24T07:00:26Z, refresh online archive facts first, rebuild the analysis dataset, write the latest Chinese broad candidate report, mirror it to reports/README.md, and preserve stage discipline.

Changes:
- Refreshed HKEX English and Chinese current-listing pages, rolling recent-listing coverage, current HKEX document searches, VBKR/Jieli live T0.95 market heat, ipohk external history, Yahoo same-day D1 evidence, and A/H quote/FX evidence.
- Archived 5 still-actionable T0_95_final_heat rows for 00668, 02697, 03952, 06715, and 06915 while leaving post-deadline names on their last pre-deadline heat snapshots.
- Rebuilt analysis_model_v0_dataset.csv and related snapshots for 2026-06-24T07:00:26Z.
- Updated reports/2026-06-24_latest_ipo_candidates_analysis.md in Simplified Chinese and mirrored identical content to reports/README.md.
- Kept unofficial heat in ipo_market_heat only, preserved official T1 demand from HKEX allotment-result sources, and kept 02335/06106 D1 rows labelled as in-session rather than final D1 confirmation.

Verification:
- git diff --check
- git diff --cached --check
- Rebuilt analysis dataset for 2026-06-24T07:00:26Z with 312 rows
- Python check that reports/README.md matches the dated report and all unresolved current/recent tickers are covered in the risk table
- Python check that 5 current-run heat rows are T0_95_final_heat with provider VBKR/Jieli and active heat tickers have no official ipo_demand rows
- Python check that 02335 and 06106 official T1 fields match HKEX allotment results
- Python check that 89 source_refs archived at 2026-06-24T07:00:26Z use repo-relative paths, files exist, and hashes match

Next useful context:
- At 15:00 HKT, only 00668, 02697, 03952, 06715, and 06915 remained still-actionable by public subscription timetable; 06715 crossed 10x heat and upgraded to a small secondary candidate.
- 02335 and 06106 have refreshed 2026-06-24 Yahoo D1 rows before the Hong Kong close; treat them as in-session execution evidence, not final D1 review labels.
- T2 grey-market remains a source-strategy data_gap for recent June IPOs without approved reproducible evidence.

2026-06-24 07:15:05 +00:00

5.7 KiB

Raw Blame History

HK IPO Analysis Model v0

Model version: ipo_score_v0
Analysis as of: 2026-06-24T07:00:26Z
Rule file: rules/ipo_score_v0.yaml
Dataset: data/snapshots/analysis_model_v0_dataset.csv

What This Model Does

This is the first analyst model built from the downloaded archive. It creates a repeatable feature table, scores each IPO using stage-safe rules, and calibrates the score buckets against archived D1 sell outcomes. It is intentionally transparent: the output includes every score component and the archived source paths used for each ticker.

The model is built for a short IPO allocation trade: sell in T2 grey market when reliable executable data exists, or sell on D1 otherwise. It does not use grey-market data in v0 because T2 currently has no approved reproducible source. It also does not use post-listing returns as inputs; D1 is the primary sell label, while D5/D20/D60 are review labels only.

Data Inventory

IPO rows scored: 312
Rows with D1 labels: 280
Rows with structured T1 demand fields: 297
Rows with prospectus source path: 312
Rows with allotment source path: 297
Rows with offer size: 312
Rows with public oversubscription: 287
Rows with international oversubscription: 282
Rows with market heat snapshots: 19
Rows with T0.5 margin heat snapshots: 5
Rows with T0.95 late-order heat snapshots: 14
Rows with T0.5 margin heat and D1 labels: 5
Rows with T0.95 late-order heat and D1 labels: 0
Rows matched to external ipohk history: 102
Rows with external final oversubscription: 95
Rows with external final oversubscription and D1 labels: 86
Rows pending T1 structure: 15 (00668, 01191, 01688, 01956, 02272, 02667, 02672, 02697, 03661, 03952, 06228, 06715, 06915, 09630, 09637)
T1 field-level blanks: public oversubscription 10, international oversubscription 15, valid applications 6, successful applications 18

T0 Calibration

T0 uses only prospectus-stage structure: offer size, initial public offer percentage, minimum subscription amount, offer price band, and over-allotment availability.

Bucket	N	D1 positive	D1 >= 10%	Avg D1 return	Median D1 return
t0_1_to_4	60	63.3%	40.0%	9.6	3.1
t0_5_to_7	107	73.8%	52.3%	42.6	14.1
t0_gte_8	77	76.6%	48.1%	29.4	9.8
t0_lt_1	36	58.3%	33.3%	12.8	2.3

T1 Calibration

T1 adds allotment-stage demand: public subscription, international placing demand, valid application count, application success rate, and HK public offer reallocation.

Bucket	N	D1 positive	D1 >= 10%	Avg D1 return	Median D1 return
total_0_to_9	68	58.8%	30.9%	3.3	0.2
total_10_to_17	29	55.2%	34.5%	13.9	1.5
total_18_to_25	49	75.5%	51.0%	31.3	13.4
total_gte_26	66	93.9%	86.4%	85.8	78.1
total_lt_0	68	61.8%	23.5%	0.4	1.0

T0.5 Market Heat

T0.5 uses archived subscription-period margin heat snapshots. T0.95 is the near-deadline subset that is still actionable before the user's order cutoff. These are non-official live signals and are kept separate from T1 allotment demand. The current archive is not yet a historical training set: it has too few rows and no D1 labels for calibration.

Total market heat rows: 19
T0.5 margin rows: 5
T0.5 rows with D1 labels: 5
T0.95 late-order heat rows: 14
T0.95 rows with D1 labels: 0

External Final Heat Proxy

The ipohk history archive adds final public oversubscription, one-lot win rate, grey-market return, and first-day return where available. These fields are useful for coverage checks and post-hoc calibration, but they are not T0.5 inputs because they are final or near-final history.

External history rows matched into this dataset: 102
Matched rows with final oversubscription: 95
Matched rows with final oversubscription and D1 labels: 86

Bucket	N	D1 positive	D1 >= 10%	Avg D1 return	Median D1 return
external_os_1000x_to_5000x	34	94.1%	79.4%	60.7	56.7
external_os_100x_to_1000x	21	61.9%	38.1%	8.8	4.2
external_os_10x_to_100x	7	28.6%	14.3%	-23.0	-21.9
external_os_gte_5000x	18	83.3%	72.2%	101.7	89.7
external_os_lt_10x	6	50.0%	16.7%	4.7	-4.1

Current Read

After the T1 demand text backfill, the strongest v0 T1 bucket is total_gte_26 with 66 historical D1 observations and a 93.9% D1 positive rate. The model is most useful after allotment results are available; T0 is a watchlist filter rather than a final subscription call.

The high-conviction bucket remains clearly differentiated, but the middle and low score buckets are still not monotonic. This refresh keeps the v0 score formula unchanged and updates empirical calibration only; future rule changes should come from reviewed prediction cards rather than overfitting this historical sample.

Usage

Run scripts/build_analysis_dataset.py after archivist updates the database.
Use t0_score for prospectus-stage watchlisting.
Use total_score, decision_band, and calibrated_d1_positive_rate for T1-stage subscription cards.
Frame live decisions around a T2 or D1 sell, not long-term holding.
Treat D5/D20/D60 columns as review labels only, never as prediction inputs or holding targets.

Known Gaps

T1 is structurally complete for listed rows; residual field-level NULLs remain when the archived source does not explicitly state a demand field.
Industry and issuer fundamentals are not sufficiently structured for model input.
T2 grey-market signal is blocked pending an approved source.
Extreme D1 returns should be audited before they drive rule changes.

5.7 KiB Raw Blame History