Request: - Re-analyze the IPO model using the updated historical archive after T1 demand backfill. Changes: - Regenerate the v0 analysis dataset from the current SQLite archive. - Refresh the v0 calibration report with expanded T1 coverage and new empirical bucket rates. - Update the report template to show pending T1 rows and field-level blanks. - Clarify v0 limitations and record why the score formula stays unchanged for this refresh. Verification: - Ran scripts/build_analysis_dataset.py against data/hk_ipo.sqlite. - Ran py_compile for scripts/build_analysis_dataset.py. - Checked dataset row count, T1 demand coverage, source-only T1 gaps, and repo-relative paths. - Ran git diff --check. Next useful context: - T1 structured coverage is now 291 rows, with 06106 and 06675 still pending_not_due. - The high-conviction T1 bucket remains differentiated, but middle and low buckets are still not monotonic enough for a v1 rule change.
3.5 KiB
HK IPO Analysis Model v0
- Model version:
ipo_score_v0 - Analysis as of:
2026-06-15T14:04:34Z - Rule file:
rules/ipo_score_v0.yaml - Dataset:
data/snapshots/analysis_model_v0_dataset.csv
What This Model Does
This is the first analyst model built from the downloaded archive. It creates a repeatable feature table, scores each IPO using stage-safe rules, and calibrates the score buckets against archived D1 outcomes. It is intentionally transparent: the output includes every score component and the archived source paths used for each ticker.
The model does not use grey-market data in v0 because T2 currently has no approved reproducible source. It also does not use post-listing returns as inputs; returns are labels only.
Data Inventory
- IPO rows scored: 293
- Rows with D1 labels: 273
- Rows with structured T1 demand fields: 291
- Rows with prospectus source path: 293
- Rows with allotment source path: 291
- Rows with offer size: 293
- Rows with public oversubscription: 281
- Rows with international oversubscription: 277
- Rows pending T1 structure: 2 (06106, 06675)
- T1 field-level blanks: public oversubscription 10, international oversubscription 14, valid applications 6, successful applications 18
T0 Calibration
T0 uses only prospectus-stage structure: offer size, initial public offer percentage, minimum subscription amount, offer price band, and over-allotment availability.
| Bucket | N | D1 positive | D1 >= 10% | Avg D1 return | Median D1 return |
|---|---|---|---|---|---|
| t0_1_to_4 | 60 | 63.3% | 40.0% | 9.6 | 3.1 |
| t0_5_to_7 | 105 | 73.3% | 51.4% | 40.1 | 13.2 |
| t0_gte_8 | 72 | 76.4% | 47.2% | 28.6 | 9.6 |
| t0_lt_1 | 36 | 58.3% | 33.3% | 12.8 | 2.3 |
T1 Calibration
T1 adds allotment-stage demand: public subscription, international placing demand, valid application count, application success rate, and HK public offer reallocation.
| Bucket | N | D1 positive | D1 >= 10% | Avg D1 return | Median D1 return |
|---|---|---|---|---|---|
| total_0_to_9 | 68 | 58.8% | 30.9% | 3.3 | 0.2 |
| total_10_to_17 | 29 | 55.2% | 34.5% | 13.9 | 1.5 |
| total_18_to_25 | 49 | 75.5% | 51.0% | 31.3 | 13.4 |
| total_gte_26 | 59 | 94.9% | 88.1% | 86.7 | 80.0 |
| total_lt_0 | 68 | 61.8% | 23.5% | 0.4 | 1.0 |
Current Read
After the T1 demand text backfill, the strongest v0 T1 bucket is total_gte_26 with 59 historical D1 observations and a 94.9% D1 positive rate. The model is most useful after allotment results are available; T0 is a watchlist filter rather than a final subscription call.
The high-conviction bucket remains clearly differentiated, but the middle and low score buckets are still not monotonic. This refresh keeps the v0 score formula unchanged and updates empirical calibration only; future rule changes should come from reviewed prediction cards rather than overfitting this historical sample.
Usage
- Run
scripts/build_analysis_dataset.pyafter archivist updates the database. - Use
t0_scorefor prospectus-stage watchlisting. - Use
total_score,decision_band, andcalibrated_d1_positive_ratefor T1-stage subscription cards. - Treat D1/D5/D20/D60 columns as review labels only, never as prediction inputs.
Known Gaps
- T1 is structurally complete for listed rows; residual field-level NULLs remain when the archived source does not explicitly state a demand field.
- Industry and issuer fundamentals are not sufficiently structured for model input.
- T2 grey-market signal is blocked pending an approved source.
- Extreme D1 returns should be audited before they drive rule changes.