e0c194e115
Request: - Refresh today's HK IPO analyst view for the current candidate set. Changes: - Refreshed the 2026 HKEX new-listing report archive and synchronized source hashes across 2026 report-backed rows. - Re-ran HKEX document archiving for 01392, 06067, 06132, 02335, and 06106; no official T1 allotment facts were available at 2026-06-18T08:16:33Z. - Rebuilt the v0 analysis dataset and model report as of 2026-06-18T08:16:33Z. - Added a Chinese 2026-06-18 cross-candidate analysis update that treats 06106/02335 as past standard subscription cutoff, flags 01392 for a post-23:00 HKT T1 refresh, and lists newly visible HKEX page tickers as pending archivist work. Verification: - Ran scripts/update_recent_ipo_list.py for 2026-01-01 through 2026-06-18. - Ran scripts/archive_hkex_documents.py for 01392,06067,06132,02335,06106. - Ran scripts/build_analysis_dataset.py as of 2026-06-18T08:16:33Z. - Ran git diff --check and git diff --cached --check. - Ran py_compile for the touched workflow scripts. - Ran SQLite integrity_check and foreign_key_check. - Verified durable report paths exist and source_refs have no missing paths or hash mismatches. Next useful context: - Re-run archivist after 2026-06-18T15:00:00Z to capture 01392 allotment results if published. - Add a seed/archive path for current HKEX New Listing Information page candidates before scoring 02672, 01191, 09637, 09630, 06228, 03661, 01956, 02272, 01688, and 02667.
104 lines
5.6 KiB
Markdown
104 lines
5.6 KiB
Markdown
# HK IPO Analysis Model v0
|
|
|
|
- Model version: `ipo_score_v0`
|
|
- Analysis as of: `2026-06-18T08:16:33Z`
|
|
- Rule file: `rules/ipo_score_v0.yaml`
|
|
- Dataset: `data/snapshots/analysis_model_v0_dataset.csv`
|
|
|
|
## What This Model Does
|
|
|
|
This is the first analyst model built from the downloaded archive. It creates a repeatable feature table, scores each IPO using stage-safe rules, and calibrates the score buckets against archived D1 sell outcomes. It is intentionally transparent: the output includes every score component and the archived source paths used for each ticker.
|
|
|
|
The model is built for a short IPO allocation trade: sell in T2 grey market when reliable executable data exists, or sell on D1 otherwise. It does not use grey-market data in v0 because T2 currently has no approved reproducible source. It also does not use post-listing returns as inputs; D1 is the primary sell label, while D5/D20/D60 are review labels only.
|
|
|
|
## Data Inventory
|
|
|
|
- IPO rows scored: 297
|
|
- Rows with D1 labels: 274
|
|
- Rows with structured T1 demand fields: 292
|
|
- Rows with prospectus source path: 297
|
|
- Rows with allotment source path: 292
|
|
- Rows with offer size: 297
|
|
- Rows with public oversubscription: 282
|
|
- Rows with international oversubscription: 278
|
|
- Rows with market heat snapshots: 5
|
|
- Rows with T0.5 margin heat snapshots: 5
|
|
- Rows with T0.95 late-order heat snapshots: 0
|
|
- Rows with T0.5 margin heat and D1 labels: 0
|
|
- Rows with T0.95 late-order heat and D1 labels: 0
|
|
- Rows matched to external ipohk history: 102
|
|
- Rows with external final oversubscription: 95
|
|
- Rows with external final oversubscription and D1 labels: 85
|
|
- Rows pending T1 structure: 5 (01392, 02335, 06067, 06106, 06132)
|
|
- T1 field-level blanks: public oversubscription 10, international oversubscription 14, valid applications 6, successful applications 18
|
|
|
|
## T0 Calibration
|
|
|
|
T0 uses only prospectus-stage structure: offer size, initial public offer percentage, minimum subscription amount, offer price band, and over-allotment availability.
|
|
|
|
| Bucket | N | D1 positive | D1 >= 10% | Avg D1 return | Median D1 return |
|
|
| --- | ---: | ---: | ---: | ---: | ---: |
|
|
| t0_1_to_4 | 60 | 63.3% | 40.0% | 9.6 | 3.1 |
|
|
| t0_5_to_7 | 105 | 73.3% | 51.4% | 40.1 | 13.2 |
|
|
| t0_gte_8 | 73 | 76.7% | 47.9% | 29.9 | 9.8 |
|
|
| t0_lt_1 | 36 | 58.3% | 33.3% | 12.8 | 2.3 |
|
|
|
|
## T1 Calibration
|
|
|
|
T1 adds allotment-stage demand: public subscription, international placing demand, valid application count, application success rate, and HK public offer reallocation.
|
|
|
|
| Bucket | N | D1 positive | D1 >= 10% | Avg D1 return | Median D1 return |
|
|
| --- | ---: | ---: | ---: | ---: | ---: |
|
|
| total_0_to_9 | 68 | 58.8% | 30.9% | 3.3 | 0.2 |
|
|
| total_10_to_17 | 29 | 55.2% | 34.5% | 13.9 | 1.5 |
|
|
| total_18_to_25 | 49 | 75.5% | 51.0% | 31.3 | 13.4 |
|
|
| total_gte_26 | 60 | 95.0% | 88.3% | 87.3 | 83.3 |
|
|
| total_lt_0 | 68 | 61.8% | 23.5% | 0.4 | 1.0 |
|
|
|
|
## T0.5 Market Heat
|
|
|
|
T0.5 uses archived subscription-period margin heat snapshots. T0.95 is the near-deadline subset that is still actionable before the user's order cutoff. These are non-official live signals and are kept separate from T1 allotment demand. The current archive is not yet a historical training set: it has too few rows and no D1 labels for calibration.
|
|
|
|
- Total market heat rows: 5
|
|
- T0.5 margin rows: 5
|
|
- T0.5 rows with D1 labels: 0
|
|
- T0.95 late-order heat rows: 0
|
|
- T0.95 rows with D1 labels: 0
|
|
|
|
## External Final Heat Proxy
|
|
|
|
The ipohk history archive adds final public oversubscription, one-lot win rate, grey-market return, and first-day return where available. These fields are useful for coverage checks and post-hoc calibration, but they are not T0.5 inputs because they are final or near-final history.
|
|
|
|
- External history rows matched into this dataset: 102
|
|
- Matched rows with final oversubscription: 95
|
|
- Matched rows with final oversubscription and D1 labels: 85
|
|
|
|
| Bucket | N | D1 positive | D1 >= 10% | Avg D1 return | Median D1 return |
|
|
| --- | ---: | ---: | ---: | ---: | ---: |
|
|
| external_os_1000x_to_5000x | 33 | 93.9% | 78.8% | 60.4 | 44.2 |
|
|
| external_os_100x_to_1000x | 21 | 61.9% | 38.1% | 8.8 | 4.2 |
|
|
| external_os_10x_to_100x | 7 | 28.6% | 14.3% | -23.0 | -21.9 |
|
|
| external_os_gte_5000x | 18 | 83.3% | 72.2% | 101.7 | 89.7 |
|
|
| external_os_lt_10x | 6 | 50.0% | 16.7% | 4.7 | -4.1 |
|
|
|
|
## Current Read
|
|
|
|
After the T1 demand text backfill, the strongest v0 T1 bucket is `total_gte_26` with 60 historical D1 observations and a 95.0% D1 positive rate. The model is most useful after allotment results are available; T0 is a watchlist filter rather than a final subscription call.
|
|
|
|
The high-conviction bucket remains clearly differentiated, but the middle and low score buckets are still not monotonic. This refresh keeps the v0 score formula unchanged and updates empirical calibration only; future rule changes should come from reviewed prediction cards rather than overfitting this historical sample.
|
|
|
|
## Usage
|
|
|
|
1. Run `scripts/build_analysis_dataset.py` after archivist updates the database.
|
|
2. Use `t0_score` for prospectus-stage watchlisting.
|
|
3. Use `total_score`, `decision_band`, and `calibrated_d1_positive_rate` for T1-stage subscription cards.
|
|
4. Frame live decisions around a T2 or D1 sell, not long-term holding.
|
|
5. Treat D5/D20/D60 columns as review labels only, never as prediction inputs or holding targets.
|
|
|
|
## Known Gaps
|
|
|
|
- T1 is structurally complete for listed rows; residual field-level NULLs remain when the archived source does not explicitly state a demand field.
|
|
- Industry and issuer fundamentals are not sufficiently structured for model input.
|
|
- T2 grey-market signal is blocked pending an approved source.
|
|
- Extreme D1 returns should be audited before they drive rule changes.
|