Refresh IPO analysis model calibration

Request: - Re-analyze the IPO model using the updated historical archive after T1 demand backfill. Changes: - Regenerate the v0 analysis dataset from the current SQLite archive. - Refresh the v0 calibration report with expanded T1 coverage and new empirical bucket rates. - Update the report template to show pending T1 rows and field-level blanks. - Clarify v0 limitations and record why the score formula stays unchanged for this refresh. Verification: - Ran scripts/build_analysis_dataset.py against data/hk_ipo.sqlite. - Ran py_compile for scripts/build_analysis_dataset.py. - Checked dataset row count, T1 demand coverage, source-only T1 gaps, and repo-relative paths. - Ran git diff --check. Next useful context: - T1 structured coverage is now 291 rows, with 06106 and 06675 still pending_not_due. - The high-conviction T1 bucket remains differentiated, but middle and low buckets are still not monotonic enough for a v1 rule change.
2026-06-15 14:05:34 +00:00
parent 6d05056609
commit 58ad869f84
5 changed files with 347 additions and 312 deletions
@@ -594,6 +594,11 @@ def write_report(
    total = len(records)
    d1_records = [record for record in records if record["d1_return_pct"] is not None]
    structured_t1 = [record for record in records if record["has_structured_t1"]]
+    pending_t1_tickers = ", ".join(sorted(record["ticker"] for record in records if not record["has_structured_t1"]))
+    t1_public_os_missing = sum(record["public_oversubscription_times"] is None for record in structured_t1)
+    t1_international_os_missing = sum(record["international_oversubscription_times"] is None for record in structured_t1)
+    t1_valid_missing = sum(record["valid_applications"] is None for record in structured_t1)
+    t1_successful_missing = sum(record["successful_applications"] is None for record in structured_t1)
    best_bucket = max(total_metrics, key=lambda metric: metric.d1_positive_rate or -1)

    lines = [
@@ -620,6 +625,9 @@ def write_report(
        f"- Rows with offer size: {count_present(records, 'offer_size_hkd_m')}",
        f"- Rows with public oversubscription: {count_present(records, 'public_oversubscription_times')}",
        f"- Rows with international oversubscription: {count_present(records, 'international_oversubscription_times')}",
+        f"- Rows pending T1 structure: {total - len(structured_t1)}"
+        + (f" ({pending_t1_tickers})" if pending_t1_tickers else ""),
+        f"- T1 field-level blanks: public oversubscription {t1_public_os_missing}, international oversubscription {t1_international_os_missing}, valid applications {t1_valid_missing}, successful applications {t1_successful_missing}",
        "",
        "## T0 Calibration",
        "",
@@ -633,11 +641,11 @@ def write_report(
        "",
        metrics_table(total_metrics),
        "",
-        "## Initial Read",
+        "## Current Read",
        "",
-        f"The strongest v0 T1 bucket is `{best_bucket.bucket}` with {best_bucket.sample_size} historical D1 observations and a {fmt_pct(best_bucket.d1_positive_rate)} D1 positive rate. The model is most useful after allotment results are available; T0 is a watchlist filter rather than a final subscription call.",
+        f"After the T1 demand text backfill, the strongest v0 T1 bucket is `{best_bucket.bucket}` with {best_bucket.sample_size} historical D1 observations and a {fmt_pct(best_bucket.d1_positive_rate)} D1 positive rate. The model is most useful after allotment results are available; T0 is a watchlist filter rather than a final subscription call.",
        "",
-        "The middle score buckets are not monotonic yet. That is a feature, not a bug report: v0 is exposing where the current rules are too coarse and where missing T1 demand facts weaken calibration. Future rule changes should come from reviewed prediction cards, not from overfitting this initial sample.",
+        "The high-conviction bucket remains clearly differentiated, but the middle and low score buckets are still not monotonic. This refresh keeps the v0 score formula unchanged and updates empirical calibration only; future rule changes should come from reviewed prediction cards rather than overfitting this historical sample.",
        "",
        "## Usage",
        "",
@@ -648,7 +656,7 @@ def write_report(
        "",
        "## Known Gaps",
        "",
-        "- T1 demand parsing is incomplete for older HTML-only allotment announcements.",
+        "- T1 is structurally complete for listed rows; residual field-level NULLs remain when the archived source does not explicitly state a demand field.",
        "- Industry and issuer fundamentals are not sufficiently structured for model input.",
        "- T2 grey-market signal is blocked pending an approved source.",
        "- Extreme D1 returns should be audited before they drive rule changes.",