hk-ipo/rules/rule_change_log.md

# Rule Change Log

## 2026-06-15 - Refresh `ipo_score_v0` after T1 demand backfill

Request:

- Re-analyze the model using the known historical archive after T1 demand text backfill.

Change:

- Regenerated `data/snapshots/analysis_model_v0_dataset.csv` from the current SQLite archive.
- Refreshed `reports/2026-06-15_analysis_model_v0.md` with the expanded T1 demand coverage and new empirical calibration.
- Kept the `ipo_score_v0` score formula unchanged because the expanded sample still shows non-monotonic middle and low score buckets.
- Updated model limitations to reflect that T1 is structurally complete for listed rows, while field-level NULLs remain when source documents do not explicitly state a field.

Rationale:

- T1 structured coverage increased from 154 to 291 rows after archivist backfilled demand facts from extracted PDF text.
- The high-conviction bucket remains clearly differentiated, but the rest of the calibration is not strong enough to justify a v1 rule change.
- Avoiding a threshold rewrite here preserves the feedback loop: future rule changes should be tied to reviewed predictions and named error cases.

Verification:

- Rebuilt the analysis dataset and model report from `data/hk_ipo.sqlite`.
- Confirmed post-listing returns remain labels only and are not score inputs.
- Confirmed durable source paths remain repo-relative.

## 2026-06-15 - Introduce `ipo_score_v0`

Request:

- Start digesting the downloaded IPO archive and build the first analyst model.

Change:

- Added `rules/ipo_score_v0.yaml` as the initial transparent scoring baseline.
- Added `scripts/build_analysis_dataset.py` to generate a feature dataset and calibration report from `data/hk_ipo.sqlite`.
- Added `data/snapshots/analysis_model_v0_dataset.csv` as the first model-ready snapshot.
- Added `reports/2026-06-15_analysis_model_v0.md` to document coverage, calibration, and known gaps.

Rationale:

- The archive now has enough T0 facts and D1/D5/D20/D60 labels to support a repeatable baseline.
- T1 demand data is partially structured and highly informative where available.
- T2 grey-market data remains blocked until a reliable source exists, so it is excluded from v0.

Verification:

- Generated the dataset from the current SQLite archive.
- Confirmed the model keeps post-listing returns as labels only.
- Recorded non-monotonic middle buckets as a limitation rather than overfitting them away.