Add IPO analysis model baseline

Request:
- Use the analyst skill to digest downloaded IPO archive data and start building an analysis model.

Changes:
- Add ipo_score_v0 as the first transparent stage-safe scoring rule set.
- Add build_analysis_dataset.py to derive model features, scores, decision bands, and empirical D1 calibration from SQLite.
- Generate analysis_model_v0_dataset.csv with 293 scored IPO rows and archived source paths.
- Add a model calibration report documenting coverage, T0/T1 bucket performance, usage, and known gaps.
- Record the initial model entry in the rule change log and document the command in README.

Verification:
- Ran py_compile for scripts/build_analysis_dataset.py.
- Regenerated the analysis dataset and report with as-of 2026-06-15T13:00:00Z.
- Checked CSV row count, source path coverage, and repo-relative path hygiene.
- Ran git diff --cached --check.

Next useful context:
- v0 should be treated as a transparent baseline, with T1 high-score calibration strongest and middle buckets still non-monotonic.
- T2 is excluded until a reliable grey-market source is approved.
This commit is contained in:
2026-06-15 12:49:48 +00:00
parent 5f9546b16c
commit 48b89552fe
6 changed files with 1233 additions and 0 deletions
+12
View File
@@ -129,6 +129,18 @@ Use the price-performance archiver to fill due D1/D5/D20/D60 review checkpoints:
The archiver stores raw Yahoo Finance chart responses under `data/raw/{ticker}/`, records source references and hashes, writes structured rows into `price_performance`, exports snapshots, and refreshes `sync_tasks`.
## Analysis Model
Use the analyst model builder to digest archived data into a stage-safe scoring dataset and calibration report:
```bash
.venv/bin/python scripts/build_analysis_dataset.py --as-of 2026-06-15T13:00:00Z
```
The v0 model is documented in `rules/ipo_score_v0.yaml`. It writes `data/snapshots/analysis_model_v0_dataset.csv` and `reports/2026-06-15_analysis_model_v0.md`.
The model separates T0 prospectus inputs from T1 allotment inputs. D1/D5/D20/D60 returns are labels for calibration and review, not prediction inputs.
## Incremental Archive Sync
The archivist keeps a per-ticker sync ledger so repeated updates can focus on missing stages: