hk-ipo

geometrybase/hk-ipo

Fork 0

Commit Graph

Author	SHA1	Message	Date
geometrybase	33d0bc056e	Tighten historical data audit coverage Request: - Use the audit skill to check historical data completeness and self-correct the audit criteria after the missed PDF extracted-text gap. Changes: - Add a mandatory derived-evidence checklist to the audit skill. - Require broad historical audits to reconcile PDF source_refs, extracted text files, manifest rows, and hashes. - Add a historical data completeness audit report for the current archive. Findings: - Source integrity and PDF extracted-text completeness now pass. - Full historical completeness still fails due to incomplete structured T1 demand, unresolved T2 grey-market data, open due price-performance tasks, and missing context fields. Verification: - Ran SQLite integrity, foreign-key, source hash, snapshot, PDF manifest, extracted-text hash, stage coverage, and analysis-dataset checks. - Ran scripts/extract_pdf_text.py and confirmed 557 PDF sources were skipped unchanged with 557 manifest rows. - Ran git diff --check.	2026-06-15 13:43:22 +00:00
geometrybase	48b89552fe	Add IPO analysis model baseline Request: - Use the analyst skill to digest downloaded IPO archive data and start building an analysis model. Changes: - Add ipo_score_v0 as the first transparent stage-safe scoring rule set. - Add build_analysis_dataset.py to derive model features, scores, decision bands, and empirical D1 calibration from SQLite. - Generate analysis_model_v0_dataset.csv with 293 scored IPO rows and archived source paths. - Add a model calibration report documenting coverage, T0/T1 bucket performance, usage, and known gaps. - Record the initial model entry in the rule change log and document the command in README. Verification: - Ran py_compile for scripts/build_analysis_dataset.py. - Regenerated the analysis dataset and report with as-of 2026-06-15T13:00:00Z. - Checked CSV row count, source path coverage, and repo-relative path hygiene. - Ran git diff --cached --check. Next useful context: - v0 should be treated as a transparent baseline, with T1 high-score calibration strongest and middle buckets still non-monotonic. - T2 is excluded until a reliable grey-market source is approved.	2026-06-15 12:49:48 +00:00

Author

SHA1

Message

Date

geometrybase

33d0bc056e

Tighten historical data audit coverage

Request:
- Use the audit skill to check historical data completeness and self-correct the audit criteria after the missed PDF extracted-text gap.

Changes:
- Add a mandatory derived-evidence checklist to the audit skill.
- Require broad historical audits to reconcile PDF source_refs, extracted text files, manifest rows, and hashes.
- Add a historical data completeness audit report for the current archive.

Findings:
- Source integrity and PDF extracted-text completeness now pass.
- Full historical completeness still fails due to incomplete structured T1 demand, unresolved T2 grey-market data, open due price-performance tasks, and missing context fields.

Verification:
- Ran SQLite integrity, foreign-key, source hash, snapshot, PDF manifest, extracted-text hash, stage coverage, and analysis-dataset checks.
- Ran scripts/extract_pdf_text.py and confirmed 557 PDF sources were skipped unchanged with 557 manifest rows.
- Ran git diff --check.

2026-06-15 13:43:22 +00:00

geometrybase

48b89552fe

Add IPO analysis model baseline

Request:
- Use the analyst skill to digest downloaded IPO archive data and start building an analysis model.

Changes:
- Add ipo_score_v0 as the first transparent stage-safe scoring rule set.
- Add build_analysis_dataset.py to derive model features, scores, decision bands, and empirical D1 calibration from SQLite.
- Generate analysis_model_v0_dataset.csv with 293 scored IPO rows and archived source paths.
- Add a model calibration report documenting coverage, T0/T1 bucket performance, usage, and known gaps.
- Record the initial model entry in the rule change log and document the command in README.

Verification:
- Ran py_compile for scripts/build_analysis_dataset.py.
- Regenerated the analysis dataset and report with as-of 2026-06-15T13:00:00Z.
- Checked CSV row count, source path coverage, and repo-relative path hygiene.
- Ran git diff --cached --check.

Next useful context:
- v0 should be treated as a transparent baseline, with T1 high-score calibration strongest and middle buckets still non-monotonic.
- T2 is excluded until a reliable grey-market source is approved.

2026-06-15 12:49:48 +00:00

2 Commits