Add A/H share-class mapping workflow

Request:
- Add a repeatable mechanism so HK IPO reports detect issuers that already have Mainland A shares.
- Include a third internet/official-exchange cross-check layer beyond structured history and prospectus scans.

Changes:
- Add listed_share_classes schema support for same-issuer A-share mappings and evidence links.
- Add scripts/archive_a_share_mappings.py to scan prospectus extracted text, reject sponsor/portfolio/cornerstone false positives, archive optional official web evidence and A-share/FX quote evidence, and export snapshots on write.
- Surface a_share_* fields in the analysis dataset and single-ticker report output.
- Update hk-ipo analyst/archivist skill rules and scheduled refresh prompt to require the three-layer A/H mapping check.

Verification:
- python3 -m py_compile scripts/archive_a_share_mappings.py scripts/build_analysis_dataset.py scripts/generate_ipo_report.py
- .venv/bin/python scripts/archive_a_share_mappings.py --as-of 2026-06-24T00:00:00Z --tickers 00668,01688,03661,09630 --dry-run
- .venv/bin/python scripts/build_analysis_dataset.py --db /tmp/hk_ipo_ah_dataset_test.sqlite --dataset /tmp/hk_ipo_ah_dataset_test.csv --report /tmp/hk_ipo_ah_model_test.md --as-of 2026-06-24T00:00:00Z
- .venv/bin/python scripts/generate_ipo_report.py 09630 --dataset /tmp/hk_ipo_ah_dataset_test.csv --stdout --as-of 2026-06-24T00:00:00Z
- git diff --check

Next useful context:
- Dry-run detected 00668->300866.SZ, 01688->002600.SZ, 03661->300661.SZ, and 09630->688630.SH.
- A false positive 01688->300476.SZ from a cornerstone investor parent was rejected by the issuer-context filter.
This commit is contained in:
2026-06-24 07:21:21 +00:00
parent d3b67fa473
commit 7cbdd533b0
7 changed files with 710 additions and 0 deletions
+39
View File
@@ -320,6 +320,37 @@ def facts_table(record: dict[str, str], stage: str) -> str:
return "\n".join(lines)
def ah_overlay(record: dict[str, str]) -> str:
if not record.get("a_share_ticker"):
return "- 未识别到同一发行人的 A 股或其他内地上市股本。"
prospectus_path = record.get("a_share_prospectus_source_path") or "data_gap"
web_path = record.get("a_share_web_source_path") or "data_gap"
rows = [
("A 股代码", fmt_value(record.get("a_share_ticker"))),
("交易所", fmt_value(record.get("a_share_exchange"))),
("板块", fmt_value(record.get("a_share_board"))),
("关系", fmt_value(record.get("a_share_relationship"))),
("A 股公司名", fmt_value(record.get("a_share_company_name"))),
("A 股上市日", fmt_value(record.get("a_share_listed_date"))),
("识别方法", fmt_value(record.get("a_share_detection_method"))),
("映射置信度", fmt_value(record.get("a_share_mapping_confidence"))),
("招股书证据", f"`{prospectus_path}`" if prospectus_path != "data_gap" else "`data_gap`"),
("互联网交叉验证", f"`{web_path}`" if web_path != "data_gap" else "`data_gap`"),
]
lines = ["| 字段 | 数值 |", "| --- | --- |"]
for label, value in rows:
lines.append(f"| {label} | {value} |")
lines.extend(
[
"",
"- 这是 A/H 或内地上市股本定价场景,不应按纯首次上市 IPO 处理。",
"- A 股价格可作为估值锚,但 A 股和 H 股通常不能互换或直接套利;短线收益仍取决于香港侧认购热度、流动性、供给和 T2/D1 出口。",
]
)
return "\n".join(lines)
def stage_calendar_table(record: dict[str, str]) -> str:
application_start = fmt_value(record["application_start_date"])
application_end = fmt_value(record["application_end_date"])
@@ -372,6 +403,10 @@ def source_paths(record: dict[str, str], stage: str) -> list[str]:
paths.append(record["prospectus_source_path"])
if stage == T1_STAGE and record["allotment_source_path"]:
paths.append(record["allotment_source_path"])
if record.get("a_share_prospectus_source_path"):
paths.append(record["a_share_prospectus_source_path"])
if record.get("a_share_web_source_path"):
paths.append(record["a_share_web_source_path"])
return paths
@@ -455,6 +490,10 @@ def build_report(record: dict[str, str], rows: list[dict[str, str]], stage: str,
"",
facts_table(record, stage),
"",
"## A/H 或内地上市股本检查",
"",
ah_overlay(record),
"",
"## 短线退出模型推断",
"",
f"- D1 正收益概率:{fmt_pct_rate(metric.d1_positive_rate)}",