Commit Graph

57 Commits

Author SHA1 Message Date
geometrybase eae427d85b Add PDF text extraction workflow
Request:
- Provide a way to install or develop a PDF extraction tool for archived HK IPO documents.

Changes:
- Add requirements.txt with pypdf as the lightweight PDF text extraction dependency.
- Add scripts/extract_pdf_text.py to extract text from PDF source_refs into repo-relative data/extracted_text files.
- Add extracted text outputs and an extracted_text_manifest snapshot for the six archived HKEXnews PDFs.
- Document the extraction workflow in README.md.
- Ignore .venv and keep generated SQLite/Python transient files out of git.
- Use extracted text to verify the 06106 full prospectus, update source_refs, remove the related data gap, and fill 06106 offering terms.

Verification:
- Installed python3.14-venv system support, created a local .venv, and installed requirements.txt.
- Re-ran scripts/bootstrap_historical_data.py and scripts/extract_pdf_text.py.
- Verified extracted text paths and hashes against data/snapshots/extracted_text_manifest.csv.
- Verified SQLite integrity and snapshot row counts.
- Ran git diff --cached --check and searched durable files for machine-specific absolute paths.
2026-06-15 06:21:16 +00:00
geometrybase 7a8c648d87 Bootstrap HK IPO historical archive
Request:
- Use the project archivist workflow to update historical IPO data.

Changes:
- Add an embedded SQLite archive at data/hk_ipo.sqlite.
- Add schema/hk_ipo.schema.sql and scripts/bootstrap_historical_data.py for reproducible archive generation.
- Archive HKEXnews source PDFs for 06658, 06675, and 06106 under repo-relative data/raw paths.
- Export Git-friendly snapshots for ipo_master, offering_terms, ipo_demand, source_refs, and data_gaps.
- Add .gitignore rules for Python cache and SQLite transient files.

Verification:
- Re-ran the bootstrap script successfully.
- Ran PRAGMA integrity_check on the SQLite database.
- Verified source_refs paths are repo-relative, files exist, and SHA-256 hashes match.
- Verified snapshot row counts match SQLite table counts.
- Ran git diff --check and searched generated durable files for machine-specific absolute paths.
2026-06-15 06:13:27 +00:00
geometrybase 6b6df26271 Remove explicit push restriction
Request:
- Delete the AGENTS.md rule that allowed pushing only when explicitly requested.

Changes:
- Remove the single Git Workflow bullet that restricted push behavior.

Verification:
- Reviewed the focused diff for AGENTS.md.
- Confirmed no remaining push-related text with rg.
2026-06-15 06:05:04 +00:00
geometrybase 408ba59bc6 Document HK IPO project workflow
Request:
- Write a README introducing the project.

Changes:
- Describe the HK IPO research feedback loop.
- Document the stage-based workflow, project-local skills, storage model, path rules, and Git discipline.

Verification:
- Reviewed README contents with sed.
- Ran rg for machine-specific absolute path patterns; none found.
- Ran git diff --check.
2026-06-15 06:02:10 +00:00
geometrybase a138ef3193 Add project agent workflow instructions
Request:
- Review the project agent instructions and make the Git workflow explicitly automatic.
- Commit the project-local modification into the repository.

Changes:
- Add AGENTS.md to the repo.
- Rename the Git workflow section to emphasize automatic commits.
- Clarify that completed repository changes should be committed before the final response.
- Clarify that related project-local files such as .codex skills, schema, scripts, snapshots, memos, and documentation belong in the focused commit.

Verification:
- Reviewed the updated Git Workflow section with sed.
- Confirmed the expected automatic-commit language with rg.
- Checked the staged diff includes only AGENTS.md.
2026-06-15 06:00:26 +00:00
geometrybase 67b78cc172 Add project-local HK IPO skills
Request:
- Keep the HK IPO workflow skills inside the repo so they travel with the project.
- Use concise names while preserving clear HK IPO scope and repo-relative path rules.

Changes:
- Add .codex/skills/archivist for source archiving, SQLite fact updates, hashes, and CSV snapshots.
- Add .codex/skills/analyst for T0/T1/T2 IPO decisions, prediction cards, reviews, and rule-change recommendations.
- Add agents/openai.yaml metadata for both skills.

Verification:
- Checked staged changes include only .codex skill files.
- Searched .codex for machine-specific absolute path patterns; none found.

Next useful context:
- AGENTS.md remains untracked and was not included in this commit.
2026-06-15 05:53:53 +00:00
geometrybase 6907418731 first commit 2026-06-15 05:43:41 +00:00