# hedgefund.wiki Architecture

This runbook documents the current API/data architecture, the local server test harness behavior, and known operational gaps.
It is intended for maintainers, Cloudflare Pages operators, and AI agents that need to update the corpus without guessing how generated artifacts are served.

## System Map

```mermaid
flowchart LR
  subgraph Source["Hand-authored source data"]
    Glossary["data/glossary-source.json<br/>1,074 legacy terms"]
    Data["data/*.json<br/>categories, strategies, regulations,<br/>calculators, conventions, sources, people"]
    DataSchemas["data/schema/*.json<br/>authoring contracts"]
  end

  subgraph Build["Build and enrichment"]
    BuildPy["scripts/build.py"]
    Enrich["term enrichment<br/>UUIDv5 ids, backlinks, checksums,<br/>search tokens, LLM chunks"]
  end

  subgraph Artifacts["Generated public artifacts"]
    ApiBuild["api-build/v1/**<br/>JSON, NDJSON, JSON-LD inputs,<br/>manifest, graph, OpenAPI, stats"]
    Discovery["llms.txt, llms-full.txt,<br/>sitemap.xml, .well-known/ai-plugin.json"]
    Schemas["schema/*.schema.json<br/>served JSON Schema 2020-12"]
  end

  subgraph Runtime["Cloudflare Pages runtime"]
    Static["Pages static assets<br/>index.html, docs/, schema/, api-build/"]
    Functions["functions/api/v1/**<br/>dynamic query, graph, search,<br/>bulk, compute, negotiation"]
    Helpers["functions/_lib/respond.js<br/>CORS, ETag, RFC 7807,<br/>content negotiation, asset reads"]
    Finmath["functions/_lib/finmath.js<br/>pure JS calculators"]
  end

  subgraph Consumers["Consumers"]
    Browsers["Humans and docs readers"]
    Agents["LLMs, RAG pipelines,<br/>API clients, crawlers"]
  end

  Glossary --> BuildPy
  Data --> BuildPy
  DataSchemas --> BuildPy
  BuildPy --> Enrich --> ApiBuild
  BuildPy --> Discovery
  BuildPy --> Schemas
  ApiBuild --> Static
  Discovery --> Static
  Schemas --> Static
  Static --> Functions
  Helpers --> Functions
  Finmath --> Functions
  Functions --> Agents
  Static --> Browsers
  Static --> Agents
```

## Request Path

```mermaid
sequenceDiagram
  participant Client
  participant Pages as Cloudflare Pages
  participant Fn as functions/api/v1/*
  participant Assets as env.ASSETS.fetch
  participant Build as api-build/v1/*

  Client->>Pages: GET /api/v1/terms?format=json
  Pages->>Fn: Invoke route module
  Fn->>Assets: Load generated JSON/NDJSON
  Assets->>Build: Read static artifact
  Build-->>Assets: Artifact body
  Assets-->>Fn: Response
  Fn->>Fn: Filter, paginate, project fields
  Fn->>Fn: Negotiate format, add CORS/ETag/cache
  Fn-->>Client: 200 JSON, NDJSON, JSON-LD, Markdown, or 304
```

`functions/_lib/respond.js` is the shared contract layer: every successful API response gets CORS headers, `X-API-Version`, `X-License`, `ETag`, cache headers, and content negotiation through `?format=json|ndjson|jsonld|md` or `Accept`. Errors use RFC 7807 `application/problem+json`.

## Local Test Harness

`node scripts/test_server.mjs` automatically rebuilds generated API artifacts when they are absent. The harness checks for `api-build/v1/manifest.json` and runs `python3 scripts/build.py` before importing the Cloudflare Pages Function modules if the manifest is missing.

```mermaid
flowchart TD
  Start["node scripts/test_server.mjs"] --> Check["Does api-build/v1/manifest.json exist?"]
  Check -- "no" --> Build["python3 scripts/build.py"]
  Check -- "yes" --> Harness["Create local env.ASSETS.fetch"]
  Build --> Harness
  Harness --> Import["Import functions/api/v1 route modules"]
  Import --> Invoke["Call onRequestGet / onRequestPost / onRequestOptions"]
  Invoke --> Assert["36 endpoint assertions"]
  Assert --> Result["Exit 0 when all pass"]
```

### Constraints

- The harness simulates Cloudflare Pages by binding `env.ASSETS.fetch` to a local filesystem reader under the repo root.
- Auto-build is presence-based: it runs only when `api-build/v1/manifest.json` is missing. If source data changed and local artifacts already exist, run `python3 scripts/build.py` first or remove `api-build/` before running server tests.
- `scripts/build.py` writes ignored `api-build/v1/**` artifacts and also refreshes top-level discovery files such as `llms.txt`, `llms-full.txt`, `sitemap.xml`, and `.well-known/ai-plugin.json`.
- Use `BUILD_TIME=<iso timestamp> python3 scripts/build.py` when a reproducible generated timestamp is required.
- `python3 -m http.server 8080` can preview static files, but it does not execute Pages Functions. Use `node scripts/test_server.mjs` for Function behavior.

## Public Interface Inventory

| Surface | Codepath | Purpose | Validation |
|---|---|---|---|
| Static landing page | `index.html` | Maintenance/marketing shell; do not modify without explicit direction | Manual static preview |
| Developer docs | `docs/index.html`, `docs/*.md` | Public API guidance and operator runbooks | Static preview and Markdown review |
| Generated API data | `api-build/v1/**` | Canonical served corpus, graph, chunks, OpenAPI, manifest | `python3 scripts/build.py` and server tests |
| JSON Schemas | `schema/*.schema.json` | Public validation contracts | Build output plus schema review |
| Dynamic API routes | `functions/api/v1/**` | Query, search, graph, compute, bulk, format negotiation | `node scripts/test_server.mjs` |
| Financial math | `functions/_lib/finmath.js` | Deterministic calculator implementations | `node scripts/test_finmath.mjs` |
| Response contract | `functions/_lib/respond.js` | CORS, cache, ETag, RFC 7807, negotiation | Server tests |

## Developer Workflow

1. Update hand-authored source files under `data/` or route code under `functions/`.
2. If data changed, run `python3 scripts/build.py` and review generated discovery files.
3. Run `node scripts/test_finmath.mjs` for calculator math changes.
4. Run `node scripts/test_server.mjs` for API behavior. It will build missing `api-build/` artifacts, but it will not detect staleness when a manifest already exists.
5. Inspect `git status --short` and keep documentation-only changes separate from generated artifact churn unless the task explicitly requires regenerated public files.

## Troubleshooting

| Symptom | Likely cause | Fix |
|---|---|---|
| `Not Found` from a server harness test | Missing or stale `api-build/v1/**` artifact | Run `python3 scripts/build.py`; if needed, remove `api-build/` and rerun tests |
| Test passes locally but deployed route differs | Local generated artifacts do not match deployed Pages assets | Rebuild with the intended data snapshot and verify the deployment artifact list |
| Unexpected generated timestamp drift | Build ran without `BUILD_TIME` | Re-run with `BUILD_TIME=<iso timestamp>` for reproducible output |
| Static preview works but `/api/v1/*` behavior is absent | `python3 -m http.server` does not execute Pages Functions | Use the Node test harness for Function routes |
| `304` assertion fails | Response body or ETag contract changed | Check `functions/_lib/respond.js` and update tests only if the public cache contract intentionally changed |

## Known Gaps and Next Iterations

| Gap | Risk | Suggested remediation |
|---|---|---|
| Auto-build only checks for manifest presence | Stale local artifacts can mask source-data drift | Add a lightweight freshness check comparing source mtimes or a build fingerprint in `manifest.json` |
| Generated artifacts are ignored but required by route tests | New contributors may not understand why a clean clone needs a build | Keep this runbook linked from README and add a CI log line when auto-build runs |
| API README historically drifted from `/api/v1` | Clients may copy old `/api/*` paths | Treat `/api/README.md` as a public contract and review it whenever routes change |
| No dedicated CI workflow documented in repo | Operators must infer required commands from `AGENTS.md` | Add a checked CI workflow or a `make test`/`npm test` wrapper when the repo standardizes automation |
| Build output mixes API artifacts with top-level discovery files | Data-only edits can produce broad diffs | Document expected generated files in PR templates or split reproducible generation into an explicit release step |

## Documentation UI Suggestions

- Add a docs card on `docs/index.html` for architecture/runbook pages so human readers can discover operator guidance without browsing the repo tree.
- Publish Mermaid diagrams through a rendered documentation site or GitHub Wiki mirror for richer architecture visuals.
- Add a short "agent quick path" section to the README that points autonomous consumers at `llms.txt`, `/api/v1/manifest`, `/api/v1/openapi.json`, and this architecture page.
- Track future agent suggestions in this page's "Known Gaps And Next Iterations" table until the repo has a dedicated ADR or issue tracker workflow.
