Methodology — Balance of Power

Pipeline summary

For every (country, sector) cell the pipeline runs a fresh search through DuckDuckGo, asks a local LLM (huihui_ai/gemma-4-abliterated:e4b) to generate four targeted query terms, then retrieves real result snippets and asks the same model to score the cell on an 8-point US–China scale (Solid US → Tilt US → Tilt China → Solid China). Every retrieved URL is preserved verbatim; the model never invents a source. A post-generation citation gate strips any [n] reference to a non-existent source. Source domains are then tagged primary / data / think-tank / news / other based on a curated keyword list.

Pipeline flowchart

Every step of a single run, end to end. Colour coding: blue = I/O, violet = LLM call, teal = network search, amber = guardrail / validation, green = external feed, olive = decision, rose = side-channel notification.

flowchart TD Start(["Pipeline start"]):::st --> LoadProg["Load progress.json
last-completed dates"]:::io LoadProg --> InfraGate{"Any feed past staleness threshold?"}:::dec InfraGate -->|No| IdentityPrime InfraGate -->|Yes| Infra subgraph Infra ["Infrastructure refresh"] direction LR I1["Submarine cables
TeleGeography V3
tag US / CN / CONTESTED"]:::feed I2["Military satellites
Celestrak TLE
name-keyword owner tag"]:::feed I3["US carriers
LLM estimate from news
JSON-mode enforced"]:::feed I4["GDP
World Bank PPP
NY.GDP.MKTP.PP.CD"]:::feed I5["Trade flows
UN Comtrade v3
US + CN bilateral totals"]:::feed end Infra --> IdentityPrime IdentityPrime["Identity priming
Each model self-assesses bias
once per session — hard-fail if empty"]:::llm IdentityPrime --> Queue Queue["Build work queue
stale cells first, then oldest
continuous cycling when all fresh"]:::stg Queue --> Cell subgraph Cell ["Per cell loop"] direction TB C1["1. Proxy-gen: 4 targeted queries
JSON-mode enforced"]:::llm C2["2. Search: DDG + Bing CN
ZH query translation per term
circuit-breaker after 10 failures"]:::net C3["3. Tag every URL by tier
primary / data / think-tank /
news / other"]:::stg C4["4. Gemma independent score
8-point scale + citations
JSON-mode enforced"]:::llm C4b["5. Qwen independent score
8-point scale + citations
JSON-mode enforced"]:::llm C5["6. Citation gate x2
strip hallucinated markers,
map valid ones to real URLs"]:::grd C6["7. Debate engine
detect disagreement, run rounds
search-augment if requested
reconcile to final verdict"]:::llm C7["8. Save data.json + report.md
winner, evidence, citations,
full debate transcript"]:::io C8["9. Update progress.json"]:::io C1 --> C2 --> C3 --> C4 --> C4b --> C5 --> C6 --> C7 --> C8 end Cell --> CompleteCheck{"Country at 100 percent complete?"}:::dec CompleteCheck -->|Yes| Bluesky["Optional Bluesky post
completion or flip alert"]:::sid CompleteCheck -->|No| DeployCheck Bluesky --> DeployCheck DeployCheck{"Every 10 entries?"}:::dec DeployCheck -->|No| TimeCheck DeployCheck -->|Yes| Build subgraph Build ["Website build"] direction TB B1["Walk all data.json files
build master_data"]:::stg B2["Aggregate per-country wins
US %, CN %, contentiousness avg"]:::stg B3["Calculate SPI
tier x log1p PPP GDP
+ 2,000,000 point pool"]:::stg B4["Detect flips vs previous run"]:::grd B5["Aggregate citations
tier counts, dedup by URL"]:::stg B6["Write strategic_overlays.json
carrier positions, cable landings"]:::stg B7["Render pages: globe, list,
methodology, per-country,
sitemap, robots"]:::stg B1 --> B2 --> B3 --> B4 --> B5 --> B6 --> B7 end Build --> Deploy["Deploy to Cloudflare Pages
via wrangler"]:::io Deploy --> TimeCheck TimeCheck{"Runtime limit reached?"}:::dec TimeCheck -->|No| Queue TimeCheck -->|Yes| EndRun(["Session complete"]):::dn classDef st fill:#1a3a1a,stroke:#3a8c3a,color:#cfe6cf classDef dn fill:#1a3a1a,stroke:#3a8c3a,color:#cfe6cf classDef io fill:#0f1620,stroke:#3a5a8c,color:#9ec0ec classDef stg fill:#141414,stroke:#2a2a2a,color:#cccccc classDef llm fill:#1a1428,stroke:#5a3a8c,color:#bfa4e0 classDef net fill:#10181a,stroke:#3a6a6a,color:#9ec8c8 classDef grd fill:#1f1810,stroke:#8c6a2a,color:#d8c08a classDef feed fill:#0f1410,stroke:#3a6a3a,color:#bfd8bf classDef sid fill:#1c1014,stroke:#7a3a4a,color:#d8a0b0 classDef dec fill:#181410,stroke:#7a6a3a,color:#d8c89a

Structural Presence Index

Per-country SPI is the tier-weighted share of sectors won by each power. Sector tiers:

Tier 1 (×3) — structural / military: Military Engineering Cooperation, Military Planning Cooperation, Cybersecurity Cooperation, Semiconductor Supply Chain, 5G Telecommunications
Tier 2 (×2) — relational lock-in: Artificial Intelligence Export, Satellite Internet Infrastructure, Port Management and Logistics, Renewable Energy Investment, Spaceport and Launch Capabilities, Rare Earth Mineral Mining, Biotech and Genomic Research, Economic Imports, Economic Exports, Financial Cooperation
Tier 3 (×1) — ambient / soft power: Electric Vehicle Manufacturing, Tourism (Both ways), Public Reception, Immigration & Emigration, Cultural Influence

Regional and global SPI are GDP-weighted means of the per-country scores, where the weight is log1p(PPP GDP) from the World Bank indicator NY.GDP.MKTP.PP.CD. A 2,000,000-point pool is then distributed proportionally to those weights and split across sectors by tier.

Source-quality tiers

Every retrieved URL is tagged by domain into one of: primary (governments, IGOs, regulatory filings), data (established quantitative trackers), think_tank (recognised research institutes), news (established wire services and major outlets), or other (everything else). The tier is shown as a small badge next to each citation. The classifier is a coarse domain-keyword heuristic; it under-classifies primary sources from jurisdictions whose government domains are not in the keyword list, and it cannot tell a strong news article from a weak one within the same outlet.

Chinese-internet sourcing (Phase 1)

English-language search alone systematically under-samples the Sinophone information space — precisely the side of the US–China contest where Beijing is the active player. To partially close that gap, every cell now also runs a parallel Chinese-language search leg: each English query is translated into Simplified Chinese by the model, the translated query is run through cn.bing.com (Microsoft's compliant mainland index), and the resulting snippets are translated back to English for the judge. The Chinese original is preserved alongside the translation in the citation manifest so reviewers can audit translation drift.

Source-stance taxonomy. Chinese-language domains are tagged with an editorial-stance pill that displays alongside the standard quality tier (e.g. a Caixin article reads NEWS · ZH-INDEPENDENT; a Xinhua release reads PRIMARY · ZH-STATE). The five buckets:

zh_state — mainland state media (Xinhua, People's Daily, CCTV, China Daily, Global Times). Reads as a primary source for what Beijing claims, not for ground truth.
zh_party — explicit CCP organs (Qiushi).
zh_independent — mainland commercial / less party-aligned outlets that still operate inside the censorship envelope but produce real reporting (Caixin, 36Kr, The Paper, Yicai).
zh_diaspora — Sinophone media outside mainland censorship (RFA Mandarin, VOA Mandarin, Initium, HKFP, Taiwanese press, BBC/DW Chinese).
zh_translated — English-language outlets that translate / curate Chinese content (China Digital Times, Sinocism, ChinaFile, MERICS, SCMP).

Honest caveat. Reading state media does not make analysis less biased; it makes it biased in a different direction. Mainland state media exists to manufacture a specific narrative. The tier+stance badging surfaces that distinction at the citation level so a verdict resting heavily on zh_state sources is visibly less robust than one drawing across zh_state, zh_independent, and zh_diaspora.

Phase 2/3 (planned, not yet active). Sogou WeChat public-account search; direct site-search wrappers for Caixin, Initium, China Digital Times; a triangulation rule that requires multi-stance coverage before high-confidence verdicts; Zhihu and Weibo integration.

Limitations

This is a real product but it is not a peer-reviewed analytical index. The honest list:

Single LLM as judge. Every cell is graded by one model in one shot, with no ensemble, no calibration, and no inter-rater agreement. There is currently no held-out gold set against which accuracy can be measured.

Multilingual source bias. As of Phase 1, every cell now runs a parallel Chinese-language search leg through cn.bing.com with model-driven translation in both directions (see "Chinese-internet sourcing"). Lusophone, Russophone, Arabophone, and Francophone perspectives are still under-sampled — closing those gaps is on the roadmap.

Tier weights are asserted, not validated. The Tier 1/2/3 weighting is a defensible editorial choice, but no sensitivity analysis has been run to show how stable the SPI rankings are under alternative weightings.

Forced binary. The 8-point scale forbids "tied" or "neither", which discards real information about countries that genuinely hedge (India, UAE, Indonesia, Brazil).

Citation gate is shape-only. The gate guarantees every [n] points at a real retrieved URL. It does not verify that the surrounding sentence faithfully summarises that URL. Misrepresentation of cited sources is the residual hallucination risk.

Carrier positions are illustrative. Aircraft-carrier coordinates on the globe are LLM estimates derived from open-source news, not AIS or fleet-tracker data. They should be treated as a sketch of fleet posture, not a real-time track.

Reproducibility. DuckDuckGo results vary by time, IP, and rate-limiting. Two runs of the same cell will not return identical sources, and the model's verdict can differ. Verdicts should be read as snapshots, not deterministic conclusions.

Coverage at a glance

20 countries × 20 sectors. Citation index currently holds 8759 unique source URLs across 3726 domains (12329 total references).

Sinophone share: 264 Chinese-tagged sources — state 217, independent 2, diaspora 6, translated 39, party 0.