Coherent Markets Engine V0.1

Abstract

The Coherent Markets Engine (CME) is a methodology for detecting structural mispricing on prediction markets. It reads observed prices from independently-priced contracts on Polymarket, builds an event-relationship graph encoding the probability axioms that related contracts must satisfy (additivity, mutually-exclusive sums, conditional Bayes consistency), and projects the observed price vector onto the constraint-feasible region. The projection identifies markets that are individually priced but globally inconsistent.

CME V0.1 emits a "Convexly Signal" only when a detected inconsistency passes three independent statistical and economic gates: a Benjamini-Hochberg- corrected significance test on the observed tension, a cost-adjusted expected-value check (Polymarket fees, slippage, and capital lockup), and a resolution risk score capturing UMA-oracle manipulation potential. Each signal carries a SHA-256 hash-chain audit-log row linking it to the frozen input artifacts and the methodology code commit at the time of detection.

Theoretical foundation: the Pennock-Lahaie-Kroer LCMM (Logarithmic Combinatorial Market Maker) lineage. Closest published peer: Saguillo, Ghafouri, Kiffer, and Suarez-Tangil (2025), "Unravelling the Probabilistic Forest" (arXiv:2508.03474, presented at the AFT 2025 conference). CME extends their retrospective per-pair coherence analysis to a real-time n-market polytope projection on a live public order book.

The structural opportunity

Prediction markets present a transparency advantage no other financial market provides: outcome contracts are settled against verifiable real-world events, and the market-implied probabilities of related contracts must obey known probability laws.

Two related markets that ask logically connected questions cannot have arbitrary independent prices. If markets A and B cover mutually exclusive and exhaustive outcomes, the prices must sum to one. If A and B are non-exclusive, their union must equal P(A) + P(B) − P(A ∩ B). Conditional probabilities must satisfy Bayes consistency. These constraints arise from probability theory, not from any regulatory rule.

Yet on every major U.S.-accessible prediction market, individual contracts are priced independently via order book or automated market maker. Cross-market constraints are not enforced. The result is a measurable, real-time stream of structural inconsistencies.

Saguillo et al. (2025) measured this gap empirically. Analyzing 86 million Polymarket bid records from April 2024 through April 2025, they identified approximately 39.6 million dollars in realized arbitrage from probability-axiom violations during that twelve-month window. CME V0.1 generalizes their retrospective per-pair analysis to a real-time n-market polytope projection.

Methodology

CME V0.1 implements a discrete pipeline of analytical modules, all reproducible from frozen input SHA-256 hashes:

Pull active Polymarket market metadata via the gamma API. Each market in the snapshot includes outcome prices, liquidity, end date, and an event_id grouping related contracts.
Build an event-relationship graph G. For each event_id with two or more constituent markets, classify the relationship type using a frozen heuristic: mutually-exclusive-and-exhaustive (MEE, sum equals one), mutually-exclusive-but- not-exhaustive (MENE, sum is at most one), or unclassified (no constraint applied).
Project observed prices onto the constraint- feasible region: minimize a weighted squared- distance objective subject to G. Weights are liquidity-tier-derived, with a floor to avoid over-weighting thin markets. The V0.2 specification adds KL-divergence (I-projection) as the canonical objective, following the Csiszár 1975 axiom system.
Compute a cost-adjusted expected value for each detected inconsistency. The cost stack includes the 2% Polymarket fee on the winning side at resolution, a liquidity-tier slippage estimate (V0.2 replaces this heuristic with real CLOB bid-ask depth), and a capital opportunity cost from a 4% annual treasury rate on locked position size from now until the listed end date.
Test the statistical significance of each observed tension. The standard error of the per-event-group sum is derived from the bid-ask noise model. Tests are two-sided for MEE and one-sided upper for MENE. P-values are corrected for multiple testing using the Benjamini-Hochberg false discovery rate procedure.
Score each constituent market's resolution risk. Four factors contribute: subjective wording detected in the market question or slug, liquidity tier (UMA manipulation cost scales with market stake), time-to-resolution proximity (insufficient dispute window increases risk), and high-risk category keywords. Markets with composite resolution risk above the threshold are excluded from the signal stream.
Append a hash-chain provenance row to the audit log. Each row carries SHA-256 hashes of all six input and output artifacts, plus a previous-row hash linking to the prior log entry. The chain is verifiable by any reader; a single tampered row breaks the chain.

A "Convexly Signal" is any detected inconsistency that passes all three independent gates: statistically significant, cost-adjusted expected-value positive, and resolution risk below threshold. The three-gate design distinguishes methodologically real inefficiencies from market noise, fake alpha, and oracle-resolution-risk arbs.

What CME is, and what it deliberately is not

CME uses observed market prices as the inference input. It does not aggregate trader forecasts. This distinction is empirically important.

Convexly's V2.8.2 paper (full methodology, AsPredicted #287,436, frozen 2026-04-26) tested twenty-four pre-registered aggregator variants that weight individual trader forecasts by an Edge Score composite. All twenty-four failed to reject the null hypothesis in the wrong direction: the skill-weighted aggregator was 0.179 Brier worse than the market-implied baseline (95% CI 0.164 to 0.193). The wash-filter equivalence test (AsPredicted #287,983, completed 2026-04-29) confirmed that this negative finding is robust to wash-trader contamination. PnL-skill on Polymarket is not a usable per-market forecast aggregation weight.

CME is therefore designed around the observation that market prices themselves are the most accurate per-event forecasts; the inefficiency lives at the multi-market structural level, where independently- priced markets fail to enforce coherence axioms. Edge Score V3b continues to operate in its working domains: per-wallet ranking on the Skill Leaderboard, the public wallet analyzer, the insider trading behavioral detection MVP, and the wash-trading detector cohort definition. It is intentionally outside the per-market forecasting layer.

Theoretical foundation and prior work

The mathematical machinery underlying CME has been studied for over twenty years. Hanson (1999, 2003) introduced the Logarithmic Market Scoring Rule (LMSR) as a combinatorial information market design; LMSR enforces coherence by construction when a single market maker quotes all related contracts. Pennock, Chen, Fortnow, Lambert, and Wortman (2004 onward) extended this to combinatorial markets with bounded loss. Kroer, Dudík, Lahaie, and Pennock (2016) provided integer-programming algorithms for arbitrage- free combinatorial market making (LCMM).

CME ports the LCMM coherence projection to an observation-side setting. Where LCMM enforces coherence at the time of price quotation by a single market maker, CME measures incoherence after the fact, on prices that emerged from independent CLOB trading on Polymarket. The same Bregman-projection machinery applies, with a different operational target: detection rather than construction.

The closest published peer to CME is Saguillo, Ghafouri, Kiffer, and Suarez-Tangil (2025), "Unravelling the Probabilistic Forest" (arXiv 2508.03474, IMDEA Networks, presented at AFT 2025). They analyzed 86 million Polymarket bid records and used large language models to extract logical relations between markets, then performed retrospective per-pair coherence checks. Their estimate of total realized arbitrage during the April 2024 to April 2025 window is approximately 39.6 million dollars. CME extends their work in three respects: real-time operation rather than retrospective analysis, n-market polytope projection rather than per-pair checks, and a productized signal layer with statistical significance gating, cost-adjusted expected value, and resolution risk scoring.

Adjacent work includes Sirolly, Ma, Kanoria, and Sethi (2025), "Spectral Detection of Wash Trading in Prediction Markets," whose detector Convexly has independently adapted to Polymarket position-tape data; Capponi, Gliozzo, and Zhu (2025), "Semantic Trading"; Tsang and Yang (2026), "The Anatomy of Polymarket"; and Ho, Budescu, and Himmelstein (2025) on coherence as a quality signal for human forecasters. The Csiszár 1975 axiom system on I-divergence projection geometry provides the theoretical justification for KL-divergence as the canonical objective; the Ben-Tal and Nemirovski 2009 robust optimization framework supports the bid-ask box robust projection upgrade planned for V0.2.

Limitations and the V0.2 plan

CME V0.1 is honest about what it does not yet do. Five gaps remain between V0.1 and a publication- grade or commercially-viable product:

V0.1 uses a quadratic-distance projection. The Csiszár-axiom-canonical objective is KL-divergence (I-projection), which V0.2 will adopt.
V0.1 operates on the gamma API midpoint. Real bid-ask data from the Polymarket CLOB websocket is required to detect intra-market YES/NO asymmetry, which Saguillo et al. (2025) found to account for approximately 99.76% of realized Polymarket arbitrage. V0.2 integrates the CLOB websocket and ships a four-price detector alongside the inter-market projection.
V0.1 estimates capacity from liquidity tiers. V0.2 will replace this with order-book depth at the signal price level, which is the actual capacity-constrained quantity.
V0.1 does not include a backtest against historical Polymarket outcomes. The V0.1.5 framework provides the scaffold; V0.2 will run a pre-registered backtest (per the AsPredicted pattern of V2.8.2 and #287,983) once daily snapshot accumulation provides sufficient time series.
V0.1 employs a heuristic for resolution risk. V0.2 will replace the keyword model with an LLM-graded resolution-criteria similarity check and ground- truth calibration against historical UMA dispute outcomes.

The full V0.2 specification is documented in docs/research/marketalpha/cme_v0_2_synthesis_spec.md in Convexly's research repository. The estimated build effort is approximately twelve weeks, with the CLOB integration as the largest single component.

Reproducibility

The CME V0.1 pipeline is fully reproducible from public input artifacts. Each pipeline run writes a hash-chain audit-log row carrying SHA-256 hashes of six input and output files, plus a previous-row hash linking it to the prior log entry. Any reader can re-derive the methodology output from the input hashes.

Code is open at the Convexly research repository. The methodology is frozen at the commit referenced in each audit-log entry. The Polymarket gamma API snapshots used for the smoke tests are deterministic for a given fetch timestamp and market filter.

Convexly invites academic and institutional collaborators to engage on the methodology. The Pennock-Lahaie-Kroer axis at Columbia, Microsoft Research, and Google Research is the natural academic-validation chain; the IMDEA Networks group is the closest contemporary peer; the Sirolly et al. group at Columbia and Cornell is the natural wash-trading methodology partner.

AI tooling disclosure

This work used AI tools (Claude, GPT-4) as research aids during methodology design and pre-publication review. All claims, statistical results, and figures are reproducible from the public methodology code at the linked Convexly research repository and the frozen-commit hash recorded in each audit-log entry. No claim on this page is taken as true on the basis of an AI tool's output; every quantitative result is recomputable from the documented inputs with the pre-registered seed.

Citation: Convexly Research. (2026). Coherent Markets Engine V0.1: Constraint-Projection Structural Inference for Prediction Markets. Methodology version 0.1. convexly.app/research/coherent-markets-engine-v1. Drafted 2026-04-28.

Related research

Edge Score Methodology V1 — per-wallet skill ranking foundation
Edge Score V1-M (cross-venue extension) — Polymarket plus Manifold plus Kalshi
MarketAlpha V2.8.2 — why skill-weighted forecast aggregation fails on Polymarket; the empirical justification for using market prices in CME instead
Edge Score V1.5 — V3b reframed as a cross-sectional ranker, not a per-wallet temporal predictor
Methodology disclosure — what each Convexly methodology paper has been tested on, what it has not, what is deferred