Methods

Methodology change log

A chronological record of methodology version changes across the Edge Score, Coherent Markets Engine, and Market Trust families: freezes, releases, pre-registration filings, negative results, and deviation notes. Each entry cites the repository file or document that backs it; pinned hashes are reproduced verbatim from the freeze modules and verdict artifacts.

Reading note: entries are descriptive history, not performance claims. Market Trust remains a canary_preview surface; its verdict cutpoints are operator-set and a prospective calibration test is filed and maturing on resolved outcomes. The methodology disclosure covers what each method is and is not tested on.

2026-06-11Market TrustReleaseinstrument fitness v0.1

Instrument fitness diagnostics block shipped (experimental, opt-in)

An opt-in component table on the v0.2 card and packet APIs (include=fitness) and a collapsed packet-page section: depth durability, spread and price impact, time to resolution, resolution reliability, concentration exposure, and a structural basis-risk note. Each component states the observable on the stored card it derives from and reports an honest insufficient_data state when the card lacks the input; time to resolution is insufficient_data on every current card because the v0.2 schema stores no end date. Deliberately no combined number; the not-advice sentence, experimental label, and methodology version are fixed payload fields. Default API responses are unchanged.

Market Trust

Repository sources (3)
  • apps/web/src/lib/market-trust-instrument-fitness.ts
  • apps/web/src/app/api/market-trust/v0.2/[conditionId]/route.ts (include=fitness)
  • docs/strategy/2026-06-11-prediction-markets-insurance-angle.md (BUILD item, founder-facing memo)
2026-06-07Market TrustDeviation notev1 calibration freeze

Deviation note 1 for #295174: UMA terminal-dispute classification

The resolution-quality classifier was corrected so a UMA dispute counts as a bad outcome only when it is the terminal state; a disputed-then-re-resolved-clean market is clean. Recorded as a dated deviation note rather than a re-filing: zero resolution rows existed when the correction shipped, so the change preceded any forward evidence. The freeze hash is unchanged because it pins the frozen formula and test design, not the classifier implementation.

Repository sources (2)
  • docs/research/preregistrations/2026-06-07-mt-v1-295174-uma-amendment.md
  • scripts/ops/lib/market-trust-calibration-shared.mjs (classifyPolymarketResolution)
2026-06-07Edge ScoreDeviation noteFDR forward-persistence

Deviation note 1 for #294147: SELL-trade win/loss correctness fix

A correctness bug in the shared position aggregation resolved SELL fills against the raw outcome index while keying positions on the normalized index, inverting win/loss for SELL-derived positions. Fixed in the data-generation path with a forward-archive reset; recorded as a dated deviation note because the forward collector imports that aggregation. The fix landed before forward evidence accumulated under the buggy rule.

Repository sources (1)
  • docs/research/preregistrations/2026-06-07-wallet-skill-294147-sell-pnl-deviation.md
2026-06-06Market TrustFreezev1 calibration freeze

Market Trust v1 calibration freeze filed (AsPredicted #295174)

Pillar weights, verdict cutpoints (80/65/45), evidence-confidence caps, the unknown-flow ladder, and the use gates were frozen as-is (operator-set, deliberately not tuned before filing). The single confirmatory primary outcome is resolution quality; the forward window starts 2026-06-07 and the verdict matures on resolved outcomes alone. A pre-registered null keeps Market Trust in canary_preview and is a publishable outcome. The freeze hash is pinned by an agreement test that fails CI if the frozen object drifts.

freeze hash: e9c91babf8c2f103c4f512436e3b0b3c7a29d9c827fd5f6898e7173bcc840d34

Pre-registration: AsPredicted.org #295174 (private filing; receipt held internally)

Repository sources (3)
  • apps/web/src/lib/market-trust-v1-calibration-freeze.ts (MARKET_TRUST_V1_FROZEN_FORMULA, MARKET_TRUST_V1_FILING)
  • apps/web/src/lib/market-trust-v1-calibration-freeze.test.ts (pinned freeze hash)
  • docs/research/preregistrations/2026-06-06-market-trust-v1-calibration-resolution-quality-aspredicted.md
2026-06-06Market TrustReweightingV0.2 pillar reweighting

Market Trust v0.2 pillar merge: liquidity and depth become executability

The liquidity and depth pillars merged into a single executability pillar (weight 20); the freed weight moved to orderbook-independent pillars (coherence 20 to 25, resolution_reliability to 20) so the weights sum to exactly 100. The weights remain operator-set candidates, not calibrated, pending the outcome-ledger calibration track.

Repository sources (2)
  • apps/web/src/lib/market-trust-v02.ts (MARKET_TRUST_V02_PILLAR_WEIGHTS note)
  • apps/web/src/app/api/market-trust/v0.2/[conditionId]/verify/route.test.ts (Defect B comment, coherence 20 -> 25)
2026-06-01Edge ScorePre-registration filedFDR forward-persistence

Wallet-skill FDR forward-persistence test filed (AsPredicted #294147)

Strictly prospective 90-day test (window 2026-06-02 through 2026-08-30) of whether FDR-cleared wallet-skill candidates persist out of sample. The frozen candidate registry and methodology hash are pinned in code; the daily verdict reports an honest insufficient_sample until at least 40 candidate wallets clear the 10-position floor and at least 1,000 qualifying forward positions resolve.

frozen registry sha256: 5478f7c47b8296b295be9c2b85215ce4f517d68c2b1778f8bea3a65137ec20a9

methodology hash: 39f07abfcabb84e914231200f7cf1f9c33a843cbef585c2faf1369d20bdb4f69

Pre-registration: AsPredicted.org #294147 (private filing; receipt held internally)

Repository sources (3)
  • services/api/scripts/skilled_wallet_fdr/forward_config.py (FILING_TIMESTAMP_UTC, FROZEN_REGISTRY_SHA256)
  • docs/research/preregistrations/2026-05-31-wallet-skill-fdr-forward-only-draft.md (status: FILED 2026-06-01)
  • docs/ops/evidence/wallet-skill-fdr-forward-verdict/latest.json (methodology_hash)
2026-05-31Coherent Markets EnginePre-registration filedV0.1.6 forward

CME realized-vs-control forward-only validation filed (AsPredicted #294035)

Strictly prospective matched-control test of realized CME signal performance against controls, maturing on data and time alone with no human in the loop. The methodology hash is pinned; the daily verdict artifact reports an honest insufficient_sample until at least 30 K=20 pairs resolve. A null is a publishable pre-registered outcome.

methodology hash (git-sha-stripped): 1c6e3ddc95988c9eef368201c0d19404e1e5347a06df23ae402ca0f97935cc92

Pre-registration: AsPredicted.org #294035 (private filing; receipt held internally)

Repository sources (3)
  • docs/research/preregistrations/2026-05-10-cme-realized-vs-control-forward-only-draft.md (FILING RECORD 2026-05-31)
  • .github/workflows/cme-matched-control-verdict.yml
  • docs/ops/evidence/cme-matched-control-verdict/latest.json (methodology_hash)
2026-05-08Market TrustReleaseV0.2

Market Trust v0.2: evidence-status-gated cards under canary_preview

Versioned card substrate (migration 083, dated 2026-05-07) with SHA-256 row-hash receipts and previous_row_hash chaining; the rating predicate gates a use verdict on measured-pillar count and a pending-pillar ceiling. Every card carries contract_status canary_preview, fixed not_claiming fields, and the not-calibrated caveats for operator-set cutpoints.

Market Trust | Market Trust card research page

Repository sources (3)
  • supabase/migrations/083_market_trust_v02_cards.sql
  • apps/web/src/lib/market-trust-v02.ts (MARKET_TRUST_V02_CONTRACT_STATUS, MARKET_TRUST_V02_NOT_CALIBRATED_CAVEATS)
  • supabase/migrations/085_methodology_versions.sql (market_trust V0.2 seed row)
2026-04-30Market TrustReleaseV0.1

Market Trust v0.1 scorecards shipped (experimental)

Daily artifact-first scorecards with four ratings (use, use_with_caveats, discount, do_not_cite) and operator-set cutpoints at 80/65/45. Explicitly experimental; superseded by v0.2 once the evidence-status-gated rating predicate landed.

Market Trust card research page

Repository sources (2)
  • supabase/migrations/085_methodology_versions.sql (market_trust V0.1 seed row)
  • apps/web/scripts/build-market-trust-cards.mjs
2026-04-29Coherent Markets EngineFreezeV0.1

Coherent Markets Engine V0.1 frozen

Constraint-projection detector for structural inefficiencies across linked Polymarket events, operating on market prices only (no skill weighting). Frozen configuration committed in the repo. The same day, the V0.2 walk-forward backtest was pre-registered on AsPredicted before running on historical data. Track record accrues on resolutions.

Pre-registration: AsPredicted.org #288046 (V0.2 backtest)

CME V1 research page

Repository sources (3)
  • supabase/migrations/085_methodology_versions.sql (cme_signals V0.1 seed row)
  • services/api/scripts/cme_v0_1/frozen_config.py
  • docs/research/preregistrations/2026-04-29-cme-v0-2-backtest-aspredicted.md
2026-04-25Edge ScoreNegative resultV2.8.2

V2.8.2 skill-weighted aggregation sweep: no aggregator beat the baselines

A 24-aggregator sweep on the V1-M cohort tested whether skill-weighted aggregation improves per-market price priors. The primary hypothesis did not reject the null and none of the 24 aggregators beat the market-implied baseline; the primary aggregator underperformed both the equal-weighted and market-implied baselines. Published 2026-04-27 as a negative result that closed skill-weighted aggregation as a price prior.

Model-selection writeup

Repository sources (2)
  • supabase/migrations/085_methodology_versions.sql (edge_score V2.8.2 seed row)
  • apps/web/src/app/research/v3b-over-v1-model-selection-under-bias/page.tsx
2026-04-25Edge ScoreNegative resultV1.5

V1.5 pre-registered follow-up: both primary tests failed; published openly

Two ex-ante primary tests (E2 per-wallet temporal holdout, E7 per-quarter IC stability) were filed on AsPredicted before analysis and both failed at their ex-ante thresholds. Published 2026-04-27 as a negative result: the composite is a cross-sectional ranker, not a per-wallet temporal predictor.

Pre-registration: AsPredicted.org #287368

V1.5 paper | Negative results

Repository sources (2)
  • docs/research/preregistrations/2026-04-25-v1-5-experiments-aspredicted.md
  • supabase/migrations/085_methodology_versions.sql (edge_score V1.5 seed row)
2026-04-22Edge ScoreReleaseV1-M

Edge Score V1-M cross-venue extension shipped

V1-M extends the V1 frozen-coefficient ranker across Polymarket, Kalshi, and Manifold using the same composite formula. Active as the production wallet ranking, published with a public data bundle.

V1-M methodology paper

Repository sources (2)
  • supabase/migrations/085_methodology_versions.sql (edge_score V1-M seed row)
  • apps/web/src/app/research/edge-score-methodology-v1m/page.tsx
2026-04-16Edge ScoreFreezeV1

Edge Score V1 frozen

Frozen-coefficient three-pillar composite anchored on 8,656 Polymarket wallets with at least 5 resolved positions. Coefficients were committed to git before the validation script ran; V1 is internally ex-ante, not externally pre-registered. Paper published 2026-04-18.

V1 methodology paper

Repository sources (2)
  • supabase/migrations/085_methodology_versions.sql (edge_score V1 seed row)
  • apps/web/src/app/research/edge-score-methodology-v1/page.tsx

Boundary of this page

This log records when methods were frozen, filed, shipped, or corrected. It is market diligence infrastructure, not investment advice, not a trading instruction, and not proof that any score or market probability is correct.