Audit infrastructure

AsPredicted pre-registrations

For external methodology changes filed after 2026-04-25, Convexly pre-registers hypotheses on AsPredicted before any analysis runs. The verdict + run date + effect-size CI are reported within 24 hours of the test running. The original V1 and V1-M papers were not retroactively pre-registered; they remain frozen-coefficient methodology with ex-ante version-controlled commitment via the SHA-256 audit chain. Failed pre-registrations land in the negative-result registry; the audit chain is verifiable in your browser at /research/verify.

Machine-readable manifest at /research/preregistrations.json.

Last updated 2026-05-01T18:00:00Z.8 entries

AsPredicted #287368

Filed 2026-04-25 · Ran 2026-04-27

V1.5 follow-up experiments E2 + E7

Failed (rejected)

E2 per-wallet temporal holdout: ρ = +0.111 [+0.046, +0.175], well below the +0.30 pre-reg threshold. E7 per-quarter IC stability: median ρ = +0.038, only 3 of 5 quarters positive vs ≥5/6 required. Both failed.

Audit-chain anchor: v1_5_analyses_results_20260427_191717

AsPredicted #287436

Filed 2026-04-26 · Ran 2026-04-26

MarketAlpha V2 in-sample skill-weighted aggregation tests

Failed (rejected)

Initial in-sample test of skill-weighted aggregation as per-market price prior. 24 aggregator variants tested; all rejected. Cohort substitution amendment filed as #287714.

Audit-chain anchor: marketalpha_v2_in_sample_run

AsPredicted #287442

Filed 2026-04-26 · Ran 2026-04-26

MarketAlpha V2 forward-only skill-weighted aggregation tests

Failed (rejected)

Forward-only complement to #287436. Same 24 aggregator variants on a held-out forward window. All variants rejected forward.

Audit-chain anchor: marketalpha_v2_forward_run

AsPredicted #287714

Filed 2026-04-27 · Ran 2026-04-27

MarketAlpha V2 cohort-substitution amendment

Failed (rejected)

Cohort substitution from V1 (8,656 wallets) to V1-M (8,778 wallets) to verify the negative result is not cohort-specific. All 24 aggregator variants rejected on V1-M as well; consistent with the original finding.

Audit-chain anchor: marketalpha_v2_cohort_substitution

AsPredicted #287983

Filed 2026-04-28 · Ran 2026-04-29

V2.8.2 wash-filter TOST equivalence test on V1-M Polymarket cohort (Sirolly-adapted)

Passed

Wash-filter robustness check on the V2.8.2 negative result. Brier delta CI [+0.16028, +0.19287] sits inside the pre-registered TOST equivalence range [+0.154, +0.204]. Movement after wash filtering: +0.00243 Brier (1.4% relative). The V2.8.2 finding (skill-weighted aggregation rejected) is robust to wash-trader filtering at composite-z >= 3.0.

Audit-chain anchor: v28_2_wash_filter_tost_passed_2026_04_29

AsPredicted #288046

Filed 2026-04-29 · Runs 2026-07-29

CME V0.2 backtest: 90-day walk-forward on Polymarket constraint-projection signals

Pending

90-day walk-forward backtest of the CME V0.2 constraint-projection pipeline pre-registered. Hyperparameters frozen ex-ante (thresholds, sizing, cost model, performance metrics). No hyperparameter tuning based on backtest results allowed by the pre-reg.

Audit-chain anchor: cme_v0_2_0_methodology_frozen_commit_8616a63

AsPredicted #288610

Filed 2026-05-01

V2-Perps Edge Score: skill ranking with CRPS + funding-capture pillars

Pending

Pre-registers the form (4 pillars: CRPS-posture, conviction, discipline, funding-capture) + 7 validation gates for the V2-Perps Edge Score composite. Form locked at freeze commit 8c86dd4; coefficients TBD pending Hyperliquid 90-day cohort fit. Composite reduces to V1 / V3b on binary outcomes (Brier-equivalence identity) and extends across crypto perps, equity perps, compute futures, AI benchmark markets, valuation futures, and prediction markets per spec Section 6.

Audit-chain anchor: v2_perps_methodology_frozen_commit_8c86dd4

AsPredicted #288615

Filed 2026-05-01 · Runs 2026-07-30

CME V0.2-Perps: 90d walk-forward on Hyperliquid coherence-violation signals

Pending

90-day walk-forward backtest of the V0.2-Perps coherence-violation engine (7 constraints: cash-and-carry, triangle, put-call parity, Carr-Madan butterfly, Litterman-Scheinkman PCA calendar, cross-venue 4-corner, vertical-spread monotonicity). H1 net Sharpe > 1.0; H2 capacity ceiling < 50K USD/day; H3 each constraint contributes positive Sharpe with 95% bootstrap CI excluding zero. Methodology code freezes at commit adb99d6; emit cron at .github/workflows/cme-v0-2-perps-emit.yml.

Audit-chain anchor: cme_v0_2_perps_methodology_frozen_commit_adb99d6

Filing policy

  • Filing rule: Every methodology change that affects an external claim is pre-registered on AsPredicted before any analysis runs. The pre-registration appears in this manifest within 24 hours of filing.
  • Verdict update rule: When a pre-registered test runs, the verdict + run_at_utc + verdict_summary are updated within 24 hours of the run completing. Verdicts are PASSED, FAILED, or PENDING. Failed pre-registrations are added to the negative-result registry at /research/negative-results.
  • Supersession rule: When a pre-registration is superseded by an amendment, the original entry is kept (verdict noted as superseded) and the amendment is added as a separate entry. Original entries are never removed.
  • Audit-chain link: Every entry's audit_chain_anchor field references the SHA-256-hash-chained run identifier in apps/web/public/research/cme/audit_log.jsonl (or paper-specific provenance log). The /research/verify page walks the chain in client-side JavaScript and renders a green stamp if every prev_hash matches its parent's row_hash.