What is Edge Score?
A composite skill measure for prediction-market traders. Three pillars (calibration, conviction, discipline), one frozen coefficient set, fit on 8,656 real Polymarket wallets. Open methodology, public data, scorable on any wallet for free.
The one-paragraph version
Edge Score combines three z-scored predictors of trader behavior on Polymarket: posture (the standardized negation of baseline-adjusted Brier score), conviction (PnL concentration in the wallet's single largest event), and discipline (the log of resolved position count, with a negative sign in the composite). The frozen coefficients were fit by ordinary least squares against signed log realized PnL on a reference cohort of 8,656 Polymarket wallets, sampled April 15-16 2026, and never refit at inference time. The raw composite is mapped to a 0-100 percentile rank against the training cohort distribution. Out-of-fold Spearman rank correlation with forward signed log PnL is +0.514, against +0.148 for a Brier-only baseline.
Why three pillars and not just one
If calibration were the dominant skill on prediction markets, a trader's Brier score alone would predict their profit rank cleanly. It does not. Across the full 8,656-wallet Polymarket cohort, the Spearman rank correlation between raw Brier score and realized PnL is only +0.148. Among the top 100 wallets by realized profit, the relationship actually flips: worse-calibrated wallets in that group earn more (Spearman +0.42 in the whale audit). The empirical story is that prediction-market profit is fat-tailed (Hill tail index = 1.28; below the alpha = 2 threshold above which OLS variance is well-behaved), and a few large concentrated positions dominate realized PnL.
Adding conviction (concentration) and discipline (position count) to the composite lifts the out-of-fold Spearman rank correlation with PnL to +0.514. The intuition: a trader who is well-calibrated but spreads tiny bets across many markets does not capture much of the available edge; a trader who concentrates the right way at the right times tends to outperform. Edge Score does not tell you HOW to find the right concentration; it tells you what the historical pattern of the profitable cohort looks like, and how a given wallet ranks against that pattern.
The three pillars in plain English
Posture (calibration)
Posture is the standardized negation of baseline-adjusted Brier. Baseline-adjusted Brier is observed Brier minus the wallet's own marginal-frequency Brier (the trivial always-predict-the-base-rate baseline). The coefficient on this pillar is +0.79. Higher posture means worse calibration relative to the baseline-trivial alternative, which on this cohort empirically aligns with higher realized profit. The pillar does not measure forecasting accuracy in the traditional sense; it measures the sign-aligned contribution of calibration to PnL on this specific cohort. The renaming from "calibration" to "posture" in the V1 paper preserves the measured effect without overclaiming what the pillar tracks.
Conviction (concentration)
Conviction is the standardized share of total realized PnL attributable to the wallet's single largest event. The coefficient on this pillar is +2.72, the largest of the three. Higher conviction means a more barbell-concentrated profit profile: most of the wallet's return comes from one event. On the training cohort, the wallets that compound the most are not the ones that distribute risk evenly across the catalog; they are the ones that concentrate when the conviction trade appears.
Discipline (position count)
Discipline is the standardized log1p of resolved position count, with a negative sign in the composite. The coefficient on this pillar is -1.15. Higher discipline (in the composite-contribution sense) corresponds to fewer resolved positions: the most profitable wallets on the training cohort hold fewer, larger positions. A trader who places hundreds of small bets across the catalog tends to score lower on Edge Score even with comparable calibration.
What the score number actually means
Edge Score is on a 0-100 percentile scale by construction. A score of 50 is exactly the cohort median. A score of 90 means the wallet is in the top decile of the 8,656-wallet reference cohort by composite ranking. The current top of the daily Polymarket leaderboard sits around 95-100. Below 30 means the composite places the wallet in the bottom third by skill ranking, regardless of where their realized PnL lands.
Two important caveats. First, the percentile is computed against the frozen training cohort, not against any new wallet population the analyzer encounters. A new wallet that is materially different from the training cohort (e.g., a wallet that only bets on one category, or one that has very few resolved positions) is being scored by extrapolation. Second, the composite ranks wallets cross-sectionally; it does not bound expected returns for any individual wallet. Realized PnL on Polymarket is fat-tailed and individual outcomes vary widely.
What Edge Score does NOT do
Edge Score does not predict whether a particular wallet will profit on the next market. It does not separate skill from luck on a single wallet's realized PnL history (the per-wallet temporal holdout that would address this is pre-registered as the V1.5 follow-up at AsPredicted #287368). It does not transfer cleanly across venues: per the V1-M paper, the fitted coefficients diverge materially between Polymarket and Manifold, with the discipline pillar flipping sign at permutation p = 0.0001. And it does not substitute for category-specific calibration analysis, time-period analysis, or position-sizing diagnostics that depend on bankroll context.
Where the methodology lives
The V1 methodology paper (frozen coefficients, full validation suite, Fama-French bootstrap null at 10,000 permutations) is at /research/edge-score-methodology-v1. The V1-M cross-venue extension (15,106-user Manifold cohort, sweepcash within-user natural experiment) is at /research/edge-score-methodology-v1m. The V1.5 deferred experiments (per-wallet temporal holdout + per-quarter Information Coefficient stability) are pre-registered with AsPredicted before any analysis runs. The reproducibility data bundle, including 15,106 aggregated user records and a stdlib-only Python script that re-runs the analysis, is downloadable as a 1.2 MB tar.gz at /research/v1m/v1m-data-bundle.tar.gz.
Score a wallet
Paste any Polymarket wallet address at the analyzer. No signup, no signature, no private key. Reads public on-chain data only.
Convexly publishes new methodology research roughly every 6-8 weeks plus the /learn series on a rolling cadence. Get the next paper in your inbox when it ships:
Frequently asked
What is Edge Score?
How is Edge Score calculated?
What is a good Edge Score?
Why three pillars instead of just calibration?
Is Edge Score the same as PnL or rank by realized profit?
Where is the methodology published?
Can I score my own wallet?
What does Edge Score NOT do?
Related explainers
- /learn/calibration: what a Brier score actually measures, baseline-adjusted Brier (skill-Brier), and why calibration alone barely predicts profit on Polymarket
- /learn/conviction: what concentration means, how to read a barbell PnL profile (coming soon)
- /learn/discipline: why position count predicts profit inversely on Polymarket and oppositely on Manifold (coming soon)
- /learn/kelly: fractional Kelly under fat tails, why full-Kelly is unsafe when alpha < 2 (coming soon)