PnL vs Edge Score
Same 100-wallet cohort, two rankings. Left: top 20 by realized PnL (what Polymarket shows you). Right: top 20 by Edge Score V3b (what a frozen-coefficient skill measure shows). Where the columns disagree is where historical PnL should not be treated as a standalone skill proxy.
Top 20 by realized PnL
What the Polymarket leaderboard shows
rank . wallet . edge . pnl . edge rank
Top 20 by Edge Score V3b
Frozen-coefficient skill measure, fit on 8,656 wallets
rank . wallet . edge . pnl . pnl rank
What this artifact is saying
Across the full 8,656-wallet Polymarket cohort in our V1 methodology paper, the Spearman rank correlation between a wallet's Brier score (calibration) and its realized PnL is +0.148. That is a weak relationship. The top 1 percent of wallets by absolute PnL captured 36.2 percent of all signed profit on the platform, and most of them did not get there by being better calibrated than everyone else.
Convexly's Edge Score V3b is a composite of three pillars fitted against signed log PnL with out-of-fold cross- validation: posture, conviction, and discipline. Its out-of- fold Spearman correlation with signed log PnL is +0.514. A Fama-French 2010 bootstrap with 10,000 permutations places the observed correlation outside every permuted sample at p less than 0.0001. It is a stronger cross-sectional behavioral-skill ranking than raw PnL in this cohort.
The divergence you see above is the visible evidence for that. Wallets that appear on the left column but not the right are PnL winners whose historical record is not as strong on the Edge Score pillars. Wallets that appear on the right column but not the left are higher-Edge operators whose PnL is lower because their turnover or sizing is more disciplined than the PnL-leaderboard leaders.
The V1-M extension paper published 2026-04-22 adds a 15,106-user Manifold cohort and a paired-window sweepcash sensitivity analysis. The original public bundle reported median concentration 8.9 percentage points lower in the sweepcash window on 1,647 paired users, with concentration delta defined on n=333. A 2026-05-04 recovered-cohort rerun excluding political and election markets reports -9.2pp, 95% CI [-12.8, -3.6], Wilcoxon p=0.0021 on n=515 of 1,208 paired users with defined concentration delta. The concentration shift survives the sensitivity check, but causal attribution to sweepcash alone remains limited by the catalog confound.
Score any Polymarket wallet against this cohort
Paste any 0x address into the analyzer and see its three- pillar Edge Score in under 30 seconds. Free, no signup, no wallet signature.
Methodology: Edge Score V3b (Convexly Research 2026), fit on 8,656 Polymarket wallets. Cohort refreshes daily at 06:00 UTC via a GitHub Actions cron. Use this as a comparison view, not a wallet recommendation list.