Back to Blog
Pre-distributed via X Wallet Skill Brief #15 min read

Most “whales” are not skilled

Convexly’s first weekly read on prediction-market wallets, signals, and methodology.

Headline finding · 2026-05-03 snapshot

Across the Top-50 Polymarket cohort, the rank-correlation between Edge Score and realized PnL is Spearman ρ = −0.32 (p = 0.026, n = 50).

A negative correlation. In this slice, higher Edge Score is associated with lowerrealized PnL, not higher. The dollar leaderboard and the skill leaderboard disagree about which wallets look skilled, and they disagree systematically — not by a small amount of noise.

This is the first issue of a recurring weekly post. Every Sunday we’ll share what changed in the Top-50 leaderboard, surface one wallet whose PnL overstates skill, summarize any CME signals that resolved that week, and explain one methodology choice in plain language.

The Brief is built off the same public artifacts the rest of Convexly operates on: the edge-score-top100.json snapshot, the CME signal ledger, and the public claim registry at /evidence. Every number has a receipt.

What Edge Score V3b says today

The Top-50 leaderboard at /research/wallet-rankings ranks 50 Polymarket wallets with at least 20 resolved positions, by the Edge Score V3b composite (calibration / conviction / discipline). The 2026-05-03 snapshot had a Polygon block anchor at height 86,337,463.

The current top 5:

RankAddressEdge ScoreRealized PnLn positions
10x0e9be66f…ce4e99.5$5,18322
20xf386fb70…f22c99.4$5,78822
30x48200256…1c8699.4$6,02423
40x8988e1d0…5dd599.3$9,53026
50xd68cd8fe…f3e999.1$11,07020

The numbers above are headline-friendly but they are not the interesting finding. The interesting finding is what happens if you re-rank by raw PnL.

The PnL-rank vs Edge-rank gap

If you sort the same Top-50 cohort by realized PnL alone, the order shuffles substantially. The top 5 by raw PnL:

PnL rankAddressRealized PnLEdge ScoreEdge rank
10xc2e7800b…be51$1,532,71094.436
20x2cc5404d…91d7$132,26559.244
30xffa43496…1f7f$124,54960.842
40xbf49d1ad…5608$82,90596.520
50x682e3886…70bd$78,35155.745

Three of the top-five PnL wallets in the cohort rank in the bottom quartile by Edge Score. Their dollar outcomes are real; their prospective skill signal is not particularly strong.

False whale of the week

0x682e3886…70bd made $78,351 in realized PnL on Polymarket. Edge Score V3b puts it 45th out of 50in the cohort — near the bottom of the calibration-weighted skill ranking we surface publicly.

The disconnect is the framing: a $78K realized return on the timeframe of a typical Polymarket positions cycle (months) is a great-looking absolute number; the Convexly composite reads the same wallet’s posture, conviction, and discipline percentiles and finds that the return is more consistent with high-conviction bets that happened to work than with repeatable skill. Cohort median PnL is $13,357, so $78K is real outperformance — but Edge Score ranks the wallet near the bottom of the cohort because the underlying components are.

If you want to copy a wallet’s behavior, copy from the Edge Score top — not the PnL top. Most “whales” are not skilled in the calibration-and-conviction sense; some are lucky in the small sample of resolved positions they have.

The negative-correlation finding from the lede is the cohort-wide version of this: the false whale isn’t an outlier, it’s a pattern. Edge Score is picking up a different signal than the dollar column. The ρ = −0.32 result is one-snapshot descriptive — meaningful for “is the disagreement real” but not a forward-validation claim. The forward-validation work lands in Brief #2 once enough daily snapshots accumulate.

The Edge Score V1-M paper at /research/edge-score-methodology-v1m documents the construction, and the public reproduce.py script lets anyone re-run the ranking on the underlying dataset.

CME signals this week

The Coherent Markets Engine (CME) emitted signals through the daily 06:30 UTC pipeline this week as usual. The realized track record at /research/cme/track-record shows the cumulative count of emitted signals; the per-signal table is gated to the Researcher tier so paying customers retain exclusive access to the methodology disclosure.

We do not yet have enough resolved signals to publish a hit-rate point estimate this week. The first batch of resolutions is expected in late July 2026 once the V0.2 backtest target window closes, per the methodology referenced on the track-record page.

When the resolved-signal count clears 10, we’ll publish a rolling realized scorecard with confidence intervals. Until then, the value of the signal feed is the methodology + audit chain, not the realized numbers.

One methodology note: why we don’t rank by PnL

Edge Score V3b is a frozen-coefficient composite of three pillars (calibration / conviction / discipline). Realized PnL is not a pillar. There are two reasons:

  1. Sample size. A wallet with 22 resolved positions and a $33K realized PnL has noisy enough returns that a single high-stakes correct call can dominate the dollar number. Edge Score weights calibration across positions, less dominated by any single bet.
  2. Forward predictability.A wallet’s PnL rank in cohort N is a weak predictor of its PnL rank in cohort N+1. A wallet’s calibration percentile is a stronger predictor — the V1-M paper documents out-of-fold Spearman +0.514 on the V1 cohort versus +0.147for the Brier-only baseline (permutation p < 0.0001, n = 8,656).

Edge Score is what we’d want to copy. Realized PnL is what we’d want to follow once Edge Score has already made the pick. The leaderboard ranks by the first; the per-wallet detail surfaces both so you can compare.

What’s coming next

  • A daily wallet-rank persistence metric: how often a wallet in the top decile this week is in the top decile next week. The Convexly hypothesis is that calibration-weighted ranks move less week-over-week than PnL ranks. We will publish the side-by-side test once enough daily snapshots accumulate (starting in two weeks).
  • An alert when one of the wallets in your saved watchlist crosses a meaningful threshold (rank move of 5 positions or more, edge-score delta of 10 percentile points or more). Email-channel arrives this week; Telegram and Discord land after.
  • A “high-Edge wallet entered this market” signal that fires when one of the top-decile wallets opens a new position above a notional threshold. Gated behind upcoming Ledger 2 work.

Next steps if you’re new here

The Brief is published Sundays. Reply to research@convexly.app with what would be useful in next week’s issue.