Research note · scan date 2026-06-09
We ran our skill read on every wallet in our own published top-50 cohort. 35 of 50 had enough resolved positions to test. The number that cleared the corrected statistical bar: zero.
Convexly publishes a top-50 Polymarket cohort ranked by a behavioral composite, Edge Score (the leaderboard page renders the cohort's positive-PnL subset). We turned our own skill test on that published cohort: for each wallet with at least 30 resolved positions, we computed the realized entry edge, meaning whether entries resolved true more often than the price paid implied, with a bootstrap 95% interval, plus a concentration screen.
Results: 19 of 35 read clean as not separable from chance, with an interval that includes zero and no concentration flag. Another 15 are too concentrated to read as a clean edge, with one event carrying most of the net result; most of those intervals include zero too. Exactly 1 record's interval clears zero on the positive side, which is roughly what chance alone predicts: testing 35 records at a 2.5% threshold is expected to produce about 0.9 false positives. That record does not survive a multiple-comparisons correction, and the same wallet is net-negative on PnL. A wallet with a very large public P&L that people cite as proof of skill shows an edge of +0.4 probability points, 95% interval [-16.0, +15.6], across 38 resolved positions.
The honest takeaway is NOT "these wallets have no skill." It is that the resolved records, at these sample sizes, mostly cannot distinguish skill from chance either way, and a P&L screenshot tells you even less.
Per-wallet table
Labels are on-chain Polymarket usernames or addresses only. The one positive-interval row is annotated in place: it is an uncorrected single test selected out of 35, statistically consistent with chance at the cohort level, it is not in the FDR-corrected cleared set, and its net PnL is negative. Dollar PnL figures are withheld pending a cross-pipeline reconciliation; the sign column reflects the published cohort artifact. Diagnostics on public on-chain records, not investment advice; a past read is not a forecast.
| Read | Label | Edge Score | n resolved | Realized edge (pp) | 95% BCa CI | Concentration | Net PnL sign |
|---|---|---|---|---|---|---|---|
| interval clears zero on the positive side (uncorrected) | ShouShouKKos 0xc2fb28Uncorrected single test of 35 (about 0.9 false positives expected by chance); NOT in the FDR-corrected cleared set; net PnL negative. Not a skill verdict. | 12.7 | 137 | +5.6 | [+1.2, +9.5] | n/a | negative |
| not separable from chance | Zarvantis 0x30c92f | 77.1 | 47 | -2.7 | [-8.9, +3.2] | n/a | positive |
| not separable from chance | beachboy4 0xc2e780 | 76.3 | 38 | +0.4 | [-16.0, +15.6] | n/a | positive |
| not separable from chance | Pesdruzi 0x15707f | 59.9 | 34 | +0.4 | [-14.7, +16.2] | 0.58 | positive |
| not separable from chance | dasasd 0x6a7d89 | 49.8 | 32 | -2.5 | [-16.4, +10.8] | n/a | negative |
| not separable from chance | sUSDe 0x657a26 | 49.0 | 30 | +11.2 | [-2.7, +25.2] | 0.45 | positive |
| not separable from chance | anon_1d8a377c 0x1d8a37 | 47.3 | 43 | +0.5 | [-11.8, +12.4] | n/a | negative |
| not separable from chance | Ieirigk733 0xaaaf7f | 47.1 | 32 | +12.2 | [-0.7, +23.7] | n/a | negative |
| not separable from chance | YoonSukYeol 0x00dcd7 | 46.5 | 41 | +2.6 | [-9.6, +14.5] | 0.52 | positive |
| not separable from chance | Vaedrix 0x5afa18 | 45.4 | 49 | +0.9 | [-7.2, +8.4] | n/a | negative |
| not separable from chance | Drakonis 0xffa434 | 44.6 | 47 | +0.1 | [-6.3, +6.2] | n/a | negative |
| not separable from chance | Myndor 0xaef998 | 44.6 | 37 | -0.5 | [-9.9, +7.5] | n/a | negative |
| not separable from chance | JohnAdams54321 0xb94b4d | 41.9 | 45 | -4.4 | [-16.5, +5.6] | n/a | negative |
| not separable from chance | Michelangelo02 0x1d70c3 | 38.8 | 33 | +2.2 | [-6.6, +12.1] | n/a | negative |
| not separable from chance | Caelthar 0xb0f651 | 37.3 | 35 | -1.4 | [-11.2, +9.0] | 0.36 | positive |
| not separable from chance | 0x4128Be7113DBca57 0x4128be | 26.5 | 79 | +0.5 | [-6.8, +8.3] | 0.40 | positive |
| not separable from chance | nbafan88 0xe3d4ed | 24.6 | 84 | -0.0 | [-8.9, +9.1] | 0.30 | positive |
| not separable from chance | PinkPunks- 0x39a5f8 | 14.9 | 124 | -3.3 | [-8.8, +2.8] | 0.33 | positive |
| not separable from chance | Mongabc123 0x1c853e | 6.6 | 184 | +2.7 | [-2.4, +7.8] | 0.25 | positive |
| not separable from chance | nutrichicha 0xa02e4a | 6.0 | 111 | -0.5 | [-4.3, +3.1] | 0.30 | positive |
| too concentrated to read | Vaelric 0x0d83ca | 95.8 | 35 | +1.9 | [-7.8, +11.9] | 2.70 | positive |
| too concentrated to read | Evandor-363 0xc13903 | 95.3 | 38 | +1.5 | [-5.7, +8.5] | 0.95 | positive |
| too concentrated to read | bioo 0xdd5453 | 86.5 | 69 | -1.9 | [-12.4, +8.6] | 1.66 | positive |
| too concentrated to read | zzh991217 0x43751b | 80.7 | 30 | -1.9 | [-13.5, +10.0] | 1.50 | positive |
| too concentrated to read | c66 0xcdfe1e | 71.7 | 31 | +2.6 | [-14.1, +19.1] | 0.81 | positive |
| too concentrated to read | Zorin-472 0x2cc540 | 69.2 | 36 | -2.1 | [-9.2, +5.5] | 10.97 | positive |
| too concentrated to read | coconuttree-130 0xbf49d1 | 57.7 | 61 | +0.0 | [-11.2, +11.2] | 0.85 | positive |
| too concentrated to read | PiderPen 0xeca133 | 57.4 | 37 | +1.7 | [-11.7, +14.9] | 0.73 | positive |
| too concentrated to read | Stefanlecca 0xff2b87 | 54.7 | 37 | +2.4 | [-8.1, +12.8] | 0.86 | positive |
| too concentrated to read | DenzelMorgan 0xd721ee | 54.5 | 70 | +10.2 | [-1.5, +21.2] | 0.62 | positive |
| too concentrated to read | lejcles 0xfc4c47 | 47.7 | 49 | +1.1 | [-6.2, +8.1] | 0.85 | positive |
| too concentrated to read | ligmaaaaa 0x682e38 | 32.8 | 87 | -7.4 | [-14.9, -0.2] | 0.88 | negative |
| too concentrated to read | JHHK 0xe0a38c | 31.5 | 134 | -0.2 | [-5.9, +5.4] | 0.80 | positive |
| too concentrated to read | yuo45 0xf23c45 | 29.5 | 215 | -2.6 | [-7.3, +2.2] | 0.89 | positive |
| too concentrated to read | Green 0x220abc | 18.1 | 72 | +1.2 | [-6.5, +8.9] | 2.47 | positive |
| too few resolved positions | gggggggggggggggggg 0xff7a60 | 99.5 | <30 (npos=23) | - | - | - | positive |
| too few resolved positions | Joycejoshua 0xd68cd8 | 98.4 | <30 (npos=20) | - | - | - | positive |
| too few resolved positions | mytaXBT 0x54d085 | 96.0 | <30 (npos=21) | - | - | - | positive |
| too few resolved positions | Sytherin 0xf386fb | 95.1 | <30 (npos=22) | - | - | - | positive |
| too few resolved positions | Strenik 0x8988e1 | 81.6 | <30 (npos=26) | - | - | - | positive |
| too few resolved positions | jjwin 0x988463 | 81.4 | <30 (npos=25) | - | - | - | positive |
| too few resolved positions | tibimarket 0x0886ee | 76.1 | <30 (npos=24) | - | - | - | positive |
| too few resolved positions | Xerria 0xd1800c | 71.3 | <30 (npos=26) | - | - | - | positive |
| too few resolved positions | Jexley 0x482002 | 62.0 | <30 (npos=23) | - | - | - | positive |
| too few resolved positions | rizzdaddy 0x39feac | 61.9 | <30 (npos=28) | - | - | - | positive |
| too few resolved positions | Trevio 0x0e9be6 | 61.1 | <30 (npos=24) | - | - | - | negative |
| too few resolved positions | Velenza 0xb71e86 | 60.5 | <30 (npos=26) | - | - | - | positive |
| too few resolved positions | Lomyn 0xeb7681 | 58.2 | <30 (npos=26) | - | - | - | positive |
| too few resolved positions | Uryxen 0xc18e5f | 56.7 | <30 (npos=28) | - | - | - | negative |
| too few resolved positions | SirSats 0x68a9f0 | 55.2 | <30 (npos=27) | - | - | - | positive |
Reads: 1 interval-clears-zero on the positive side (uncorrected), 19 not separable from chance, 15 too concentrated to read, 15 with fewer than 30 resolved positions. Zero of the 50 are in the FDR-corrected cleared set.
Method
- Universe: the published top-50 Edge Score cohort artifact (wallets with at least 20 resolved positions, ranked by the descriptive Edge Score composite; refreshed daily). The rendered leaderboard page shows the cohort's positive-PnL subset.
- Realized entry edge per wallet: outcome minus entry volume-weighted price, equal-weighted across resolved unique positions; 95% interval via BCa bootstrap. Readability floor: 30 resolved positions (35 of 50 qualify; the other 15 are reported as insufficient, not tested).
- Concentration screen: the single biggest winning event's PnL divided by net PnL; at or above 0.6 the record is reported as too concentrated to read. The ratio is not bounded by 1 and exceeds it when one event is larger than the net result. A non-computable ratio (analyzer-side net PnL at or below zero) cannot trigger the concentrated read; the record is then classified by its interval alone. This applies to the one positive-interval row: its net-negative PnL makes the ratio non-computable, so that record was not concentration-screened. The sign column comes from the cohort artifact, which can disagree with the analyzer-side figure.
- Multiple comparisons: 35 one-sided tests at 2.5% imply about 0.875 expected false positives under a null of zero skill. The cohort-level corrected screen (Benjamini-Hochberg, q=0.10, published separately) cleared 178 of 3,871 wallets platform-wide; none of the 50 in this cohort are among them.
- Dollar PnL figures are withheld: the cohort artifact and the live analyzer disagree on realized PnL for some wallets, and we do not publish numbers our own pipelines dispute. The reconciliation is tracked internally; the sign column is the cohort artifact's.
Diagnostics on public on-chain records, not investment advice. A past read is not a forecast, and no corrected evidence of skill is not evidence of no skill: several intervals are wide and include large positive values, so the record is underpowered, not settled. If you think the math is wrong, the interval is the thing to argue with; we publish corrections when receipts say so.