Research

Research note · scan date 2026-06-09

We ran our skill read on every wallet in our own published top-50 cohort. 35 of 50 had enough resolved positions to test. The number that cleared the corrected statistical bar: zero.

Convexly publishes a top-50 Polymarket cohort ranked by a behavioral composite, Edge Score (the leaderboard page renders the cohort's positive-PnL subset). We turned our own skill test on that published cohort: for each wallet with at least 30 resolved positions, we computed the realized entry edge, meaning whether entries resolved true more often than the price paid implied, with a bootstrap 95% interval, plus a concentration screen.

Results: 19 of 35 read clean as not separable from chance, with an interval that includes zero and no concentration flag. Another 15 are too concentrated to read as a clean edge, with one event carrying most of the net result; most of those intervals include zero too. Exactly 1 record's interval clears zero on the positive side, which is roughly what chance alone predicts: testing 35 records at a 2.5% threshold is expected to produce about 0.9 false positives. That record does not survive a multiple-comparisons correction, and the same wallet is net-negative on PnL. A wallet with a very large public P&L that people cite as proof of skill shows an edge of +0.4 probability points, 95% interval [-16.0, +15.6], across 38 resolved positions.

The honest takeaway is NOT "these wallets have no skill." It is that the resolved records, at these sample sizes, mostly cannot distinguish skill from chance either way, and a P&L screenshot tells you even less.

Per-wallet table

Labels are on-chain Polymarket usernames or addresses only. The one positive-interval row is annotated in place: it is an uncorrected single test selected out of 35, statistically consistent with chance at the cohort level, it is not in the FDR-corrected cleared set, and its net PnL is negative. Dollar PnL figures are withheld pending a cross-pipeline reconciliation; the sign column reflects the published cohort artifact. Diagnostics on public on-chain records, not investment advice; a past read is not a forecast.

ReadLabelEdge Scoren resolvedRealized edge (pp)95% BCa CIConcentrationNet PnL sign
interval clears zero on the positive side (uncorrected)ShouShouKKos 0xc2fb28Uncorrected single test of 35 (about 0.9 false positives expected by chance); NOT in the FDR-corrected cleared set; net PnL negative. Not a skill verdict.12.7137+5.6[+1.2, +9.5]n/anegative
not separable from chanceZarvantis 0x30c92f77.147-2.7[-8.9, +3.2]n/apositive
not separable from chancebeachboy4 0xc2e78076.338+0.4[-16.0, +15.6]n/apositive
not separable from chancePesdruzi 0x15707f59.934+0.4[-14.7, +16.2]0.58positive
not separable from chancedasasd 0x6a7d8949.832-2.5[-16.4, +10.8]n/anegative
not separable from chancesUSDe 0x657a2649.030+11.2[-2.7, +25.2]0.45positive
not separable from chanceanon_1d8a377c 0x1d8a3747.343+0.5[-11.8, +12.4]n/anegative
not separable from chanceIeirigk733 0xaaaf7f47.132+12.2[-0.7, +23.7]n/anegative
not separable from chanceYoonSukYeol 0x00dcd746.541+2.6[-9.6, +14.5]0.52positive
not separable from chanceVaedrix 0x5afa1845.449+0.9[-7.2, +8.4]n/anegative
not separable from chanceDrakonis 0xffa43444.647+0.1[-6.3, +6.2]n/anegative
not separable from chanceMyndor 0xaef99844.637-0.5[-9.9, +7.5]n/anegative
not separable from chanceJohnAdams54321 0xb94b4d41.945-4.4[-16.5, +5.6]n/anegative
not separable from chanceMichelangelo02 0x1d70c338.833+2.2[-6.6, +12.1]n/anegative
not separable from chanceCaelthar 0xb0f65137.335-1.4[-11.2, +9.0]0.36positive
not separable from chance0x4128Be7113DBca57 0x4128be26.579+0.5[-6.8, +8.3]0.40positive
not separable from chancenbafan88 0xe3d4ed24.684-0.0[-8.9, +9.1]0.30positive
not separable from chancePinkPunks- 0x39a5f814.9124-3.3[-8.8, +2.8]0.33positive
not separable from chanceMongabc123 0x1c853e6.6184+2.7[-2.4, +7.8]0.25positive
not separable from chancenutrichicha 0xa02e4a6.0111-0.5[-4.3, +3.1]0.30positive
too concentrated to readVaelric 0x0d83ca95.835+1.9[-7.8, +11.9]2.70positive
too concentrated to readEvandor-363 0xc1390395.338+1.5[-5.7, +8.5]0.95positive
too concentrated to readbioo 0xdd545386.569-1.9[-12.4, +8.6]1.66positive
too concentrated to readzzh991217 0x43751b80.730-1.9[-13.5, +10.0]1.50positive
too concentrated to readc66 0xcdfe1e71.731+2.6[-14.1, +19.1]0.81positive
too concentrated to readZorin-472 0x2cc54069.236-2.1[-9.2, +5.5]10.97positive
too concentrated to readcoconuttree-130 0xbf49d157.761+0.0[-11.2, +11.2]0.85positive
too concentrated to readPiderPen 0xeca13357.437+1.7[-11.7, +14.9]0.73positive
too concentrated to readStefanlecca 0xff2b8754.737+2.4[-8.1, +12.8]0.86positive
too concentrated to readDenzelMorgan 0xd721ee54.570+10.2[-1.5, +21.2]0.62positive
too concentrated to readlejcles 0xfc4c4747.749+1.1[-6.2, +8.1]0.85positive
too concentrated to readligmaaaaa 0x682e3832.887-7.4[-14.9, -0.2]0.88negative
too concentrated to readJHHK 0xe0a38c31.5134-0.2[-5.9, +5.4]0.80positive
too concentrated to readyuo45 0xf23c4529.5215-2.6[-7.3, +2.2]0.89positive
too concentrated to readGreen 0x220abc18.172+1.2[-6.5, +8.9]2.47positive
too few resolved positionsgggggggggggggggggg 0xff7a6099.5<30 (npos=23)---positive
too few resolved positionsJoycejoshua 0xd68cd898.4<30 (npos=20)---positive
too few resolved positionsmytaXBT 0x54d08596.0<30 (npos=21)---positive
too few resolved positionsSytherin 0xf386fb95.1<30 (npos=22)---positive
too few resolved positionsStrenik 0x8988e181.6<30 (npos=26)---positive
too few resolved positionsjjwin 0x98846381.4<30 (npos=25)---positive
too few resolved positionstibimarket 0x0886ee76.1<30 (npos=24)---positive
too few resolved positionsXerria 0xd1800c71.3<30 (npos=26)---positive
too few resolved positionsJexley 0x48200262.0<30 (npos=23)---positive
too few resolved positionsrizzdaddy 0x39feac61.9<30 (npos=28)---positive
too few resolved positionsTrevio 0x0e9be661.1<30 (npos=24)---negative
too few resolved positionsVelenza 0xb71e8660.5<30 (npos=26)---positive
too few resolved positionsLomyn 0xeb768158.2<30 (npos=26)---positive
too few resolved positionsUryxen 0xc18e5f56.7<30 (npos=28)---negative
too few resolved positionsSirSats 0x68a9f055.2<30 (npos=27)---positive

Reads: 1 interval-clears-zero on the positive side (uncorrected), 19 not separable from chance, 15 too concentrated to read, 15 with fewer than 30 resolved positions. Zero of the 50 are in the FDR-corrected cleared set.

Method

  • Universe: the published top-50 Edge Score cohort artifact (wallets with at least 20 resolved positions, ranked by the descriptive Edge Score composite; refreshed daily). The rendered leaderboard page shows the cohort's positive-PnL subset.
  • Realized entry edge per wallet: outcome minus entry volume-weighted price, equal-weighted across resolved unique positions; 95% interval via BCa bootstrap. Readability floor: 30 resolved positions (35 of 50 qualify; the other 15 are reported as insufficient, not tested).
  • Concentration screen: the single biggest winning event's PnL divided by net PnL; at or above 0.6 the record is reported as too concentrated to read. The ratio is not bounded by 1 and exceeds it when one event is larger than the net result. A non-computable ratio (analyzer-side net PnL at or below zero) cannot trigger the concentrated read; the record is then classified by its interval alone. This applies to the one positive-interval row: its net-negative PnL makes the ratio non-computable, so that record was not concentration-screened. The sign column comes from the cohort artifact, which can disagree with the analyzer-side figure.
  • Multiple comparisons: 35 one-sided tests at 2.5% imply about 0.875 expected false positives under a null of zero skill. The cohort-level corrected screen (Benjamini-Hochberg, q=0.10, published separately) cleared 178 of 3,871 wallets platform-wide; none of the 50 in this cohort are among them.
  • Dollar PnL figures are withheld: the cohort artifact and the live analyzer disagree on realized PnL for some wallets, and we do not publish numbers our own pipelines dispute. The reconciliation is tracked internally; the sign column is the cohort artifact's.

Diagnostics on public on-chain records, not investment advice. A past read is not a forecast, and no corrected evidence of skill is not evidence of no skill: several intervals are wide and include large positive values, so the record is underpowered, not settled. If you think the math is wrong, the interval is the thing to argue with; we publish corrections when receipts say so.