Thursday, April 26, 2018

no nillion for you

We had two math and statistics professionals look into the likelihood of the New Mexico/ Henderson events occurring.

The first report is from an experienced data scientist, who prefers to remain anonymous, but whose professional opinion I sought and whose report I will be forwarding to the ethics committee. The data scientist examined three scenarios in the Jan 15 tournament, one using pre-tournament ratings, one using post-tournament ratings, and a third using the lowest published rating in the past year of the Henderson students and the peak ratings of their opponents. The data scientist found that the chances of the Jan 15 tournament occurring, assuming pre-tournament ratings were accurate, is 0.000000000000000000000000000000888, which is less than one in one nonillion (1 with 30 zeroes after it). That is approximately a billion times the number of stars in the observable universe.

Assuming post-tournament ratings led to a probability of 0.000000000045, which is less than 1 in 100 billion (note that 100 billion is the approximate number of stars in our galaxy).

And the third (most favorable to Henderson) scenario, assuming the Henderson students were at their past-year weakest and the opponents were at their lifetime strongest, still found a likelihood of only is 0.000000037, which is less than 1 in 10 million.

A second analysis was done by a parent on my team who works in computer programming and statistics. I present his work and conclusions below; for obvious reasons they are very similar to the above. They are slightly different in scenario two because the first statistician assumed post tournament ratings of both sides and the second analysis assumed only post tournament ratings of the New Mexico players. (This scenario was run because an argument is being made that the New Mexican players' ratings were provisional and inaccurate, see below.)

Base Analysis

The main argument is that the EP vs. EG tournament is highly implausible. The ratings difference between the winners and losers is much too wide for such a number of simultaneous upsets to occur.

This analysis looked at each individual game, calculated the odds of losing each game, and then calculated the odds of a 0-28 score based on those odds. The odds of losing a given game is given by the USCF ELO model (see resources below). Specifically, the odds of losing a given game is 1 minus the odds of winning a game given two ratings:

This analysis excludes the possibility of draws, but if we included those odds the odds of losing any given game would be lower, so would only strengthen this argument.

Given the above, the odds of such a lopsided tournament occuring is once in 1.13 x 10^30. In plain English, that's once in a nonillion chance of occuring (We had to look that up; see resources below).

5 sigma is often used as an extreme hurdle to determine validity or significance. Scientists used it to validate the discovery of a new particle (see article). 5 sigma is an event that occurs once in 3.5 milliontimes. Not billion. Not trillion.

Post-event Peak Analysis

One argument in defense of the upset team is that the opponent ratings were provisional and therefore meaningless. It's true that six out of the seven winners had provisional ratings. We ran the same test as the above, but this time using the peak ratings of the opponents after the above suspicious event.

Sure enough, most of the provisionally rated opponents had their ratings move up (even though much of it occured by beating their much higher rated opponents in the above event!). As of April, 2018, four players still had provisional ratings, but two of those had 24 and 25 games respectively, so their ratings are close to non-provisional (26 games needed for non-provisional rating).

Using these peak ratings of the opponents, running the same analysis shows the odds of a 0-28 sweep/upset is one in 1.44 x 10^16.

Or, in plain English, one in 14 quadrillion.

This seems like a fair analysis; if you look through the histories of the provisionally rated players, there isn't much to indicate that they are materially, grossly underrated. They do show patterns of consistently losing to low rated players etc.

Even-match Analysis

Finally, all this math aside, the simplest analysis is to just look at the odds of a 0-28 sweep of an evenly matched team, which is far from the case here. The odds of such an upset is simply 0.5^28.

Using this method, we get the odds of this occurring as one in 268 million. Remember, 5 sigma is a once in 3.5 million event, good enough to validate the discovery of a new particle.

Given the above analysis, and especially even the last 'even-match', sanity-check analysis, it is safe (or exceedingly, astronomically safe) to say that this was not a valid event.

We have seen various analyses on this (including one from a math Phd, professional quantitative analyst/statistician), and numbers may vary due to rounding and other issues, but the conclusion is basically the same; this event is an astronomically unlikely event to have occured normally.

Friday, April 20, 2018

"but I can beat him, Mister"

       Let me say first that my assistant principal, John Galvin, is the greatest detective in the world. He's the one who originally caught the Henderson cheating, basically figured out everything they did going back a couple years.

      So yesterday I see him hunched over his desk, reading a small book. It turns out to be The Champions Game, by Saul Ramirez.
     Let's read along:

In my mind, telling a kid who wants to play to draw is unethical. You can ask, if it means the team wins. But if the kid wants to play for a personal championship, you have to respect that. He earned the chance. Leaning on him is an abuse of power.

But now think for a minute about this story. Why would it hurt the team for the top two scorers to have a decisive result vs a draw? It can't. Either way, the team gets one point. In fact, they were up five points going into the round -- they had already clinched it.

Now look at the crosstable (MS Novice);

  1 | LEO GONZALEZ     |6.0  |W   9|W  33|W  24|W   5|W  11|W   6|L   2|
   TX | 15707532 / R: Unrated-> 976P7   |     |     |     |     |     |     |     |     |
    2 | BRANDON CABALLERO  |6.0  |W  27|L   7|W  19|W  24|W  13|W  12|W   1|
   TX | 15707553 / R: Unrated-> 931P7   |     |     |     |     |     |     |     |     |

He didn't tell Leo to draw; he told him to lose. Why? It boggles the mind. I guess he wanted two co-champions rather than one. Reread the dialogue, this time knowing the kid is begging to play the game honestly and he's being told to throw it. The chutzpah of writing "He understood it, but his ego was fighting the concept of sacrificing in order to achieve something greater" just blows my mind.

Also, this isn't cheating, but did you guys watch the video about Henderson girls in Mike's article? Where Ramirez says "I'm not going to lie, I had to read a lot of books about how to coach a girls' team ...They mess up their positions in a whole different way"  wtf?

Tuesday, April 17, 2018

I'm sorry, what?

USCF President Carol Meyers issued a statement that began

"1. No cheating happened, nor is alleged to have happened, at the 2018 National Junior High Championship; the alleged incident took place prior to our event."

I'm sorry, what????????
We're accusing them of intentionally losing games to lower their rating in order to enter the National Junior High in inappropriately low sections. They did this and won the tournament.

Intentionally entering the wrong section is cheating.
They began their cheating with the tournaments on 1/15* and 1/19, but this was done only in order to cheat at nationals. In some sense, nationals is the real cheating because the earlier events have no victims in and of themselves.

The statement Carol Meyer issued is now being used by the Henderson coach to claim he has been exonerated.
Honestly, it's hard for me to fathom what she could have been thinking, but the statement needs to be retracted immediately.

Friday, April 13, 2018


So the USCF put out this statement, which I consider an laughable shirking of their responsibilities. 

"The US Chess Federation has not received a written complaint to initiate our procedures for factual inquiry and ruling on any allegation of cheating pertaining to this event."

And you don't care enough to do anything on your own?? After you have been begged in a timely fashion to by no less than 12 coaches? After your own national championship becomes an outrage and a joke?

It's the casual denial of responsibility that kills me.

You have all the facts you need, Carol Meyer, USCF et al.  Cheating obviously occurred and ruined YOUR national championship. People complained to your organization in time to remove the kids from the section and fix the problem. If your rules are really set up to make you powerless  to investigate on your own, then I feel sorry for you. 

Pretty soon no one is going to pay money to attend your national championship if you don't fulfill your fundamental responsibility of enforcing the rules. 

Thursday, April 12, 2018

Cheating at the National Junior High

Last weekend, at the National Junior High School Chess 
Championship, Henderson Middle School from El Paso 
Texas "won" the under 750 and Under 1000 sections
 with teams of obviously sandbagged players. This 
was brought to the attention of Chief TD David Hater 
by many coaches, but he felt it was not his
 responsibility to act. 

Let's examine the evidence. 
The Under 1000 team members are
Ra***ez, Saul (7.0, 899)
Ra***ez, Juan (6.5, 867)
Pal***no, Carlos (6.0, 760)
Ar**jo, Carlos (4.5, 884)

Why are their ratings under 900, you are thinking? 
Because that allowed them to play in and win the 
Texas Under 900 championship. 

To get their ratings under 900 for these events, 
they claim to have played a two round match in
 Las Cruces, NM, where they lost 26-0, most of 
which were 400+ point upsets. 
This was rated as a tournament, rather than a match;
 perhaps accidentally or perhaps because there's an 
anti-sandbagging rule that says you can only lose 50
 points in a match. 

My assistant principal, John Galvin, reported this at 
7 pm Saturday. At the 2:30 meeting the next day, there
 was some disagreement about whether these results
 were spectacularly unlikely or actually impossible
A parent from my team who is also a mathematician 
was kind enough to run some numbers for me (results 
have been reviewed by a few of his colleagues and 
detailed discussion is in the comments. )
His analysis showed the odds of losing 26-0 with the 
rating differentials is 1 in 3x 10^21 
Without considering ratings, it's 1 in 263,000,000. 

When asked, the Henderson coach attributed his team's
 poor performance to "being kids" and coming from 
underprivileged homes.  

The Under 750 team is 

R**z, Alessandra (7.0, 734)
Arga***na, Aime (6.0, 585)
Ag***re, Devante (5.0, 632)
Ji***ez, Jose Luis (5.0, 654)
Valadez, Angelica (5.0, 683) 

On Jan 19, 2019, they held another tournmanent / 
match in New Mexico in which the Texas players 
again did very very poorly. This time their under 
750 team goes under. Notice how the MSA report lists 
the players' states in the left hand corner so you can 
easily see how badly Texas fared. 

The TD supervising these tournaments, Will Barela, 
 is also the President of the New Mexico Chess Association.
Looking through his directing history reveals some, 
lets' say ... "purposeful" events. Between Dec 28 and
 Jan 5 of 2017/2018, he rated a series of 15 multi 
section tournaments, in which a master who was 
dropping dangerously close to 2200, beat kids rated
 100-1000 in hundreds of games, thereby obtaining 
his life master title.
Congratulations to Life Master Benjamin Corarreti, 

I have never seen more obvious evident of sandbagging.
 There is no attempt to hide the thrown games, not a 
single draw. 
USCF officials could have moved their sections and saved
the integrity and reputation of their tournament; they 
were told at the beginning of round 5. Instead, they 
insist it needs to be handled by the Ethics Committee.

Handing it off to the Ethics Committee has enormous
costs. The entire credibility of the tournament 
experience is ruined for everyone. A confidential 
committee decision six months later does nothing 
to fix this. The cheated teams will never get to walk
 across the stage; they'll never get the newspaper 
articles, or the homecoming celebration, or the 
exhilaration of that night. 

I know there will be cases where the evidence is not
 clear and the TDs can't, in good conscience, act. But
 this is not that situation. This is the clearest, most 
unambiguous case of cheating POSSIBLE. 

If you aren't going to act on this, you can't claim to 
be enforcing the rules. 

It's unfortunate it wasn't handled well at the time, 
and more unfortunate (see next post) that the 
USCF is doubling down on their new stated policy of
 not interfering in cheating in progress.  
The USCF ought now to announce the cheating 
publicly and congratulate Metcalf and Thomas 
Edison on their wins in the U750 and U1000 
sections, and Scotty Gordon and Sameris Desvignes 
on the individual triumphs. 

In future, under sections should use peak rating.  

Wednesday, April 11, 2018

How to Solve Coaches Cheating at Nationals

Use peak ratings as eligibility for under sections at nationals.
Children's ratings should not be going down. This solves the problem and is easy to understand and enforce.