Elizabeth Spiegel's blog: April 2018

Thursday, April 26, 2018

no nillion for you

We had two math and statistics professionals look into the likelihood of the New Mexico/ Henderson events occurring.

The first report is from an experienced data scientist, who prefers to remain anonymous, but whose professional opinion I sought and whose report I will be forwarding to the ethics committee. The data scientist examined three scenarios in the Jan 15 tournament, one using pre-tournament ratings, one using post-tournament ratings, and a third using the lowest published rating in the past year of the Henderson students and the peak ratings of their opponents. The data scientist found that the chances of the Jan 15 tournament occurring, assuming pre-tournament ratings were accurate, is 0.000000000000000000000000000000888, which is less than one in one nonillion (1 with 30 zeroes after it). That is approximately a billion times the number of stars in the observable universe.

Assuming post-tournament ratings led to a probability of 0.000000000045, which is less than 1 in 100 billion (note that 100 billion is the approximate number of stars in our galaxy).

And the third (most favorable to Henderson) scenario, assuming the Henderson students were at their past-year weakest and the opponents were at their lifetime strongest, still found a likelihood of only is 0.000000037, which is less than 1 in 10 million.

A second analysis was done by a parent on my team who works in computer programming and statistics. I present his work and conclusions below; for obvious reasons they are very similar to the above. They are slightly different in scenario two because the first statistician assumed post tournament ratings of both sides and the second analysis assumed only post tournament ratings of the New Mexico players. (This scenario was run because an argument is being made that the New Mexican players' ratings were provisional and inaccurate, see below.)

Base Analysis

The main argument is that the EP vs. EG tournament is highly implausible. The ratings difference between the winners and losers is much too wide for such a number of simultaneous upsets to occur.

This analysis looked at each individual game, calculated the odds of losing each game, and then calculated the odds of a 0-28 score based on those odds. The odds of losing a given game is given by the USCF ELO model (see resources below). Specifically, the odds of losing a given game is 1 minus the odds of winning a game given two ratings:

This analysis excludes the possibility of draws, but if we included those odds the odds of losing any given game would be lower, so would only strengthen this argument.

Given the above, the odds of such a lopsided tournament occuring is once in 1.13 x 10^30. In plain English, that's once in a nonillion chance of occuring (We had to look that up; see resources below).

5 sigma is often used as an extreme hurdle to determine validity or significance. Scientists used it to validate the discovery of a new particle (see article). 5 sigma is an event that occurs once in 3.5 milliontimes. Not billion. Not trillion.

Post-event Peak Analysis

One argument in defense of the upset team is that the opponent ratings were provisional and therefore meaningless. It's true that six out of the seven winners had provisional ratings. We ran the same test as the above, but this time using the peak ratings of the opponents after the above suspicious event.

Sure enough, most of the provisionally rated opponents had their ratings move up (even though much of it occured by beating their much higher rated opponents in the above event!). As of April, 2018, four players still had provisional ratings, but two of those had 24 and 25 games respectively, so their ratings are close to non-provisional (26 games needed for non-provisional rating).

Using these peak ratings of the opponents, running the same analysis shows the odds of a 0-28 sweep/upset is one in 1.44 x 10^16.

Or, in plain English, one in 14 quadrillion.

This seems like a fair analysis; if you look through the histories of the provisionally rated players, there isn't much to indicate that they are materially, grossly underrated. They do show patterns of consistently losing to low rated players etc.

Even-match Analysis

Finally, all this math aside, the simplest analysis is to just look at the odds of a 0-28 sweep of an evenly matched team, which is far from the case here. The odds of such an upset is simply 0.5^28.

Using this method, we get the odds of this occurring as one in 268 million. Remember, 5 sigma is a once in 3.5 million event, good enough to validate the discovery of a new particle.

Conclusion
Given the above analysis, and especially even the last 'even-match', sanity-check analysis, it is safe (or exceedingly, astronomically safe) to say that this was not a valid event.

We have seen various analyses on this (including one from a math Phd, professional quantitative analyst/statistician), and numbers may vary due to rounding and other issues, but the conclusion is basically the same; this event is an astronomically unlikely event to have occured normally.

Friday, April 20, 2018

"but I can beat him, Mister"

Let me say first that my assistant principal, John Galvin, is the greatest detective in the world. He's the one who originally caught the Henderson cheating, basically figured out everything they did going back a couple years.

So yesterday I see him hunched over his desk, reading a small book. It turns out to be The Champions Game, by Saul Ramirez.
Let's read along:

In my mind, telling a kid who wants to play to draw is unethical. You can ask, if it means the team wins. But if the kid wants to play for a personal championship, you have to respect that. He earned the chance. Leaning on him is an abuse of power.

But now think for a minute about this story. Why would it hurt the team for the top two scorers to have a decisive result vs a draw? It can't. Either way, the team gets one point. In fact, they were up five points going into the round -- they had already clinched it.

Now look at the crosstable (MS Novice);

  1 | LEO GONZALEZ     |6.0  |W   9|W  33|W  24|W   5|W  11|W   6|L   2|
   TX | 15707532 / R: Unrated-> 976P7   |     |     |     |     |     |     |     |     |
-----------------------------------------------------------------------------------------
    2 | BRANDON CABALLERO  |6.0  |W  27|L   7|W  19|W  24|W  13|W  12|W   1|
   TX | 15707553 / R: Unrated-> 931P7   |     |     |     |     |     |     |     |     |
-------

He didn't tell Leo to draw; he told him to lose. Why? It boggles the mind. I guess he wanted two co-champions rather than one. Reread the dialogue, this time knowing the kid is begging to play the game honestly and he's being told to throw it. The chutzpah of writing "He understood it, but his ego was fighting the concept of sacrificing in order to achieve something greater" just blows my mind.

Also, this isn't cheating, but did you guys watch the video about Henderson girls in Mike's article? Where Ramirez says "I'm not going to lie, I had to read a lot of books about how to coach a girls' team ...They mess up their positions in a whole different way" wtf?

Tuesday, April 17, 2018

I'm sorry, what?

USCF President Carol Meyers issued a statement that began

"1. No cheating happened, nor is alleged to have happened, at the 2018 National Junior High Championship; the alleged incident took place prior to our event."

I'm sorry, what????????
We're accusing them of intentionally losing games to lower their rating in order to enter the National Junior High in inappropriately low sections. They did this and won the tournament.

Intentionally entering the wrong section is cheating.
They began their cheating with the tournaments on 1/15* and 1/19, but this was done only in order to cheat at nationals. In some sense, nationals is the real cheating because the earlier events have no victims in and of themselves.

The statement Carol Meyer issued is now being used by the Henderson coach to claim he has been exonerated.
Honestly, it's hard for me to fathom what she could have been thinking, but the statement needs to be retracted immediately.

Friday, April 13, 2018

spineless

So the USCF put out this statement, which I consider an laughable shirking of their responsibilities.

"The US Chess Federation has not received a written complaint to initiate our procedures for factual inquiry and ruling on any allegation of cheating pertaining to this event."

And you don't care enough to do anything on your own?? After you have been begged in a timely fashion to by no less than 12 coaches? After your own national championship becomes an outrage and a joke?

It's the casual denial of responsibility that kills me.

You have all the facts you need, Carol Meyer, USCF et al. Cheating obviously occurred and ruined YOUR national championship. People complained to your organization in time to remove the kids from the section and fix the problem. If your rules are really set up to make you powerless to investigate on your own, then I feel sorry for you.

Pretty soon no one is going to pay money to attend your national championship if you don't fulfill your fundamental responsibility of enforcing the rules.

Thursday, April 12, 2018

Cheating at the National Junior High

Last weekend, at the National Junior High School Chess

Championship, Henderson Middle School from El Paso 
Texas "won" the under 750 and Under 1000 sections

 with teams of obviously sandbagged players. This

was brought to the attention of Chief TD David Hater

by many coaches, but he felt it was not his

 responsibility to act.

Let's examine the evidence.

The Under 1000 team members are

Ra***ez, Saul (7.0, 899)

Ra***ez, Juan (6.5, 867)

Pal***no, Carlos (6.0, 760)

Ar**jo, Carlos (4.5, 884)

Why are their ratings under 900, you are thinking?

Because that allowed them to play in and win the

Texas Under 900 championship.

To get their ratings under 900 for these events,

they claim to have played a two round match in

 Las Cruces, NM, where they lost 26-0, most of

which were 400+ point upsets.

This was rated as a tournament, rather than a match;

 perhaps accidentally or perhaps because there's an

anti-sandbagging rule that says you can only lose 50

 points in a match.

My assistant principal, John Galvin, reported this at

7 pm Saturday. At the 2:30 meeting the next day, there

 was some disagreement about whether these results

 were spectacularly unlikely or actually impossible

A parent from my team who is also a mathematician

was kind enough to run some numbers for me (results

have been reviewed by a few of his colleagues and

detailed discussion is in the comments. )

His analysis showed the odds of losing 26-0 with the

rating differentials is 1 in 3x 10^21

(3,000,000,000,000,000,000,000).

Without considering ratings, it's 1 in 263,000,000.

When asked, the Henderson coach attributed his team's

 poor performance to "being kids" and coming from

underprivileged homes.

The Under 750 team is

R**z, Alessandra (7.0, 734)

Arga***na, Aime (6.0, 585)

Ag***re, Devante (5.0, 632)

Ji***ez, Jose Luis (5.0, 654)

Valadez, Angelica (5.0, 683)

On Jan 19, 2019, they held another tournmanent /

match in New Mexico in which the Texas players

again did very very poorly. This time their under

750 team goes under. Notice how the MSA report lists

the players' states in the left hand corner so you can

easily see how badly Texas fared.

The TD supervising these tournaments, Will Barela,

 is also the President of the New Mexico Chess Association.

Looking through his directing history reveals some,

lets' say ... "purposeful" events. Between Dec 28 and

 Jan 5 of 2017/2018, he rated a series of 15 multi

section tournaments, in which a master who was

dropping dangerously close to 2200, beat kids rated

 100-1000 in hundreds of games, thereby obtaining

his life master title.

Congratulations to Life Master Benjamin Corarreti,

cheater.

I have never seen more obvious evident of sandbagging.

 There is no attempt to hide the thrown games, not a

single draw.

USCF officials could have moved their sections and saved

the integrity and reputation of their tournament; they

were told at the beginning of round 5. Instead, they

insist it needs to be handled by the Ethics Committee.

Handing it off to the Ethics Committee has enormous

costs. The entire credibility of the tournament

experience is ruined for everyone. A confidential

committee decision six months later does nothing

to fix this. The cheated teams will never get to walk

 across the stage; they'll never get the newspaper

articles, or the homecoming celebration, or the

exhilaration of that night.

I know there will be cases where the evidence is not

 clear and the TDs can't, in good conscience, act. But

 this is not that situation. This is the clearest, most

unambiguous case of cheating POSSIBLE.

If you aren't going to act on this, you can't claim to

be enforcing the rules.

It's unfortunate it wasn't handled well at the time,

and more unfortunate (see next post) that the

USCF is doubling down on their new stated policy of

 not interfering in cheating in progress.

The USCF ought now to announce the cheating

publicly and congratulate Metcalf and Thomas

Edison on their wins in the U750 and U1000

sections, and Scotty Gordon and Sameris Desvignes

on the individual triumphs.

In future, under sections should use peak rating.

Wednesday, April 11, 2018

How to Solve Coaches Cheating at Nationals

Use peak ratings as eligibility for under sections at nationals.
Children's ratings should not be going down. This solves the problem and is easy to understand and enforce.