Wednesday, April 29, 2009

help me design a research experiment!

I've been thinking recently about the role of confidence in chess. My theory has always been that there are some things that it helps to be unconfident in (like driving) and some things it clearly helps to be overconfident in (public speaking, fighting), and that chess clearly belonged to the latter group. Last weekend, while coaching at the Nationals Girls Championship, it occurred to me that I was hearing the phrase "I don't know if this is right, but..." a crazy amount (though interestingly, not from the very best players or the very worst). Which inspired me to try an idea for a research experiment that Jean Hoffman and I had come up with a couple months ago.

I'm trying to test whether girls are less self-confident in their chess abilities than boys, if that affects their ability to find the right move in different situations, if it impacts their time usage, and whether one gender is more accurate in their self-assessment (i.e. maybe girls and boys both answer a question incorrectly, but only girls are aware of it). I'm in the middle of producing the survey and test positions, so I'm eager for your feedback or ideas. I'm excited about doing this well, because I feel like I'm in a uniquely good situation to gather data: many coaches have already offered to help.

Students are given 12 positions and asked what move they would play if they had this position in a game. While the problems are not labeled as such, there are four types of answers: tactics, attacking combinations/moves, positional moves, and positions where you must respond to your opponent's threat. After deciding on a move, students are asked
a) how sure they are that their move is a good move
b) how sure they are that their move is the best move.
They can choose from very sure, sure, medium sure, not sure, and it's a guess.
There are two worksheets, one for tournament players 800-1100, another for those rated 1100-1500. Students may take as long as they wish to answer the questions, but they are asked to report the total time.

After* completing the problems, students are asked a few questions: age, gender, ethnicity, rating, time spent on chess, time spent on this survey, and how good they see themselves as being, relative to others in their chess club. (What else should I ask?)

Any statisticians want to offer any advice on what I need for this to be remotely valid? Or where to go to learn how to analyse the data? (I know what a standard deviation is and I can use excel, but that's about it.)

other thoughts?

ALSO: If you are a chess teacher and would be willing to give this to your students, please let me know. It should take about 30 minutes to complete, and I would be happy to send you the answers, so you could use it as a lesson.

*There's been research that shows that when African Americans (and maybe other minorities who are negatively stereotyped) are asked about their ethnicity before taking a standardized test, they perform measurably worse than when they are asked after completing the test.

14 comments:

Anonymous said...

I would be curious to know whether there is a difference between pre-adolescent and adolescent (and even post adolescent) girls and boys on this issue. For example, if you looked at 8 to 10 year olds, would the findings be different than 11 to 14 or 15 year olds. Since you are asking their age, you might be able to analyse the data.

Anyway, sounds like an interesting study.

EB

Anonymous said...

The first question I'd ask is how many students per rank group you'd be able to test (to see if it's a statistically significant sample). Can the exact same experiment be repeated by other teachers in other classes to increase the sample? Do you think the answers might change if the students were actually in a game rather than answering from a worksheet? With the right prep work I'd say this might be publishable.

D

ATH2044 said...

Usually I like things like this:
"They can choose from very sure, sure, medium sure, not sure, and it's a guess."
to be symmetrical about a neutral point. That would mean carefully selecting (an odd number of) words like: Positive/certain, pretty sure, perhaps/maybe, not likely, no confidence/almost certainly not. This puts "most confident" at one end & "strongly believe it's not the right move" at the other end, with something akin to "don't know" or "no opinion" in the middle.
It's just anecdotal, but I've observed on numerous (glaring) occasions that strong chess players who might be expected to exude confidence (in chess) seem to be horribly indecisive in other areas. I'm not referring to Greg's random decision generator but to behavior that approaches paralysis in otherwise simple social situations. My cheap & dirty explanation is that there's a finite amount of that type of thought processing ability available & some individuals choose to allocate it in eccentric ways.
It seems likely that similar studies have been done with topics other than chess, but I don't know of any off hand.

Naisortep said...

I find strong players (2400+) are extremely decisive ATH2044. Its one of the few common characteristics i've noticed.

Anonymous said...

A couple observations:
1. Your example in the discussion may mix two issues: self-confidence and how one presents oneself to the public.
For example, girls may be more reluctant to just flat out state "I am sure this move Nxf6 wins" and out of politeness or trying to appear self-effacing say instead "How about Nxf6? That looks interesting." Public presentation doesn't always equal actual internal feelings - I've taught some girls and women in an academic field with this same issue. They actually know cold the right answer but don't want to appear to be "a know-it-all" or "to hog all of the answers" - in short do something which might make them less popular - but they do know their stuff. Your approach with a written answer may help sort this out - you might want to design the study with two groupings: a control group where the girls answer questions about the position private on paper, and one where they are asked to give their thoughts on a position in a mixed sex group of kids roughly the same age.
2. Another way of looking at these issues is to see the effect of peer opinion. You can tell each subject that certain moves have been suggested by these players, for example: younger child with lower rating likes move A, younger child with much higher rating likes move B, etc. and see if girls and boys are swayed more by the opinion of others. Many younger stronger players can be a bit stubborn and aren't easily swayed... great idea and I hope you'll be able to carry it out.,

Anonymous said...

My advice is to make sure you get a random sample (and the sample size should be at least 30 respondents).

If you simply ask the students "Who wants to work some chess puzzles?", you'll only get kids who are interseted in and feel good about puzzles, and that's not random. This is what's wrong with a lot of internet surveys amd TV/radio call-in surveys.

In other words, you'll need to select the participants in some random fashion, rather than letting them select themselves.

Anonymous said...

I think it was Botvinnik who said "Chess is the art of analysis." It seems to me that it would depend on the position. Stronger players may see a forcing winning continuation that other players miss. But even the world's best still go back later and suggest improvements they could have played. I would think the stronger the player, the more appropriate the ammount of assurance according to a given position.

Perhaps for your first position you should be White's very first move. What would those answers tell us?

ATH2044 said...

Naisortep, the people I've observed were all in the range of 2000-2350, so I guess I shouldn't have used the word "strong" without first checking to see if they were all players who are fully aware of "the hopelessness of the situation"Lizzy, since you're designing an experiment, you're probably already intimately familiar with a lot of the standard material on designing experiments. I especially liked the part where they address the TIF vs. MIF issue which is still quite controversial in some cultures.

Anonymous said...

Listening to GMs like Benjamin, Christiansen, Fedorowicz, Kaidanov--heck, even Svidler--analyze live games on ChessFM, I find those guys to be endearingly tentative while assessing positions and proferring lines. Listening to them, I get the feeling that the better the player, the deeper their understanding of the difficulty and, perhaps, the unknowability of chess.

Anonymous said...

Elizabeth, to conduct an experiment, no matter how benign it seems, do you have to pass muster with an Institutional Review Board at your school or in your larger school system? http://en.wikipedia.org/wiki/Institutional_review_board

Elizabeth Vicary said...

I considered this possibility, but

a) I'm just doing this for my own interest. Who exactly gets to/will object? Parents? seems pretty unlikely.

b) The wiki article mentions exemptions:

While IRBs can be more inclusive or restrictive, under the statute, exemptions to IRB approval include research activities in which the only involvement of human subjects will be in one or more of the following categories:
...

2. Research involving the use of educational tests (cognitive, diagnostic, aptitude, achievement), survey procedures, interview procedures or observation of public behavior, unless:

1.information obtained is recorded in such a manner that human subjects can be identified, directly or through identifiers linked to the subjects; and
2.any disclosure of the human subjects' responses outside the research could reasonably place the subjects at risk of criminal or civil liability or be damaging to the subjects' financial standing, employability, or reputation.

Since all surveys are anonymous, I think I'm good there. It's a widely used exemption--I didn't have to get my masters thesis approved by a IRB because it fell under this category.

Anonymous said...

My own confidence level (and, I imagine, lots of players') fluctuates enormously depending on how I've been playing lately. Any ideas on how to deal with that factor?

Rick Massimo

Daan said...

Hi Lizzy,

nice research idea. It reminds me of a sentence in the book "The Black Swan" from Taleb. Taleb states that better players think more in terms of rejecting their hypothesis then in terms of confirming it. This means that it helps the be insecure about your move, because then you will look for ways your opponent can punnish you for that move. On the other side I can understand that sometimes I bit of confidence can help to make practical decisions. In a way chess seems to require both over- and underconfidence.
Actually, a chess friend of yours (and mine) once discussed "The Black Swan" on the chess website ChessVibes, and he linked me your page since I am doing chess research myself and I am a statistician. The first problem I foresee with the experiment is that in general lower rated people, and especially children, have very unreliable ratings. So a kid with a rating of 1000 is not necessary worse then a 1300 rated kid. This means that it is hard to compare groups. In my experience I found that only after 1800, ratings become more reliable. An alternative is to test the skill of the children in a different way, by giving them some chess diagrams (from easy to difficult), and see how well they solve them. Then you can distinguish the kids based on this test. I think this is the best solution, because I think that women who continue playing chess might not be representative for women in general (and the same holds for guys btw :).
Also I think it is important to have enough participants. In general you should aim for at least 60 people, and of course the more the better.
Further you might wanna have a look at a paper written by Dennis Holding and Douglas Pfua (1985). They also conduct an experiment where they have chess players judge positions and ask them how sure they are about their evaluation. They do not look at gender differences, but at difference in skill.

ok, good luck with the experiment, and if you have questions concerning statistics you can mail to ChessVibes, they know how to find me.

Cheers, Daan

Anonymous said...

Hi Elizabeth: Not exactly on point/more on point with the previous discussion about Rowson's comments but did you read David Brooks in this morning (Friday)'s N.Y. Times on "Genius: The Modern View." http://www.nytimes.com/2009/05/01/opinion/01brooks.html?_r=1

Have a good weekend,

Ellen