The “We Say You Do” Model Of Code Review Is Broken

With the upcoming review and rehash of the Code of Points for the 2017–2020 cycle one thing is becoming increasingly clear, the FIG and requisite technical committees should do as students are told to do and show their working. The Code has been a living and morphing document since it’s inception. This needn’t change, however the way in which we go about modifications has to become a more data driven process.

I have been attending athletes meetings at every World Championships since 2011 and I can say they it has been important and useful method to communicate ideas and feedback between the athletes and the FIG about many issues. The consultations last year concerning the new code changes, however, were underwhelming and it has taken me a long time to really unpack why that was. To be clear these are criticisms about the process of the Code review rather than criticisms of the final draft.

Show Me The Data

The biggest change proposed for the upcoming MAG code was the move from 10 to 8 counting skills. There were of course other changes put forward but for the purposes of this discussion I will focus on this issue. This was extensively discussed and the general consensus was that it would make gymnastics more forgiving on the body, it would possibly improve the overall quality of competitions (especially if gymnasts can discard that one problem skill) and raise the base level of gymnasts. There was one thing that really bugged me though and so I asked a question:

Is there any evidence for these claims?

The reaction was mostly shrugged shoulders. The FIG and the Men’s Technical Committee (MTC) have a responsibility to provide the best arguments for and against any Code modifications and on this occasion it didn’t feel like this was the case.

A change from 10 to 8 counting skills is not without precedent, The WAG code has been operating with 8 counting skills this Olympic cycle. This should provide a great opportunity to do some real data analysis on some of these claims. Here are just some examples that I thought about off the top of my head that could help with justifying the arguments for Code changes either way.

Injury Frequency

I don’t know what data is collected by the FIG but I have heard they have made an effort recently in collating data for injuries during competition. A demonstration of injury frequency before and after the change for the WAG code could be an invaluable statistic. Now many factors need to line up to correctly obtain this data. The National medical teams need to be willing to participate, reporting of injuries needs to be consistent across the board and the FIG need to collate and analyse this data in a meaningful way. However if this is going to be an argument for modifications based on injury rates these kind of statistics are surely an integral part of that argument.

Average Execution Deductions

To justify whether less counting skills translates to an increase in skill quality we could take execution scores across various competitions and find the average execution deduction per skill before and after the WAG code changes. This eliminates the bias towards 8 counting skills having a higher execution score as would happen if the analysis were to be done per routine.

A more basic measure would be to calculate the rate of falls per routine or per skill to see whether, on average, 8 counting skills correlates with less falls. These kind of statistics would, at the least, provide some evidence to support arguments concerning the quality of routines after the changes.

Average Participation Age

Has the average or median participant age increased for major competitions around the World? If longevity is an argument then this may be borne out in the statistics themselves. T̶h̶e̶r̶e̶ ̶a̶r̶e̶,̶ ̶h̶o̶w̶e̶v̶e̶r̶,̶ ̶p̶r̶o̶b̶l̶e̶m̶s̶ ̶w̶i̶t̶h̶ ̶t̶h̶i̶s̶ ̶a̶s̶ ̶t̶h̶e̶ ̶m̶i̶n̶i̶m̶u̶m̶ ̶a̶g̶e̶ ̶r̶e̶s̶t̶r̶i̶c̶t̶i̶o̶n̶s̶ ̶f̶o̶r̶ ̶W̶A̶G̶ ̶w̶e̶r̶e̶ ̶b̶u̶m̶p̶e̶d̶ ̶u̶p̶ ̶t̶o̶ ̶1̶8̶ ̶y̶e̶a̶r̶s̶ ̶r̶e̶c̶e̶n̶t̶l̶y̶ Update: I have been reminded this was only on the MAG side. Not an insurmountable problem however for statisticians.

Single Data Point Arguments

I’ve heard through the grapevine that a justification for keeping with 10 counting skills was put forward because of the unique situation at Glasgow World Championships where Uneven Bars had a four way tie for first place. The argument follows that this would not happen in a 10 counting skill system. Now this argument could easily be backed up with a little data analysis. Have we had a sudden increase in tie situations in WAG competitions since the introduction of 8 counting skills? This would be fairly easy to justify either way by simply checking back on major competitions and comparing the rate of tied scores before and after the change. One data point does not an argument make.


Without any analysis no real conclusions can be drawn. I’m sure there are many ways to manipulate the data. There are also many smarter people out there who could get more out of the data than I ever could and the FIG and MTC should be embracing these possibilities. I for one would be happy to stay with 10 counting skills or move to 8 if there was something, anything, concrete the FIG and MTC could hang their hat on but so far I am left underwhelmed. Who knows, perhaps some analysis has been done and I am ignorant of it, if so feel free to point me towards it. I really hope that I am wrong.

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.