Technology in sports has become increasingly crucial in recent years. FIFA’s creation of a Football Technology and Innovation department only a few years ago is a strong sign that the beautiful game is fully embracing the added value technology can bring to sports.
However, much criticism of various technology implementations has been around the notion that those innovations may only serve at the elite level (i.e. the VAR and Goal Line Technology for football). Developing for the elite makes sense initially — more financial stakes, more visibility too. But the natural evolution for technology is to become progressively more democratic. mycujoo is a key factor of this application of technology capacities to the wider public, and is pushing the boundaries further through its machine learning and computer vision innovation fields.
Andrea Pennisi, PhD, computer vision and machine learning scientist at mycujoo, explains how innovation can be put at the service of the larger sports spectrum:
Football is becoming an important topic of research in the computer vision / machine learning field, but it is far more complex than the other sports that were researched so far.
I started at mycujoo in January 2018 and my role is to analyse the football matches on mycujoo via computer vision and machine learning over several aspects, and to define new features for the mycujoo product.
Auto-panning — technology to help scaling up streaming
The main idea behind my work is to create a fully automatic system able to analyse the match itself, to act on the analysis by creating the capacities to replace the need for a human operator of the camera. The mycujoo platform allows any club at any level to stream their football matches for free, with very simple recording setups — no need for heavy cameras and OB-vans, you can stream directly with a mobile phone.
But a human person is still needed to produce the images, follow the action as it moves on the full scale of the pitch, zoom in and out. At amateur levels, it can be difficult to secure the people required to operate this, and these people will often be untrained volunteers who will make mistakes or lose track of the action, and this is detrimental for all stakeholders — the viewers of course, who will miss what’s happening, but also the clubs, the players and the sponsors, as there is then loss of valuable data.
My research will help the main people involved at this level of sports — the clubs who create the content, the players and coaches who will be able to use their own data, the fans who can go back to the highlights at any time, and the advertisers and sponsors.
How we do this, is by teaching the machine how to replicate the expected behaviour of a single cameraman during a match. By making it ingest hundreds of thousands of football footage, it is learning how to zoom and pan the camera automatically, following and producing the action as any professional broadcasting crew would. This is actually a relatively classic computer vision approach — you isolate the movement of the players inside the pitch, then fixate the point of view in the middle of the movement, and then depending on certain patterns you may zoom in or out of the action in the direction of the movement.
The initial challenge that we are solving with our streaming team right now, is that to achieve this we need a wide angle image of the pitch at all times. So we are creating a scalable solution here, by stitching two cameras with overlapping views so that the whole pitch is covered. We have to ensure that the two images are perfectly synchronized, and that the overlap is correctly treated to avoid any blurring.
Automating data collection in a scalable manner
With this wide angle view, we can create the auto-panning, and move to our second main innovation: the statistical analysis of a match.
For this, we have to do this in a few steps: first the machine needs to be able to detect the pitch area and the objects within the pitch — the players, the ball, the goals. But as a second step, it has to be a quite sophisticated analysis, whereby we need to be able to correctly track each individual player during the whole duration of the match. Then, to be able to detect events, this is when it becomes even more complex. We first need to define what a goal is from the machine’s point of view. Basically that definition is constituted by several dimensions: a “physical” detection of the location of the ball and the goal — those two objects have to be characterised. Then, we also teach the machine to recognise a similarity with thousands of previously ingested goal situations. And it is the combination of these two dimensions that gives the machine the notion of whether a goal was scored or not. This obviously can be applied to other sports where the concept of the goal is the same as in football, with a ball and a goal.
The final step for us will be to implement a trial phase where we will have the machine suggest the event happening (in the same way the GLT tells the referee “goal or not”) to the match operator producing the match physically. And we will collect feedback from the users to evaluate the accuracy of the machine’s suggestions, analyse and figure out where we need to improve the feedback.
mycujoo works in the long tail, at the amateur level. That is a huge challenge: we work with images created by non-professionals and we have to teach the machine through the enormous diversity of the images.
The main difficulty, what makes it even more interesting from the machine analysis point of view, is that with mycujoo and at the long tail level, the images do not come from professionally trained cameramen. That makes the computer vision aspect that much more difficult, due to the incredible disparity of the image quality we work with. To give an example, we have also started to look into esports. Well, efootball is basically the exact opposite of the live matches on mycujoo: in efootball, or at the professional level, the vision part is almost easy — camera movements are predictable, the grass is always green, the cues for the machine are consistent across matches, competitions etc. On mycujoo, it’s a different story. The camera is not always perfectly centered on the action because the human operator who is not a trained professional might lose track of what is happening. And the grass… is definitely not always green.
In efootball or at professional level, it is easy — the camera movements are predictable, the grass is always green. On mycujoo, the camera is not always perfectly on the action, and the grass… is definitely not always green.
But we have to find ways to work around these complexities and bring the system to work regardless. That’s a very interesting challenge for me to develop a technology that circumvents these types of heterogeneous circumstances.
Sponsor visibility analysis
Finally, the next topic of research for me is to create an automatic sponsor visibility detection system. As a sponsor in football, if I buy the right to be seen on a football jersey or around the stadium, I want to know for how long my brand has been exposed to viewers. We can provide this data on the streaming side — calculate the amount of visibility earned via streaming automatically and provide this data automatically to sponsors.
For the jersey sponsor recognition we have to upload as many images of the sponsor’s logo in advance to train the machine. For the stadium panels across the pitch it is simpler as we can do it “ad-hoc” before the beginning of the match by identifying the logos before the match starts, upload this to the server and get the machine to learn as it analyses.
This has two clear benefits — first to the sponsors themselves, as they get true, measurable data on the visibility of their campaigns; second, to the clubs and leagues who will then be able to produce better reports and define their advertising real estate’s digital value.
Impacting the sports industry
I like to think that the work I do represents the future. Important football organisations now are going the way of the wearable technology with RFID chips for instance. I don’t think it’s ideal for a few reasons — however small the devices may become, it is still invasive on the players and may affect them. Also, this is not scalable across the whole football ecosystem. What I work on, it’s really about putting two smartphones next to each other and hit the record button. The machine, the technology that we are building will do the rest. This is totally scalable.