Decoding the Dynamics of MMA Through Neural Networks: An Image Classification Journey

5 min readJun 10, 2024

In the ever-evolving field of data science, image classification serves as a cornerstone, illustrating the profound capabilities of neural networks. Standard datasets like MNIST for handwritten digits or Fashion-MNIST for apparel items are frequently cited examples. However, the true test of neural networks lies in their ability to decipher more complex and dynamic scenarios, such as those found in MMA (Mixed Martial Arts) and kickboxing fights. This article delves into the application of neural networks for classifying images from these high-intensity sports, shedding light on the journey from raw image data to meaningful classification.

Unraveling the Neural Network Mystique

Neural networks, often perceived as mysterious “black boxes,” operate on the principle of layered transformations. While the intricate inner workings of these layers can be opaque, we control the inputs and structure of these models. The essence of image processing in this context involves converting a picture into a numerical data matrix. For example, consider an image from a ground fight in Fight Exclusive Night (FEN) 24 (featured by Polish fighter Andrzej Grzebyk). By resizing this image to a 50x50 pixel format, we create a manageable data matrix. Each pixel’s color, described through RGB values, provides the numerical input for our neural network.

Fight Exclusive Night 24 — own photography

The neural network processes this image by examining it in fragments, often using a moving frame, such as 2x2 pixels. These fragments are sequentially fed into the model, which reconstructs the overall image through these smaller pieces, akin to solving a jigsaw puzzle. As the data passes through successive layers of the network, the initial image becomes less visually coherent to humans but more interpretable for the model, facilitating effective classification.

Capturing the Essence of Fights

Analyzing images from MMA fights introduces unique challenges. MMA encompasses a variety of techniques including punches, kicks, clinches, grappling, and ground fighting. However, images often fail to distinctly capture clinches and grappling moves, leading us to focus primarily on punches, kicks, and ground fighting. To enhance our dataset, we include images from related disciplines such as kickboxing, Muay Thai, and Olympic boxing. Each technique, whether it’s a high kick, low kick, front kick, or a specific ground fighting maneuver, contributes to the complexity of the classification task.

Determining which fighter’s action to classify in a given image adds another layer of complexity. Typically, the focus is on the dominant, active fighter, but defensive actions like blocks and parries can obscure clear classification. Additionally, varying camera angles — from top-down views to audience and press perspectives — introduce inconsistencies that further challenge the classification process.

Initial Forays and Promising Outcomes

With an initial dataset of approximately 450 images, divided into training and validation sets, early results were encouraging despite the limited data. A Convolutional Neural Network (CNN) model achieved about 85% accuracy on the validation set. For instance, while theimage of Andrzej Grzebyk engaged in ground combat was misclassified as a punch, similar images were correctly identified as ground fighting.

To draw a comparison, a separate model trained on 16,000 images to distinguish between Adidas and Under Armour logos achieved over 90% accuracy. This higher accuracy is likely due to the simpler and more repetitive nature of logo patterns. In contrast, fight images should be kept as realistic as possible, avoiding excessive augmentations like flips and rotations, which can distort the natural scenarios depicted.

Scaling Up and Refining the Model

To improve accuracy, the dataset was expanded to nearly 1800 training images, with additional 400 images each for validation and testing. This larger dataset maintained clear distinctions among punches, kicks, and ground fighting, while omitting clinch images due to insufficient samples. Augmentation techniques such as horizontal flips proved beneficial to a certain extent, though over-reliance on these transformations led to diminished model performance.

Confusion matrix for the best performing model

In refining the model, it became apparent that more sophisticated network topologies could be employed. Networks such as VGG16, VGG19, ResNet50, ResNet151, Xception, and AlexNet, which have shown success in large datasets like ImageNet, were considered. However, overfitting emerged as a significant challenge. Overfitting occurs when a model performs well on training data but poorly on test data, indicating that it has memorized training images rather than generalizing from them. To counteract this, techniques such as increasing the number of original images, implementing Dropout layers to randomly deactivate neurons during training, and employing early stopping to halt training when validation accuracy plateaus were used.

Expanding Categories for Greater Precision

To enhance model accuracy further, additional fight categories were introduced. These included neutral positions, various wrestling actions, and solo fighter images, resulting in six key classes. Even though recognizing specific techniques proved challenging, which is a known difficulty even for skilled judges with consistency rates between 76.5%-85%, the model’s accuracy reached 70%. This improvement was achieved by enlarging the dataset and employing strategies to mitigate overfitting.

Confusion matrix for the best performing solution (DenseNet 161 architecture)

One key insight was that ground fighting scenarios were generally easier to classify due to their distinct characteristics. In contrast, standing fights involving punches and kicks presented greater ambiguity. The journey towards more precise and reliable image classification in dynamic environments like MMA continues, underscoring the importance of robust dataset quality and advanced neural network architectures.

Looking Ahead

The application of neural networks in classifying MMA fight images not only demonstrates the potential of this technology but also highlights the challenges inherent in dynamic, real-world scenarios. Continuous refinement in dataset curation, augmentation strategies, and network topologies will be crucial in advancing the accuracy and utility of these models.

As we push the boundaries of image classification in sports, the lessons learned from this endeavor extend to broader applications in data science, offering a glimpse into the future where AI can comprehensively analyze and interpret complex visual data. Through persistent experimentation and innovation, the goal of achieving human-like understanding and classification of images in fast-paced, multifaceted environments becomes increasingly attainable.

If you want to try the model, please got to: https://mma.hamlet07.link/

Additional sources:

R. Pujszo, M.Adam, The course of the MMA fights as a part of KSW federation — as the examples of the heavy weight „fight of the night”, Journal of Combat Sports and Martial Arts, 2016; 1(2); Vol. 7, 51–54
M. Adam, R.Pujszo, S.Kuźmicki, M. Szymański, S.Tabakov, MMA fighters’ technical-tactical preparation– fight analysis: a case study, Journal of Combat Sports and Martial Arts, 2015; 1(2); Vol. 6, 35–41