Player Detection using Deep Learning

Daniel Azevedo
Analytics Vidhya
Published in
5 min readJun 20, 2021

Player Detection in Football

Over these last years there has been an increasing interest in using data for improving the football teams gameplay. There are multiple areas that can be positively impacted by the analysis and study of football data, such as, player scouting, team gameplay (e.g., pitch control and xG models), players training and performance, etc.

In particular, Computer Vision can be an important tool for extracting relevant information from game/training videos.

As such, in this post I will focus on detecting football players from game footages and then try to identify the numbers in their shirts.

The code presented in this post can be seen here: https://github.com/danielazevedo/Football-Analytics/tree/master/player_detection

Example of Player Detection

Players Detection Pipeline

In order to detect and identify (through the shirt number) the players from video footages, I have established a workflow pipeline with 3 steps:

  1. Player Detection: Detect persons (football players) from images
  2. Number Detection: After detecting the player, detect the zone, within the player, where the shirt number is located
  3. Number Identification: After having the shirt number detected, identify the number drawn on the shirt.

Visually, the whole pipeline is defined as follows:

Player Detection pipeline

1. Player Detection

The first step is the simplest to implement. Fortunately, there are already pre-trained Deep Learning models that we can use for detecting persons from images, which, in this case, will detect the football players.

In this case, I used a pre-trained YOLO model using the Darknet framework, which was trained with thousands of images from COCO dataset. Using python, there is a OpenCv module named DNN that has the YOLO/DarkNet incorporated. Check this link with instructions on how to do object detection using YOLO.

Here is a snippet of Python code for loading and detecting the players

# initialize minimum probability to eliminate weak predictions
p_min = 0.5
thres = 0.
# 'VideoCapture' object and reading vicv2.mean(image, mask=mask)deo from a filevideo = cv2.VideoCapture('video_test.mp4')writer = None
h, w = None, None
# Create labels into list
with open('coco.names') as f:
labels = [line.strip() for line in f]
# load network
network = cv2.dnn.readNet('darknet/cfg/yolov3.weights', 'darknet/cfg/yolov3.cfg')
# Getting only output layer names that we need from YOLO
ln = network.getLayerNames()
ln = [ln[i[0] - 1] for i in network.getUnconnectedOutLayers()]
# Defining loop for catching frames
while True:
ret, frame = video.read()
if not ret:
break
# frame preprocessing for deep learning
blob = cv2.dnn.blobFromImage(frame, 1 / 255.0, (416, 416), swapRB=True, crop=False)
# perform a forward pass of the YOLO object detector, giving us our bounding boxes and associated probabilities. network.setInput(blob)
output_from_network = network.forward(ln)
#process bounding boxes in output_from_network

Here is an example. The blue boxes represent the detected players with the respective probabilities.

Image frame from Brazil vs Belgium

2. Number Detection

The second step focuses on detecting the number in the player shirt.

For this task, I trained a Deep Learning model using a dataset (you can find more informations here) with number on shirts labeled. The images were preprocessed with scale resizing and gray scale application.

Example of samples on the dataset

The model corresponds to a pre-trained VGG model, trained with some layers on top (i.e., transfer learning). Here is a snippet of the model architecture.

# load the VGG16 network, ensuring the head FC layers are left offvgg = VGG16(weights="imagenet", include_top=False, input_tensor=Input(shape=(224, 224, 3)))# freeze all VGG layers so they will *not* be updated during the training process
vgg.trainable = True
# flatten the max-pooling output of VGG
flatten = vgg.output
flatten = Flatten()(flatten)
# construct a fully-connected layer header to output the predicted bounding box coordinatesbboxHead = Dense(128, activation="relu")(flatten)
bboxHead = Dense(64, activation="relu")(bboxHead)
bboxHead = Dense(32, activation="relu")(bboxHead)
bboxHead = Dense(4, activation="sigmoid")(bboxHead)
# construct the model we will fine-tune for bounding box regressionmodel = Model(inputs=vgg.input, outputs=bboxHead)

Here is an example. The red boxes represent the detected players and detected numbers.

Image frame from Everton vs Tottenham

3. Number Identification

The last steps concerns the identification of the detected number. For this task a CNN model was trained with data augmentation. The model architecture was the follows.

classifier = Sequential()classifier.add(Conv2D(128, (3, 3), input_shape = (224, 224, 3), activation = 'relu'))classifier.add(MaxPooling2D(pool_size = (2, 2)))
classifier.add(Dropout(0.2))
classifier.add(Conv2D(64, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
classifier.add(Dropout(0.2))
classifier.add(Conv2D(32, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
classifier.add(Dropout(0.2))
classifier.add(Flatten())classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dense(units = 64, activation = 'relu'))
classifier.add(Dense(units = 64, activation = 'relu'))
classifier.add(Dense(units = 10, activation = 'softmax'))
# Compiling the CNNclassifier.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])

Here is an example. Besides the player and number detections, we can also see the numbers predicted.

Example of Number Identification

Putting all together

By applying these 3 steps consecutively, we can get real time prediction of the players, including the identification of their shirt number.

Here is an example of prediction in real time.

Real time prediction

Final Remarks

In this article I presented a pipeline for detecting football players on the pitch and identify their number in the shirt. I divided this task into 3 steps: Player Detection, Number Detection and Number Identification. For each, a Deep Learning model was trained, with different architecture depending on the final goal.

One future step would be to compute the player coordinates on the pitch. To achieve that, pitch lines would need to be identified first.

I hope you find this article interesting! For more work related to Football Data Science, please visit my repo: https://github.com/danielazevedo/Football-Analytics.

--

--