Real-time Pose Estimation from Video using MediaPipe and OpenCV in Python

Real-time Pose Estimation from Video using MediaPipe and OpenCV in Python

Riddhi Kumari Singh
4 min readApr 26, 2023

--

Have you ever wondered how computer vision algorithms can identify the human body and its various poses from a video? In this blog, we’ll explore how we can use the MediaPipe Pose model and OpenCV library to detect human poses in real-time from a video file using Python.

We’ll take a step-by-step approach and explain each code block along the way. By the end of this blog, you’ll have a good understanding of how the MediaPipe Pose model works and how you can use it to detect human poses from your own videos.

Let’s Dive In

Step 1: Import Libraries

The first step is to import the required libraries for the code to work. The code uses OpenCV, MediaPipe, and CSV libraries. The OpenCV library is used for video processing, MediaPipe is used to run the Pose model and the CSV library is used to write the landmark coordinates to a CSV file.

import cv2
import mediapipe as mp
import csv

Step 2: Define Functions

The second step is to define a function write_landmarks_to_csv that takes in the detected landmarks, the frame number, and a list of CSV data and prints the landmark coordinates to the console. The function also appends the landmark coordinates to the CSV data list.

def write_landmarks_to_csv(landmarks, frame_number, csv_data):
print(f"Landmark coordinates for frame {frame_number}:")
for idx, landmark in enumerate(landmarks):
print(f"{mp_pose.PoseLandmark(idx).name}: (x: {landmark.x}, y: {landmark.y}, z: {landmark.z})")
csv_data.append([frame_number, mp_pose.PoseLandmark(idx).name, landmark.x, landmark.y, landmark.z])
print("\n")

Step 3: Define Input and Output Paths

The third step is to define the path to the input video file and the path to the output CSV file.

video_path = 'path_to_your_mp4_file'
output_csv = 'path_to_where_you_want_to_store_your_csv_file'

Step 4: Initialize Libraries

The fourth step is to initialize the MediaPipe Pose and Drawing utilities and load the video file using OpenCV’s VideoCapture function.

# Initialize MediaPipe Pose and Drawing utilities
mp_pose = mp.solutions.pose
mp_drawing = mp.solutions.drawing_utils
pose = mp_pose.Pose()

# Open the video file
cap = cv2.VideoCapture(video_path)

Step 5: Process Each Frame of the Video

The fifth step is to enter a loop that reads each frame of the video and processes it with MediaPipe Pose. It also draws the pose landmarks on the frame using MediaPipe Drawing utilities. If pose landmarks are detected, the write_landmarks_to_csv function is called to write the landmark coordinates to the console and to the CSV data list.

frame_number = 0
csv_data = []

while cap.isOpened():
ret, frame = cap.read()
if not ret:
break

# Convert the frame to RGB
frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

# Process the frame with MediaPipe Pose
result = pose.process(frame_rgb)

# Draw the pose landmarks on the frame
if result.pose_landmarks:
mp_drawing.draw_landmarks(frame, result.pose_landmarks, mp_pose.POSE_CONNECTIONS)

# Add the landmark coordinates to the list and print them
write_landmarks_to_csv(result.pose_landmarks.landmark, frame_number, csv_data)

# Display the frame
cv2.imshow('MediaPipe Pose', frame)

# Exit if 'q' keypyt

Step 6: Test the Work

To test the code from the above blog, you can follow these steps:

  1. Download a sample video that contains human poses, such as a yoga video or a workout video.
  2. Save the video file in your computer’s file system and update the video_path variable in the code to point to the location of the video file.
  3. Specify the path and filename of the output CSV file by updating the output_csv variable in the code.
  4. Ensure that you have all the required libraries installed in your Python environment. You can install them using pip by running the following command: pip install opencv-python mediapipe numpy csv
  5. Open a Python IDE or a Python terminal and run the code.
  6. The code will process each frame of the video and display the output in a window. Press the ‘q’ key to stop the video processing.
  7. Once the video processing is complete, the output CSV file will be created with the landmark coordinates for each frame of the video.

By following these steps, you can easily test the code from the above blog and see how it detects human poses from a video file.

Conclusion

In conclusion, this blog demonstrated how to use the MediaPipe Pose model and OpenCV to detect human poses in real-time from a video file using Python. We explored how to process each frame of the video using the MediaPipe Pose model, how to draw the pose landmarks on the frames, and how to save the landmark coordinates to a CSV file.

With the help of this blog, you can now easily apply this technique to your own videos and extract useful insights from them. We hope you found this blog informative and useful, and we encourage you to continue exploring the world of computer vision and machine learning.

gitLink: https://github.com/wittygirl8/mp4-to-mediapipe

Read my other articles:

--

--