This side learning project will explore how mediapipe can be used to improve athletic performance through analysis of pose estimation. The below code has been extrapolated from several open code sources and edited for use and understanding.
Colab Files
Introduction to MP
Computer Vision can aid in limb and joint detection in both images and videos. The Open-source Computer Vision library can help process images and extract certain necessary features. MediaPipe is a pipeline which can provide solutions for facial detection, poses, object detection, box tracking, and motion detection. Videos must be converted into RGB format. Pose marks landmark coordinates on the image/video that ultimately seperate an object from its background. MP first finds the Region of Interest (ROI) before adequately marking landmarks.
Exploring the Code
First we must import our python libraries:
#Mount google drive to access data
from google.colab import drive
drive.mount('/content/drive')
import cv2
!pip install mediapipe==0.8.8
from google.colab.patches import cv2_imshow
import mediapipe as mp
import time
import math
from mediapipe.python._framework_bindings import packet
We must then initialize the MP pose object. Each argument is defined below:
- static_image_mode: This will decide whether to treat the input as a video stream or as a batch of images. If set to True, MP will run an ROI detector for each image. Since the video input used does not need to localize the object in question multiple times, we have set it to False. If the video had unrelated ROIs, we might set this to True.
- model_complexity: This sets the complexity of the landmark models (0,1,2). Landmark accuracy will increase with model complexity. 1 is the default.
- smooth_landmarks: Decides whether to filter landmarks across images.
- enable_segmentation: Decides whether to predict a segmentation mask
- smooth_segmentation: Decides whether to filter segmentation across different images.
- min_detection_confidence: Decides the minimum confidence value (0.0, 1.0) from the detection model for the detection to be successful.
- min_tracking_confidence: Decides the minimum confidence value (0.0, 1.0) for the pose landmarks to be considered succesful. Detection will be invoked if it fails. Higher values increase robustness but with higher latency.
class poseDetector(): def __init__(self, static_image_mode=False, model_complexity=1, smooth_landmarks=True, enable_segmentation=False, smooth_segmentation=True, min_detection_confidence=0.5, min_tracking_confidence=0.5): self.static_image_mode = static_image_mode self.model_complexity = model_complexity self.smooth_landmarks = smooth_landmarks self.enable_segmentation = enable_segmentation self.smooth_segmentation = smooth_segmentation self.min_detection_confidence = min_detection_confidence self.min_tracking_confidence=min_tracking_confidence
# DRAW LANDMARKS self.mpDraw = mp.solutions.drawing_utils # Using a detector, the pipeline first locates the person/pose region-of-interest (ROI) within the frame. #The tracker subsequently predicts the pose landmarks and segmentation mask within the ROI using the ROI-cropped frame as input self.mpPose = mp.solutions.pose self.pose = self.mpPose.Pose( static_image_mode, model_complexity, smooth_landmarks, enable_segmentation, smooth_segmentation, min_detection_confidence, min_tracking_confidence) def findPose(self, img, draw=True): imgRGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) self.results = self.pose.process(imgRGB) if self.results.pose_landmarks: if draw: self.mpDraw.draw_landmarks(img, self.results.pose_landmarks, self.mpPose.POSE_CONNECTIONS) return img def findPosition(self, img, draw=True): self.lmList = [] if self.results.pose_landmarks: for id, lm in enumerate(self.results.pose_landmarks.landmark): h, w, c = img.shape # print(id, lm) cx, cy = int(lm.x * w), int(lm.y * h) self.lmList.append([id, cx, cy]) if draw: cv2.circle(img, (cx, cy), 5, (255, 0, 0), cv2.FILLED) return self.lmList #MEASURE ANGLE... def findAngle(self, img, p1, p2, p3, draw=True): # Get the landmarks x1, y1 = self.lmList[p1][1:] x2, y2 = self.lmList[p2][1:] x3, y3 = self.lmList[p3][1:] # Calculate the Angle angle = math.degrees(math.atan2(y3 - y2, x3 - x2) - math.atan2(y1 - y2, x1 - x2)) if angle < 0: angle += 360 if draw: cv2.line(img, (x1, y1), (x2, y2), (255, 255, 255), 3) cv2.line(img, (x3, y3), (x2, y2), (255, 255, 255), 3) cv2.circle(img, (x1, y1), 10, (0, 0, 255), cv2.FILLED) cv2.circle(img, (x1, y1), 15, (0, 0, 255), 2) cv2.circle(img, (x2, y2), 10, (0, 0, 255), cv2.FILLED) cv2.circle(img, (x2, y2), 15, (0, 0, 255), 2) cv2.circle(img, (x3, y3), 10, (0, 0, 255), cv2.FILLED) cv2.circle(img, (x3, y3), 15, (0, 0, 255), 2) cv2.putText(img, str(int(angle)), (x2 - 50, y2 + 50), cv2.FONT_HERSHEY_PLAIN, 2, (0, 0, 255), 2) return angle