MediaPipe Pose进阶教程：自定义关键点检测

1. 引言：从标准检测到个性化定制

1.1 人体骨骼关键点检测的技术演进

人体姿态估计（Human Pose Estimation）是计算机视觉中的核心任务之一，广泛应用于动作识别、虚拟试衣、运动分析和人机交互等领域。早期方法依赖于复杂的深度学习模型（如OpenPose、AlphaPose），通常需要GPU支持且推理速度较慢。

随着轻量化模型的发展，Google推出的MediaPipe Pose成为边缘设备和CPU环境下的首选方案。它不仅能在毫秒级完成33个关键点的高精度定位，还具备极强的鲁棒性，适用于复杂姿态场景（如瑜伽、舞蹈、健身训练等）。

然而，标准的MediaPipe输出包含全部33个关键点，但在实际项目中，我们往往只关注特定部位——例如仅需检测手部动作用于手势控制，或仅追踪肩、膝关节用于康复评估。这就引出了一个关键需求：如何基于MediaPipe Pose实现自定义关键点检测？

1.2 本文目标与价值

本文将带你从零开始，深入理解MediaPipe Pose的工作机制，并通过代码实践完成以下目标：

理解MediaPipe Pose的关键点索引结构
提取并可视化指定子集的关键点（如上半身/下肢）
自定义连接线逻辑，构建专属骨架图
集成WebUI进行实时图像上传与结果展示

最终你将掌握一套可直接落地的“轻量级+可裁剪”姿态检测方案，适用于资源受限环境下的定制化AI应用开发。

2. 核心原理：MediaPipe Pose的关键点体系解析

2.1 模型架构与输出格式

MediaPipe Pose采用单阶段检测器（Single-stage Detector）结合BlazePose骨干网络，在保证精度的同时极大优化了推理效率。其输出为一组标准化的3D坐标点（x, y, z, visibility），共33个关键点，覆盖面部、躯干、四肢主要关节。

这些关键点按固定顺序排列，每个索引对应特定身体部位。以下是部分关键点映射表（节选）：

索引	名称	区域
0	nose	面部
1	left_eye_inner	面部
2	left_eye	面部
...	...	...
11	left_shoulder	上肢
12	right_shoulder	上肢
13	left_elbow	上肢
14	right_elbow	上肢
15	left_wrist	手部
16	right_wrist	手部
23	left_hip	下肢
24	right_hip	下肢
25	left_knee	下肢
26	right_knee	下肢
27	left_ankle	足部
28	right_ankle	足部

📌提示：完整索引可在 MediaPipe官方文档查阅。

2.2 关键点连接逻辑分析

默认情况下，MediaPipe使用预定义的POSE_CONNECTIONS来绘制骨架连线。该连接集由17条边组成，例如(11, 13)表示左肩到左肘的连线。

但这一默认配置可能包含你不关心的区域（如面部细节）。因此，要实现“自定义关键点检测”，必须做到两点：

筛选关键点索引集合
重构连接关系图

这正是我们接下来要解决的核心问题。

3. 实践应用：构建自定义关键点检测系统

3.1 技术选型与环境准备

本项目基于以下技术栈构建：

Python 3.9+
MediaPipe 0.10.9+
Flask Web框架（用于WebUI）
OpenCV-Python（图像处理）

pip install mediapipe opencv-python flask numpy

项目结构如下：

custom_pose/ ├── app.py # Flask主程序 ├── pose_detector.py # 自定义姿态检测类 ├── static/uploads/ # 图片上传目录 └── templates/index.html # 前端页面

3.2 自定义关键点提取与过滤

我们首先封装一个CustomPoseDetector类，支持灵活选择感兴趣的关键点。

# pose_detector.py import cv2 import mediapipe as mp import numpy as np class CustomPoseDetector: def __init__(self, upper_body_only=False, lower_body_only=False): self.mp_drawing = mp.solutions.drawing_utils self.mp_pose = mp.solutions.pose # 初始化MediaPipe Pose模型 self.pose = self.mp_pose.Pose( static_image_mode=True, model_complexity=1, enable_segmentation=False, min_detection_confidence=0.5 ) # 定义感兴趣的区域 if upper_body_only: self.landmark_indices = list(range(11, 17)) + [23, 24] # 肩、肘、腕、髋 self.connection_pairs = [ (11, 13), (13, 15), # 左臂 (12, 14), (14, 16), # 右臂 (11, 23), (12, 24), # 肩至髋 (23, 24) # 髋部连接 ] elif lower_body_only: self.landmark_indices = list(range(23, 29)) # 左右髋、膝、踝 self.connection_pairs = [ (23, 25), (25, 27), # 左腿 (24, 26), (26, 28) # 右腿 ] else: # 默认全身体 self.landmark_indices = list(range(33)) self.connection_pairs = self.mp_pose.POSE_CONNECTIONS def detect(self, image_path): image = cv2.imread(image_path) rgb_image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) results = self.pose.process(rgb_image) if not results.pose_landmarks: return None, "未检测到人体" h, w, _ = image.shape keypoints = [] # 提取指定关键点坐标 for idx in self.landmark_indices: lm = results.pose_landmarks.landmark[idx] cx, cy = int(lm.x * w), int(lm.y * h) keypoints.append((cx, cy, lm.visibility)) # 绘制骨架 annotated_image = image.copy() for pt in keypoints: cv2.circle(annotated_image, (pt[0], pt[1]), 5, (0, 0, 255), -1) # 红点 for conn in self.connection_pairs: if conn[0] in self.landmark_indices and conn[1] in self.landmark_indices: start_idx = self.landmark_indices.index(conn[0]) end_idx = self.landmark_indices.index(conn[1]) cv2.line(annotated_image, keypoints[start_idx][:2], keypoints[end_idx][:2], (255, 255, 255), 2) # 白线 return annotated_image, f"成功检测 {len(keypoints)} 个关键点"

📌代码说明： - 支持三种模式：全身、上半身、下半身 -landmark_indices控制保留哪些关键点 -connection_pairs定义新的连接逻辑 - 使用OpenCV手动绘制红点与白线，替代默认绘图函数

3.3 WebUI集成与交互设计

使用Flask搭建简易Web界面，支持图片上传与结果显示。

# app.py from flask import Flask, request, render_template, send_from_directory import os from pose_detector import CustomPoseDetector app = Flask(__name__) UPLOAD_FOLDER = 'static/uploads' os.makedirs(UPLOAD_FOLDER, exist_ok=True) @app.route('/', methods=['GET', 'POST']) def index(): if request.method == 'POST': file = request.files['image'] mode = request.form.get('mode', 'full') if file: input_path = os.path.join(UPLOAD_FOLDER, 'input.jpg') output_path = os.path.join(UPLOAD_FOLDER, 'output.jpg') file.save(input_path) # 初始化检测器 detector = CustomPoseDetector( upper_body_only=(mode == 'upper'), lower_body_only=(mode == 'lower') ) result_img, msg = detector.detect(input_path) if result_img is not None: cv2.imwrite(output_path, result_img) return render_template('index.html', message=msg, input_img='uploads/input.jpg', output_img='uploads/output.jpg') else: return render_template('index.html', message=msg) return render_template('index.html') @app.route('/static/<path:filename>') def static_files(filename): return send_from_directory('static', filename) if __name__ == '__main__': app.run(host='0.0.0.0', port=5000)

前端HTML模板（简化版）：

<!-- templates/index.html --> <h1>🤸‍♂️ 自定义人体骨骼检测</h1> <form method="post" enctype="multipart/form-data"> <input type="file" name="image" required /> <select name="mode"> <option value="full">全身模式</option> <option value="upper">上半身模式</option> <option value="lower">下半身模式</option> </select> <button type="submit">分析姿态</button> </form> {% if message %} <p><strong>{{ message }}</strong></p> <img src="{{ url_for('static', filename=input_img) }}" width="400" /> <img src="{{ url_for('static', filename=output_img) }}" width="400" /> {% endif %}

3.4 实际运行效果与优化建议

运行步骤

启动服务：python app.py
浏览器访问http://localhost:5000
上传一张人物照片
选择检测模式（如“上半身”）
查看生成的骨骼图

输出示例

输入图像：一名做俯卧撑的人
选择“上半身模式”
输出图像显示：仅肩、肘、腕、髋四个部位被标记为红点，并用白线连接形成上肢骨架

性能优化建议

优化方向	措施说明
减少冗余计算	若仅需关键点坐标，可关闭`enable_segmentation`和`smooth_landmarks`
提升响应速度	对视频流使用`static_image_mode=False`以启用轨迹平滑
降低内存占用	使用`model_complexity=0`切换至轻量模型（LITE版本）
增强鲁棒性	添加置信度过滤：忽略`visibility < 0.5`的关键点