js+网络摄像头实现人体肢体关键点动作捕获

最近有一个项目，客户需要用户人体姿势识别，进行表演考核用途，或者康复中心用户恢复护理考核，需要用摄像头进行人体四肢进行肢体关键点对比考核，资料还是太少了。只有个别大佬发了部分技术指导。感觉写的不错。

阿里云-视觉智能开放平台提供sdk 服务

视觉智能开放平台
针对人体特性，定制15个肢体关键点，可以精准刻画人物姿态。对环境光、模糊等具有较强鲁棒性。

体育健身

在这里插入图片描述

互动直播

在这里插入图片描述
看了以后奈何经费（W–七步包月扛不住啊，需要审核开通使用sdk）有点高。果断放弃了。

百度AI开放平台

人体关键点识别
精准定位人体的21个主要关键点，包含头顶、五官、颈部、四肢主要关节部位；支持人体背面、侧面、中低空斜拍、大动作等复杂场景

在这里插入图片描述

体育健身
娱乐
安防
百度的可以免费体验使用，这个做的比较好，可能两个平台的侧重点不一样。

首先感谢大佬【忠文老弟-知乎】
来源于：纯手撸（js+网络摄像头）实现的丐版动捕

视频展示效果

纯手撸（js+网络摄像头）实现的丐版动捕

技术依赖

1、WebRTC读取摄像头数据，这个是web规范，可以直接使用，注意如果不在本机调试（包括手机预览），需要支持Https；

2、Google在2020年推出的神经网络模型BlazePose，可以对图像进行识别，同时给出关键点 2d坐标值和3d坐标值。其中2d坐标值是相对于输入的图像坐标，3d坐标值是相当于人体臀部中间为原点的 2x2x2的立方体；
在这里插入图片描述
3、使用了Three.js作为可视化库和一些基础的数学函数库，再无其他依赖。

解决的问题及核心思路

虽然有了blazePose的加持，已经能够识别人体的姿态，在画布中直接绘制也是相当的方便（因为有了2d坐标），但是他们都没有3d的展示，有也仅仅是直接把3d坐标点直接渲染出来，相当于只有结果。但是对人体结构和动画来说，并不是说知道某个关键点的位置就完了，而是需要知道这些骨骼关节是怎么运动的，最后才达成了这个姿势，同时也不是单纯的 IK（反向运动学）去设置一些关键骨骼的到达点，然后算法直接驱动所有骨骼。因为现在BlazePose模型已经给出了每个关键点的位置，我们是可以通过计算获得骨骼关节的运动形态，从而驱动动画角色。

核心问题就变成了

0、BlazePose模型是对每幅图像直接给出预测值，但是并没有给定初始值，这里需要自己首先从T-Pose获取到位置并作为初始化位置。

1、单段骨骼的长度是不会改变的，因此只能通过关节的弯折、旋转，用来驱动整个身体的姿势变化。

2、对于大臂带动小臂，大腿带动小腿这种问题，其实只关心的大臂/小臂两个关键点位置，而与其他关键点无关。

3、对于上半身的旋转，我们取左右肩膀与原始方向之间进行比较，就可以获得。

4、对于头部的旋转是最复杂的

头部的关键点坐标值取出来都很小，容易造成一些误差
头部是作为一个整体进行驱动的，所以头部的关键点相对于头部来说是没有变化的，都得统一转化到对于人体原点
根据关键点无法得到头部的原始访问，因此需要通过其他方式设定头部的方向
怎么获取头部基础坐标系

怎么获取头部基础坐标系

5、整个计算过程较长，怎么检测每步的计算逻辑和结果是否正确，因此需要一些辅助性的debug工具

关键代码实现

1、通过WebRTC获取摄像头视频流

function getMedia() {const width = window.innerWidth / 2;const height = window.innerHeight / 2;let constraints = {//参数video: { width, height },audio: false};//获得video摄像头区域let video = document.getElementById("video");video.width = width;video.height = height;//返回的Promise对象let promise = navigator.mediaDevices.getUserMedia(constraints);promise.then(function (MediaStream) {video.srcObject = MediaStream;video.play();});
}

2、THREEJS加载带骨骼的模型，并设置辅助工具

直接使用Threejs自带的 /examples/models/gltf/XBot.glb 就可以，这块基础代码就不贴了，最后的结果如下
XBot T-Pose侵删

3、设置关键点骨骼初始化位置

BlazePose模型好像只能识别偏真人的模型，因此通过blender摆放的动漫的角色截图他是识别不了的，因此我找了一张跟真人比较像的T-Pose照片，用他来设置基础的关键点位置。

网上找的T-pose，侵删

通过BlazePose预测，我们取到了一些想要的关键点，并进行和XBot模型绑定并设置初始化

let boneObj = {}
let viewModel;const loader = new THREE.GLTFLoader();loader.load('./Xbot.glb', function (gltf) {model = gltf.scene;scene.add(model);viewModel = model;bones = model.children[0].children[0]// 关键点由模型根据上面的T-Pose预测并绑定boneObj['left-shoulder'] = { bone: bones.getObjectByName('mixamorigLeftShoulder'), initPos: new THREE.Vector3(0.16728906333446503, -0.4775106608867645, -0.2042236328125) }boneObj['left-arm'] = { bone: bones.getObjectByName('mixamorigLeftArm'), initPos: new THREE.Vector3(0.38952040672302246, -0.4693129360675812, -0.207763671875) }boneObj['left-fore-arm'] = { bone: bones.getObjectByName('mixamorigLeftForeArm'), initPos: new THREE.Vector3(0.5944491624832153, -0.4565984904766083, -0.315185546875) }boneObj['right-shoulder'] = { bone: bones.getObjectByName('mixamorigRightShoulder'), initPos: new THREE.Vector3(-0.17201489210128784, -0.4690127372741699, -0.2266845703125) }boneObj['right-arm'] = { bone: bones.getObjectByName('mixamorigRightArm'), initPos: new THREE.Vector3(-0.40517494082450867, -0.43440765142440796, -0.2242431640625) }boneObj['right-fore-arm'] = { bone: bones.getObjectByName('mixamorigRightForeArm'), initPos: new THREE.Vector3(-0.6103491187095642, -0.4126957058906555, -0.3125) }boneObj['neck'] = { bone: bones.getObjectByName('mixamorigNeck'), initPos: new THREE.Vector3(1, 0.0, 0.0) }boneObj['waist'] = { bone: bones.getObjectByName('mixamorigSpine2'), initPos: new THREE.Vector3(1, 0.0, 0.0) }// 隐藏骨骼// skeleton = new THREE.SkeletonHelper(model);// skeleton.visible = false;// scene.add(skeleton);});

对关键点的骨骼进行初始化位置，用于计算初始方向，然后与XBot角色模型的骨骼绑定起来，方便后续进行操作

4、肩部旋转

原理：

在BlazePose中，肩膀就是 11，12 两个关键点，在T-pose中，11,12 的初始方向是 vector3(1,0,0)，现在我们只要知道 11,12 点的方向，就能求出旋转，再设置腰部的旋转就可以。这里没有考虑肩部下沉等状态。

因为肩部本身是无法旋转的，他们是依靠腰部的关节旋转，因此该旋转需要应用在模型腰部的骨骼上。

在这里插入图片描述
肩部旋转示意图
对两个向量的旋转用四元数(Quaternion)计算与表示，使用向量叉积求出旋转轴，使用向量angle计算角度

const angle = newVec.angleTo(baseVec);
const cross = new THREE.Vector3().crossVectors(newVec, baseVec);
const quaternion = new THREE.Quaternion();
quaternion.setFromAxisAngle(cross, angle);

// shoulder 旋转
const leftShoulderM = currentPoses.keypoints3D[11];
const rightShoulderM = currentPoses.keypoints3D[12];
const centorShoulderM = new THREE.Vector3().addVectors(leftShoulderM, rightShoulderM).divideScalar(2);

const shoulderOriginDir = new THREE.Vector3(1, 0, 0);
const shoulderCurrentDir = new THREE.Vector3().subVectors(leftShoulderM, rightShoulderM).normalize();const shoulderAngle = shoulderCurrentDir.angleTo(shoulderOriginDir);
const shoulderCross = new THREE.Vector3().crossVectors(shoulderCurrentDir, shoulderOriginDir).normalize();const shoulderQuaternion = new THREE.Quaternion();
shoulderQuaternion.setFromAxisAngle(shoulderCross, shoulderAngle * 2);
boneObj['waist']['bone'].setRotationFromQuaternion(shoulderQuaternion);

5、大臂/小臂旋转

由于单段骨骼长度固定，因此只能通过旋转，与肩部旋转类似，计算大臂的旋转可以通过肩部关键点(11) -> 肘关键点(13)进行计算，小臂的旋转可以通过肘部关键点（13） -> 腕部关键点（15）进行计算。

在这里插入图片描述
大臂带动小臂旋转逻辑

/**
* 通过上下两个骨骼端点位置，计算出骨骼的旋转角度*/function getBoneRotation(baseBoneName, baseBonePoints, currentBoneName, currentBonePoints) {const baseBone = boneObj[baseBoneName];const currentBone = boneObj[currentBoneName];const baseBonePos = new THREE.Vector3(baseBonePoints[0], baseBonePoints[1], baseBonePoints[2])const currentBonePos = new THREE.Vector3(currentBonePoints[0], currentBonePoints[1], currentBonePoints[2])const newVec = new THREE.Vector3().subVectors(currentBonePos, baseBonePos);const baseVec = new THREE.Vector3().subVectors(currentBone.initPos, baseBone.initPos);newVec.normalize();baseVec.normalize();const angle = newVec.angleTo(baseVec);const cross = new THREE.Vector3().crossVectors(newVec, baseVec);const quaternion = new THREE.Quaternion();quaternion.setFromAxisAngle(cross, angle);return quaternion;}const left_nodes = currentPoses.keypoints3D.filter((item, index) => {return index === 11 || index === 13 || index === 15})const right_nodes = currentPoses.keypoints3D.filter((item, index) => {return index === 12 || index === 14 || index === 16})const boneRotation1 = getBoneRotation('left-shoulder', [left_nodes[0].x, left_nodes[0].y, left_nodes[0].z], 'left-arm', [left_nodes[1].x, left_nodes[1].y, left_nodes[1].z])boneObj['left-arm']['bone'].setRotationFromQuaternion(boneRotation1)const boneRotation2 = getBoneRotation('left-arm', [left_nodes[1].x, left_nodes[1].y, left_nodes[1].z], 'left-fore-arm', [left_nodes[2].x, left_nodes[2].y, left_nodes[2].z])boneObj['left-fore-arm']['bone'].setRotationFromQuaternion(boneRotation2)const boneRotation3 = getBoneRotation('right-shoulder', [right_nodes[0].x, right_nodes[0].y, right_nodes[0].z], 'right-arm', [right_nodes[1].x, right_nodes[1].y, right_nodes[1].z])boneObj['right-arm']['bone'].setRotationFromQuaternion(boneRotation3)const boneRotation4 = getBoneRotation('right-arm', [right_nodes[1].x, right_nodes[1].y, right_nodes[1].z], 'right-fore-arm', [right_nodes[2].x, right_nodes[2].y, right_nodes[2].z])boneObj['right-fore-arm']['bone'].setRotationFromQuaternion(boneRotation4)

6、通过脖子关节带动头部整体的旋转

头部运动相对负责，前面已经提到过，为了解决初始坐标系及当前姿态坐标系问题，做了一些工作。简化下来说，就是对左脸颊（10，8，6）三个关键点建立三角面 leftTriangle，同时求得该面的法向量，对右边脸颊做同样操作，最后对两个法向量求和，得到用来近似表达头部的方向向量。原理如图

在这里插入图片描述

// neck 旋转
const noseM = currentPoses.keypoints3D[0];const leftM3 = currentPoses.keypoints3D[9];
const leftM = currentPoses.keypoints3D[2];
const leftM2 = currentPoses.keypoints3D[7];const rightM3 = currentPoses.keypoints3D[10];
const rightM = currentPoses.keypoints3D[5];
const rightM2 = currentPoses.keypoints3D[8];const leftTriangle = new THREE.Triangle(
new THREE.Vector3(leftM3.x, leftM3.y, leftM3.z).multiplyScalar(100),
new THREE.Vector3(leftM2.x, leftM2.y, leftM2.z).multiplyScalar(100),
new THREE.Vector3(leftM.x, leftM.y, leftM.z).multiplyScalar(100)
);const rightTriangle = new THREE.Triangle(
new THREE.Vector3(rightM2.x, rightM2.y, rightM2.z).multiplyScalar(100),
new THREE.Vector3(rightM3.x, rightM3.y, rightM3.z).multiplyScalar(100),
new THREE.Vector3(rightM.x, rightM.y, rightM.z).multiplyScalar(100)
);const leftNormal = leftTriangle.getNormal(new THREE.Vector3());
const rightNormal = rightTriangle.getNormal(new THREE.Vector3());const mixNormal = new THREE.Vector3().addVectors(leftNormal, rightNormal);mixNormal.normalize();const scaleRate = 2;
const faceMaterial = new THREE.LineBasicMaterial({
color: 0xffffff
});const noseVector = new THREE.Vector3(noseM.x, noseM.y, noseM.z);
// leftFace
const leftFacePoints = [];
leftFacePoints.push(new THREE.Vector3().subVectors(new THREE.Vector3(leftM.x, leftM.y, leftM.z), noseVector).multiplyScalar(scaleRate));
leftFacePoints.push(new THREE.Vector3().subVectors(new THREE.Vector3(leftM3.x, leftM3.y, leftM3.z), noseVector).multiplyScalar(scaleRate));
leftFacePoints.push(new THREE.Vector3().subVectors(new THREE.Vector3(leftM2.x, leftM2.y, leftM2.z), noseVector).multiplyScalar(scaleRate));
const leftFaceGeometry = new THREE.BufferGeometry().setFromPoints(leftFacePoints);
const leftFaceMesh = new THREE.LineLoop(leftFaceGeometry, faceMaterial);
helperGroup.add(leftFaceMesh);// rightFace
const rightFacePoints = [];
rightFacePoints.push(new THREE.Vector3().subVectors(new THREE.Vector3(rightM.x, rightM.y, rightM.z), noseVector).multiplyScalar(scaleRate));
rightFacePoints.push(new THREE.Vector3().subVectors(new THREE.Vector3(rightM3.x, rightM3.y, rightM3.z), noseVector).multiplyScalar(scaleRate));
rightFacePoints.push(new THREE.Vector3().subVectors(new THREE.Vector3(rightM2.x, rightM2.y, rightM2.z), noseVector).multiplyScalar(scaleRate));
const rightFaceGeometry = new THREE.BufferGeometry().setFromPoints(rightFacePoints);
const rightFaceMesh = new THREE.LineLoop(rightFaceGeometry, faceMaterial);
helperGroup.add(rightFaceMesh);// neckOrigin
const neckOriginDir = new THREE.Vector3(0, 0, -1);
const neckOriginPoints = []
neckOriginPoints.push(new THREE.Vector3());
neckOriginPoints.push(neckOriginDir);const neckOriginGeometry = new THREE.BufferGeometry().setFromPoints(neckOriginPoints);
const neckOriginMesh = new THREE.Line(neckOriginGeometry, new THREE.LineBasicMaterial({
color: 0xff0000
}));
helperGroup.add(neckOriginMesh);// neckCurrent
const neckCurrentDir = new THREE.Vector3(0, 0, -1);
const neckCurrentPoints = []
neckCurrentPoints.push(new THREE.Vector3());
neckCurrentPoints.push(mixNormal);const neckCurrentGeometry = new THREE.BufferGeometry().setFromPoints(neckCurrentPoints);
const neckCurrentMesh = new THREE.Line(neckCurrentGeometry, faceMaterial);
helperGroup.add(neckCurrentMesh);const oNormal = new THREE.Vector3(0, 0, -1);const neckAngle = oNormal.angleTo(mixNormal);
const neckCross = new THREE.Vector3().crossVectors(oNormal, mixNormal).normalize();const neckQuaternion = new THREE.Quaternion();
neckQuaternion.setFromAxisAngle(neckCross, neckAngle * 2);
boneObj['neck']['bone'].setRotationFromQuaternion(neckQuaternion)