MediaPipe 系列 16：后处理 Calculator——解析模型输出完整指南

前言：为什么需要后处理？

16.1 后处理的重要性

模型推理输出需要经过后处理才能得到最终结果：

┌─────────────────────────────────────────────────────────────────────────┐
│                    后处理的重要性                                       │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   问题：模型输出如何转换为可用的检测结果？                              │
│                                                                         │
│   ┌─────────────────────────────────────────────────────────┐          │
│   │  挑战：                                                 │          │
│   │                                                         │          │
│   │  • 模型输出是原始数值，不是检测结果                     │          │
│   │  • 需要解码检测框坐标                                   │          │
│   │  • 需要去重（NMS）                                     │          │
│   │  • 需要提取关键点                                       │          │
│   │  • 需要转换为 IMS 可用格式                              │          │
│   │                                                         │          │
│   └─────────────────────────────────────────────────────────┘          │
│                                                                         │
│   解决方案：后处理 Calculator                                           │
│   ┌─────────────────────────────────────────────────────────┐          │
│   │                                                         │          │
│   │   1. 检测框解码 Calculator                              │          │
│   │      • Anchor 解码                                     │          │
│   │      • 坐标转换                                        │          │
│   │                                                         │          │
│   │   2. NMS Calculator                                    │          │
│   │      • 非极大值抑制                                    │          │
│   │      • IoU 计算                                        │          │
│   │                                                         │          │
│   │   3. 关键点解析 Calculator                              │          │
│   │      • 坐标归一化                                       │          │
│   │      • ROI 映射                                        │          │
│   │                                                         │          │
│   │   4. 疲劳检测后处理 Calculator                          │          │
│   │      • 眼睛状态解析                                     │          │
│   │      • EAR 计算                                        │          │
│   │                                                         │          │
│   └─────────────────────────────────────────────────────────┘          │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

16.2 常见后处理任务

┌─────────────────────────────────────────────────────────────┐
│                    常见后处理任务                             │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│   ┌─────────────────────────────────────────────┐              │
│   │   1. 检测框解码                             │              │
│   │   ┌─────────────────────────────────────┐   │              │
│   │   │   输入：[x_center, y_center, w, h]   │   │              │
│   │   │   输出：[x1, y1, x2, y2]             │   │              │
│   │   └─────────────────────────────────────┘   │              │
│   └─────────────────────────────────────────────┘              │
│                          │                                    │
│                          ▼                                    │
│   ┌─────────────────────────────────────────────┐              │
│   │   2. NMS 过滤                               │              │
│   │   ┌─────────────────────────────────────┐   │              │
│   │   │   输入：多个重叠检测框                │   │              │
│   │   │   输出：去重后的检测框                │   │              │
│   │   └─────────────────────────────────────┘   │              │
│   └─────────────────────────────────────────────┘              │
│                          │                                    │
│                          ▼                                    │
│   ┌─────────────────────────────────────────────┐              │
│   │   3. 关键点解析                             │              │
│   │   ┌─────────────────────────────────────┐   │              │
│   │   │   输入：归一化坐标 [0, 1]             │   │              │
│   │   │   输出：原图坐标 [pixel]             │   │              │
│   │   └─────────────────────────────────────┘   │              │
│   └─────────────────────────────────────────────┘              │
│                          │                                    │
│                          ▼                                    │
│   ┌─────────────────────────────────────────────┐              │
│   │   4. 分类结果解析                           │              │
│   │   ┌─────────────────────────────────────┐   │              │
│   │   │   输入：[class0, class1, ...]        │   │              │
│   │   │   输出：[class_id, score]            │   │              │
│   │   └─────────────────────────────────────┘   │              │
│   └─────────────────────────────────────────────┘              │
│                                                             │
└─────────────────────────────────────────────────────────────┘

十七、后处理流程

┌─────────────────────────────────────────────────────────────┐
│                    后处理流程                                 │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│   模型输出 Tensor                                            │
│   ┌─────────────────────────────────────────────┐          │
│   │   [batch, num_anchors, 4]  # 坐标           │          │
│   │   [batch, num_anchors, num_classes] # 分数  │          │
│   │   [batch, num_landmarks, 2] # 关键点        │          │
│   └─────────────────────────────────────────────┘          │
│                          │                                    │
│                          ▼                                    │
│   ┌─────────────────────────────────────────────┐          │
│   │  步骤 1：检测框解码                         │          │
│   │  ┌─────────────────────────────────────┐   │          │
│   │  │   Anchor 解码                       │   │          │
│   │  │   [x_center, y_center, w, h]        │   │          │
│   │  │   → [x1, y1, x2, y2]                │   │          │
│   │  └─────────────────────────────────────┘   │          │
│   └─────────────────────────────────────────────┘          │
│                          │                                    │
│                          ▼                                    │
│   ┌─────────────────────────────────────────────┐          │
│   │  步骤 2：置信度过滤                         │          │
│   │  ┌─────────────────────────────────────┐   │          │
│   │  │   score < threshold → 丢弃           │   │          │
│   │  └─────────────────────────────────────┘   │          │
│   └─────────────────────────────────────────────┘          │
│                          │                                    │
│                          ▼                                    │
│   ┌─────────────────────────────────────────────┐          │
│   │  步骤 3：NMS 过滤                           │          │
│   │  ┌─────────────────────────────────────┐   │          │
│   │  │   IoU > threshold → 抑制             │   │          │
│   │  └─────────────────────────────────────┘   │          │
│   └─────────────────────────────────────────────┘          │
│                          │                                    │
│                          ▼                                    │
│   ┌─────────────────────────────────────────────┐          │
│   │  步骤 4：关键点解析                         │          │
│   │  ┌─────────────────────────────────────┐   │          │
│   │  │   归一化坐标 → 原图坐标               │   │          │
│   │  └─────────────────────────────────────┘   │          │
│   └─────────────────────────────────────────────┘          │
│                          │                                    │
│                          ▼                                    │
│   最终检测结果                                               │
│   ┌─────────────────────────────────────────────┐          │
│   │   [Detection, Detection, ...]              │          │
│   └─────────────────────────────────────────────┘          │
│                                                             │
└─────────────────────────────────────────────────────────────┘

十八、检测框解码 Calculator

18.1 SSD Anchor 解码

// detection_decode_calculator.h
#ifndef MEDIAPIPE_CALCULATORS_DETECTION_DETECTION_DECODE_CALCULATOR_H_
#define MEDIAPIPE_CALCULATORS_DETECTION_DETECTION_DECODE_CALCULATOR_H_

#include "mediapipe/framework/calculator_framework.h"
#include "mediapipe/framework/formats/detection.pb.h"
#include "mediapipe/framework/formats/image_frame.h"
#include "mediapipe/framework/formats/landmark.pb.h"

namespace mediapipe {

// ========== Proto Options ==========
/*
syntax = "proto3";
package mediapipe;

message DetectionDecodeOptions {
  optional float score_threshold = 1 [default = 0.5];
  optional int32 num_classes = 2 [default = 1];
  
  enum AnchorFormat {
    SSD = 0;      // SSD 格式
    YOLO = 1;     // YOLO 格式
  }
  optional AnchorFormat anchor_format = 3 [default = SSD];
  
  // SSD Anchor 配置
  message SSDAnchor {
    optional int32 num_layers = 1 [default = 6];
    optional int32 num_anchors_per_layer = 2 [default = 6];
    repeated float strides = 3;
    repeated float scales = 4;
    repeated float aspect_ratios = 5;
  }
  optional SSDAnchor ssd_anchor = 6;
}
*/

// ========== Anchor 配置 ==========
struct Anchor {
  float x_center;
  float y_center;
  float width;
  float height;
  float x_scale;
  float y_scale;
  float w_scale;
  float h_scale;
};

// ========== 检测框解码 Calculator ==========
class DetectionDecodeCalculator : public CalculatorBase {
 public:
  static absl::Status GetContract(CalculatorContract* cc) {
    cc->Inputs().Tag("TENSORS").Set<std::vector<Tensor>>();
    cc->Inputs().Tag("ANCHORS").Set<std::vector<Anchor>>();
    cc->Outputs().Tag("DETECTIONS").Set<std::vector<Detection>>();
    cc->Options<DetectionDecodeOptions>();
    return absl::OkStatus();
  }

  absl::Status Open(CalculatorContext* cc) override {
    const auto& options = cc->Options<DetectionDecodeOptions>();
    
    score_threshold_ = options.score_threshold();
    num_classes_ = options.num_classes();
    anchor_format_ = options.anchor_format();
    
    LOG(INFO) << "DetectionDecodeCalculator initialized: "
              << "threshold=" << score_threshold_
              << ", classes=" << num_classes_;
    
    return absl::OkStatus();
  }

  absl::Status Process(CalculatorContext* cc) override {
    if (cc->Inputs().Tag("TENSORS").IsEmpty()) {
      return absl::OkStatus();
    }

    if (cc->Inputs().Tag("ANCHORS").IsEmpty()) {
      return absl::OkStatus();
    }

    const auto& tensors = cc->Inputs().Tag("TENSORS").Get<std::vector<Tensor>>();
    const auto& anchors = cc->Inputs().Tag("ANCHORS").Get<std::vector<Anchor>>();

    // 解码检测框
    std::vector<Detection> detections;
    
    // 检查输入维度
    if (tensors.empty()) {
      return absl::OkStatus();
    }
    
    const float* box_data = tensors[0].data<float>();
    const float* score_data = tensors[1].data<float>();
    
    int num_anchors = tensors[0].shape().dims[1];
    int batch_size = tensors[0].shape().dims[0];
    
    LOG(INFO) << "Decoding " << num_anchors << " anchors";
    
    // 解码每个 anchor
    for (int i = 0; i < num_anchors; ++i) {
      // ========== 1. 找到最大类别分数 ==========
      int max_class = 0;
      float max_score = score_data[i * num_classes_];
      
      for (int c = 1; c < num_classes_; ++c) {
        float score = score_data[i * num_classes_ + c];
        if (score > max_score) {
          max_score = score;
          max_class = c;
        }
      }
      
      // ========== 2. 置信度过滤 ==========
      if (max_score < score_threshold_) {
        continue;
      }
      
      // ========== 3. 获取 Anchor ==========
      if (i >= anchors.size()) break;
      const Anchor& anchor = anchors[i];
      
      // ========== 4. 解码边界框（SSD 格式）==========
      float y_center = box_data[i * 4 + 0] / anchor.y_scale + anchor.y_center;
      float x_center = box_data[i * 4 + 1] / anchor.x_scale + anchor.x_center;
      float h = std::exp(box_data[i * 4 + 2] / anchor.h_scale) * anchor.h;
      float w = std::exp(box_data[i * 4 + 3] / anchor.w_scale) * anchor.w;
      
      // 转换为 xmin, ymin, xmax, ymax
      float ymin = y_center - h / 2;
      float xmin = x_center - w / 2;
      float ymax = y_center + h / 2;
      float xmax = x_center + w / 2;
      
      // ========== 5. 创建检测结果 ==========
      Detection det;
      det.set_xmin(xmin);
      det.set_ymin(ymin);
      det.set_xmax(xmax);
      det.set_ymax(ymax);
      det.set_score(max_score);
      det.set_label_id(max_class);
      
      detections.push_back(det);
    }

    LOG(INFO) << "Decoded " << detections.size() << " detections";

    cc->Outputs().Tag("DETECTIONS").AddPacket(
        MakePacket<std::vector<Detection>>(detections).At(cc->InputTimestamp()));

    return absl::OkStatus();
  }

 private:
  float score_threshold_ = 0.5f;
  int num_classes_ = 1;
  DetectionDecodeOptions::AnchorFormat anchor_format_ = 
      DetectionDecodeOptions::SSD;
};

REGISTER_CALCULATOR(DetectionDecodeCalculator);

}  // namespace mediapipe

#endif  // MEDIAPIPE_CALCULATORS_DETECTION_DETECTION_DECODE_CALCULATOR_H_

18.2 YOLO 格式解码

// YOLO 格式解码示例
absl::Status Process(CalculatorContext* cc) override {
  const auto& tensors = cc->Inputs().Tag("TENSORS").Get<std::vector<Tensor>>();
  const float* data = tensors[0].data<float>();
  
  int num_anchors = tensors[0].shape().dims[1];
  int num_classes = tensors[0].shape().dims[2] - 5;  // 5 = x, y, w, h, conf
  
  for (int i = 0; i < num_anchors; ++i) {
    float x = data[i * (num_classes + 5) + 0];
    float y = data[i * (num_classes + 5) + 1];
    float w = data[i * (num_classes + 5) + 2];
    float h = data[i * (num_classes + 5) + 3];
    float conf = data[i * (num_classes + 5) + 4];
    
    // 找到最大类别
    int max_class = 0;
    float max_score = conf;
    
    for (int c = 0; c < num_classes; ++c) {
      float score = data[i * (num_classes + 5) + 5 + c];
      if (score > max_score) {
        max_score = score;
        max_class = c;
      }
    }
    
    // 置信度过滤
    if (max_score < score_threshold_) continue;
    
    // 转换为 xmin, ymin, xmax, ymax
    Detection det;
    det.set_xmin(x - w / 2);
    det.set_ymin(y - h / 2);
    det.set_xmax(x + w / 2);
    det.set_ymax(y + h / 2);
    det.set_score(max_score);
    det.set_label_id(max_class);
    
    detections.push_back(det);
  }
  
  return absl::OkStatus();
}

十九、NMS Calculator

19.1 非极大值抑制实现

// nms_calculator.h
#ifndef MEDIAPIPE_CALCULATORS_DETECTION_NMS_CALCULATOR_H_
#define MEDIAPIPE_CALCULATORS_DETECTION_NMS_CALCULATOR_H_

#include "mediapipe/framework/calculator_framework.h"
#include "mediapipe/framework/formats/detection.pb.h"

namespace mediapipe {

// ========== NMS Calculator ==========
class NMSCalculator : public CalculatorBase {
 public:
  static absl::Status GetContract(CalculatorContract* cc) {
    cc->Inputs().Tag("DETECTIONS").Set<std::vector<Detection>>();
    cc->Outputs().Tag("DETECTIONS").Set<std::vector<Detection>>();
    cc->Options<NMSOptions>();
    return absl::OkStatus();
  }

  absl::Status Open(CalculatorContext* cc) override {
    const auto& options = cc->Options<NMSOptions>();
    
    iou_threshold_ = options.iou_threshold();
    max_detections_ = options.max_detections();
    sort_by_ = options.sort_by();
    
    LOG(INFO) << "NMSCalculator initialized: "
              << "iou_threshold=" << iou_threshold_
              << ", max_detections=" << max_detections_;
    
    return absl::OkStatus();
  }

  absl::Status Process(CalculatorContext* cc) override {
    if (cc->Inputs().Tag("DETECTIONS").IsEmpty()) {
      return absl::OkStatus();
    }

    const auto& detections = 
        cc->Inputs().Tag("DETECTIONS").Get<std::vector<Detection>>();

    if (detections.empty()) {
      return absl::OkStatus();
    }

    // ========== 1. 按分数排序 ==========
    std::vector<int> indices(detections.size());
    std::iota(indices.begin(), indices.end(), 0);
    
    switch (sort_by_) {
      case NMSOptions::SCORE:
        std::sort(indices.begin(), indices.end(),
                  [&detections](int a, int b) {
                    return detections[a].score() > detections[b].score();
                  });
        break;
      case NMSOptions::AREA:
        std::sort(indices.begin(), indices.end(),
                  [&detections](int a, int b) {
                    float area_a = (detections[a].xmax() - detections[a].xmin()) *
                                   (detections[a].ymax() - detections[a].ymin());
                    float area_b = (detections[b].xmax() - detections[b].xmin()) *
                                   (detections[b].ymax() - detections[b].ymin());
                    return area_a > area_b;
                  });
        break;
      case NMSOptions::NONE:
        // 不排序
        break;
    }

    // ========== 2. NMS 过滤 ==========
    std::vector<bool> suppressed(detections.size(), false);
    std::vector<Detection> result;

    for (int idx : indices) {
      if (suppressed[idx]) continue;
      
      // 添加到结果
      result.push_back(detections[idx]);
      
      // 限制最大数量
      if (result.size() >= max_detections_) {
        break;
      }
      
      // 抑制重叠检测
      for (int j : indices) {
        if (suppressed[j]) continue;
        
        float iou = ComputeIOU(detections[idx], detections[j]);
        if (iou > iou_threshold_) {
          suppressed[j] = true;
        }
      }
    }

    LOG(INFO) << "NMS: " << detections.size() << " → " << result.size();

    cc->Outputs().Tag("DETECTIONS").AddPacket(
        MakePacket<std::vector<Detection>>(result).At(cc->InputTimestamp()));

    return absl::OkStatus();
  }

 private:
  float iou_threshold_ = 0.45f;
  int max_detections_ = 100;
  NMSOptions::SortBy sort_by_ = NMSOptions::SCORE;
  
  // ========== IoU 计算 ==========
  float ComputeIOU(const Detection& a, const Detection& b) {
    // 计算交集
    float x1 = std::max(a.xmin(), b.xmin());
    float y1 = std::max(a.ymin(), b.ymin());
    float x2 = std::min(a.xmax(), b.xmax());
    float y2 = std::min(a.ymax(), b.ymax());
    
    float intersection = std::max(0.0f, x2 - x1) * std::max(0.0f, y2 - y1);
    
    // 计算面积
    float area_a = (a.xmax() - a.xmin()) * (a.ymax() - a.ymin());
    float area_b = (b.xmax() - b.xmin()) * (b.ymax() - b.ymin());
    
    // 计算并集
    float union_area = area_a + area_b - intersection;
    
    // IoU
    if (union_area == 0) return 0.0f;
    return intersection / union_area;
  }
  
  // ========== Soft NMS（可选）==========
  float SoftNMS(const std::vector<Detection>& detections, 
                float iou_threshold, 
                float sigma = 0.5,
                float score_threshold = 0.5) {
    std::vector<float> scores;
    for (const auto& det : detections) {
      scores.push_back(det.score());
    }
    
    std::vector<bool> suppressed(detections.size(), false);
    std::vector<float> weights = scores;
    
    for (int i = 0; i < detections.size(); ++i) {
      if (suppressed[i]) continue;
      
      for (int j = 0; j < detections.size(); ++j) {
        if (i == j || suppressed[j]) continue;
        
        float iou = ComputeIOU(detections[i], detections[j]);
        if (iou > iou_threshold) {
          float weight = exp(-(iou * iou) / sigma);
          weights[j] *= weight;
        }
      }
    }
    
    // 更新分数
    for (int i = 0; i < detections.size(); ++i) {
      detections[i].set_score(std::min(detections[i].score(), weights[i]));
    }
    
    return 0;  // 返回更新后的结果
  }
};

REGISTER_CALCULATOR(NMSCalculator);

}  // namespace mediapipe

#endif  // MEDIAPIPE_CALCULATORS_DETECTION_NMS_CALCULATOR_H_

二十、关键点解析 Calculator

20.1 人脸关键点解析

// landmark_decode_calculator.h
#ifndef MEDIAPIPE_CALCULATORS_DETECTION_LANDMARK_DECODE_CALCULATOR_H_
#define MEDIAPIPE_CALCULATORS_DETECTION_LANDMARK_DECODE_CALCULATOR_H_

#include "mediapipe/framework/calculator_framework.h"
#include "mediapipe/framework/formats/detection.pb.h"
#include "mediapipe/framework/formats/landmark.pb.h"
#include "mediapipe/framework/formats/image_frame.h"

namespace mediapipe {

// ========== Proto Options ==========
/*
syntax = "proto3";
package mediapipe;

message LandmarkDecodeOptions {
  optional int32 num_landmarks = 1 [default = 468];  // FaceMesh
  optional int32 input_width = 2 [default = 320];
  optional int32 input_height = 3 [default = 240];
  optional bool normalize = 4 [default = true];
}
*/

// ========== 关键点解析 Calculator ==========
class LandmarkDecodeCalculator : public CalculatorBase {
 public:
  static absl::Status GetContract(CalculatorContract* cc) {
    cc->Inputs().Tag("TENSORS").Set<std::vector<Tensor>>();
    cc->Inputs().Tag("ROIS").Set<std::vector<Rect>>();
    cc->Outputs().Tag("LANDMARKS").Set<std::vector<LandmarkList>>();
    cc->Options<LandmarkDecodeOptions>();
    return absl::OkStatus();
  }

  absl::Status Open(CalculatorContext* cc) override {
    const auto& options = cc->Options<LandmarkDecodeOptions>();
    
    num_landmarks_ = options.num_landmarks();
    input_width_ = options.input_width();
    input_height_ = options.input_height();
    normalize_ = options.normalize();
    
    LOG(INFO) << "LandmarkDecodeCalculator initialized: "
              << "num_landmarks=" << num_landmarks_
              << ", input_size=" << input_width_ << "x" << input_height_;
    
    return absl::OkStatus();
  }

  absl::Status Process(CalculatorContext* cc) override {
    if (cc->Inputs().Tag("TENSORS").IsEmpty()) {
      return absl::OkStatus();
    }

    if (cc->Inputs().Tag("ROIS").IsEmpty()) {
      return absl::OkStatus();
    }

    const auto& tensors = cc->Inputs().Tag("TENSORS").Get<std::vector<Tensor>>();
    const auto& rois = cc->Inputs().Tag("ROIS").Get<std::vector<Rect>>();

    const float* landmark_data = tensors[0].data<float>();
    
    int num_faces = rois.size();
    int num_landmark_pairs = tensors[0].shape().dims[1] / 2;  // x, y 坐标
    
    std::vector<LandmarkList> all_landmarks;
    
    for (int f = 0; f < num_faces; ++f) {
      const Rect& roi = rois[f];
      
      LandmarkList landmarks;
      
      for (int l = 0; l < num_landmarks_; ++l) {
        if (l >= num_landmark_pairs) break;
        
        // 获取归一化坐标
        float x_norm = landmark_data[f * num_landmark_pairs * 2 + l * 2 + 0];
        float y_norm = landmark_data[f * num_landmark_pairs * 2 + l * 2 + 1];
        
        Landmark landmark;
        
        if (normalize_) {
          // 归一化坐标 -> 原图坐标
          float img_x = x_norm * roi.width() + roi.x();
          float img_y = y_norm * roi.height() + roi.y();
          
          landmark.set_x(img_x);
          landmark.set_y(img_y);
        } else {
          // 已经是像素坐标
          landmark.set_x(x_norm);
          landmark.set_y(y_norm);
        }
        
        landmarks.add_landmark()->CopyFrom(landmark);
      }
      
      all_landmarks.push_back(landmarks);
    }

    cc->Outputs().Tag("LANDMARKS").AddPacket(
        MakePacket<std::vector<LandmarkList>>(all_landmarks)
            .At(cc->InputTimestamp()));

    return absl::OkStatus();
  }

 private:
  int num_landmarks_ = 468;
  int input_width_ = 320;
  int input_height_ = 240;
  bool normalize_ = true;
};

REGISTER_CALCULATOR(LandmarkDecodeCalculator);

}  // namespace mediapipe

#endif  // MEDIAPIPE_CALCULATORS_DETECTION_LANDMARK_DECODE_CALCULATOR_H_

二十一、分割掩码解析

21.1 实现示例

// segmentation_decode_calculator.h
class SegmentationDecodeCalculator : public CalculatorBase {
 public:
  static absl::Status GetContract(CalculatorContract* cc) {
    cc->Inputs().Tag("TENSORS").Set<std::vector<Tensor>>();
    cc->Outputs().Tag("MASKS").Set<std::vector<std::vector<float>>>();
    cc->Options<SegmentationDecodeOptions>();
    return absl::OkStatus();
  }

  absl::Status Process(CalculatorContext* cc) override {
    if (cc->Inputs().Tag("TENSORS").IsEmpty()) {
      return absl::OkStatus();
    }

    const auto& tensors = cc->Inputs().Tag("TENSORS").Get<std::vector<Tensor>>();
    const float* mask_data = tensors[0].data<float>();
    
    int num_masks = tensors[0].shape().dims[1];
    int mask_height = tensors[0].shape().dims[2];
    int mask_width = tensors[0].shape().dims[3];
    
    std::vector<std::vector<float>> masks;
    
    for (int m = 0; m < num_masks; ++m) {
      std::vector<float> mask(mask_height * mask_width);
      
      const float* mask_ptr = mask_data + m * mask_height * mask_width;
      std::memcpy(mask.data(), mask_ptr, mask.size() * sizeof(float));
      
      masks.push_back(mask);
    }

    cc->Outputs().Tag("MASKS").AddPacket(
        MakePacket<std::vector<std::vector<float>>>(masks)
            .At(cc->InputTimestamp()));

    return absl::OkStatus();
  }
};

REGISTER_CALCULATOR(SegmentationDecodeCalculator);

二十二、IMS 实战：疲劳检测后处理

22.1 眼睛状态解析

// eye_state_decode_calculator.h
#ifndef MEDIAPIPE_CALCULATORS_FATIGUE_EYE_STATE_DECODE_CALCULATOR_H_
#define MEDIAPIPE_CALCULATORS_FATIGUE_EYE_STATE_DECODE_CALCULATOR_H_

#include "mediapipe/framework/calculator_framework.h"
#include "mediapipe/framework/formats/detection.pb.h"

namespace mediapipe {

// ========== 眼睛状态消息 ==========
message EyeState {
  bool left_eye_open = 1;
  bool right_eye_open = 2;
  float left_score = 3;
  float right_score = 4;
  float ear_left = 5;
  float ear_right = 6;
}

// ========== 眼睛状态解码 Calculator ==========
class EyeStateDecodeCalculator : public CalculatorBase {
 public:
  static absl::Status GetContract(CalculatorContract* cc) {
    cc->Inputs().Tag("TENSORS").Set<std::vector<Tensor>>();
    cc->Outputs().Tag("EYE_STATE").Set<EyeState>();
    cc->Options<EyeStateDecodeOptions>();
    return absl::OkStatus();
  }

  absl::Status Open(CalculatorContext* cc) override {
    const auto& options = cc->Options<EyeStateDecodeOptions>();
    
    ear_threshold_ = options.ear_threshold();
    
    LOG(INFO) << "EyeStateDecodeCalculator initialized: "
              << "ear_threshold=" << ear_threshold_;
    
    return absl::OkStatus();
  }

  absl::Status Process(CalculatorContext* cc) override {
    if (cc->Inputs().Tag("TENSORS").IsEmpty()) {
      return absl::OkStatus();
    }

    const auto& tensors = cc->Inputs().Tag("TENSORS").Get<std::vector<Tensor>>();
    const float* data = tensors[0].data<float>();

    // 模型输出格式：
    // [left_eye_open_score, right_eye_open_score]
    // 或者
    // [left_ear, right_ear, left_pupil_ratio, right_pupil_ratio]
    
    float left_score = data[0];
    float right_score = data[1];
    
    // 计算眼睛睁开状态
    bool left_eye_open = left_score > 0.5f;
    bool right_eye_open = right_score > 0.5f;
    
    // 计算 EAR（Eye Aspect Ratio）
    float ear_left = 0.0f;
    float ear_right = 0.0f;
    
    if (tensors.size() > 1) {
      const float* ear_data = tensors[1].data<float>();
      ear_left = ear_data[0];
      ear_right = ear_data[1];
    }
    
    // 创建眼睛状态
    EyeState state;
    state.set_left_eye_open(left_eye_open);
    state.set_right_eye_open(right_eye_open);
    state.set_left_score(left_score);
    state.set_right_score(right_score);
    state.set_ear_left(ear_left);
    state.set_ear_right(ear_right);
    
    LOG(INFO) << "Eye state: left=" << left_eye_open
              << " (" << left_score << "), right=" << right_eye_open
              << " (" << right_score << ")"
              << ", EAR: " << ear_left << ", " << ear_right;
    
    cc->Outputs().Tag("EYE_STATE").AddPacket(
        MakePacket<EyeState>(state).At(cc->InputTimestamp()));

    return absl::OkStatus();
  }

 private:
  float ear_threshold_ = 0.2f;
};

REGISTER_CALCULATOR(EyeStateDecodeCalculator);

}  // namespace mediapipe

#endif  // MEDIAPIPE_CALCULATORS_FATIGUE_EYE_STATE_DECODE_CALCULATOR_H_

22.2 头部姿态解析

// head_pose_decode_calculator.h
class HeadPoseDecodeCalculator : public CalculatorBase {
 public:
  static absl::Status GetContract(CalculatorContract* cc) {
    cc->Inputs().Tag("TENSORS").Set<std::vector<Tensor>>();
    cc->Outputs().Tag("HEAD_POSE").Set<HeadPose>();
    cc->Options<HeadPoseDecodeOptions>();
    return absl::OkStatus();
  }

  absl::Status Process(CalculatorContext* cc) override {
    if (cc->Inputs().Tag("TENSORS").IsEmpty()) {
      return absl::OkStatus();
    }

    const auto& tensors = cc->Inputs().Tag("TENSORS").Get<std::vector<Tensor>>();
    const float* data = tensors[0].data<float>();

    // 模型输出：[pitch, yaw, roll]
    float pitch = data[0];
    float yaw = data[1];
    float roll = data[2];

    // 创建头部姿态
    HeadPose pose;
    pose.set_pitch(pitch);
    pose.set_yaw(yaw);
    pose.set_roll(roll);

    cc->Outputs().Tag("HEAD_POSE").AddPacket(
        MakePacket<HeadPose>(pose).At(cc->InputTimestamp()));

    return absl::OkStatus();
  }
};

REGISTER_CALCULATOR(HeadPoseDecodeCalculator);

二十三、完整后处理流水线

23.1 Graph 配置

# face_detection_with_landmarks_graph.pbtxt

# ========== 输入输出 ==========
input_stream: "IMAGE:image"
output_stream: "DETECTIONS:faces"
output_stream: "LANDMARKS:landmarks"

# ========== 步骤 1：检测框解码 ==========
node {
  calculator: "DetectionDecodeCalculator"
  input_stream: "TENSORS:raw_tensors"
  input_stream: "ANCHORS:anchors"
  output_stream: "DETECTIONS:raw_detections"
  options {
    [mediapipe.DetectionDecodeOptions.ext] {
      score_threshold: 0.5
      num_classes: 1
    }
  }
}

# ========== 步骤 2：NMS 过滤 ==========
node {
  calculator: "NMSCalculator"
  input_stream: "DETECTIONS:raw_detections"
  output_stream: "DETECTIONS:filtered_detections"
  options {
    [mediapipe.NMSOptions.ext] {
      iou_threshold: 0.45
      max_detections: 50
      sort_by: SCORE
    }
  }
}

# ========== 步骤 3：ROI 提取 ==========
node {
  calculator: "RoiFromDetectionCalculator"
  input_stream: "DETECTIONS:filtered_detections"
  output_stream: "ROIS:rois"
}

# ========== 步骤 4：关键点解码 ==========
node {
  calculator: "LandmarkDecodeCalculator"
  input_stream: "TENSORS:landmark_tensors"
  input_stream: "ROIS:rois"
  output_stream: "LANDMARKS:landmarks"
  options {
    [mediapipe.LandmarkDecodeOptions.ext] {
      num_landmarks: 468
      input_width: 320
      input_height: 240
      normalize: true
    }
  }
}

# ========== 流量限制 ==========
node {
  calculator: "FlowLimiterCalculator"
  input_stream: "image"
  input_stream: "faces"
  input_stream_info: { tag_index: "faces" back_edge: true }
  output_stream: "throttled_image"
}

# ========== 流水线 ==========
node {
  calculator: "FaceDetectionGraph"
  input_stream: "IMAGE:throttled_image"
  input_stream: "LANDMARKS:landmarks"
  output_stream: "DETECTIONS:faces"
  output_stream: "LANDMARKS:landmarks"
}

二十四、总结

后处理任务	Calculator	说明
检测框解码	DetectionDecodeCalculator	Anchor 解码
NMS 过滤	NMSCalculator	去重
关键点解析	LandmarkDecodeCalculator	坐标转换
眼睛状态	EyeStateDecodeCalculator	疲劳检测
头部姿态	HeadPoseDecodeCalculator	姿态估计

下篇预告

MediaPipe 系列 17：数据聚合 Calculator——多流同步

深入讲解如何同步多个输入流、聚合数据、处理时序数据。

参考资料

Google AI Edge. MediaPipe Detection Calculators
Redwood AI. YOLOv5 Post-Processing
Google AI Edge. MediaPipe Landmark Calculators

系列进度： 16/55
更新时间： 2026-03-12

MediaPipe 系列 > Calculator 开发

#IMS #疲劳检测 #MediaPipe #后处理 #NMS #检测解码 #关键点解析

MediaPipe 系列 16：后处理 Calculator——解析模型输出完整指南

https://dapalm.com/2026/03/13/MediaPipe系列16-后处理Calculator：解析模型输出/

作者

Mars

发布于

2026年3月13日

许可协议

驾驶员视线估计：从几何方法到深度学习的演进上一篇

Euro NCAP 2026 CPD儿童存在检测：从60GHz雷达到UWB的技术路线下一篇