MediaPipe 系列 10：线程模型与调度策略完整指南

前言：为什么需要理解线程模型？

10.1 线程模型的重要性

MediaPipe 的性能优化核心在于高效的线程调度：

┌─────────────────────────────────────────────────────────────────────────┐
│                    线程模型的重要性                                     │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   问题：如何高效执行复杂的感知流水线？                                   │
│                                                                         │
│   ┌─────────────────────────────────────────────────────────┐          │
│   │  挑战：                                                 │          │
│   │                                                         │          │
│   │  • 多个 Calculator 如何并行执行？                       │          │
│   │  • 如何避免阻塞和死锁？                                 │          │
│   │  • 如何平衡 CPU/GPU 负载？                              │          │
│   │  • 如何保证实时性？                                     │          │
│   │  • 如何处理多输入同步？                                 │          │
│   │                                                         │          │
│   └─────────────────────────────────────────────────────────┘          │
│                                                                         │
│   解决方案：线程模型与调度策略                                           │
│   ┌─────────────────────────────────────────────────────────┐          │
│   │                                                         │          │
│   │   • Executor：执行器（管理线程池）                       │          │
│   │   • ThreadPool：线程池实现                              │          │
│   │   • Scheduler：调度器（决定执行顺序）                    │          │
│   │   • Input Stream Handler：输入流处理器                   │          │
│   │                                                         │          │
│   └─────────────────────────────────────────────────────────┘          │
│                                                                         │
│   IMS DMS 实际场景：                                                     │
│   ┌─────────────────────────────────────────────────────────┐          │
│   │                                                         │          │
│   │   同时运行的任务：                                       │          │
│   │   • 人脸检测（GPU，高优先级）                            │          │
│   │   • 关键点提取（GPU，高优先级）                          │          │
│   │   • 头部姿态（CPU，中优先级）                            │          │
│   │   • 疲劳评分（CPU，低优先级）                            │          │
│   │   • 日志记录（IO，最低优先级）                           │          │
│   │                                                         │          │
│   │   需要合理的线程调度保证实时性                           │          │
│   │                                                         │          │
│   └─────────────────────────────────────────────────────────┘          │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

10.2 核心概念概览

┌─────────────────────────────────────────────────────────────┐
│                    核心概念概览                               │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│   ┌─────────────────────────────────────────────┐              │
│   │         Graph（图）                         │              │
│   │                                       │              │
│   │   包含多个 Calculator 和 Scheduler      │              │
│   │                                       │              │
│   └─────────────────────────────────────────────┘              │
│                          │                                    │
│                          ▼                                    │
│   ┌─────────────────────────────────────────────┐              │
│   │         Scheduler（调度器）                 │              │
│   │                                       │              │
│   │   • 检测就绪的 Calculator                │              │
│   │   • 加入执行队列                         │              │
│   │   • 分配到 Executor 执行                 │              │
│   │                                       │              │
│   └─────────────────────────────────────────────┘              │
│                          │                                    │
│                          ▼                                    │
│   ┌─────────────────────────────────────────────┐              │
│   │         Executor（执行器）                  │              │
│   │                                       │              │
│   │   • ThreadPool：线程池                   │              │
│   │   • ApplicationThread：主线程            │              │
│   │   • CustomExecutor：自定义               │              │
│   │                                       │              │
│   └─────────────────────────────────────────────┘              │
│                          │                                    │
│                          ▼                                    │
│   ┌─────────────────────────────────────────────┐              │
│   │         Input Stream Handler（输入流处理器）│              │
│   │                                       │              │
│   │   • SyncSetInputStreamHandler：同步      │              │
│   │   • AsyncInputStreamHandler：异步        │              │
│   │   • BarrierInputStreamHandler：屏障      │              │
│   │                                       │              │
│   └─────────────────────────────────────────────┘              │
│                                                             │
└─────────────────────────────────────────────────────────────┘

十一、Executor 详解

11.1 Executor 类型

┌─────────────────────────────────────────────────────────────┐
│                    Executor 类型                              │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│   1. ThreadPool（线程池）- 默认                              │
│      ┌─────────────────────────────────────────────┐          │
│      │   • 多线程并发执行                           │          │
│      │   • 可配置线程数                             │          │
│      │   • 适用于 CPU 密集型任务                    │          │
│      │                                             │          │
│      │   配置：                                     │          │
│      │   executor {                                │          │
│      │     name: "default_executor"                │          │
│      │     type: "ThreadPool"                      │          │
│      │     options {                               │          │
│      │       num_threads: 4                        │          │
│      │     }                                       │          │
│      │   }                                         │          │
│      └─────────────────────────────────────────────┘          │
│                                                             │
│   2. ApplicationThread（应用线程）                          │
│      ┌─────────────────────────────────────────────┐          │
│      │   • 在应用主线程执行                         │          │
│      │   • 单线程                                  │          │
│      │   • 适用于 UI 更新任务                      │          │
│      │                                             │          │
│      │   配置：                                     │          │
│      │   executor {                                │          │
│      │     name: "ui_executor"                     │          │
│      │     type: "ApplicationThread"               │          │
│      │   }                                         │          │
│      └─────────────────────────────────────────────┘          │
│                                                             │
│   3. CustomExecutor（自定义执行器）                         │
│      ┌─────────────────────────────────────────────┐          │
│      │   • 用户自定义执行逻辑                       │          │
│      │   • 可实现特定调度策略                       │          │
│      │   • 适用于特殊硬件（DSP, NPU）              │          │
│      │                                             │          │
│      │   需要实现自定义 Executor 类                │          │
│      └─────────────────────────────────────────────┘          │
│                                                             │
└─────────────────────────────────────────────────────────────┘

11.2 Executor 配置详解

# ========== Executor 配置详解 ==========

# ========== 默认 Executor（无需显式配置）==========
# 默认使用线程池，线程数 = CPU 核心数

# ========== 自定义线程池 Executor ==========
executor {
  name: "cpu_executor"
  type: "ThreadPool"
  options {
    [mediapipe.ThreadPoolExecutorOptions.ext] {
      # 线程数
      num_threads: 4
      
      # 栈大小（字节）
      stack_size: 32768
      
      # 线程优先级
      # LOW = 0, NORMAL = 1, HIGH = 2
      thread_priority: NORMAL
      
      # CPU 亲和性（绑定到特定 CPU 核心）
      cpu_affinity: [0, 1, 2, 3]
      
      # 线程名称前缀
      thread_name_prefix: "cpu_pool"
    }
  }
}

# ========== GPU Executor ==========
executor {
  name: "gpu_executor"
  type: "ThreadPool"
  options {
    [mediapipe.ThreadPoolExecutorOptions.ext] {
      num_threads: 2
      thread_priority: HIGH
      thread_name_prefix: "gpu_pool"
    }
  }
}

# ========== IO Executor ==========
executor {
  name: "io_executor"
  type: "ThreadPool"
  options {
    [mediapipe.ThreadPoolExecutorOptions.ext] {
      num_threads: 2
      thread_priority: LOW
      thread_name_prefix: "io_pool"
    }
  }
}

# ========== Calculator 指定 Executor ==========
node {
  calculator: "GPUCalculator"
  input_stream: "IMAGE:image"
  output_stream: "OUTPUT:result"
  executor: "gpu_executor"    # 使用 GPU Executor
}

node {
  calculator: "CPUCalculator"
  input_stream: "INPUT:input"
  output_stream: "OUTPUT:result"
  executor: "cpu_executor"    # 使用 CPU Executor
}

node {
  calculator: "IOCalculator"
  input_stream: "INPUT:input"
  output_stream: "OUTPUT:result"
  executor: "io_executor"     # 使用 IO Executor
}

11.3 Executor 优先级

┌─────────────────────────────────────────────────────────────┐
│                    Executor 优先级                            │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│   线程优先级：                                               │
│   ┌─────────────────────────────────────────────┐              │
│   │   LOW = 0      # 后台任务                   │              │
│   │   NORMAL = 1   # 普通任务（默认）           │              │
│   │   HIGH = 2     # 高优先级任务               │              │
│   └─────────────────────────────────────────────┘              │
│                                                             │
│   IMS DMS 典型配置：                                         │
│   ┌─────────────────────────────────────────────┐              │
│   │                                             │              │
│   │   GPU Executor（HIGH）：                    │              │
│   │   • 人脸检测                                │              │
│   │   • 关键点提取                              │              │
│   │   • 虹膜检测                                │              │
│   │                                             │              │
│   │   CPU Executor（NORMAL）：                  │              │
│   │   • EAR 计算                                │              │
│   │   • PERCLOS 计算                           │              │
│   │   • 头部姿态                                │              │
│   │                                             │              │
│   │   IO Executor（LOW）：                      │              │
│   │   • 日志记录                                │              │
│   │   • 文件写入                                │              │
│   │   • 网络发送                                │              │
│   │                                             │              │
│   └─────────────────────────────────────────────┘              │
│                                                             │
└─────────────────────────────────────────────────────────────┘

十二、调度策略详解

12.1 调度流程

┌─────────────────────────────────────────────────────────────┐
│                    调度流程                                   │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│   1. Packet 到达 Input Stream                               │
│      ┌─────────────────────────────────────────────┐          │
│      │   Input Stream 接收新 Packet                │          │
│      └─────────────────────────────────────────────┘          │
│                          │                                    │
│                          ▼                                    │
│   2. Scheduler 检测就绪的 Calculator                         │
│      ┌─────────────────────────────────────────────┐          │
│      │   • Input Stream Handler 检查输入是否就绪  │          │
│      │   • 判断 Calculator 是否可以执行           │          │
│      └─────────────────────────────────────────────┘          │
│                          │                                    │
│                          ▼                                    │
│   3. 将就绪 Calculator 加入执行队列                          │
│      ┌─────────────────────────────────────────────┐          │
│      │   • 加入 Executor 的任务队列               │          │
│      │   • 根据 Executor 类型排序                 │          │
│      └─────────────────────────────────────────────┘          │
│                          │                                    │
│                          ▼                                    │
│   4. Executor 从队列取出 Calculator 执行                     │
│      ┌─────────────────────────────────────────────┐          │
│      │   • 线程池线程获取任务                     │          │
│      │   • 调用 Calculator::Process()             │          │
│      └─────────────────────────────────────────────┘          │
│                          │                                    │
│                          ▼                                    │
│   5. Calculator 处理完成，发送 Output Packet                 │
│      ┌─────────────────────────────────────────────┐          │
│      │   • Output Stream 发送 Packet              │          │
│      │   • 触发下游 Calculator                    │          │
│      └─────────────────────────────────────────────┘          │
│                          │                                    │
│                          ▼                                    │
│   6. 触发下游 Calculator 就绪                                │
│      ┌─────────────────────────────────────────────┐          │
│      │   • 下游 Calculator 输入就绪               │          │
│      │   • 回到步骤 2                             │          │
│      └─────────────────────────────────────────────┘          │
│                                                             │
└─────────────────────────────────────────────────────────────┘

12.2 Input Stream Handler 类型

# ========== Input Stream Handler 类型 ==========

# ========== 1. SyncSetInputStreamHandler（同步，默认）==========
# 特点：
#   • 多个输入必须时间戳对齐
#   • 所有输入都有数据时才执行
#   • 自动缓存等待同步
node {
  calculator: "MergeCalculator"
  input_stream: "A:stream_a"
  input_stream: "B:stream_b"
  output_stream: "OUTPUT:output"
  input_stream_handler {
    input_stream_handler: "SyncSetInputStreamHandler"
  }
}

# ========== 2. AsyncInputStreamHandler（异步）==========
# 特点：
#   • 任一输入有数据就执行
#   • 不等待其他输入
#   • 适用于独立处理
node {
  calculator: "AsyncCalculator"
  input_stream: "A:stream_a"
  input_stream: "B:stream_b"
  output_stream: "OUTPUT:output"
  input_stream_handler {
    input_stream_handler: "AsyncInputStreamHandler"
  }
}

# ========== 3. BarrierInputStreamHandler（屏障）==========
# 特点：
#   • 所有输入都到达后一起执行
#   • 适用于批量处理
node {
  calculator: "BarrierCalculator"
  input_stream: "A:stream_a"
  input_stream: "B:stream_b"
  input_stream: "C:stream_c"
  output_stream: "OUTPUT:output"
  input_stream_handler {
    input_stream_handler: "BarrierInputStreamHandler"
  }
}

# ========== 4. ImmediateInputStreamHandler（立即）==========
# 特点：
#   • 输入立即处理
#   • 不缓存
#   • 适用于单输入
node {
  calculator: "ImmediateCalculator"
  input_stream: "INPUT:input"
  output_stream: "OUTPUT:output"
  input_stream_handler {
    input_stream_handler: "ImmediateInputStreamHandler"
  }
}

# ========== 5. MuxInputStreamHandler（多路复用）==========
# 特点：
#   • 选择其中一个输入处理
#   • 适用于条件分支
node {
  calculator: "MuxCalculator"
  input_stream: "INPUT:0:branch_a"
  input_stream: "INPUT:1:branch_b"
  output_stream: "OUTPUT:output"
  input_stream_handler {
    input_stream_handler: "MuxInputStreamHandler"
  }
}

十三、FlowLimiter 限流机制

13.1 为什么需要限流？

┌─────────────────────────────────────────────────────────────┐
│                    为什么需要限流？                            │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│   问题：输入帧速率 > 处理速率                                │
│                                                             │
│   ┌─────────────────────────────────────────────┐              │
│   │   输入：30 FPS                               │              │
│   │   处理：10 FPS（推理慢）                     │              │
│   │                                             │              │
│   │   结果：                                     │              │
│   │   • 队列积压                                 │              │
│   │   • 内存增长                                 │              │
│   │   • 延迟增加                                 │              │
│   │   • 实时性丧失                               │              │
│   └─────────────────────────────────────────────┘              │
│                                                             │
│   解决方案：FlowLimiter                                     │
│   ┌─────────────────────────────────────────────┐              │
│   │                                             │              │
│   │   • 控制同时处理的帧数                      │              │
│   │   • 丢弃多余帧                              │              │
│   │   • 保证实时性                              │              │
│   │                                             │              │
│   └─────────────────────────────────────────────┘              │
│                                                             │
└─────────────────────────────────────────────────────────────┘

13.2 FlowLimiter 配置

# ========== FlowLimiter 配置 ==========

node {
  calculator: "FlowLimiterCalculator"
  input_stream: "input_image"
  input_stream: "detections"           # 反馈信号（处理完成）
  input_stream_info: {
    tag_index: "detections"            # 标记为反向边
    back_edge: true
  }
  output_stream: "throttled_image"
  options {
    [mediapipe.FlowLimiterCalculatorOptions.ext] {
      # 最多同时处理的帧数
      max_in_flight: 1
      
      # 队列中最多等待的帧数
      max_in_queue: 1
      
      # 是否丢弃旧帧
      drop_old_frames: true
    }
  }
}

# ========== 完整示例 ==========
input_stream: "VIDEO:video"
output_stream: "DETECTIONS:detections"

# 限流
node {
  calculator: "FlowLimiterCalculator"
  input_stream: "video"
  input_stream: "detections"
  input_stream_info: {
    tag_index: "detections"
    back_edge: true
  }
  output_stream: "throttled_video"
  options {
    [mediapipe.FlowLimiterCalculatorOptions.ext] {
      max_in_flight: 1
      max_in_queue: 1
    }
  }
}

# 检测
node {
  calculator: "FaceDetector"
  input_stream: "IMAGE:throttled_video"
  output_stream: "DETECTIONS:detections"
}

十四、并行优化策略

14.1 独立分支并行

# ========== 独立分支并行 ==========
# 多个独立分支自动并行执行

input_stream: "IMAGE:image"

# 分支 1：人脸检测
node {
  calculator: "FaceDetector"
  input_stream: "IMAGE:image"
  output_stream: "FACES:faces"
}

# 分支 2：人体检测
node {
  calculator: "PersonDetector"
  input_stream: "IMAGE:image"
  output_stream: "PERSONS:persons"
}

# 分支 3：车辆检测
node {
  calculator: "CarDetector"
  input_stream: "IMAGE:image"
  output_stream: "CARS:cars"
}

# 三个检测器自动并行执行

14.2 Executor 隔离

# ========== Executor 隔离 ==========
# 不同类型任务使用不同 Executor

# GPU Executor（高优先级）
executor {
  name: "gpu_executor"
  type: "ThreadPool"
  options {
    [mediapipe.ThreadPoolExecutorOptions.ext] {
      num_threads: 2
      thread_priority: HIGH
    }
  }
}

# CPU Executor（普通优先级）
executor {
  name: "cpu_executor"
  type: "ThreadPool"
  options {
    [mediapipe.ThreadPoolExecutorOptions.ext] {
      num_threads: 4
      thread_priority: NORMAL
    }
  }
}

# IO Executor（低优先级）
executor {
  name: "io_executor"
  type: "ThreadPool"
  options {
    [mediapipe.ThreadPoolExecutorOptions.ext] {
      num_threads: 2
      thread_priority: LOW
    }
  }
}

# GPU 任务
node {
  calculator: "FaceDetector"
  input_stream: "IMAGE:image"
  output_stream: "FACES:faces"
  executor: "gpu_executor"
}

# CPU 任务
node {
  calculator: "LandmarkCalculator"
  input_stream: "FACES:faces"
  output_stream: "LANDMARKS:landmarks"
  executor: "cpu_executor"
}

# IO 任务
node {
  calculator: "LogCalculator"
  input_stream: "LANDMARKS:landmarks"
  executor: "io_executor"
}

14.3 流水线优化

# ========== 流水线优化 ==========
# 将大任务分解为多个小任务，流水线执行

# 阶段 1：预处理
node {
  calculator: "PreprocessCalculator"
  input_stream: "IMAGE:image"
  output_stream: "PREPROCESSED:preprocessed"
}

# 阶段 2：特征提取
node {
  calculator: "FeatureExtractor"
  input_stream: "PREPROCESSED:preprocessed"
  output_stream: "FEATURES:features"
}

# 阶段 3：推理
node {
  calculator: "InferenceCalculator"
  input_stream: "FEATURES:features"
  output_stream: "OUTPUT:output"
}

# 阶段 4：后处理
node {
  calculator: "PostprocessCalculator"
  input_stream: "OUTPUT:output"
  output_stream: "RESULT:result"
}

# 流水线执行：
# t=0: Preprocess(frame_0)
# t=1: Preprocess(frame_1), FeatureExtractor(frame_0)
# t=2: Preprocess(frame_2), FeatureExtractor(frame_1), Inference(frame_0)
# ...

十五、性能分析 Profiling

15.1 启用 Profiling

# ========== 启用 Profiling ==========

# Graph 级别
options {
  [mediapipe.GraphOptions.ext] {
    enable_profiling: true
    
    # 输出格式
    profiling_config {
      # 输出统计摘要
      output_summary: true
      
      # 输出详细日志
      output_detailed: false
      
      # 输出到文件
      output_file: "/tmp/profile.txt"
    }
  }
}

15.2 Profiling 输出

# ========== Profiling 输出示例 ==========

Profile Summary:
============================================================
Calculator          Avg Time  Min Time  Max Time  Calls   Total
============================================================
FlowLimiter         0.05ms    0.03ms    0.10ms    1000    0.05s
FaceDetector        2.50ms    2.00ms    5.00ms    1000    2.50s
LandmarkCalculator  3.20ms    2.50ms    8.00ms    1000    3.20s
PostprocessCalc     1.00ms    0.80ms    2.00ms    1000    1.00s
============================================================
Total               6.75ms    5.33ms    15.10ms           6.75s

Analysis:
• Bottleneck: LandmarkCalculator (3.20ms avg)
• Performance: 148 FPS (1000 frames / 6.75s)
• Recommendation: Optimize LandmarkCalculator or parallelize

15.3 分析瓶颈

┌─────────────────────────────────────────────────────────────┐
│                    分析瓶颈                                   │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│   步骤 1：启用 Profiling                                    │
│      options {                                             │
│        [mediapipe.GraphOptions.ext] {                      │
│          enable_profiling: true                            │
│        }                                                   │
│      }                                                     │
│                                                             │
│   步骤 2：运行 Graph                                        │
│      • 运行足够长时间（至少 100 帧）                       │
│      • 记录 Profiling 输出                                 │
│                                                             │
│   步骤 3：分析结果                                          │
│      • 找出 Avg Time 最高的 Calculator                    │
│      • 分析是否可以并行化                                  │
│      • 分析是否可以异步化                                  │
│                                                             │
│   步骤 4：优化                                              │
│      • 优化算法                                            │
│      • 使用更快的模型                                      │
│      • 并行化独立任务                                      │
│      • 异步化非关键任务                                    │
│                                                             │
│   步骤 5：重新测试                                          │
│      • 重新运行 Profiling                                  │
│      • 对比优化前后                                        │
│                                                             │
└─────────────────────────────────────────────────────────────┘

十六、实战：IMS DMS 多任务并行

16.1 完整 Graph 设计

# dms_graph.pbtxt
# IMS DMS 多任务并行完整设计

# ========== 输入输出 ==========
input_stream: "IR_IMAGE:ir_image"
input_stream: "VEHICLE_SPEED:speed"
output_stream: "DMS_RESULT:dms_result"

# ========== Executor 配置 ==========
# GPU Executor（高优先级）
executor {
  name: "gpu_executor"
  type: "ThreadPool"
  options {
    [mediapipe.ThreadPoolExecutorOptions.ext] {
      num_threads: 2
      thread_priority: HIGH
    }
  }
}

# CPU Executor（普通优先级）
executor {
  name: "cpu_executor"
  type: "ThreadPool"
  options {
    [mediapipe.ThreadPoolExecutorOptions.ext] {
      num_threads: 4
      thread_priority: NORMAL
    }
  }
}

# IO Executor（低优先级）
executor {
  name: "io_executor"
  type: "ThreadPool"
  options {
    [mediapipe.ThreadPoolExecutorOptions.ext] {
      num_threads: 1
      thread_priority: LOW
    }
  }
}

# ========== 限流 ==========
node {
  calculator: "FlowLimiterCalculator"
  input_stream: "ir_image"
  input_stream: "dms_result"
  input_stream_info: {
    tag_index: "dms_result"
    back_edge: true
  }
  output_stream: "throttled_ir_image"
  options {
    [mediapipe.FlowLimiterCalculatorOptions.ext] {
      max_in_flight: 1
      max_in_queue: 1
    }
  }
}

# ========== 分支 1：人脸检测（GPU，高优先级）==========
node {
  calculator: "FaceDetector"
  input_stream: "IMAGE:throttled_ir_image"
  output_stream: "FACES:faces"
  executor: "gpu_executor"
}

# ========== 分支 2：关键点提取（GPU，高优先级）==========
node {
  calculator: "FaceMeshCalculator"
  input_stream: "IMAGE:throttled_ir_image"
  input_stream: "FACES:faces"
  output_stream: "LANDMARKS:landmarks"
  executor: "gpu_executor"
}

# ========== 分支 3：虹膜检测（GPU，高优先级）==========
node {
  calculator: "IrisCalculator"
  input_stream: "LANDMARKS:landmarks"
  input_stream: "IMAGE:throttled_ir_image"
  output_stream: "IRIS:iris"
  output_stream: "GAZE:gaze"
  executor: "gpu_executor"
}

# ========== 分支 4：头部姿态（CPU，普通优先级）==========
node {
  calculator: "HeadPoseCalculator"
  input_stream: "LANDMARKS:landmarks"
  output_stream: "POSE:head_pose"
  executor: "cpu_executor"
}

# ========== 分支 5：EAR 计算（CPU，普通优先级）==========
node {
  calculator: "ARCalculator"
  input_stream: "LANDMARKS:landmarks"
  output_stream: "EAR:ear"
  executor: "cpu_executor"
}

# ========== 分支 6：PERCLOS 计算（CPU，普通优先级）==========
node {
  calculator: "PERCLOSCalculator"
  input_stream: "EAR:ear"
  output_stream: "PERCLOS:perclos"
  executor: "cpu_executor"
}

# ========== 融合计算（CPU，普通优先级）==========
node {
  calculator: "DMSFusionCalculator"
  input_stream: "GAZE:gaze"
  input_stream: "POSE:head_pose"
  input_stream: "EAR:ear"
  input_stream: "PERCLOS:perclos"
  input_stream: "SPEED:speed"
  output_stream: "DMS_RESULT:dms_result"
  executor: "cpu_executor"
  input_stream_handler {
    input_stream_handler: "SyncSetInputStreamHandler"
  }
}

# ========== 日志记录（IO，低优先级）==========
node {
  calculator: "LogCalculator"
  input_stream: "DMS_RESULT:dms_result"
  executor: "io_executor"
}

# ========== Profiling 配置 ==========
options {
  [mediapipe.GraphOptions.ext] {
    enable_profiling: true
  }
}

十七、总结

概念	说明
Executor	执行 Calculator 的线程池
ThreadPool	默认执行器类型
Input Stream Handler	调度策略（同步/异步）
FlowLimiter	限流器（控制处理帧数）
并行优化	独立分支、Executor 隔离、流水线
Profiling	性能分析（找出瓶颈）

下篇预告

MediaPipe 系列 11：自定义 Calculator 第一步——Hello World

从零开始创建自定义 Calculator，完整覆盖 Proto 定义、Calculator 实现、Graph 配置、BUILD 文件、编译运行。

参考资料

Google AI Edge. MediaPipe Threading Model
Google AI Edge. Executor Configuration
Google AI Edge. FlowLimiter Calculator

系列进度： 10/55
更新时间： 2026-03-12

MediaPipe 系列 > 框架基础

#MediaPipe #性能优化 #线程 #调度 #Executor #ThreadPool #并行

MediaPipe 系列 10：线程模型与调度策略完整指南

https://dapalm.com/2026/03/12/MediaPipe系列10-线程模型与调度策略/

作者

Mars

发布于

2026年3月12日

许可协议

MediaPipe 系列 12：图像处理 Calculator——输入输出 ImageFrame 完整指南上一篇

MediaPipe 系列 09：Calculator Options——参数化配置完整指南下一篇