6.3.2 dma_resv_get_singleton接口实现分析：一个 Fence 统治所有的聚合魔法

原创已于 2026-06-15 14:53:16 修改 · 174 阅读

3 ·

CC 4.0 BY-SA版权

文章标签：

#dma_resv #dma-resv #dma_fence #dma-fence

于 2026-06-15 05:00:00 首次发布

Linux技术杂谈同时被 2 个专栏收录

55 篇文章

订阅专栏

深探ROCm

36 篇文章

订阅专栏

dma_resv的接口中有个函数比较有意思：它就是dma_resv_get_singleton()。这个get函数第一眼让人觉得：什么鬼，然后就是：嗯，不错。本篇我们来看下这个函数功能和实现。

1. 功能：多 fence 合一

dma_resv_get_singleton() 将 dma_resv 中多个 fence 打包成一个单一的 fence 返回。等这个 singleton fence signal，就意味着原来所有 fence 都已 signal。
函数声明如下：

int dma_resv_get_singleton(struct dma_resv *obj, enum dma_resv_usage usage,
                           struct dma_fence **fence)

参数：

obj: 目标 dma_resv 对象
usage: 控制包含哪些 fence（KERNEL/WRITE/READ/BOOKKEEP）
fence: 输出参数，返回聚合后的单一 fence

返回值： 0 表示成功，-ENOMEM 表示内存分配失败

2. 实现逻辑

┌─────────────────────────────────────────────────────────────┐
│  dma_resv_get_fences() 获取所有满足 usage 的 fences           │
└─────────────────────┬───────────────────────────────────────┘
                      │
        ┌─────────────┼─────────────┐
        ▼             ▼             ▼
   count == 0    count == 1    count > 1
        │             │             │
        ▼             ▼             ▼
  返回 NULL      直接返回       用 dma_fence_array_create()
                 fences[0]      打包成一个 fence_array
                                        │
                                        ▼
                                返回 &array->base

所以有下面三种情形：

情况	返回值	说明
0 个 fence	`*fence = NULL`	无需等待
1 个 fence	直接返回该 fence	无需打包，避免额外开销
N 个 fence	返回 `dma_fence_array`	打包成一个复合 fence

3. dma_fence_array：聚合的魔法核心

dma_fence_array 是实现"多合一"的关键数据结构，它把多个 fence 聚合成一个，并通过引用计数 + 回调机制实现统一 signal。

3.1 设计模式：组合模式 (Composite Pattern)

dma_fence_array 是 组合模式 的经典应用：

                    ┌──────────────────┐
                    │   dma_fence      │  ← 抽象基类（Component）
                    │  ─────────────   │
                    │  + signal()      │
                    │  + wait()        │
                    │  + add_callback()│
                    └────────┬─────────┘
                             │
            ┌────────────────┴────────────────┐
            │                                 │
            ▼                                 ▼
   ┌─────────────────┐              ┌─────────────────────┐
   │  具体 fence     │              │  dma_fence_array    │
   │  (Leaf)         │              │  (Composite)        │
   │  ────────────   │              │  ──────────────     │
   │  GPU fence      │              │  base: dma_fence    │ ← 继承
   │  软件 fence     │              │  fences: dma_fence**│ ← 包含多个子节点
   │  ...            │              │                     │
   └─────────────────┘              └─────────────────────┘

组合模式三要素在此的体现：

角色	对应实现	说明
Component（抽象）	`struct dma_fence`	定义统一接口：signal、wait、callback
Leaf（叶子）	各种具体 fence	GPU fence、软件 fence 等
Composite（组合）	`dma_fence_array`	内嵌 `base`，包含 `fences[]`

关键代码体现：

struct dma_fence_array {
    struct dma_fence base;       // 继承：是一个 dma_fence
    struct dma_fence **fences;   // 组合：包含多个 dma_fence
    ...
};

// 返回时转型为基类指针 - 调用者完全透明
*fence = &array->base;

组合模式的威力：

// 调用者不需要知道是单个 fence 还是 array
// 统一使用 dma_fence 接口
dma_fence_wait(fence, true);      // 对单个 fence 有效
dma_fence_wait(fence, true);      // 对 fence_array 同样有效！

dma_fence_add_callback(fence, &cb, my_func);  // 两者都支持

这就是组合模式的精髓：客户端代码无需区分处理单个对象还是组合对象，大大简化了上层逻辑。

3.2 数据结构

struct dma_fence_array {
    struct dma_fence base;       // 继承自 dma_fence，对外表现为普通 fence

    spinlock_t lock;
    unsigned num_fences;         // 子 fence 数量
    atomic_t num_pending;        // 还有多少子 fence 未 signal（关键！）
    struct dma_fence **fences;   // 子 fence 数组

    struct irq_work work;        // 用于中断安全上下文中 signal
    struct dma_fence_array_cb callbacks[];  // 每个子 fence 的回调
};

3.3 工作原理

                         dma_fence_array
                    ┌─────────────────────────┐
                    │   base (dma_fence)      │
                    │   num_pending = 3       │  ← 原子计数器
                    └───────────┬─────────────┘
                                │
           ┌────────────────────┼────────────────────┐
           │                    │                    │
           ▼                    ▼                    ▼
    ┌─────────────┐      ┌─────────────┐      ┌─────────────┐
    │  fence[0]   │      │  fence[1]   │      │  fence[2]   │
    │  + callback │      │  + callback │      │  + callback │
    └──────┬──────┘      └──────┬──────┘      └──────┬──────┘
           │                    │                    │
           │ signal             │ signal             │ signal
           ▼                    ▼                    ▼
      num_pending--        num_pending--        num_pending--
         (3→2)                (2→1)                (1→0)
                                                     │
                                                     ▼
                                            atomic_dec_and_test()
                                            返回 true！
                                                     │
                                                     ▼
                                          dma_fence_signal(&array->base)
                                          整个 array fence 被 signal！

3.4 核心回调函数

static void dma_fence_array_cb_func(struct dma_fence *f,
                                    struct dma_fence_cb *cb)
{
    struct dma_fence_array_cb *array_cb =
        container_of(cb, struct dma_fence_array_cb, cb);
    struct dma_fence_array *array = array_cb->array;

    // 传播子 fence 的错误状态
    dma_fence_array_set_pending_error(array, f->error);

    // 原子递减，如果减到 0 就 signal 整个 array
    if (atomic_dec_and_test(&array->num_pending))
        irq_work_queue(&array->work);  // 在安全上下文中 signal
    else
        dma_fence_put(&array->base);
}

3.5 为什么用 irq_work？

子 fence 的回调可能在中断上下文中执行，而 dma_fence_signal() 可能触发其他回调链。为了避免栈溢出和死锁，使用 irq_work 把最终的 signal 操作延迟到安全的上下文中执行。

3.6 signal_on_any 模式

dma_fence_array_create() 的最后一个参数 signal_on_any：

false（默认）：所有子 fence signal 后，array 才 signal
true：任一子 fence signal 后，array 就 signal

dma_resv_get_singleton() 使用 false，即等待所有操作完成。

4. 核心代码分析

理解了上面的原理，代码实现就很简单了。这里不在具体分析。

int dma_resv_get_singleton(struct dma_resv *obj, enum dma_resv_usage usage,
                           struct dma_fence **fence)
{
    struct dma_fence_array *array;
    struct dma_fence **fences;
    unsigned count;
    int r;

    // 步骤1: 获取所有满足 usage 条件的 fences
    r = dma_resv_get_fences(obj, usage, &count, &fences);
    if (r)
        return r;

    // 步骤2: 0 个 fence 的情况
    if (count == 0) {
        *fence = NULL;
        return 0;
    }

    // 步骤3: 1 个 fence 的情况 - 直接返回，无需打包
    if (count == 1) {
        *fence = fences[0];
        kfree(fences);
        return 0;
    }

    // 步骤4: 多个 fences - 创建 fence_array 打包
    array = dma_fence_array_create(count, fences,
                                   dma_fence_context_alloc(1),
                                   1, false);
    if (!array) {
        while (count--)
            dma_fence_put(fences[count]);
        kfree(fences);
        return -ENOMEM;
    }

    // 返回 fence_array 的 base fence
    *fence = &array->base;
    return 0;
}

5. 使用方式与注意事项

当你需要一次性等待 BO 上所有操作完成时非常方便：

struct dma_fence *fence;

// 获取一个代表所有 READ 级别操作的聚合 fence
dma_resv_get_singleton(bo->resv, DMA_RESV_USAGE_READ, &fence);

if (fence) {
    // 一个 wait 搞定所有 - 不用逐个遍历等待
    dma_fence_wait(fence, true);
    dma_fence_put(fence);
}

5.1 典型应用场景

Buffer 导出前同步：导出 dma-buf 给其他设备前，确保所有 GPU 操作完成
用户空间同步：通过 sync_file 把聚合 fence 导出给用户空间
跨设备同步：一个设备等待另一个设备的所有操作完成

5.2 重要警告

代码注释中有个警告：

Warning: This can’t be used like this when adding the fence back to the resv object since that can lead to stack corruption when finalizing the dma_fence_array.

不能把返回的 singleton fence 再加回同一个 dma_resv！

原因：

dma_fence_array 内部持有原始 fences 的引用
如果把它加回去，会形成循环引用
当 fence_array finalize 时会导致栈损坏

5.3 与其他函数的关系

dma_resv_get_singleton()
        │
        ├── 调用 dma_resv_get_fences()    // 获取 fence 列表
        │
        └── 调用 dma_fence_array_create() // 打包成数组（如需要）