SQL Server执行计划操作员–第4部分

最新推荐文章于 2025-04-15 14:25:51 发布

翻译最新推荐文章于 2025-04-15 14:25:51 发布 · 301 阅读

本内容遵循CC 4.0 BY-SA版权协议

原文链接：https://www.sqlshack.com/sql-server-execution-plan-operators-part-4/

本文深入探讨了SQLServer执行计划的第四组操作符，包括Segment、TableSpool、MergeInterval、Filter、OnlineIndexInsert、SequenceProject、EagerSpool和Parallelism等，解析了它们的工作原理和应用场景。

In the previous articles of this series, we went through three sets of SQL Server Execution Plan operators that you will meet with while working with the different Execution Plan queries. We described the Table Scan, Clustered Index Scan, Clustered Index Seek, the Non-Clustered Index Seek, RID Lookup, Key Lookup, Sort, Aggregate – Stream Aggregate, Compute Scalar, Concatenation, Assert, Hash Match Join, Hash Match Aggregate , Merge Join and Nested Loops Join Execution Plan operators. In this article, we will dive in the fourth set of these SQL Server Execution Plan operators.

在本系列的前几篇文章中，我们介绍了三组SQL Server执行计划操作符，在使用不同的执行计划查询时您将遇到这些操作符。我们描述了表扫描，聚簇索引扫描，聚簇索引寻找，非聚簇索引寻找，RID查找，键查找，排序，汇总–流汇总，计算标量，串联，断言，哈希匹配联接，哈希匹配汇总，合并连接和嵌套循环连接执行计划运算符。在本文中，我们将深入探讨这些SQL Server执行计划操作符的第四组。

Before listing the fourth set of the SQL Server Execution Plan operators, let us prepare for the practical demos of this article by creating a new table and insert into that table 2K records for testing purposes. The table can be created and filled with the testing data using the T-SQL script below:

在列出第四组SQL Server执行计划运算符之前，让我们通过创建一个新表并将2K记录插入该表中以进行测试，为本文的实际演示做准备。可以使用以下T-SQL脚本创建表并用测试数据填充该表：

CREATE TABLE ExPlanOperator_P4 (
  ID INT IDENTITY(1, 1)
  ,CX_Name VARCHAR(50)
  ,CX_PhoneNum VARCHAR(50)
  ,CX_Address VARCHAR(MAX)
  ,CX_Credit INT
  )
GO
 
INSERT INTO ExPlanOperator_P4
VALUES (
  'Alen'
  ,'9625788954'
  ,'London'
  ,500
  ) GO 1000
 
INSERT INTO ExPlanOperator_P4
VALUES (
  'Frank'
  ,'962445785'
  ,'Germany'
  ,1400
  ) GO 1000

SQL Server段运算符 (SQL Server Segment Operator)

SQL Server Segment operators iare used to divide the input data into different groups based on their values. Assume that we run the below SELECT statement that uses the ROW_NUMBER() ranking function to rank the customers regarding their credits, including the Actual SQL Server Execution Plan of the query:

SQL Server Segment运算符用于根据其值将输入数据分为不同的组。假设我们运行下面的SELECT语句，该语句使用ROW_NUMBER（）排名函数对客户的信用进行排名，包括查询的“实际SQL Server执行计划”：

SELECT ID
  ,CX_Name
  ,CX_PhoneNum
  ,CX_Address
  ,ROW_NUMBER() OVER (
    PARTITION BY CX_Credit ORDER BY ID
    )
FROM ExPlanOperator_P4;

You can see from the generated SQL Server explian plan that, the SQL Server Engine will read all the requested data from the underlying table, sort these values bases on the Credit and ID values then use the Segment operator to partition the customers into groups depending on their credits, as shown in the SQL Server Execution Plan below:

您可以从生成SQL Server Explian计划中看到，SQL Server Engine将从基础表中读取所有请求的数据，根据Credit和ID值对这些值进行排序，然后使用Segment运算符将客户划分为多个组，具体取决于他们的信用，如下面SQL Server执行计划中所示：

SQL Server表假脱机操作符（惰性假脱机） (SQL Server Table Spool Operator (Lazy Spool))

Te SQL Server Lazy Spool is used to build a temporary table on the TempDB and fill it in lazy manner. In other words, it fills the table by reading and storing the data only when individual rows are required by the parent operator. Assume that we run the below query that returns all the customers with credit values greater than the average credit of all customers, using a sub select query that returns the average credit value for comparison and filtration purposes, including the Actual Execution Plan of the query:

SQL Server惰性缓冲池用于在TempDB上构建临时表并以惰性方式填充它。换句话说，仅当父运算符需要单独的行时，它才通过读取和存储数据来填充表。假设我们运行下面的查询，该查询返回的信用值大于所有客户的平均信用的所有客户，并使用子选择查询返回的平均信用值用于比较和筛选，包括查询的实际执行计划：

SELECT ID
  ,CX_Name
  ,CX_Credit
FROM ExPlanOperator_P4 CX1
WHERE CX_Credit > (
    SELECT AVG(CX_Credit)
    FROM ExPlanOperator_P4 CX2
    WHERE CX1.ID = CX2.ID
    )

From the generated SQL Server Execution Plan, you can see that the SQL Server Engine reads the data from the table first, sort the data before dividing it into segments, then create a temp table to store the data groups. On the other part of the explain plan, the SQL Server Engine reads from the Table Spool then calculate the average credit value for each group using the Stream Aggregate operator. The last Table Spool operator will read the grouped data and join it to retrieve the values higher than the average value. The three Table Spool operators will use the same temp table created at the first time, as shown in the SQL Server Explain Plan below:

从生成SQL Server执行计划中，您可以看到SQL Server Engine首先从表中读取数据，在将数据划分为段之前对数据进行排序，然后创建一个临时表来存储数据组。在解释计划的另一部分，SQL Server引擎从表假脱机读取，然后使用“流聚合”运算符计算每个组的平均功劳值。最后一个Table Spool运算符将读取分组的数据并将其合并，以检索高于平均值的值。这三个表假脱机操作符将使用第一次创建的相同临时表，如下面SQL Server说明计划所示：

SQL Server合并间隔运算符 (SQL Server Merge Interval Operator)

The SQL Server Merge Interval operator is used to perform a DISTINCT query by identifying the overlapping intervals and merge it to generate a non-overlapping interval with no duplicated predicates in a query, in order to avoid scanning the same values more than one time. Assume that we run the below SELECT statement that calculates the sum of the credits for a group of customers, including the Actual Execution Plan of the query, after creating an index on the ID column including the Credit column:

SQL Server合并间隔运算符用于通过标识重叠间隔来执行DISTINCT查询，并将其合并以生成不重叠的间隔，查询中没有重复的谓词，以避免多次扫描相同的值。假设我们在ID列（包括Credit列）上创建索引后，运行下面的SELECT语句来计算一组客户的信用额总和，包括查询的实际执行计划：

CREATE INDEX IX_ID ON ExPlanOperator_P4 (ID) INCLUDE (CX_Credit)
 
DECLARE @CredID1 INT = 10
  ,@CredID2 INT = 20
  ,@CredID3 INT = 50
  ,@CredID4 INT = 70
  ,@CredID5 INT = 90
 
SELECT SUM(CX.CX_Credit)
FROM ExPlanOperator_P4 CX
WHERE ID IN (
    @CredID1
    ,@CredID2
    ,@CredID3
    ,@CredID4
    ,@CredID5
    )

You can see from the generated SQL Server execution plan, how the SQL Server Engine uses the Merge Interval operator to identify the duplicates and speed up the data comparison and retrieval by performing one index seek operation rather than seeking the same data twice, as shown below:

您可以从生成SQL Server执行计划中看到，SQL Server Engine如何通过执行一次索引查找操作而不是两次查找相同的数据来使用Merge Interval运算符识别重复项并加快数据比较和检索的速度，如下所示：

SQL Server筛选器运算符 (SQL Server Filter Operator)

The SQL Server Filter operator is used to check the input data and return only the data that satisfies the predicate expression. Assume that we run the below SELECT statement that returns the number of duplicate credit values, including the Actual Execution Plan of that query:

SQL Server筛选器运算符用于检查输入数据，并仅返回满足谓词表达式的数据。假设我们运行下面的SELECT语句，该语句返回重复的信用值的数量，包括该查询的实际执行计划：

SELECT COUNT(cx_credit)
FROM ExPlanOperator_P4
HAVING COUNT(cx_credit) > 5

From the generated SQL Server Execution Plan, you can see that the SQL Server Engine uses the Filter operator to return the records that matches the HAVING clause predicate at the last stage, as in the Execution Plan in SQL Server below:

从生成SQL Server执行计划中，您可以看到SQL Server Engine使用筛选器运算符返回与最后一个阶段的HAVING子句谓词相匹配的记录，如下面SQL Server中的执行计划所示：

SQL Server在线索引插入运算符 (SQL Server Online Index Insert Operator)

When creating an index, SQL Server provides you with the ability to create or alter that index online, without preventing the clients from connecting to the underlying table during the index creation process. The SQL Server Online Index Insert operator is used for creating or altering the index online.

创建索引时，SQL Server可以在线创建或更改该索引，而不会阻止客户端在索引创建过程中连接到基础表。 SQL Server在线索引插入运算符用于在线创建或更改索引。

For more information about the index creation options, see the article SQL Server index operations.

有关索引创建选项的更多信息，请参见文章 SQL Server索引操作 。

Assume that we run the below CREATE INDEX T-SQL statement, including the Actual Execution Plan of the query:

假设我们运行下面的CREATE INDEX T-SQL语句，包括查询的实际执行计划：

CREATE INDEX IX_ID ON ExPlanOperator_P4 (ID) INCLUDE (CX_Credit)
  WITH (
      DROP_EXISTING = ON
      ,ONLINE = ON
      )

The generated SQL Server Execution Plan will show you that, the SQL Server Engine uses the Online Index Insert operator to perform the index creation process online, without holding a lock on the underlying table, as shown below:

生成SQL Server执行计划将向您显示，SQL Server引擎使用Online Index Insert操作符在线执行索引创建过程，而无需在基础表上保持锁定，如下所示：

SQL Server序列项目运算符 (SQL Server Sequence Project Operator)

The SQL Server Sequence Project operator is the closest friend of the Segment operator, that can be seen when using the ROW_NUMBER (), RANK () or DENSE_RANK() windowed functions. The Sequence Project operator is simply used while classifying the data into groups by adding 1 to the row count column when the Segment operator still working with the same group or resets the counter to 1 when start working with a new group. Let us revisit the same SELECT query that we used as an example for the Segment operator, including the Actual Execution Plan of that query:

SQL Server Sequence Project运算符是Segment运算符的最亲密的朋友，在使用ROW_NUMBER（），RANK（）或DENSE_RANK（）窗口函数时可以看到。当段运算符仍与同一个组一起使用时，在行计数列中加1时，或在开始与新组一起使用时将计数器重置为1时，在将数据分类到组中时，仅使用序列项目运算符。让我们重新访问我们作为段运算符示例使用的SELECT查询，包括该查询的实际执行计划：

SELECT ID
  ,CX_Name
  ,CX_PhoneNum
  ,CX_Address
  ,ROW_NUMBER() OVER (
    PARTITION BY CX_Credit ORDER BY ID
    )
FROM ExPlanOperator_P4;

The generated Execution Plan will show you the friendship between the Segment and Sequence Project operators, where the Segments will be used to classify the sorted data into groups and the Sequence Project operator will control completing with the current group or start counting with for a new group, as shown below:

生成的执行计划将向您显示“段”和“序列项目”运算符之间的友谊，其中“段”将用于将排序的数据分类为组，而“序列项目”运算符将控制当前组的完成或开始计算新组，如下所示：

SQL Server急切假脱机运算符 (SQL Server Eager Spool operator)

Te SQL Server Eager Spool operator is used to take all the records passed to it from another operator, read all the data at one time, blocking any access to the data during that one shot read and store it in a temp table in the tempdb database. Let us drop the previously created index on the ID column and replace it with Clustered index on the ID column and Non-Clustered index on the cx_credit column, using the T-SQL script below:

SQL Server Eager Spool运算符用于获取从另一运算符传递给它的所有记录，一次读取所有数据，在一次读取期间阻止对数据的任何访问，并将其存储在tempdb数据库的temp表中。让我们使用以下T-SQL脚本将先前创建的索引放在ID列上，并用ID列上的聚集索引和cx_credit列上的非聚集索引替换：

DROP INDEX IX_ID ON ExPlanOperator_P4
 
CREATE CLUSTERED INDEX IX_NEWID ON ExPlanOperator_P4 (ID)
 
CREATE INDEX IX_CX_Credit ON ExPlanOperator_P4 (CX_Credit)

Then execute the below UPDATE statement to modify the credit of a group if customers, including the Actual Execution Plan of the query:

然后，如果有客户（包括查询的实际执行计划），请执行以下UPDATE语句来修改组的信誉：

UPDATE ExPlanOperator_P4
SET cx_credit = cx_credit * 1.075
FROM ExPlanOperator_P4 WITH (INDEX = IX_CX_Credit)
WHERE cx_credit >= 900

The generated SQL Server Execution Plan will show you that the SQL Server Engine will seek for the requested data in the Non-Clustered index then it will use the Eager Spool operator to read the data from the index and write it to the temp table, blocking the access to that data during the read process, as shown below:

生成SQL Server执行计划将向您显示SQL Server Engine将在Non-Clustered索引中寻找请求的数据，然后它将使用Eager Spool运算符从索引中读取数据并将其写入临时表，从而阻止在读取过程中对数据的访问，如下所示：

SQL Server并行运算符 (SQL Server Parallelism Operator)

SQL Server will manage to execute a query using a parallel plan to speed up the execution of the expensive queries. The decision of using a parallel plan depends on multiple factors, such as whether the SQL Server should be installed on a multi-processor server, the requested number of threads should be available to be satisfied, the Maximum Degree of Parallelism option is not set to 1 and the cost of the query exceeds the previously configured Cost Threshold for Parallelism value. SQL Server performs that using the Parallelism operator. Assume that we run the below command to rebuild the highly fragmented clustered index on out demo table, including the Actual Execution Plan of the query:

SQL Server将设法使用并行计划执行查询，以加快昂贵查询的执行速度。使用并行计划的决定取决于多个因素，例如是否应在多处理器服务器上安装SQL Server，应满足所请求的线程数， 最大并行度选项未设置为1，并且查询的成本超过了先前配置的“并行度成本阈值”值。 SQL Server使用Parallelism运算符执行该操作。假设我们运行以下命令在演示表上重建高度分散的聚集索引，包括查询的实际执行计划：

ALTER INDEX IX_NEWID ON ExPlanOperator_P4 REBUILD

You will see from the generated SQL Server Execution Plan that the SQL Server Engine decided to speed up the execution of that heavy query using a parallel plan. This is clear from the existence of the Parallelism operator with yellow arrows identifying that operator, as shown below:

从生成SQL Server执行计划中，您将看到SQL Server引擎决定使用并行计划来加快该繁重查询的执行速度。通过存在带有黄色箭头的Parallelism运算符可以清楚地看出这一点，如下所示：

For now, we have covered most of the common SQL Server Explain Plan operators that you will deal with, while working with the Execution Plans in SQL Server. In the next article, we will jump another step in the SQL Server Execution Plans series. Stay tuned!

目前，在使用SQL Server中的执行计划时，我们已经介绍了您将要处理的大多数常见SQL Server解释计划运算符。在下一篇文章中，我们将在“ SQL Server执行计划”系列中迈出又一步。敬请关注！

目录 (Table of contents)

SQL Server Execution Plans Overview

SQL Server Execution Plans Types

How to Analyze SQL Execution Plan Graphical Components

SQL Server Execution Plans Operators – Part 1

SQL Server Execution Plans Operators – Part 2

SQL Server Execution Plans Operators – Part 3

SQL Server Execution Plans Operators – Part 4

SQL Execution Plan enhancements in SSMS 18.0

A new SQL Execution Plan viewer

Using the SQL Execution Plan for Query Performance Tuning

Saving your SQL Execution Plan

SQL Server执行计划概述

SQL Server执行计划类型

如何分析SQL执行计划图形组件

SQL Server执行计划操作员–第1部分

SQL Server执行计划操作员–第2部分

SQL Server执行计划操作员–第3部分

SQL Server执行计划操作员–第4部分

SSMS 18.0中SQL执行计划增强功能

新SQL执行计划查看器

使用SQL执行计划进行查询性能调整

保存您SQL执行计划

翻译自: https://www.sqlshack.com/sql-server-execution-plan-operators-part-4/

标签

#数据库 #mysql #java #python #索引