视频1 视频21 视频41 视频61 视频文章1 视频文章21 视频文章41 视频文章61 推荐1 推荐3 推荐5 推荐7 推荐9 推荐11 推荐13 推荐15 推荐17 推荐19 推荐21 推荐23 推荐25 推荐27 推荐29 推荐31 推荐33 推荐35 推荐37 推荐39 推荐41 推荐43 推荐45 推荐47 推荐49 关键词1 关键词101 关键词201 关键词301 关键词401 关键词501 关键词601 关键词701 关键词801 关键词901 关键词1001 关键词1101 关键词1201 关键词1301 关键词1401 关键词1501 关键词1601 关键词1701 关键词1801 关键词1901 视频扩展1 视频扩展6 视频扩展11 视频扩展16 文章1 文章201 文章401 文章601 文章801 文章1001 资讯1 资讯501 资讯1001 资讯1501 标签1 标签501 标签1001 关键词1 关键词501 关键词1001 关键词1501 专题2001
SSISPerformance-Parallelism
2020-11-09 15:14:14 责编:小采
文档


Parallelism exists almost in every field after multi-core processor come into play, and SSIS is not an exception. SSIS allow us configuration the parallelism in two different granularities: Packge Level By set the MaxConcurrentExecutables

Parallelism exists almost in every field after multi-core processor come into play, and SSIS is not an exception. SSIS allow us configuration the parallelism in two different granularities:

Packge Level

By set the MaxConcurrentExecutables property within the package, we indicate SSIS engine how many Executables can run simultaneously. The default value is -1 which means the number of processor plus 2.

Now let"s do a very simple pratice. I create three Data Flow Tasks in the package and Set the MaxConcurrentExecutables property to 2 which means just 2 executables are allowed to run simultaneously. Then I set breadpoint on all of them:

<喎?http://www.2cto.com/kf/ware/vc/" target="_blank" class="keylink">vcD4KPHA+VGhlbiBsZXQ="s run the package, you will find only two tasks are running now, the third one need to wait until one of them finish:

Then let"s set the MaxConcurrentExecutables to 3 and execute the package again, we can see the three tasks are running simultaneously:

Data Flow Level

Now we have 3 executables(Data Flow tasks) in the package and all of them will run simultaneously after we set MaxConcurrentExecutables = 3. Then let's get into the Data Flow task, the EngineThreads property within the Data Flow indicate the number of threads that data flow task can use during execution.

It is a little obscure when we see the definition at the first glance. So let me make a simple explanation about the background. In general Data Flow task is the only place where SSIS do E-T-L(you may say we ca do this using Execute SQL Task, but in that case it is the SQL Server engine doing the ETL and SSIS just make a call), and in the simplest scenario, if Data Flow just extract data from source and then load the data into destination, we need one buffer and two threads: one is the used to extract data from source named Source Thread, another one is used for transformation/destination named Worker Thread.

But that"s only the simplest scenario, in most cases the Data Flow will do some transformations(Like Union, Lookup, Derived Column etc.) and so need more threads. SSIS use the concept Execution Tree for this: one Execution Tree means SSIS must create a buffer and need a thread.

Now I create 4 Source -> Destination in every Data Flows task which means there are 4 execution trees for every Data Flow task, and also it means SSIS need 4 worker threads if we want all of them run simultaneously.

If we set EngineThreads = 2, then only two of those Source->Destination can run simultaneously(When I do pratice base on SQL Server 2012, I found all of those 4 run simultaneously, I am still wondering why..... and will update this once I find the answer.).

下载本文
显示全文
专题