Implementing
Creation & Merging of Dynamic Cube Partitions using SSIS
In this
section I will explain step by step operations to create Dynamic Cube
Partitions.
We are using Analysis Services
Tutorial “Lesson 2 Complete” for this purpose.
This cube will look like this.
Deploy
SSAS Cube on your local server
Step 1: - Set Target Properties
Make sure that Target Server is
set to localhost and database is Analysis Services Tutorial.
Step
2: Create Template Partition
For this we need to create a default partition, which will act
like template partition
Double click on the cube and
navigate to Partition tab.
You will see a default partition as shown below:
Delete this default partition
and Click on new partition... to
click new partition. In Partition Wizard, Select “InternetSales”
as available table in Specify Source
Information page and click next. Select Specify a query to restrict row in Restrict Rows page and write WHERE condition
(WHERE 1 = 2) to restrict partition rows. This is just to create a partition
template with no data.
Click Next twice and enter Partition name as Internet_Sales_Partition_Template in the
Completing the Wizard page.
Select Design aggregation later
and check Deploy and Process now.
Finally click Finish.
Once cube is processed successfully, you can see a new database in Analysis Services:
We are done with one manual partition. Now it’s time to create the partitions dynamically and apply logic to process the new data every time.
First off all we need to create the create, process, alter and merge scripts.
Create Script
For this right click on the
Partition Internet_Sales_Partition_Template and
then select Script Partition as -> Create To -> New
Query Editor Window.
It will create XMLA scripts for
this partition. Save this file in the project with Create_InternetSales_Partition_Template.xmla name.
Process Script
Right click on the Partition Internet_Sales__Partition_Template and
then select Process. This
will open Process Partition wizard, click Script to
generate XMLA script and save this file in the project with Process_InternetSales_Partition_Template.xmla name.
Alter Script
For this right click on the
Partition Internet_Sales_Partition_Template and
then select Script Partition as ==> Alter To ==> New
Query Editor Window.
It will create XMLA scripts for this partition. Save this file
in the project with Alter_Internet_Sales_Partition_Template.xmla name.
Create
SQL Tables and Procs
We need some database objects need to
create in AdventureWorksDW2012 for this.
Below are details of these objects
S.No
|
Object Name
|
Type
|
Purpose
|
1
|
Check_Partition_Merge_Required
|
Proc
|
Use to check if partitions are required to merge or not.
|
2
|
Delete_PartitionLog
|
Proc
|
Use to delete the data in dbo.PartitionLog table.
|
3
|
Get_Cube_Partition_Details
|
Proc
|
Use to get the details for new partition need to create
|
4
|
Get_Cube_Processing_Details
|
Proc
|
Use to get the details of type of processing, i.e. normal run,
delete and process again or just process all again.
|
5
|
Get_Cube_XMLA_Scripts
|
Proc
|
Use to store the xmla scripts in variables.
|
6
|
Get_Current_Partition_Details
|
Proc
|
Get all the partitions details from dbo.PartitionLog table
|
7
|
Get_PartitionDetails_ToMerge
|
Proc
|
Use to get the details of Partition which need to merge
|
8
|
Insert_PartitionLog
|
Proc
|
Use to Insert the data in dbo.PartitionLog table.
|
9
|
Reset_Variables
|
Proc
|
Use to reset the variable values
|
10
|
Update_PartitionLog
|
Proc
|
Use to Update the data in dbo.PartitionLog table.
|
11
|
Insert Data in PartitionLog
|
Script
|
Insert Template partition detail in table.
|
12
|
Insert Data in PartitionQuery
|
Script
|
Insert xmla queries in table.
|
13
|
dbo.PartitionLog
|
Table
|
Use to store the Partitions details.
|
14
|
dbo.PartitionQuery
|
Table
|
Use to store the xmla queries (create, alter, merge and
process).
|
15
|
dbo.SSISLog
|
Table
|
Use to store the event logging of ssis package.
|
You can get the scripts of these
objects here.
Create
SSIS Package
We are done with SSAS and SQL development.
Now we have to create an SSIS
package to implement dynamic creation of cube partition and processing of that
partition.
Step
1: Create SSIS Package
Create a New Project of BIDS and Select Integration Services Project and enter the name of the SSIS Project as Tutorial_DynamicPartiton.
Rename the SSIS package1.dtsx by DynamicPartition.dtsx.
Step
2: Add connection managers
Add two connection Managers -
one OLE DB connection manager for database and another Analysis Services
connection manager for SSAS database.
Step 3: Add Package Variables
Add following Package variables:
Here is the brief description
of each variable:
Name
|
DataType
|
Description
|
Default Value
|
AlterPartitionXMLA
|
String
|
To store XMLA script to alter partition
|
|
CreatePartitionXMLA
|
String
|
To store XMLA script to create monthly partition
|
|
Cube_Name
|
String
|
Cube name
|
Analysis Services Tutorial
|
Database_Name
|
String
|
SSAS Database name
|
Analysis Services Tutorial
|
FromDateKey
|
String
|
To Store monthly partition start date key e.g. 2014-01-01
|
|
History_Flag
|
Int16
|
To check what kind of processing is required.
|
1
|
IsPartitionExists
|
Int16
|
To check whether monthly partition already exists or not
|
0
|
LastDate_Data
|
String
|
To store the last day of data in fact table for a particular
partition.
|
|
LoopIteration
|
Int16
|
Loop iteration variable set to 1
|
1
|
Measure_Group_Name
|
String
|
Measure group name of the cube
|
Internet Sales
|
Merge_Flag
|
Int16
|
To check whether monthly partition need to merge in yearly or
not.
|
0
|
MergePartitionXMLA
|
String
|
To store XMLA script to merge partitions
|
|
PartitionList
|
Object
|
To store the existing partition list
|
|
PartitionName
|
String
|
Monthly partition name e.g. InternetSales_20120101_20120131
|
|
PartitionToMerge
|
String
|
Name of Partition which need to merge in yearly partition.
|
|
Process_Flag
|
Int16
|
||
ProcessPartitionXMLA
|
String
|
To store XMLA script to process monthly partition
|
|
Success_Flag
|
Int16
|
To store the success flag, i.e. whether process need to run or
not, in case of monthly partition
|
|
Template_Name
|
String
|
Template name prefix for the partitions
|
Internet_Sales_
|
ToDateKey
|
String
|
To Store monthly partition end date key e.g. 2014-01-31
|
|
TotalMonthstoProcess
|
String
|
Numbers of months in the fact data to run the loop for all in
once.
|
Step 4 –Package Development
Generally there are 4 different
scenarios which can happens while processing a cube.
1)
Fresh data is available for processing
2)
All data need to be processed again (due to some data changes in
tables)
3)
Cube is altered and all partitions are need to be created again.
4)
No fresh data is available for processing.
We are handling all these
scenarios in this package.
Variable History_Flag is very
important, as this will tell us what kind of processing we required. This will
be passed as Input Parameter to the package.
There could be 3 values for
this i.e.
1 means normal / incremental
processing
2 means there is some data
changes and we need to process whole cube.
3 means there is schema changes
/ cube is altered which cause all partitions to be override and we need
to create all partitions again.
We are making 12 monthly partitions, of last 12 months and rest
will be merged into a yearly partition of that year.
Now you have to develop SSIS
package as shown below:
Task 0 - Exec_SQL_Tk - Check Processing type based
on History_Flag
First Execute SQL Task is to
check what path need to use for processing based on the @History_Flag value.
In Execute SQL Task Editor, select ResultSet Single row, ConnectionType OLE DB, Connection AdventureWorksDW2012, SQLSourceType
Direct input and
SQLStatement as:
Exec [Get_Cube_Processing_Details] ?,?,?
Click on Parameter Mapping and
map the parameters as shown below:
Click on Result Set and map the
result as shown below:
This will give data as below:
Note: in case of @History_Flag
= 2, TotalMonthstoProcess is null as in this case we just need to process the
whole cube and not partition wise.
If there is no fresh data to
process it will return TotalMonthstoProcess as null and History_Flag as 0.
Task 1.1 – Seq_Container - Re-Process Whole Cube
If
History_Flag = 2 then this container will execute and process the whole cube.
Task 1.1.a – AS_Processing_Tk - Re-Process Whole Cube
Task 1.1.b – Exec_SQL_Tk - Update info of Monthly Partition
Use
this to update the dbo.Partition table.
In Execute SQL Task Editor, select ResultSet Single row, ConnectionType OLE DB, Connection AdventureWorksDW2012, SQLSourceType
Direct input and
SQLStatement as:
exec Update_PartitionLog ?,?,?,? ,2
Click on Parameter Mapping and map the parameters as shown
below:
Task 1.2 – Seq_Container - Create fresh partitions
after Alter
If
there is any change in the schema of cube, an alter / drop create statement is
executed which will cause drop of all the partitions, except the template one.
So in this case we need to delete it from the dbo.PartitionLog table as well,
however we are still checking in SSAS cube and drop, if found and then delete
from the dbo.PartitionLog table as well.
Note:
this will run only if History_Flag value is 3
Task 1.2.a – Exec_SQL_Tk - Get List of All
Partitions to delete
First
task in this is to get a list of all the Partitions using proc
get_current_partition_details
In Execute SQL Task Editor, select ResultSet Full result set, ConnectionType OLE DB, Connection AdventureWorksDW2012, SQLSourceType
Direct input and
SQLStatement as:
exec Get_Current_Partition_Details ?,?
Click on Parameter
Mapping and map the parameters as shown below:
Click on Result Set
and map the result as shown below:
This will return the
list of Partitions (Active as well as already merged)
Task 1.2.b – Foreach_Loop_Container - Delete
Partitions from Cube
In Foreach Loop editor select Enumerator as Foreach ADO Enumerator and User:PartitionList in the ADO object
source variable
and
map the variable PartitionName to Index 0
In
this Foreach-Loop we have a script task
Task 1.2.b – Script_TK - Check and Delete Partitions from Cube
This
task is to check if partition exists, on the SSAS cube, and drop them if
exists.
ReadOnlyVariables User::Cube_Name,User::Database_Name,User::Measure_Group_Name,User::PartitionList,User::PartitionName
ReadWriteVariable - User::IsPartitionExists
Task 1.2.c – Exec_SQL_Tk - Delete Partitions
details from Log table
Finally
it will delete these partitions details from the table as well.
In Execute SQL Task Editor, select ResultSet None, ConnectionType OLE
DB, Connection AdventureWorksDW2012, SQLSourceType
Direct input and
SQLStatement as:
exec
[Delete_PartitionLog] @Database_Name, @Cube_Name
Click
on Parameter Mapping and map the parameters as shown below:
Note: it will not delete the Template partition.
Task 1.0 – For Loop Container - Create and Process Partitions
Now
in this loop container we create the monthly partition process it and if number
of active monthly partitions goes beyond to 12 then it will create and process
a yearly partition of the first month year and merge that to yearly partition.
This
will run only when History_Flag value is 1 or 3.
We
need to run the loop for all data available in the fact table. TotalMonthstoProcess
will store the total no of month’s data in the fact table.
Set
the For loop properties as below:
Task 1.0.1 – Exec_SQL_Tk - Check Partition Details
In this we’ll initialize FromDateKey, ToDateKey, PartitionName
and LastDate_Data for the next partition.
Note: As we are making monthly partitions
so in this case ToDateKey will return the last date of the FromDateKey month
and LastDate_Data will return the maximum date of that month, for which data is
available.
For example if it 1st of Jan 1014 and we have data
till 1st of Jan, then FromDateKey and LastDate_Data
will return 2014-01-01 and ToDateKey will return the last date
of jan i.e. 2014-01-31.
In Execute SQL Task Editor, select ResultSet Single row, ConnectionType OLE DB, Connection AdventureWorksDW2012, SQLSourceType
Direct input and
SQLStatement as:
Exec [Get_Cube_Partition_Details]
?,?,?
Click
on Parameter Mapping and map the parameters as shown below:
lick
on Result Set and map the result as shown below:
this will return below data
If Success_Flag = 1 then it means we need to process further.
Task 1.0.2 – Exec_SQL_Tk - Store XMLA to variables
In Execute SQL Task Editor, select ResultSet Single row, ConnectionType OLE DB, Connection AdventureWorksDW2012, SQLSourceType Direct input and SQLStatement as:
Exec
[Get_Cube_XMLA_Scripts] ?,?
Click
on Parameter Mapping and map the parameters as shown below:
Click
on Result Set and map the result as shown below:
this will return below data
Task 1.0.3 – Script_Tk - Generate XMLA scripts to create and process Monthly Partition
Script Task is used for two reasons -
1. To check whether current partition exists or not.2. To update XMLA content for current Partition.
In
script
User::Cube_Name,User::Database_Name,User::FromDateKey,User::Measure_Group_Name,User::PartitionName,User::ToDateKey
will be the ReadOnly Variables and
User::CreatePartitionXMLA,User::IsPartitionExists,User::ProcessPartitionXMLA
will be ReadWriteVariable
This is the most tricky and interesting part here. We need to replace Partition ID, Partition Name, and Source QueryDefinition for current Partition:
Task 1.0.4 – Seq_Container - Create Monthly Partition
Task 1.0.4a – AS_Exec_DDL_Tk - Create new monthly partition
Second task in the sequence
Container is Analysis Services Execute DDL Task. This task is used to create
new Partition. In Analysis Services Execute DDL Task Editor, select Analysis Services Tutorial as
Connection, Variable as SourceType, and User::CreatePartitionXMLA
as Source.
Also don't forget to set
precedence constraint. You should execute this task only when partition does
not exists. Open Precedence Constraint Editor and Select Expression
and Constraint in Evolution operation, Success in Value, and @IsPartitionExists
== 0 in Expression
Task 1.0.4.b – Exec_SQL_Tk - Insert monthly partition info in log table
This is
used to store newly created Partition information in the PartitionLog table.
In
Execute SQL Task, select ResultSet None, ConnectionType OLE DB,
Connection AdventureWorksDW2012,
SQLSourceType Direct input and SQLStatement as:
Exec
[Insert_PartitionLog] ?,?,?,?,?,?, ‘Monthly’
Click on Parameter Mapping and
map the parameters as shown below:
Task 1.0.5 – Sequence Container - Process Monthly Partition
Now we need to process the
current partition. This is independent of previous sequence
container where we are creating new partition if it is not exists.
We need to process the current partition every time we execute the package. New partition will be created on the first day of month but partition should be processed every day to include delta data.
In case you are in current month partition, if you execute the package every day in current month - it will not create any new partition till beginning of next month. It will only process Current Partition to include the data in the cube as shown below:
We need to process the current partition every time we execute the package. New partition will be created on the first day of month but partition should be processed every day to include delta data.
In case you are in current month partition, if you execute the package every day in current month - it will not create any new partition till beginning of next month. It will only process Current Partition to include the data in the cube as shown below:
This sequence
container contains three tasks.
Task 1.0.5.a – AS_Processing_Task - Process Dimensions (Monthly Partition)
First task is Analysis Services Processing Task - to
process the dimensions data. Select Analysis
Services Tutorial as connection manager and all the dimensions in Object
Name - which are not static in data. Select Process Update as Process Option.
Task 1.0.5.b – AS_Processing_Task - Process Monthly Partition
Second
Task is again Analysis Services Processing Task - to process Current
Partition. Select Analysis Services
Tutorial as connection manager and Internet_Sales_Partition_Template in Object Name - which is default partition (or Template
partition). Select Process Full as Process Option.
Task 1.0.5.c – Exec_SQL_Tk - Update info of Monthly Partition
Store
the Last Processed Date and Processed Count (number of time this partition has
processed) in the log table using Execute SQL task.
Select
ResultSet None, ConnectionType OLE DB, Connection ADVENTUREWORKSDW2012,
SQLSourceType Direct input and SQLStatement as:
Exec
[Update_PartitionLog] ?,?,?,?, 1
Click
on Parameter Mapping and map the parameters as
Now we are done with package development for creation of partitions.
Now we’ll see how to do the Dynamic Merging of Partitions.
Task 1.0.6 – Exec_SQL_Tk - Check Merge need to be done or not
As per requirement we are making last 12 months partitions and
then create and merge in yearly for rest of the partitions. So in this task we
check if we have 12 active partitions or not. It will goes further only if we
have more than 12 active monthly partitions.
In Execute SQL Task, select ResultSet None,
ConnectionType OLE DB, Connection AdventureWorksDW2012, SQLSourceType Direct input and
SQLStatement as:
Exec
[Check_Partition_Merge_Required] ?,?
Click
on Parameter Mapping and map the parameters as shown below:
Click
on Result Set and map the result as shown below:
Task 1.0.7– Seq_Container - Check details for Merge Partitions
Task 1.0.7.a – Exec_SQL_Tk - Set Current Partition name and Date Range for Merge Operation
First Execute SQL Task is used
to initialize FromDateKey, ToDateKey, PartitonName, Success_Flag and
LastDate_Data and PartitionToMerge for the partition which need to be merged.
Use below query in this Task:
Exec
[Get_PartitionDetails_ToMerge] ?,?,?
Click
on Parameter Mapping and map the parameters as shown below:
Click
on Result Set and map the result as shown below:
Output
of this will be like this
Task 1.0.7.b – Exec_SQL_Tk - Store XMLA to variables
In Execute SQL Task Editor, select ResultSet Single row, ConnectionType OLE DB, Connection AdventureWorksDW2012, SQLSourceType
Direct input and
SQLStatement as:
Exec
[Get_Cube_XMLA_Scripts] ?,?
Click
on Parameter Mapping and map the parameters as shown below:
Click
on Result Set and map the result as shown below:
this will return data like this
Task 1.0.7.c – Script_Tk - Generate XMLA script to Create and Process Yearly Partition
Script Task is used for below reasons -
1. To check whether current yearly partition exists or not.
2. To update XMLA content for yearly partition creation, merging, processing and altering of scripts.
2. To update XMLA content for yearly partition creation, merging, processing and altering of scripts.
ReadOnlyVariables
User::Cube_Name,User::Database_Name,User::FromDateKey,User::Measure_Group_Name,User::PartitionName,User::ToDateKey
ReadWriteVariables
User::AlterPartitionXMLA,User::CreatePartitionXMLA,User::IsPartitionExists,User::MergePartitionXMLA,User::PartitionToMerge,User::ProcessPartitionXMLA
We
need to replace Partition ID, Partition Name, and Source QueryDefinition
for current Partition which are highlighted below:
Task 1.0.8– Seq_Container - Create Yearly Partitions
Task 1.0.8.a – AS_Exec_DDL_Tk - Create New Yearly Partition
First task in the sequence container
is Analysis Services Execute DDL Task. This task is used to create new yearly partition.
In Analysis Services Execute
DDL Task Editor, select Analysis
Services Tutorial as Connection, Variable as SourceType, and User::CreatePartitionXMLA
as Source.
Also don't forget to set
precedence constraint. You should execute this task only when partition does
not exists. Open Precedence Constraint Editor and Select Expression
and Constraint in Evolution operation, Success in Value, and @IsPartitionExists
== 0 in Expression
Task 1.0.8.b – Exec_SQL_Tk - Insert Yearly Partition Info in Log Ttable
This is used to store newly created partition information in the PartitionLog table.
In
Execute SQL Task, select ResultSet None, ConnectionType OLE DB,
Connection AdventureWorksDW2012,
SQLSourceType Direct input and SQLStatement as:
Exec
[Insert_PartitionLog] ?,?,?,?,?,?, ‘Yearly’
Click on Parameter Mapping and
map the parameters as shown below:
Task 1.0.8.c – AS_Processing_Task - Process Dimension (Yearly Partition)
Now we need to process the newly
created year partition.
In Analysis Services Processing
Task select Analysis Services Tutorial as connection manager and all the
dimensions in Object Name - which are not static in data. Select Process
Update as Process Option.
Task 1.0.8.d – AS_Processing_Task - Process Yearly Partiton
Next Task
is again Analysis Services Processing Task - to process Current
Partition.
Select Analysis Services Tutorial as
connection manager and Internet_Sales_Partition_Template
in Object Name -
which is default partition (or Template
partition). Select Process Full as Process Option.
Task 1.0.8.e – Exec_SQL_Tk - Update info of Yearly Partition in Log Table
Now we
need to store the Last Processed Date and Processed Count (number of time this
partition has processed) in the log table using Execute SQL task.
Select
ResultSet None, ConnectionType OLE DB, Connection ADVENTUREWORKSDW2012,
SQLSourceType Direct input and SQLStatement as:
Exec
[Update_PartitionLog] ?,?,?,?, 1
Click
on Parameter Mapping and map the parameters as
Task 1.0.9– Seq_Container - Merge Monthly Partition
to Yearly Partition
Now we need to merge the first
monthly partition to Yearly Partition.
This is independent of previous
sequence container where we are creating new partition if it is not
exists.
However, we need to merge the last partition every time we execute the package. In most of the cases, new partition will be created only once but partition should be merged on first processing of every month.
This sequence container contains 4 tasks.
However, we need to merge the last partition every time we execute the package. In most of the cases, new partition will be created only once but partition should be merged on first processing of every month.
This sequence container contains 4 tasks.
Task 1.0.9.a – AS_Exec_DDL_Tk - Merge Monthly Partition to Yearly Partiton
Second task in the sequence Container is Analysis Services Execute DDL Task.
This task is used to create new Partition. In Analysis Services Execute DDL
Task Editor, select Analysis Services Tutorial as Connection, Variable
as SourceType, and User::MergePartitionXMLA as Source.
Task 1.0.9.b – Exec_SQL_Tk - Update info of Merging Partition to Log Table
Second Task in
the Sequence Container is Execute SQL Task - this is used to update newly merged Partition
information in the PartitionLog table.
In
Execute SQL Task, select ResultSet None, ConnectionType OLE DB,
Connection ADVENTUREWORKSDW2012, SQLSourceType Direct input and
SQLStatement as:
Exec
[Update_PartitionLog] ?,?,?,?, 3
Click on Parameter Mapping and map the parameters as shown below:
Task 1.0.9.c– AS_Exec_DDL_Tk - Alter Definition of Yearly Partition
Third task in the sequence Container is Analysis Services Execute DDL Task.
This task is used to alter the “QueryDefinition” of yearly partition. In Analysis
Services Execute DDL Task Editor, select Analysis Services Tutorial
as Connection, Variable as SourceType, and User::AlterPartitionXMLA
as Source.
Task 1.0.9.d – Execute SQL Task - Update LastDate_Data for Yearly_Partition
Last Task in the Sequence Container is Execute SQL Task - this is used to
update yearly Partition information in the PartitionLog
table.
In
Execute SQL Task, select ResultSet None, ConnectionType OLE DB,
Connection ADVENTUREWORKSDW2012, SQLSourceType Direct input and
SQLStatement as:
Exec
[Update_PartitionLog] ?,?,?,?, 4
Click on Parameter Mapping and map the parameters as shown below:
Task 1.0.10– Exec_SQL_Tk - Reset Variables
As
we are processing data in loop, we need to reset some variables.
In
Execute SQL Task Editor, select ResultSet Single row, ConnectionType OLE DB,
Connection AdventureWorksDW2012, SQLSourceType Direct input and SQLStatement
as:
Exec
[Reset_Variables] ?,?
Click
on Parameter Mapping and map the parameters as shown below:
Click
on Result Set and map the result as shown below:
We are done with final package
development. Now it’s time to execute the package and check the dynamic behavior
of Partition creation and processing. Before execution the package, there is
only one partition in the SSAS Analysis
Services Tutorial database.
Now Right Click on the package and execute it. Once package is execute successfully, you will see one more partition in the database and one more entry in PartitionLog table.
Now Right Click on the package and execute it. Once package is execute successfully, you will see one more partition in the database and one more entry in PartitionLog table.
If you query dbo.PartitionLog
then it will return data like this.
If you browse cube it will show
you all the partitions, you can see 12 monthly partitions -one partition for
each month and one yearly partition as shown below:
You can get the all scripts used in project here.
This is my first attempt to write a post, so there may be some formatting or small mistakes :).
If you require further details or want SSIS package, just drop me a mail.
If you require further details or want SSIS package, just drop me a mail.




























































No comments:
Post a Comment