Saturday 29 August 2015

Role Playing Dimension in Tabular Model(SSAS)

Role Playing dimension is a single dimension which has multiple references with the fact. Each of these references logically means a different role. Date dimension is the commonly used role playing dimension. The fact dimensions used to have different date columns and the date dimensions used to have reference to these columns.

In Tabular model there is no direct way to define different role to the a dimension. There are some work around to available to get role based dimension feature. 

Let's take the AdventureWorks example:

In this example we will take FactInternetSales dimension and DimDate dimesion. Following are the date keys those are present in the fact:

  1. OrderDateKey
  2. DueDateKey
  3. ShipDateKey
Create the first relationship between FactInternetSales and DimDate on the basis of Order Date.


This is the active relation between FactInternetSales and DimDate.

Now create the relation between FactInternetSales and DimDate on the basis of Ship Date.


You can find the active checkbox is unchecked and there a dotted line between FactInternetSales and DimDate which represent the inactive relationship.


This is how you can have multiple relationships between a fact and a dimension. Now we will use these relations while calculating measure.

Create a measure Internet Total Sales:=SUM([SalesAmount]).


Process the cube and analyse the data in an excel. After analyzing you will found the measure is calculated on the basis of active relationship which is Order Date.




Now let's create another measure by using the inactive relation Ship Date.

Internet Total Sales on Ship Date:=CALCULATE(SUM([SalesAmount]),USERELATIONSHIP(FactInternetSales[ShipDateKey],DimDate[DateKey]))




Process the cube and analyse the data in an excel. After analyzing you will found the measure is calculated on the basis of in-active relationship which is Ship Date.




Sunday 5 April 2015

Triggering a SSIS package by placing a file in a particular location

In SQL server 2012 you can trigger a SSIS package by placing a file in a particular location. This can be done by using a File table feature present in SQL Server 2012. This is a new feature introduced in SQL Server 2012.

File Table is a special table where you can store a file in SQL Server but the files can be accessed as if it is stored inside file system.

When a file is placed inside the File table location a record get inserted inside the File table. You can have after insert trigger over the File table and the trigger intern execute the package.

Enable File Steam feature inside the Server:

Step1: Execute the following query:

EXEC sp_configure filestream_access_level, 2
RECONFIGURE

Step2: Open the Sql Server Configuration Manager and go to the properties of the SQL Server. Open the FileStream tab.


Enable File Stream for the database

Step1: Run the following code for your database:

ALTER DATABASE database_name
SET FILESTREAM ( NON_TRANSACTED_ACCESS = FULL, DIRECTORY_NAME = N'directory_name' )

Step2: Open the properties of the database. Go to the Filegroup tab and add FileSTREAM:


Step3: Go to the Files tab and add a File in the file category as 'FILESTREAM Data':


Creation of the File Group:

Create the File Group by executing the following query:

CREATE TABLE FileDropLoc AS FileTable
WITH
(FileTable_Directory = 'FileDropLoc');

Then you can access the file location as:

\\<ip-address>\mssqlserver\DarkHorse\FileDropLoc

Place a file in the above path and run the following query:


You can find one record is present inside the File table. 

You can also find the file table created under the File table hierarchy 


Now you can write a after insert trigger on the File table and can execute the package that you want by using dtexec command line utility or you can run the SQL server job which intern run the package.

Saturday 4 April 2015

How to use fast load option in OLEDB destination(avoiding bulk data rejection due to violation of constraints)?

The common issue in OLEDB destination with fast load option is if some error occurs due to constraint violation the whole batch will fail and it will be redirected to the error flow. In this blog i will show you some work around.
  1. Remove all the constrains from the table.
  2. Have a trigger to validate the constrains like: not null, check constrain where the constraint only applied on the row only.
  3. Have a stored procedure to validate the constraints like: primary key, unique
Let me illustrate this by an example:

There is a table called Emp which is having the following columns and constraints:

DeptID column is having a reference to the DeptID column of the Dept table. Structure of the Dept table is below:


The data inside the Dept table is below:



For the work around i have created two tables Emp_Temp and Emp_Error.

Emp_Temp: Same structure as EMP without any constraints.
Emp_Error: Same structure as Emp_Temp with an extra column named ErrorDesc to store the error description.

Now we will populate the Emp table by picking data from a flat file source. Following is the data present inside the flat file:





The package should be like this:


There should be a trigger written on the top of EMP_Temp table. The trigger should be as follows:


CREATE TRIGGER dbo.Tri_EMP
   ON  dbo.Emp_Temp
   AFTER INSERT
AS 
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;

    -- Insert statements for trigger here
Insert into Emp_Error (
EMPID,
EMPNAME,
DeptID,
Salary,
ErrorDesc
)
SELECT 
EMPID,
EMPNAME,
DeptID,
Salary,
'Not null violation'
FROM inserted
Where Salary is null
OR EMPNAME is null
END
GO

Following property need to be set in the OLEDB Destination so that the trigger will be fired:


There should be a procedure as below which will enforce the PK and FK constraints:

ALTER PROCEDURE EMP_Constraint
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;

    Insert into [dbo].[Emp_Error](
EMPID,
EMPNAME,
DeptID,
Salary,
ErrorDesc
)
SELECT 
EMPID,
EMPNAME,
DeptID,
Salary,
'PK Violation.'
FROM dbo.Emp_Temp
WHERE EMPID IN (SELECT EMPID FROM Emp_Temp GROUP BY EMPID HAVING COUNT(1)>1)

Insert into [dbo].[Emp_Error](
EMPID,
EMPNAME,
DeptID,
Salary,
ErrorDesc
)
SELECT 
EMPID,
EMPNAME,
Emp_Temp.DeptID,
Salary,
'FK Violation.'
FROM dbo.Emp_Temp
LEFT
JOIN dbo.Dept
ON Emp_Temp.deptid=Dept.DeptID
WHERE Dept.DeptID IS NULL

DELETE dbo.Emp_Temp
WHERE EMPID IN (SELECT EMPID FROM Emp_Error)
END

Once the data moves into the EMP_Temp table then that need to be transferred to the actual EMP table.
Try it out for your scenario it even works for bulk amount of data. I have used it for data of size 50 million. Good luck.

Friday 12 September 2014

Execute SQL Task

Introduction:

  • Execute SQL task is a very common task used inside SSIS packages.
  • As the name says,it can execute a SQL query inside the control flow.

How to configure:

  • Execute SQL task is present inside the toolbox in the control flow tab of a package.
  • Drag the task and place it inside the control flow
  • Double click on the task.

  • The Execute SQL task editor will appear where you can see all different configurable properties of the task.
  • Both DML and DDL SQL scripts can be executed by using the Execute SQL task.
  • In this blog we will see how to execute a SQL script which doesn't return any result set.

Executing a SQL Statement which doesn't return any result set:

In this demo a create statement will be executed by Execute SQL task which will not return any result set.
  • Let's say there is a DB named test.
  • We are going to create a table named EMP with the following attributes.

  • To execute the above statement inside the Execute SQL task we need to create a connection to the TEST Database. Please refer the following post to get the details regarding creating a connection:Creating connection manager in ssis
  • Once the connection manager is created for the Database where the table need to be created go the Execute SQL Task editor.
  • Chose Connection manager name from the Connection drop down
  • Then place the SQL Statement inside the SQL Statement textbox.
  • After that click OK and execute the package
  • Once the package executed successfully you can see a EMP table is created inside the server mentioned in the OLEDB connection manager.
  • If you try to re-execute the package it will throw error as the table is already created.

Creating connection manager in SSIS

Introduction:

  • Creating a connection manager is the first step of creating a Package and is a very important part.
  • After creating connections only we can do operations inside databases and files.

OLEDB Connection:

  • OLEDB connection is required to do any operations inside a Database.
  • Here we will be creating a OLEDB connection to a SQL Server.
  • To create a connection manager go to the connection manager pan in a package.
  • Right click on it and choose 'New OLEDB connection' option.
  • Click on 'New'.
  • Fill the Server information like Server Name, Authentication type, choose the Database and then click test connection.
  • If all the information is correct you will get the following pup-up.
  • Click OK and a new connection manager will be created inside the Connection manager Pan.

Thursday 11 September 2014

Getting started with SSIS

What is ‘SSIS’:


  • ·         SQL Server integration services (SSIS) is a tool used to Extract, Transform and Load (ETL) data in different databases.
  • ·         SSIS is a Microsoft product lunch with SQL Server 2005. Prior to this it was known as DTS services.

Pre-requisite:


  • ·         Install SQL Server 2005 or any other version later to it. This will automatically install SSIS Engine to your system.
  • ·         Open SQL Server Business Intelligence Development studio (BIDS) to create SSIS package. 

Getting Started:


  • ·         To create a SSIS package first a Solution and a SSIS Project need to be created.
  • ·         A solution is a collection of different types of project like: SSIS project, SSMS project, SSAS project and different C# projects.
  • ·         A SSIS Project is a collection of SSIS packages and other files related to the project.
  • ·         To create a solution go to File->New->Project
  • ·         Click on Integration Services Project.



  • ·         Fill the Name and location. By default the solution name will be the Name of the project. If you want you can customize the name of the solution. Then press OK. 

  • ·         Once you click OK a solution and a project will be created along with a default package.

Getting familiar with BIDS:


  • ·         The following screen will appear once you click OK.
  • ·         The following panes are present inside the BIDS:
o   Solution Explorer
o   Properties
o   Toolbox
o   Variables
o   Package
§  Control Flow
§  Data Flow
§  Event Handlers
§  Package Explorer
§  Connection Manager

Solution Explorer:

·         Shows all the Data Sources, Data views, Packages and files preset inside the solution.

Properties:

·         Shows the property of the selected component.

Toolbox:

·         Contains different tasks when the Control flow tab is open.
·         Contains different transformations when the Data flow tab is open.

Variables:

·         Contains all the user defied and system defined variables present inside the package.

Package:

·         Package is a collection of task that is to be performed to do the ETL.
·         It’s a XML file containing all the ETL instructions inside the XML tags and saved with .dtsx extension.
There are four tabs present inside the package area:

Control Flow:

·         Contains all tasks present inside a package.
·         Task is the granular level of work that a package performs. For example: Execute SQL Task, Data Flow task, Send mail task

Data Flow:

·         Data flow contains source where the data need to be extracted, transformations which modifies data according to the business requirement and target where the transformed data going to reside.

Event Handler:

·         Event handler also contains task like the control flow. The only difference is the tasks will trigger only when the associated event happens.
·         The events can be associated on the package level, container level or on the task level.

Package Explorer


·         Contains the hierarchical structure of all the components present inside a package.

Tuesday 1 October 2013

How to handle source records containing delimiters in delimted Flat file connection manager

Introduction

It's common scenario in ETL while you find presence of delimiter character inside the source records while importing data into flat file. Though the flat file will be exported without any error while reading it back it will create problem as the number of columns will increase for that row.
 
Without wasting more words I will now show to avoid such situations.
 

Solution

To avoid such problems you need to add a text qualifier in the Flat file connection manager.
 
In the following example the source data is present inside SQL Server. Following is the data.
 
 

 
We will use double quotes(") as the text qualifier. So while importing the data we will replace the double quotes with two double quotes. Means
 
 
This query is placed in the OLEDB source. Then the records pulled will be exported to a comma delimited Flat file. The text qualifier should be set as double quotes(").
 
 
While reading the file also the same connection manager should be used. And following is the data viewer screen shot while reading the data back from the Flat file: