For XML PATH ROOT in SQL Server 2008

Now and Then I used to observe that when it comes to XML, .NET and SQL Developer start fearing. This is not because of It is VERY DIFFICULT but because of unawareness. Today I am going to show you the use of  “FOR XML PATH”  clause in SQL Server.
Suppose we have ID, FirstName and LastName columns in one of our SQL Server table and we want output something like this:
<Employees>
  <Employee>
    <ID value=1 />
    <FirstName value=Ritesh />
    <LastName value=Shah />
  Employee>
  <Employee>
    <ID value=2 />
    <FirstName value=Rajan />
    <LastName value=Jain />
  Employee>
Employees>
Let me give you TSQL to generate table in SQL Server and insert records in that.
Create Table EmployeeData
(
      ID INT Identity(1,1),
      FirstName varchar(10),
      LastName varchar(10)
)
insert into EmployeeData
select ‘Ritesh’,‘Shah’ UNION ALL
select ‘Rajan’,‘Jain’
GO
So now, here is the use of “FOR XML PATH ROOT” clause which will make our life easier in this kind of situation.
select ID as “ID/@value”
     , FirstName as “FirstName/@value”
     , LastName as “LastName/@value”
from (
   Select * from EmployeeData
) as t
for xml path(‘Employee’), root(‘Employees’);
Isn’t it easy to use?
BTW, below given are some links which will redirect you to my some of the past articles on XML subject.
Reference: Ritesh Shah
http://www.sqlhub.com
Note: Microsoft Books online is a default reference of all articles but examples and explanations prepared by Ritesh Shah, founder of
http://www.SQLHub.com

For XML PATH ROOT in SQL Server 2008

Now and Then I used to observe that when it comes to XML, .NET and SQL Developer start fearing. This is not because of It is VERY DIFFICULT but because of unawareness. Today I am going to show you the use of  “FOR XML PATH”  clause in SQL Server.
Suppose we have ID, FirstName and LastName columns in one of our SQL Server table and we want output something like this:
<Employees>
  <Employee>
    <ID value=1 />
    <FirstName value=Ritesh />
    <LastName value=Shah />
  Employee>
  <Employee>
    <ID value=2 />
    <FirstName value=Rajan />
    <LastName value=Jain />
  Employee>
Employees>
Let me give you TSQL to generate table in SQL Server and insert records in that.
Create Table EmployeeData
(
      ID INT Identity(1,1),
      FirstName varchar(10),
      LastName varchar(10)
)
insert into EmployeeData
select ‘Ritesh’,‘Shah’ UNION ALL
select ‘Rajan’,‘Jain’
GO
So now, here is the use of “FOR XML PATH ROOT” clause which will make our life easier in this kind of situation.
select ID as “ID/@value”
     , FirstName as “FirstName/@value”
     , LastName as “LastName/@value”
from (
   Select * from EmployeeData
) as t
for xml path(‘Employee’), root(‘Employees’);
Isn’t it easy to use?
BTW, below given are some links which will redirect you to my some of the past articles on XML subject.
Reference: Ritesh Shah
http://www.sqlhub.com
Note: Microsoft Books online is a default reference of all articles but examples and explanations prepared by Ritesh Shah, founder of
http://www.SQLHub.com

Character Map Transformation task in SSIS 2008

When it comes to string formatting in SSIS, “Character Map” transformation task is one of the very useful tasks. It generally used to convert data to Uppercase, lowercase, byte reversal etc. let us see one small demo for the same.
Before we start our package development, let us first create one small table in SQL Server and insert some data into it by following TSQL Script.
create table Employee
(
      FirstName varchar(10),
      LastName varchar(10)
)
insert into Employee
select ‘rITesH’,‘ShAh’ union all
select ‘Rajan’,‘SHAH’
GO
Now, create one new project in BIDS (Business Intelligence Development Studio) for SSIS package. Drag “Data Flow” task on your “Control Flow” tab and double click on that “Data Flow” to configure it which will forward you to “Data Flow” tab.
Drag “Ado Net Source” from tool box and drop it into your “Data Flow” tab.  Double click on “Ado Net Source” to configure it and select your Server, database and table, “Employee” table in our case.
Now, drag “Character Map” transformation task to your work area and join it with extended green arrow coming from “Ado Net Source”. Now double click on “Character Map” transformation task to configure it. We are going to convert FirstName column to UPPERCASE and LastName column to LOWERCASE. Look at screen capture below for more information.

Once you finish configuring “Character Map” transformation task, drag “Ado Net Destination” from tool box to right below “Character Map” transformation task and join it with extended green arrow coming from “Character Map” transformation task.
Double click on “Ado Net Destination” to configure it and select your SQL server, Database and table (we are going to use same Employee table as destination too) .
Once you finish all these steps, you are ready to launch your package. Hit F5 to run the package and see EMPLOYEE table in SQL Server whether it has two more rows with different case of alphabet.
Reference: Ritesh Shah
http://www.sqlhub.com
Note: Microsoft Books online is a default reference of all articles but examples and explanations prepared by Ritesh Shah, founder of
http://www.SQLHub.com

Fuzzy lookup in SSIS 2008 to keep data integrity

“Human makes mistakes” which is quite obvious. While making a data entry it is possible to make typo but as a database professional, it is our duty to keep data consistent.  Fuzzy Lookup is helpful in this case. Before we start making package in SSIS, let us have some pre-preparation for that. We are going to create one source table (it could be any source like excel, csv file etc. but we are making it in SLQ Server), one reference table which is guaranteed to have proper data. Here is the TSQL to create, source and reference, table and insert some dummy data.
create table fuzzyLookupSource
(
      firstName varchar(10),
      LastName varchar(10),
      BirthDate datetime
)
insert into fuzzyLookupSource
select ‘Rites’,‘Shah’,’02/07/1980′ union all
select ‘Rajen’,‘Shah’,’03/31/1983′ union all
select ‘Dharmesh’,‘Kalaria’,’04/09/1980′  union all
select ‘Jesica’,‘Cruize’,’05/05/1980′  union all
select ‘Roger’,‘Moore’,’04/15/1980′ 
GO
create table fuzzyLookupReference
(
      firstName varchar(10),
      LastName varchar(10),
      BirthDate datetime
)
insert into fuzzyLookupReference
select ‘Ritesh’,‘Shah’,’02/07/1980′ union all
select ‘Rajan’,‘Shah’,’03/31/1983′ union all
select ‘Jessica’,‘Cruise’,’06/05/1980′  union all
select ‘Dharmesh’,‘Kalaria’,’04/09/1980′
GO
Observe the data in both table, in first, source, table, there are some typos which you can compare with your second, reference, tables and get the purified data.
Anyway, once you are ready with both the tables , create one new project in BIDS (Business Intelligence Development Studio) and drag one “DataFlow” task from tool box to your “Control Flow” tab. Double click on “DataFlow” task to configure it so that it would redirect you to “DataFlow” tab.
Now, create one “Ado Net Source” which will refer our “fuzzyLookupSource” table in sql server database. Double click on “Ado Net Source” to configure it and look at below image to have crystal clear idea about its configuration.

Now,  drag “fuzzy lookup” transformation task below your “Ado Net Source” and connect extended green arrow from “Ado Net Source” to your fuzzy lookup. Double click on “Fuzzy Lookup” task to configure it.

In “Reference Table” tab, give reference of your database and our reference table which is “FuzzyLookupReference” in our case. Look at image below for more idea.

Click on “Columns” tab to configure which column to check with reference from source table and select “firstName” and “lastName” column and connect it so that our fuzzy lookup task will compare these two fields from source to reference table.

Once you configure “columns”, you have to click on “Advanced” tab, you can set “Similarity Threshold” which will give you how much identical both fields are…. If it is 1 than it is perfect match, if it is 0 than no match or data not present in reference table so more near to 1, good match it is. We are not going to take any decision like if it is greater than .50 then do this otherwise do that so it would be ok if you don’t change “Similarity Threshold”.

Now, drag “SQL Server Destination” task so that this matched and unmatched data could fall in SQL Server table, though we have not created any SQL Server table for this so far. Connect green extended arrow from “Fuzzy Lookup” transformation task to “SQL Server Destination” task. Before we configure “SQL Server Destination” we would like to do one more thing. Double click on GREEN arrow between Fuzzy Lookup task and SQL Server destination task.

We would like to see data in grid while running this package and before it fall into our destination table, we are going to specify this only now.
As soon as you click on green arrow, it will open “Data Flow Path Editor”, click on “Data Viewer” tab and click on “Add” button to add “Grid”.

Now, double click on “SQL Server Destination” task to configure it. Give details of your SQL Server and database into “Connection Manager” name. since we don’t have destination table already created for our data, we are going to click on “New” button besides “Use a table or view” property which will create one destination table in our SQL Server.

Now you are ready to run your package, hit F5 to run it, when it crosses Fuzzy Lookup Task, it will show you data in grid, check it and click on “Green Arrow” above the grid in same dialog box so that data falls into our SQL Server table.

you can later on check the same data into SQL Server by executing TSQL Query, while generating new table in SQL Server, if you didn’t have rename the table, it would be by default [SQL Server Destination]. So you can execute something like

SELECT * FROM [SQL Server Destination]
Reference: Ritesh Shah
http://www.sqlhub.com
Note: Microsoft Books online is a default reference of all articles but examples and explanations prepared by Ritesh Shah, founder of
http://www.SQLHub.com

Fuzzy lookup in SSIS 2008 to keep data integrity

“Human makes mistakes” which is quite obvious. While making a data entry it is possible to make typo but as a database professional, it is our duty to keep data consistent.  Fuzzy Lookup is helpful in this case. Before we start making package in SSIS, let us have some pre-preparation for that. We are going to create one source table (it could be any source like excel, csv file etc. but we are making it in SLQ Server), one reference table which is guaranteed to have proper data. Here is the TSQL to create, source and reference, table and insert some dummy data.
create table fuzzyLookupSource
(
      firstName varchar(10),
      LastName varchar(10),
      BirthDate datetime
)
insert into fuzzyLookupSource
select ‘Rites’,‘Shah’,’02/07/1980′ union all
select ‘Rajen’,‘Shah’,’03/31/1983′ union all
select ‘Dharmesh’,‘Kalaria’,’04/09/1980′  union all
select ‘Jesica’,‘Cruize’,’05/05/1980′  union all
select ‘Roger’,‘Moore’,’04/15/1980′ 
GO
create table fuzzyLookupReference
(
      firstName varchar(10),
      LastName varchar(10),
      BirthDate datetime
)
insert into fuzzyLookupReference
select ‘Ritesh’,‘Shah’,’02/07/1980′ union all
select ‘Rajan’,‘Shah’,’03/31/1983′ union all
select ‘Jessica’,‘Cruise’,’06/05/1980′  union all
select ‘Dharmesh’,‘Kalaria’,’04/09/1980′
GO
Observe the data in both table, in first, source, table, there are some typos which you can compare with your second, reference, tables and get the purified data.
Anyway, once you are ready with both the tables , create one new project in BIDS (Business Intelligence Development Studio) and drag one “DataFlow” task from tool box to your “Control Flow” tab. Double click on “DataFlow” task to configure it so that it would redirect you to “DataFlow” tab.
Now, create one “Ado Net Source” which will refer our “fuzzyLookupSource” table in sql server database. Double click on “Ado Net Source” to configure it and look at below image to have crystal clear idea about its configuration.

Now,  drag “fuzzy lookup” transformation task below your “Ado Net Source” and connect extended green arrow from “Ado Net Source” to your fuzzy lookup. Double click on “Fuzzy Lookup” task to configure it.

In “Reference Table” tab, give reference of your database and our reference table which is “FuzzyLookupReference” in our case. Look at image below for more idea.

Click on “Columns” tab to configure which column to check with reference from source table and select “firstName” and “lastName” column and connect it so that our fuzzy lookup task will compare these two fields from source to reference table.

Once you configure “columns”, you have to click on “Advanced” tab, you can set “Similarity Threshold” which will give you how much identical both fields are…. If it is 1 than it is perfect match, if it is 0 than no match or data not present in reference table so more near to 1, good match it is. We are not going to take any decision like if it is greater than .50 then do this otherwise do that so it would be ok if you don’t change “Similarity Threshold”.

Now, drag “SQL Server Destination” task so that this matched and unmatched data could fall in SQL Server table, though we have not created any SQL Server table for this so far. Connect green extended arrow from “Fuzzy Lookup” transformation task to “SQL Server Destination” task. Before we configure “SQL Server Destination” we would like to do one more thing. Double click on GREEN arrow between Fuzzy Lookup task and SQL Server destination task.

We would like to see data in grid while running this package and before it fall into our destination table, we are going to specify this only now.
As soon as you click on green arrow, it will open “Data Flow Path Editor”, click on “Data Viewer” tab and click on “Add” button to add “Grid”.

Now, double click on “SQL Server Destination” task to configure it. Give details of your SQL Server and database into “Connection Manager” name. since we don’t have destination table already created for our data, we are going to click on “New” button besides “Use a table or view” property which will create one destination table in our SQL Server.

Now you are ready to run your package, hit F5 to run it, when it crosses Fuzzy Lookup Task, it will show you data in grid, check it and click on “Green Arrow” above the grid in same dialog box so that data falls into our SQL Server table.

you can later on check the same data into SQL Server by executing TSQL Query, while generating new table in SQL Server, if you didn’t have rename the table, it would be by default [SQL Server Destination]. So you can execute something like

SELECT * FROM [SQL Server Destination]
Reference: Ritesh Shah
http://www.sqlhub.com
Note: Microsoft Books online is a default reference of all articles but examples and explanations prepared by Ritesh Shah, founder of
http://www.SQLHub.com

Derived Column and Union All Transformation task in SSIS 2008

Many times when we are trying to get data from different sources, there is a possibility to have same key fields. To eliminate this kind of problem, “Derived Column” transformation task is very useful. For example one company has factory at one location and back office work performed from other office. Both locations has their own employee table with IDENTITY as a primary key. Obviously Identity key generally starts with 1 by default so both location would have 1,2,3 etc. in their employee table. When you try to move both the table at one single destination, you might come across with situation where employee ID 1 would have two employee and sample with many more IDs. To eliminate this situation, you can have one more column which is not exists in source table which describes where this data come from so you have unique combination of that new column and an existing EmployeeID column.
Let us see how we could perform this. Before we head towards SSIS package development, let us first create two different table in SQL Server and one blank table as a destination of these both table. You can use below script to generate those table with/without data in your SQL Server database.
create table ForDerivedColumn1
(
      EmpID INT Identity(1,1),
      EmpName varchar(10)
)
INSERT INTO ForDerivedColumn1
SELECT ‘Ritesh’ union all
SELECT ‘Rajan’
GO
create table ForDerivedColumn2
(
      EmpID INT Identity(1,1),
      EmpName varchar(10)
)
INSERT INTO ForDerivedColumn2
SELECT ‘Rushik’ union all
SELECT ‘Rajvi’ 
GO
create table ForDerivedColumnNewTable
(
      SourceID INT,
      EmpID INT,
      EmpName varchar(10)
)
GO
Now, open BIDS (Business Intelligence Studio) and create new SSIS project. Draw dataflow task in your “Control Tab” and double click on it to configure it which will redirect you to “Data Flow” tab. Since we have two different SQL Server table as an source, let us first create two “Ado Net Source” in our work place and select your database and table as a source. Below screen shot will show you how to configure “Ado NET Source”. I am showing example for FIRST “ADO NET Source” you have to do it for second “ADO NET Source” too with table name “ForDerivedColumn2”.
Now take two “Derived Column” transformation task and put them under “ADO NET Source” task, double click on it to configure.

Above screen capture showing you configuration for first “DerivedColumn” do it for second “DerivedColumn” task too with value 2 in SourceID column.

Once we have our dataset ready with Derived Column, we have to merge both dataset with “UnionALL” Transformation task. “UnionAll” work exactly same as “UNION ALL” in TSQL. So drag “UnionAll” transformation task below both “DerivedColumn” task and join extended green arrow from both “DerivedColumn” to “UnionAll” task.
Now, this is a time to convert the data in proper format with “DataConversion” transformation task as many time from different sources, same kind of data coming with different datatype, let us convert it with “DataConversion” task by putting it under “UnionAll” task and join extended green arrow from “UnionAll” task to “DataConversion”.

Now, we have to perform our last steps. Data is ready to go anywhere but we have to tell where it suppose to go. Since we want it to insert it in our SQL Server table, we are going to use “SQL Server Destination” task. Double click on it to configure it. Give your database and table reference to it as below image.

Once you give reference of your table, you have to go to “Mappings” tab in the same dialog box and set the mappings there so that proper data get inserted in proper place. Look at the below image for more detailed information.

Finally you are done with package now run it by hitting F5 key and check your database table whether you have actually received the data inside or not. After running if you get “Green Signal” in every task, J you are done.

Reference: Ritesh Shah
http://www.sqlhub.com
Note: Microsoft Books online is a default reference of all articles but examples and explanations prepared by Ritesh Shah, founder of
http://www.SQLHub.com

Derived Column and Union All Transformation task in SSIS 2008

Many times when we are trying to get data from different sources, there is a possibility to have same key fields. To eliminate this kind of problem, “Derived Column” transformation task is very useful. For example one company has factory at one location and back office work performed from other office. Both locations has their own employee table with IDENTITY as a primary key. Obviously Identity key generally starts with 1 by default so both location would have 1,2,3 etc. in their employee table. When you try to move both the table at one single destination, you might come across with situation where employee ID 1 would have two employee and sample with many more IDs. To eliminate this situation, you can have one more column which is not exists in source table which describes where this data come from so you have unique combination of that new column and an existing EmployeeID column.
Let us see how we could perform this. Before we head towards SSIS package development, let us first create two different table in SQL Server and one blank table as a destination of these both table. You can use below script to generate those table with/without data in your SQL Server database.
create table ForDerivedColumn1
(
      EmpID INT Identity(1,1),
      EmpName varchar(10)
)
INSERT INTO ForDerivedColumn1
SELECT ‘Ritesh’ union all
SELECT ‘Rajan’
GO
create table ForDerivedColumn2
(
      EmpID INT Identity(1,1),
      EmpName varchar(10)
)
INSERT INTO ForDerivedColumn2
SELECT ‘Rushik’ union all
SELECT ‘Rajvi’ 
GO
create table ForDerivedColumnNewTable
(
      SourceID INT,
      EmpID INT,
      EmpName varchar(10)
)
GO
Now, open BIDS (Business Intelligence Studio) and create new SSIS project. Draw dataflow task in your “Control Tab” and double click on it to configure it which will redirect you to “Data Flow” tab. Since we have two different SQL Server table as an source, let us first create two “Ado Net Source” in our work place and select your database and table as a source. Below screen shot will show you how to configure “Ado NET Source”. I am showing example for FIRST “ADO NET Source” you have to do it for second “ADO NET Source” too with table name “ForDerivedColumn2”.
Now take two “Derived Column” transformation task and put them under “ADO NET Source” task, double click on it to configure.

Above screen capture showing you configuration for first “DerivedColumn” do it for second “DerivedColumn” task too with value 2 in SourceID column.

Once we have our dataset ready with Derived Column, we have to merge both dataset with “UnionALL” Transformation task. “UnionAll” work exactly same as “UNION ALL” in TSQL. So drag “UnionAll” transformation task below both “DerivedColumn” task and join extended green arrow from both “DerivedColumn” to “UnionAll” task.
Now, this is a time to convert the data in proper format with “DataConversion” transformation task as many time from different sources, same kind of data coming with different datatype, let us convert it with “DataConversion” task by putting it under “UnionAll” task and join extended green arrow from “UnionAll” task to “DataConversion”.

Now, we have to perform our last steps. Data is ready to go anywhere but we have to tell where it suppose to go. Since we want it to insert it in our SQL Server table, we are going to use “SQL Server Destination” task. Double click on it to configure it. Give your database and table reference to it as below image.

Once you give reference of your table, you have to go to “Mappings” tab in the same dialog box and set the mappings there so that proper data get inserted in proper place. Look at the below image for more detailed information.

Finally you are done with package now run it by hitting F5 key and check your database table whether you have actually received the data inside or not. After running if you get “Green Signal” in every task, J you are done.

Reference: Ritesh Shah
http://www.sqlhub.com
Note: Microsoft Books online is a default reference of all articles but examples and explanations prepared by Ritesh Shah, founder of
http://www.SQLHub.com