Clustered Index in SQL Server

As per the promise I have made in “Somebasics about Index in SQL Server” article, I am coming up with each Indexesof SQL Server and first and most important is “Clustered Index”. 
Understanding ofClustered Index:
As I told in previous article too that Index is the key ofperformance, good and managed Index could boost up your speed of retrieving ofdata from table.
Clustered Index contains actual data of the table in leaflevel pages in logically sorted order (to understand root and leaf level page, clickhere). Since it is logically sorted, it doesn’t need to go for all datapresent there in Index. 
For example if you are looking for the phone number of “RiteshShah” in telephone directory, you can move to a page which has phone number of theperson whose first name starts with “RI” and once all instance of “RI” over indirectory, you don’t need to move on as there is NO CHANCE, you will get “Ritesh”anywhere in directory except those particular pages.
BTW, if you don’t have clustered index on your table, thanyour table would be called “HEAP”,which wouldn’t not have logically sorted data so if you are searching for “RiteshShah”, you can to check complete table as you never know, where you will find “RiteshShah”, just one method, go each and every row of table and check for matchingcriteria.
Like any other columns, you can define clustered index inmore than one field too and all the columns covered up under the index, calledkey column.
While choosing a prime candidate for Clustered Index columnin your table, you have to select the columns which meet few of the generalcriteria defined below. (you can say following criteria as a best practicewhile choosing index candidate)
–> You keycolumn or combination of key columns should be unique and not null. If your Youkey column or combination of key columns are not unique than SQL Server has toadd one more hidden column of 4-byte INT to make it unique. However, you can’tsee that hidden column neither can query it directly; it would be purely forSQL Server’s internal use.
–> It shouldbe short as wide key value would increase the depth of Clustered Index and willreduce the performance a bit and also increase the size of non-clustered indexas it is being there as a reference in all non-clustered index.
–> Selectless changing or no changing fields for you clustered index as Key valueindicates the location of page where actual data resides, if you change thiskey value, row has to be deleted from that page and has to move to anotherappropriate page which reduces the performance and increase unnecessaryoverhead to IO.
Generally whenever you make Primary Key in any of yourtable, SQL Server itself create clustered index on it but if you want to keepclustered index on any other column(s) due to high selectivity on thosecolumn(s), you can do it.
Have you got bored of so long theory? Let us do some practicaland check it out?
–create onedatabase which you can delete after running this example
create database SQLHub
GO
USE SQLHub
GO
–if orderstable is already there. you can delete it than create new one with name”Orders”
IF OBJECT_ID(‘orders’, ‘U’) IS NOT NULL BEGIN
      DROP TABLEorders
END
GO
–creating table
CREATE TABLE orders (OrderID INT IDENTITY, OrderDate DATETIME, Amount MONEY, Refno INT)
GO
–inserting100000 fack rows into table (BTW, thank to Jacob Sebastian, copy in INSERTscript from one of his demo to generate big table)
INSERT INTO orders (OrderDate, Amount, Refno)
SELECT TOP 100000
      DATEADD(minute, ABS(a.object_id % 50000 ), CAST(‘2010-02-01’ AS DATETIME)),
      ABS(a.object_id % 10),
      CAST(ABS(a.object_id) AS VARCHAR)
FROM sys.all_objects a
CROSS JOIN sys.all_objects b
GO
–creatingclustered index, however it is not meeting the criteria I have
–given inarticle, but this is ok for example
CREATE CLUSTERED INDEXidx_refno ON orders(refno)
GO
–run both ofthe following query with execution plan and see the results in execution plan
–you can seeexecution plan with the following steps
–first selectboth of the below given query
–Press Ctrl+M
–press F5
SELECT TOP 10 * from orders whererefno=4
SELECT TOP 10 * from orders whereOrderDate = ‘2010-02-0100:04:00.000’
GO
–if you wish,you can uncomment below code and delete SQLHub database
–use master
–go
–drop databasesqlhub
If you execute both the query with “Actual Execution Plan”,you will see first query is having Index Seek and second query is having IndexScan. 
Seek and Scan is really interesting topic which I will coverlater but just keep in mind that, Seek is good, Scan is bad as it will checkall records of the index.
if you want to refer all other articles related to index, click here.

Reference: Ritesh Shah
http://www.sqlhub.com
Note: Microsoft Books online is a defaultreference of all articles but examples and explanations prepared by RiteshShah, founder of http://www.SQLHub.com
Ask me any SQL Server related question at my “ASK Profile
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: