Recently I felt pretty strongly that I needed to have a better understanding of how SQL Server worked. So I studied for (and passed) the Microsoft Exam 70-432; SQL Server 2008 Implementation and Maintenance. As a result of learning more about SQL Server, I have decided to write a series of articles that might help other developers understand some of the basic concepts of SQL Server. These articles are by no means meant to be comprehensive, but more of a brief look into some facet of SQL Server. Below is my first installment in this series, 'How does SQL Server store indexes.'
Microsoft’s SQL Server uses a B-tree data structure to organize indexes. This structure is hierarchical, with root node at the top of the hierarchy, leaf nodes (also called data pages) at the bottom, and intermediate nodes in the middle. The main idea behind the B-tree structure is that it is always balanced. There are always the same numbers of nodes on both the right and left hand sides of the tree. This allows the data structure to search quickly for data with a minimal number of disk reads.
How the row data is store in each page depends on what type (clustered or non-clustered) index is created. A clustered index (think phone book – all the data is present with the index entry) stores the entire row of data associated with the key value at the leaf level of the index. A table can only have one clustered index and clustered indexes are stored in either ascending or descending sort order. Whereas non-clustered indexes (think index in the back of a book – the index tells you where to go look for the data) only contain the indexed columns and row locators to the actual row data and are unsorted.
Index leaf nodes are referred to as data pages in SQL Server and each page can store up to 8kb of data. Rows on a page are stored in either ascending or descending sort order. As data is inserted it is placed in its appropriate sorted location in the tree structure. If data is consistently inserted into the middle of a page, eventually all of the rows on the index page will no longer fit on the single page and a page split occurs. When a page split occurs 2 new pages are created and the data is split between the newly created pages.
Remember, an index is only useful if it helps find data quickly in a table regardless of the amount of data in the table. Indexes aren’t free; they do come with a cost. Primarily, each index you create, while speeding up the selecting of data, has a negative affect on the the inserting, updating, or deleting of data because the database must update the indexes each time data changes.
Hopefully you've found something new in this article or maybe it's inspired you to go research SQL Server indexes, there is definitely a lot more to indexes than what I've presented here.
Thursday, September 17, 2009
Subscribe to:
Post Comments (Atom)
0 comments:
Post a Comment