How to Implement Table Partitioning in SQL Server

Introduction

Table partitioning is a technique used in SQL Server to physically organize data stored in a table in different storage structures. In essence one ends up having this large logical structure split into smaller parts physically. The result is that we can improve performance for certain kinds of queries and more importantly move data about using certain techniques.

Filegroups

It is important to discuss the concept filegroups in this article because filegroups are the layer of abstraction used to separate data physically. A filegroup is a logical construct that allows SQL Server to see a collection of physical data files as a single logical unit. This means that when SQL Server writes data to that file group, the data is spread across the files which belong to that filegroup. One can think of this as a File Group – a group of data files.

Tables Sit on File Groups

Whenever a table is created, they are created on a filegroup – the PRIMARY file group by default. Typically, when creating a table, you do not specify the filegroup. This means they are created on the PRIMARY filegroup. You can choose to sit a table on a different filegroup if you wish to. However, when creating a partitioned table, you should sit such a table on a Partition Scheme.

Partition Schemes and Functions

A Partition Scheme maps a table to a set of filegroups. A Partition Function defines the criteria by which data is distributed across the filegroups that belong to the desired partition scheme. Thus, it follows logically that in creating a partitioned table, we must create a Partition Function first and then a Partition Scheme.

Why Partitioned Tables?

Two main reasons have been proposed for partitioned tables. It is possible to observe performance benefits when filegroups are sitting on separate disks entirely and we are working with the appropriate degree of parallelism and queries that span one partition. Unfortunately, we will not demonstrate this in this article. Another reason for partitioning tables is maintenance, specifically data archiving which is achieved by switching out partitions. This article shows the entire process: Switching Out Table Partitions in SQL Server: A Walkthrough. Other benefits of partitioning include online index rebuilds, parallel operations, and piecemeal restores of filegroups.

Hands-on Practice

Let us walk through the process of creating a partitioned table. We start by creating a regular table as shown in Listing 1. Since we did not specify any filegroup or partition function, the table sits in the PRIMARY filegroup (See Figure 1).

-- Listing 1: CREATE TABLE Statement
use DB2
GO
CREATE TABLE memmanofarms (
fname VARCHAR(50)
,lname VARCHAR(50)
,city VARCHAR(50)
,PhoneNo bigint
,email VARCHAR(100) check (email like '%@%')
,gender char(1)
)

This is image title

Using the code in Listing 2, we populate the table with six unique records which are replicated a different number of times per row. We can confirm the row count for each city using the queries in Listing 3. Listing 3 helps us in one more way: we can get a baseline of the way the query is executed when our table is not partitioned.

-- Listing 2: Populate Table
USE DB2
GO
INSERT INTO memmanofarms VALUES ('Kenneth','Igiri','Accra','23320055444','kenneth@kennethigiri.com','M');
GO 1100
INSERT INTO memmanofarms VALUES ('Vivian','Akeredolu','Lagos','2348020055444','vivian@gmail.com','F');
GO 720
INSERT INTO memmanofarms VALUES ('Emelia','Okoro','Port Harcourt','2348030057324','emelia@yahoo.com','F');
GO 400
INSERT INTO memmanofarms VALUES ('Uche','Igiri','Enugu','2348030057324','uche@yahoo.com','M');
GO 1000
INSERT INTO memmanofarms VALUES ('Kweku','Annan','Kumasi','23354055884','kweku@ymail.com','M');
GO 150
INSERT INTO memmanofarms VALUES ('Aisha','Bello','Kano','2347088057324','aisha@gmail.com','F');
GO 890

-- Listing 3: Count Rows 

USE DB2
GO
SET STATISTICS IO ON;
SET STATISTICS TIME ON;

SELECT COUNT(*) FROM memmanofarms;
SELECT COUNT(*) FROM memmanofarms WHERE city='Accra';
SELECT COUNT(*) FROM memmanofarms WHERE city='Lagos';
SELECT COUNT(*) FROM memmanofarms WHERE city='Port Harcourt';
SELECT COUNT(*) FROM memmanofarms WHERE city='Enugu';
SELECT COUNT(*) FROM memmanofarms WHERE city='Kumasi';
SELECT COUNT(*) FROM memmanofarms WHERE city='Kano';
Taking this further, we use the code in Listing 4 to set up objects required for Table Partitioning on the DB2 databases. Notice that for N partitions (and N filegroups), there will always be N-1 boundaries.

Taking this further, we use the code in Listing 4 to set up objects required for Table Partitioning on the DB2 databases. Notice that for N partitions (and N filegroups), there will always be N-1 boundaries.

-- Listing 4: Set Up Partitioning
-- Create a Partition Function

USE [DB2]
GO
CREATE PARTITION FUNCTION
PartFunc (VARCHAR(50))
AS RANGE RIGHT
FOR VALUES 
('Accra'
,'Enugu'
,'Kano'
,'Kumasi'
,'Lagos'
,'Port Harcourt'
)
GO

-- Create File Groups

USE [master]
GO
ALTER DATABASE [DB2] ADD FILEGROUP [AC]
ALTER DATABASE [DB2] ADD FILEGROUP [EN]
ALTER DATABASE [DB2] ADD FILEGROUP [KA]
ALTER DATABASE [DB2] ADD FILEGROUP [KU]
ALTER DATABASE [DB2] ADD FILEGROUP [LA]
ALTER DATABASE [DB2] ADD FILEGROUP [PH]
ALTER DATABASE [DB2] ADD FILEGROUP [OT]
GO

-- Add Files to the File Groups
USE [master]
GO
ALTER DATABASE [DB2] ADD FILE ( NAME = N'AC01', FILENAME = N'C:\MSSQL\Data\AC01.ndf' , SIZE = 102400KB , FILEGROWTH = 131072KB ) TO FILEGROUP [AC];
ALTER DATABASE [DB2] ADD FILE ( NAME = N'EN01', FILENAME = N'C:\MSSQL\Data\EN01.ndf' , SIZE = 102400KB , FILEGROWTH = 131072KB ) TO FILEGROUP [EN];
ALTER DATABASE [DB2] ADD FILE ( NAME = N'KA01', FILENAME = N'C:\MSSQL\Data\KA01.ndf' , SIZE = 102400KB , FILEGROWTH = 131072KB ) TO FILEGROUP [KA];
ALTER DATABASE [DB2] ADD FILE ( NAME = N'KU01', FILENAME = N'C:\MSSQL\Data\KU01.ndf' , SIZE = 102400KB , FILEGROWTH = 131072KB ) TO FILEGROUP [KU];
ALTER DATABASE [DB2] ADD FILE ( NAME = N'LA01', FILENAME = N'C:\MSSQL\Data\LA01.ndf' , SIZE = 102400KB , FILEGROWTH = 131072KB ) TO FILEGROUP [LA];
ALTER DATABASE [DB2] ADD FILE ( NAME = N'PH01', FILENAME = N'C:\MSSQL\Data\PH01.ndf' , SIZE = 102400KB , FILEGROWTH = 131072KB ) TO FILEGROUP [PH];
ALTER DATABASE [DB2] ADD FILE ( NAME = N'OT01', FILENAME = N'C:\MSSQL\Data\OT01.ndf' , SIZE = 102400KB , FILEGROWTH = 131072KB ) TO FILEGROUP [OT];
GO

-- Create a Partition Scheme
USE [DB2]
GO
CREATE PARTITION SCHEME PartSch 
AS PARTITION PartFunc TO
(
AC,
EN,
KA,
KU,
LA,
PH,
OT
)
GO

Once we have the foundation laid, we can then move our regular table from the PRIMARY filegroup to the Partition Function we created. We do this by rebuilding the clustered index as shown in Listing 5. Observe that the column which we used to create the partition column must be part of the clustered index column listing. Also notice that when we specify the Partition Scheme, we must also indicate this column – city.

-- Listing 5: Move table to New Partition
CREATE CLUSTERED INDEX [ClusteredIndexCity] ON [dbo].[memmanofarms]
(
	[city] ASC,
	[PhoneNo] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = ON, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) 
ON [PartSch](city);
GO

After we run this code, the table now sits on the Partition Scheme as shown in the following Figure.

This is image title

A Little on Performance

We are using the code in Listing 6 as a benchmark of sorts. When we run both questions with the partitioned and non-partitioned tables, we see little difference in performance without the index. In both cases, SQL Server uses an index seek when we have the clustered index in place and a full table scan otherwise. Worthy of note though is that we have more reads when running these queries on the partitioned table. This is expected since we have in effect forced a distribution of the data across “distant” pages.

-- Listing 6: Querying Partitioned and Non-Partitioned Tables
SELECT COUNT(*) FROM memmanofarms WHERE city='Accra';
SELECT * FROM memmanofarms WHERE city='Accra';

This is image title

Conclusion

We have seen in this article the steps for creating a partitioned table. The references section lists more resources that demonstrate the use of partitioning for archiving old data. We have also shown that partitioning does not necessarily introduce significant improvement in performance for most use cases without other enhancements such as the right CPU and proper MAXDOP configuration.

#sql #sql-server #partitions