Mastering Partitioned Views in SQL Server: Optimizing Data Management and Performance
A partitioned view is a technique in SQL Server where data from multiple tables (typically related to the same logical entity but stored across multiple physical tables) is presented as a single unified view. This is especially useful in scenarios where the data can be logically split into segments, such as time-based data, geographical data, or large-scale data sets that are distributed across multiple tables for performance, scalability, or management reasons.
In SQL Server, partitioned views use a UNION ALL
operation to combine data from multiple tables into a single view, which allows you to query the data as though it were in one large table. Each individual table in the view typically stores a subset of the data, and the partitioned view helps you access this data seamlessly.
The partitioned view allows us to logically divide large datasets into smaller, more manageable ranges based on specific column values. To implement this, we use a check constraint. Each participating table must define this check constraint on the column used for partitioning. The partitioned view itself employs a UNION ALL
to combine the data from all participating tables into a single result set.
When we apply a WHERE
condition to the partitioning column in the view, SQL Server can efficiently query the relevant table by leveraging the check constraint, thus avoiding unnecessary scans of other tables and improving query performance.
However, if the check constraint is not defined on the participating tables, the SQL Server Query Optimizer will need to scan all tables within the view to return the result, which can degrade performance.
Common Use Cases for Partitioned Views:
- Time-based Data: Partitioning data by date, such as sales records split by year or month. For example, each year’s data might be stored in a separate table.
- Geographical Data: Partitioning data by region, country, or territory. For example, each region might have its own table.
- Sharding: Horizontal partitioning where data is distributed across multiple servers or databases based on a partitioning key.
- Performance Optimization: By partitioning data, you can improve query performance as the system will only scan relevant partitions rather than a single large table.
Structure of a Partitioned View:
To create a partitioned view, you:
- Create multiple tables that store the partitions of the data.
- Create a view that combines these tables using
UNION ALL
. - Ensure that all the tables involved in the view have the same schema (i.e., same column names, types, etc.).
Keep in mind:
- The partitioning column MUST BE a part of the PRIMARY KEY of the table;
- Can’t be an identity, computed or timestamp column;
- If multiple constraints are defined on the same column in a member table, the Database Engine ignores all of them and does not take them into account when determining whether the view is partitioned. To ensure the view meets the partitioning requirements, make sure that only one partitioning constraint exists on the partitioning column;
Let’s getting one’s hands dirty!
Steps to create a partitioned view
1. Creating Tables
In this example, i’m using a sample employee table. We are creating 3 tables for employees based on the Employment Year. Our partitioned column are EmploymentYear. Here, we’re using 3 tables on years: 2024, 2023 and 2022.
CREATE DATABASE PartitionedDemo;
-- Creating Employee tables for years 2024, 2023, and 2022
USE PartitionedDemo;
GO
-- Employee year 2024
CREATE TABLE Employees_2024 (
EmployeeId INT NOT NULL,
EmployeeName VARCHAR(100) NOT NULL,
HireDate DATETIME NULL,
EmploymentYear INT NOT NULL,
CONSTRAINT PK_Employee_2024 PRIMARY KEY (EmployeeId, EmploymentYear)
);
GO
-- Employee year 2023
CREATE TABLE Employees_2023 (
EmployeeId INT NOT NULL,
EmployeeName VARCHAR(100) NOT NULL,
HireDate DATETIME NULL,
EmploymentYear INT NOT NULL,
CONSTRAINT PK_Employee_2023 PRIMARY KEY (EmployeeId, EmploymentYear)
);
GO
-- Employee year 2022
CREATE TABLE Employees_2022 (
EmployeeId INT NOT NULL,
EmployeeName VARCHAR(100) NOT NULL,
HireDate DATETIME NULL,
EmploymentYear INT NOT NULL,
CONSTRAINT PK_Employee_2022 PRIMARY KEY (EmployeeId, EmploymentYear)
);
GO
2. Adding our Check Constraint for Partitioned View
These constraints allow the SQL Server Query Optimizer to directly access the correct partition based on the filter in the view (or in queries). This can significantly improve query performance, as the query will only scan\seek the relevant table(s) instead of all partitions.
We are adding the EmploymentYear as a check constraint. So that the query will get the specific EmploymentYear record from the respective table and avoid table scans on other tables.
This, will allow SQL Server to ignore tables that are not needed in a particular query.
USE PartitionedDemo
GO
-- CHECK CONSTRAINTS FOR PARTITIONED VIEW
ALTER TABLE Employees_2022 ADD CONSTRAINT CK_Employee_2022 CHECK (EmploymentYear = 2022);
ALTER TABLE Employees_2023 ADD CONSTRAINT CK_Employee_2023 CHECK (EmploymentYear = 2023);
ALTER TABLE Employees_2024 ADD CONSTRAINT CK_Employee_2024 CHECK (EmploymentYear = 2024);
GO
3. Inserting a few rows to sample it
USE PartitionedDemo
GO
-- year 2024
INSERT INTO [dbo].[Employees_2024] ([EmployeeId], [EmployeeName], [HireDate], [EmploymentYear])
VALUES
(101, 'Alice Smith', '2024-02-15', 2024),
(102, 'Bob Johnson', '2024-03-20', 2024),
(103, 'Carol Davis', '2024-01-11', 2024),
(104, 'Dan Harris', '2024-04-05', 2024),
(105, 'Eve Taylor', '2024-05-30', 2024),
(106, 'Frank White', '2024-07-12', 2024),
(107, 'Grace Lewis', '2024-06-23', 2024),
(108, 'Hank Clark', '2024-08-15', 2024),
(109, 'Ivy King', '2024-09-02', 2024),
(110, 'Jackie Scott', '2024-10-14', 2024);
GO
-- year 2023
INSERT INTO [dbo].[Employees_2023] ([EmployeeId], [EmployeeName], [HireDate], [EmploymentYear])
VALUES
(201, 'Charlie Brown', '2023-05-10', 2023),
(202, 'David Lee', '2023-06-15', 2023),
(203, 'Emma Walker', '2023-01-22', 2023),
(204, 'George Hall', '2023-02-17', 2023),
(205, 'Helen Allen', '2023-03-25', 2023),
(206, 'Ian Young', '2023-04-14', 2023),
(207, 'Jack Williams', '2023-07-03', 2023),
(208, 'Kelly Brown', '2023-08-06', 2023),
(209, 'Liam Clark', '2023-09-12', 2023),
(210, 'Mona Evans', '2023-10-29', 2023);
GO
-- year 2022
INSERT INTO [dbo].[Employees_2022] ([EmployeeId], [EmployeeName], [HireDate], [EmploymentYear])
VALUES
(301, 'Eve Williams', '2022-08-25', 2022),
(302, 'Frank Moore', '2022-09-30', 2022),
(303, 'Grace Turner', '2022-03-11', 2022),
(304, 'Henry Moore', '2022-01-08', 2022),
(305, 'Ivy Scott', '2022-04-14', 2022),
(306, 'John Adams', '2022-05-06', 2022),
(307, 'Kathy Miller', '2022-06-19', 2022),
(308, 'Leo Carter', '2022-07-27', 2022),
(309, 'Maya Jackson', '2022-10-03', 2022),
(310, 'Nina Lewis', '2022-11-15', 2022);
GO
If you wanna check your data, just run:
USE PartitionedDemo
GO
SELECT * FROM [dbo].[Employees_2024]
SELECT * FROM [dbo].[Employees_2023]
SELECT * FROM [dbo].[Employees_2022]
GO
4. Creating a Partitioned View
After creating this view (VW_Employee
), we can query it as if it were a single table, even though the data is actually stored in multiple partitioned tables. The query optimizer will be able to efficiently retrieve data from the correct partition based on the EmploymentYear
column, improving query performance when the appropriate filters are applied.
USE PartitionedDemo
GO
CREATE VIEW VW_Employee
AS
SELECT EmployeeId, EmployeeName, HireDate, EmploymentYear FROM Employees_2024
UNION ALL
SELECT EmployeeId, EmployeeName, HireDate, EmploymentYear FROM Employees_2023
UNION ALL
SELECT EmployeeId, EmployeeName, HireDate, EmploymentYear FROM Employees_2022
GO
5. Selecting our Partitioned View
USE PartitionedDemo
GO
SELECT * FROM VW_Employee WHERE EmploymentYear = 2022
So, SQL Server returned or data with our filter as we can see above, but how they did it?
Deeping on SQL Server Execution Plan, we can see that SQL Server was smart enough to just access one of table inside the view! WOW.. This happens because we added on step 2 a constraint do validate!
Once these constraints are in place, when a query is run against the partitioned view (VW_Employee
), SQL Server will utilize the constraints to ensure that only the relevant tables are scanned. In our example, a query filtering by EmploymentYear = 2022
will only scan the Employees_2022
table, improving performance.
So now we have created the partitioned view. Whenever we apply the filter. It will directly fetch the data on the specific table and avoid table scans on other tables.
So, let’s remove our Constraints and execute the query again.
6. Selecting a normal View
First of all, let’s remove our constraint.
-- Remove CHECK constraints for the Employees_2023 table
ALTER TABLE Employees_2022 DROP CONSTRAINT CK_Employee_2022;
-- Remove CHECK constraints for the Employees_2023 table
ALTER TABLE Employees_2023 DROP CONSTRAINT CK_Employee_2023;
-- Remove CHECK constraints for the Employees_2024 table
ALTER TABLE Employees_2024 DROP CONSTRAINT CK_Employee_2024;
GO
Now, let’s execute select again:
USE PartitionedDemo
GO
SELECT * FROM VW_Employee WHERE EmploymentYear = 2022
Execution plan:
Now, to access our data, SQL Server was not smart enough (we didn’t help him) and access all tables inside a view looking for data based on filter.
7. Inserting Values using a Partitioned View
Using the Partitioned view we can insert the values on specific tables based on check conditions. If you’re inserting for example, 3 different EmploymentYear, it will insert the row based on the EmploymentYear to the respective table.
Let’s add our constraint again:
USE PartitionedDemo
GO
-- CHECK CONSTRAINTS FOR PARTITIONED VIEW
ALTER TABLE Employees_2022 ADD CONSTRAINT CK_Employee_2022 CHECK (EmploymentYear = 2022);
ALTER TABLE Employees_2023 ADD CONSTRAINT CK_Employee_2023 CHECK (EmploymentYear = 2023);
ALTER TABLE Employees_2024 ADD CONSTRAINT CK_Employee_2024 CHECK (EmploymentYear = 2024);
GO
Inserting Data:
USE PartitionedDemo
GO
INSERT INTO [dbo].[VW_Employee] ([EmployeeId], [EmployeeName], [HireDate], [EmploymentYear])
VALUES (111, 'Clark Kent', '2024-06-12', 2024);
INSERT INTO [dbo].[VW_Employee] ([EmployeeId], [EmployeeName], [HireDate], [EmploymentYear])
VALUES (211, 'Mike Tyson', '2023-11-10', 2023);
INSERT INTO [dbo].[VW_Employee] ([EmployeeId], [EmployeeName], [HireDate], [EmploymentYear])
VALUES (311, 'Joe Sad', '2022-01-04', 2022);
Checking data:
USE PartitionedDemo
GO
SELECT * FROM [dbo].[Employees_2024] order by 1 desc
SELECT * FROM [dbo].[Employees_2023] order by 1 desc
SELECT * FROM [dbo].[Employees_2022] order by 1 desc
Conclusion
Partitioned views in SQL Server provide a powerful way to manage large datasets that can be logically divided into smaller, more manageable segments. By using partitioning techniques, such as creating multiple tables based on a partitioning column (e.g., EmploymentYear
), you can optimize query performance and data management. The use of check constraints ensures that the query optimizer efficiently accesses only the relevant partition, eliminating unnecessary scans of unrelated tables.
Partitioned views are particularly beneficial when dealing with time-based data, geographical data, or any large-scale datasets that span multiple logical segments. The ability to query a unified view while maintaining efficient access to each partition based on specific conditions makes partitioned views an essential tool for improving performance and scalability.
By carefully designing your partitioned view, defining check constraints on the partitioning column, and ensuring that your tables adhere to these constraints, you can take full advantage of SQL Server’s capabilities to handle large volumes of data with high efficiency. This setup allows for easy querying, inserting, and managing data while maintaining performance even as data grows over time.
In summary:
- Partitioned views provide a way to combine multiple tables into a single, unified view for querying.
- Check constraints on the partitioning column ensure that SQL Server optimizes queries to target only the relevant partition, improving performance.
- When designing partitioned views, careful planning of partitioning columns and constraints is key to ensuring that data is efficiently managed and queried.
By following best practices and leveraging partitioned views, you can optimize the performance of your SQL Server database.