What is a Partitioned View?

Back when I first began working with SQL Server in 1998, the databases I worked with could be measured in megabytes. I remember being a bit concerned when a database for one of my customers approached 1 GB. In today’s world, I consider databases under 100 GB to be relatively small. I am not surprised when I work with customers who have databases measuring in the terabytes.

Over the years, Microsoft has added features that help database administrators handle large databases and large tables. The latest feature on the horizon, available with SQL Server 2016, is Stretch Databases. This feature will allow a table to be divided between on premises instances and the Azure cloud. Data that is not access frequently can be stored in Azure, and it is seamless!

Two of the features that have been around for years to help manage large tables are Partitioned Views and Partitioned Tables. The Partitioned Table feature is only available in Enterprise and Developer Editions. A partitioned table is one table but, under the covers, it is divided into multiple partitions that can be managed individually. You can quickly move data in and out of the table, store partitions on different storage tiers, compress partitions at different levels, and rebuild indexes on individual partitions. When you query the table, it looks like a regular table. If you filter on the partition key column, you can get a boost in performance because SQL Server can ignore partitions not needed in your results. It’s a great way to break a large problem down to smaller, more solvable pieces, but it’s not necessarily easy to set up. It also requires the often dreaded Enterprise Edition.

I was looking for a way to accomplish something similar that would still work without the cost of upgrading when I took a look at Partitioned Views. I had never really looked seriously at this feature since I thought it was mostly a convenience. I didn’t think there was any performance advantage. The idea behind partitioned views is that you will have two or more tables with identical schemas, and you will use UNION ALL to combine the tables in a view. Now you can use the view to select the data instead of specifying the actual tables.

When creating a partitioned view, you could place a table containing older data on slower, less expensive storage. You could place the most active data in a table living on faster storage, such as on solid state disks. Adding new data to the partitioned view could be pretty quick as well. It may mean just modifying the view definition to include a new table. What I had not realized, however, is that, if you design the tables correctly, you can also see a performance advantage. SQL Server can ignore tables not needed in the results of a SELECT query when filtering on the key that divides the data.

To demonstrate, run the following script to create and populate two tables. This example is simple, but we could place each table into a different file group and even a different storage tier.

IF EXISTS(SELECT * FROM sys.views WHERE name = ‘SalesData’)
DROP VIEW SalesData;
GO

IF EXISTS(SELECT * FROM sys.tables WHERE name = ‘DataArchive’)
DROP TABLE DataArchive;
GO

IF EXISTS(SELECT * FROM sys.tables WHERE name = ‘DataActive’)
DROP TABLE DataActive;
GO

CREATE TABLE DataArchive (
SalesDate DATE, OrderID INT
CONSTRAINT PK_DataArchive PRIMARY KEY (SalesDate, OrderID)
);

CREATE TABLE DataActive (
SalesDate DATE, OrderID INT
CONSTRAINT PK_DataActive PRIMARY KEY (SalesDate, OrderID)
);

INSERT INTO dbo.DataArchive( SalesDate, OrderID )
VALUES (‘2014-01-01’, 10),(‘2014-01-10’, 20),
(‘2014-03-15’, 30), (‘2014-07-20’, 40);

INSERT INTO dbo.DataActive( SalesDate, OrderID )
VALUES (‘2015-01-07’, 50),(‘2015-02-10’, 60),
(‘2015-03-18’, 70), (‘2015-10-28’, 80);
GO

Now you have two tables, one for older data that you won’t need to access frequently, and one for active data. The next step is to create the partitioned view.

CREATE VIEW SalesData WITH SCHEMABINDING AS (
SELECT SalesDate, OrderID
FROM dbo.DataArchive
UNION ALL
SELECT SalesDate, OrderID
FROM dbo.DataActive
);
GO

To test the performance of the partitioned view, use STATISTICS IO. It will show exactly which tables are being accessed. Run this query and check the Messages tab.

SET STATISTICS IO ON;
GO
SELECT SalesDate, OrderID
FROM SalesData
WHERE SalesDate = ‘2015-01-07’;

pv1

Even though the data returned by the query came from the DataActive table, SQL Server also searched the DataArchive table. To fix this issue, you must add check constraints to the tables which will allow SQL Server to ignore tables that are not needed in a particular query. You should also refresh the view definition. Make the changes and run the query again.

ALTER TABLE dbo.DataActive ADD CONSTRAINT
CC_DataActive_SalesDate CHECK (SalesDate >= ‘2015-01-01’);

ALTER TABLE dbo.DataArchive ADD CONSTRAINT
CC_DataArchive_SalesDate CHECK (SalesDate < ‘2015-01-01’);

GO

ALTER VIEW SalesData WITH SCHEMABINDING AS (
SELECT SalesDate, OrderID
FROM dbo.DataArchive
UNION ALL
SELECT SalesDate, OrderID
FROM dbo.DataActive
);

GO

 

SELECT SalesDate, OrderID
FROM SalesData
WHERE SalesDate = ‘2015-01-07’;

pv2

This time, SQL Server ignores the DataArchive table, and the query can run faster because SQL Server is doing less work.

What about data modifications? It seems like you would have to make those directly to the underlying tables, but that is not the case. As long as you filter on the column that divides the data, you can also make updates. I did see that it might be better to make the changes directly to the tables to get the best performance. I was also surprised to see that I could move a row from one table to another by changing the constrained column. Run the following code to try it out:

INSERT INTO SalesData(SalesDate, OrderID)
VALUES(‘2014-01-13’,26),(‘2015-01-08’,51);

UPDATE SalesData SET OrderID = 23
WHERE SalesDate = ‘2014-01-13’;

DELETE FROM SalesData
WHERE SalesDate = ‘2014-01-13’;

–Moves the row to DataActive
UPDATE salesdata
SET salesdate = ‘2015-01-01’
WHERE salesdate = ‘2014-01-01’;

Conclusion

Depending on how you design them, partitioned views can have some of the same advantages of partitioned tables. No matter which technique you decide to use, you need to spend the time up front designing and planning the solution. The most important decision in each case is the column that will be used to divide the data.

Kathi Kellenberger is the author of Beginning T-SQL, Expert T-SQL Window Functions on SQL Server, and the Pluralsight course T-SQL Window Functions.

Posted in T-SQL, T-SQL Window Functions | 1 Comment

The Many Skills Needed to Succeed

Recently, I had a problem to solve for a customer. I was surprised at how many different skills I had to use just to accomplish what seemed to be a simple task. I needed to get a file from one server to another as part of a SQL Server Agent job. The only means available to transfer between the two servers was FTP. FTP is an old protocol, something that I rarely need to work with, and I don’t believe I have ever had to automate transferring a file where using SSIS was not an option.

In a SQL Server Agent job step, you have some options for command line work. You can change the job step to the “Operating system (CmdExec)” type. This type of job step will only run one line and ignore the rest if you have added multiple lines. It is possible to concatenate multiple lines of code with the && symbols. I needed to run an FTP script that included a username and password. Concatenating FTP script lines didn’t work, but I found that I could launch an existing FTP batch file from a CmdExec step with this command:
FTP -s:<batch file path>

Another idea that I had was using PowerShell. PowerShell is the current standard when it comes to scripting in Windows, and there is also a PowerShell job step type in SQL Server Agent jobs. I found that I could launch FTP from PowerShell, but once FTP is launched, it becomes an FTP session. My hope was to just list all my FTP commands in the PowerShell job step, but it treats each command as a PowerShell command. The job step creates an FTP session, but further FTP commands error out.

Even though I didn’t want to create an FTP batch file, I was running out of ideas and time. I decided to use the CmdExec job step with concatenated lines to create the file right when it was needed, and then delete it when it was no longer required. I tried using the ECHO command to redirect the required commands to a file (> to create the file and >> to append to it). The file looked correct, but what I didn’t realize is that ECHO adds an extra space to the end of each line. Those extra spaces caused the username and password to be incorrect. OK, back to PowerShell. I tried the Set-Content and Add-Content commands to create my FTP batch file. I also ended up adding the -Encoding ASCII option to ensure that the file would be compatible with FTP. This time, the file worked!

Even if working with SQL Server is your job, it’s important to be able to do things like work with PowerShell and even DOS, and to learn about the operating system. It’s really important to make Bing or Google your best friend and to develop good troubleshooting techniques. What seemed like a simple task took me a couple of hours to get right. Of course, while writing this, I found that I may have been able to use the FTP URL instead of a batch file. If I get a chance to revisit this, I’ll try that method instead.

Posted in Life in IT, SQL Server Administration | Leave a comment

Almost back from vacation, but with a couple links

I just returned from a weeklong trip to Cancun, Mexico. Wow! I thought that I was spoiled on a cruise, but the treatment is even better at a Cancun resort! I still have a few days until I am back to work, but I couldn’t stay away from emails. I think that, at least in my case, it shows how much I love my work. I love working with data and helping my customers get the most out of SQL Server.

As I have announced a couple of months ago, I will be speaking at PASS Summit in October. Speaking at PASS Summit is an honor and a privilege. As part of the publicity leading up to the summit, PASS is interviewing the speakers. My interview  was published this week. The theme of the interviews is “super heroes.” Be sure to read my interview to find out my favorite super hero.

I would also like to remind my readers about my Pluralsight course. If you would like to learn more about T-SQL Window Functions, be sure to check it out!  I am currently thinking about my next Pluralsight course. If there is something that you would like to learn from me, drop me a line on the Contact Me  page and let me know.  I may just create a new course based on your suggestion!

Posted in Life in IT | Leave a comment