When the Performance of LAG Lags Behind

My favorite T-SQL windowing functions are LAG and LEAD. LAG and LEAD let you take a peek at another row and grab any column you need in your results. They are easy to use and perform great…except when they don’t.

There is one situation that kills the performance of LAG and LEAD, and that is by using a variable instead of a hard coded value for the offset. “What’s an offset?” you are probably asking.

By default, LAG lets you grab a value from the previous row, and LEAD lets you grab a value from the following row. That is an offset of 1 row. If you specify a value for the  optional offset parameter, you can access a row that is further away. Let’s take a look at an example:

SELECT CustomerID, SalesOrderID, OrderDate, LAG& (OrderDate)   OVER(PARTITION BY CustomerID ORDER BY OrderDate) AS   Prev1OrderDate,
LAG(OrderDate,2) OVER(PARTITION BY CustomerID ORDER BY OrderDate) AS Prev2OrderDate
FROM Sales.SalesOrderHeader
WHERE CustomerID IN (29825,30118);

lag1

If, for some reason, I needed to use a variable instead of a hard coded value for the offset the query will still work, but there will be a performance hit. In the previous example, I filtered to show a couple of customers with lots of orders. In the next example, I’ll remove the WHERE clause and compare the performance difference when a variable is used for the offset.

SET STATISTICS IO ON;
GO
SELECT CustomerID, SalesOrderID, OrderDate,
LAG(OrderDate, 1) OVER(PARTITION BY CustomerID ORDER BY OrderDate) AS Prev1OrderDate
FROM Sales.SalesOrderHeader;
DECLARE @offset INT = 1;
SELECT CustomerID, SalesOrderID, OrderDate,
LAG(OrderDate, @offset) OVER(PARTITION BY CustomerID ORDER BY OrderDate) AS Prev1OrderDate
FROM Sales.SalesOrderHeader;

To see the difference, let’s take a look at the IO used by each query:

lag3

The query with the hard coded offset used 689 logical reads while the query with the variable used 689 + 188791 logical reads. That’s a big difference!

So, what’s going on here? SQL Server creates the worktable in memory in the first query while it is created on disk in tempdb in the second. Under the covers, LAG and LEAD are actually modified FIRST_VALUE and LAST_VALUE functions. FIRST_VALUE and LAST_VALUE queries require framing. (To refresh your memory about framing see this post.)  When the database engine doesn’t know upfront how many rows will be in the frame, it creates the worktable in tempdb instead of in memory. When the worktable lives in memory, it’s almost magic, with no IO or locking.

Hopefully, using a variable as an offset is a pretty rare occurrence. When the default or a hard coded offset is used, the performance of LAG and LEAD rock!

 

 

 

 

 

Posted in T-SQL, T-SQL Window Functions | Leave a comment

Files for my PASS Summit Windowing Function Session

Thanks to those of you who attended my session. As promised, I am sharing the slides and code.

Posted in T-SQL Window Functions | 1 Comment

PASS Summit Memories

I am leaving Seattle in just a few hours after spending a week here at the MVP Summit and PASS Summit.  I managed to make it to the keynote on Wednesday where some of the features for the next version of SQL Server were announced and to my own session on windowing functions. I think this is the first year that I completely skipped out on educational sessions, but I can always catch the recordings later as needed.  I had good intentions but the chance to visit with my many friends in the SQL Server community and make business connections were much more important to me.

This is my 11th PASS Summit, and the memories of them are all starting to blur together. There are some highlights that I’ll never forget. Here is my top 10 list of PASS Summit memories.

10. Walking all over Seattle, especially to Pike Place Market
9. Parties,  parties, and more parties
8. Learning about new features of SQL Server
7. Participating in the Welcome Reception Quiz Bowl (year???)
6. Speaking at Summit
5. Being a member on the WIT Panel (2009?)
4. Winning the PASSion Award in 2008
3. Attending Wayne Snyder’s presentation “How to Make a Name for Yourself” in 2004
2. Being part of SQL Family

And the number 1 memory of PASS Summits for me is the evening in 2006 that Tom LaRock (@sqlrockstar), Allen Kinsel (@AllenKinsel), Todd Robinson (@DevSQL), and I went out for Karaoke at Bush Garden and began the phenomenon called #sqlkaraoke!

 

 

 

 

 

Posted in Life in IT | 1 Comment