I have presented my T-SQL Window function presentations for a couple of years. I love it when interesting questions come up during the session. Recently, during my intro session, I was showing the difference between ROW_NUMBER, RANK, and DENSE_RANK by using a non-unique ORDER BY column. Here is the query I ran:
SELECT SalesOrderID, OrderDate, CustomerID, ROW_NUMBER() OVER(ORDER BY OrderDate) As RowNum, RANK() OVER(PARTITION BY CustomerID ORDER BY OrderDate) As Rnk, DENSE_RANK() OVER(PARTITION BY CustomerID ORDER BY OrderDate) As DenseRnk FROM Sales.SalesOrderHeader WHERE CustomerID = 11078;
Here are the results. We were looking at rows 14 and 15 where the order date was the same.
Someone asked if the row numbers 14 and 15 could be switched and assigned to the opposite SalesOrderID rows. Sure, they could. I thought that the database engine is just assigning them the way that they are because it is the easiest. So, even with the same data and the same OVER clause, ROW_NUMBER could return something different. This is called non-deterministic behavior.
In order to prove this, let’s take a look at another query to see if we can get the row numbers to switch on 2014-03-15.
SELECT SalesOrderID, OrderDate, CustomerID, ROW_NUMBER() OVER(ORDER BY OrderDate) As RowNum, RANK() OVER(ORDER BY OrderDate) As Rnk, DENSE_RANK() OVER(PARTITION BY CustomerID ORDER BY OrderDate) As DenseRnk FROM Sales.SalesOrderHeader WHERE CustomerID = 11078 ORDER BY OrderDate, SalesOrderID DESC;
By changing the sort order, I was able to get those two row numbers to switch rows:
Now 68288 gets row number 14 instead of 15 with exactly the same OVER clause.
UPDATE: After seeing that someone on Twitter disagreed with me because I changed the query’s ORDER BY clause, I decided to do some research. I found that all of the Ranking functions, and, actually, all the windowing functions that depend on ORDER BY in the OVER clause are non-deterministic. Here is a list of SQL Server functions divided by determinism.