A window aggregate function is an easy way to perform an aggregate calculation without changing the query to an aggregate query. This means that you can calculate an aggregate over the entire dataset while keeping the detail in the results. You can also perform calculations on smaller windows, or partitions, of the data. Here is a query that provides an overall total, a total for each customer, and the average sale for each customer. Even though the query is returning sums and an average, the detail remains in the results.
SELECT CustomerID, TotalDue, SalesOrderID, SUM(TotalDue) OVER() AS GrandTotal,
SUM(TotalDue) OVER(PARTITION BY CustomerID) AS CustTotal,
AVG(TotalDue) OVER(PARTITION BY CustomerID) AS CustAvg
FROM Sales.SalesOrderHeader;
You can also use windows aggregate functions within aggregate queries with some caveats. When I first began using these functions back in the SQL Server 2005 days, I had mixed success with this technique. That was long before I began focusing on window functions in my writing and speaking. I have learned a lot since then.
Let’s start with an aggregate query grouped at the customer level.
SELECT CustomerID, SUM(TotalDue) AS CustTotal,AVG(TotalDue) AS CustAvg
FROM Sales.SalesOrderHeader
GROUP BY CustomerID;
Let’s say I wanted to add a grand total by adding this to the query: SUM(TotalDue) OVER() AS GrandTotal.
SELECT CustomerID, SUM(TotalDue) AS CustTotal,
AVG(TotalDue) AS CustAvg,
SUM(TotalDue) OVER() AS GrandTotal
FROM Sales.SalesOrderHeader
GROUP BY CustomerID;
Unfortunately, this doesn’t work. I see this error message:
This seems really strange since TotalDue is part of an aggregate function. The problem is that the window aggregate function must operate either on a column in the GROUP BY clause or on an aggregate expression. I can’t add TotalDue to the GROUP BY because that will change the results I expect, but I can operate on the sum of TotalDue:
SELECT CustomerID, SUM(TotalDue) AS CustTotal,
AVG(TotalDue) AS CustAvg,
SUM(SUM(TotalDue)) OVER() AS GrandTotal
FROM Sales.SalesOrderHeader
GROUP BY CustomerID;
It is also not possible to add a PARTITION BY option on a non-grouping column or non-aggregated column. Let’s say I want to get a total for each order year and try this:
SELECT CustomerID, SUM(TotalDue) AS CustTotal,
AVG(TotalDue) AS CustAvg,
SUM(SUM(TotalDue)) OVER() AS GrandTotal,
SUM(SUM(TotalDue))
OVER(PARTITION BY YEAR(OrderDate)) AS YearTotal
FROM Sales.SalesOrderHeader
GROUP BY CustomerID;
At this point, I can go back to a non-aggregate query and use all window functions to actually get the results I am looking for.
SELECT CustomerID, YEAR(OrderDate) AS OrderYear,
SUM(TotalDue) OVER(PARTITION BY CustomerID) AS CustTotal,
AVG(TotalDue) OVER(PARTITION BY CustomerID) AS CustAvg,
SUM(TotalDue) OVER() AS GrandTotal,
SUM(TotalDue) OVER(PARTITION BY YEAR(OrderDate)) AS YearTotal
FROM Sales.SalesOrderHeader;
So, how can you tell when adding a window aggregate function to an aggregate query will work? Well, think about how the window function works. SQL Server will create a work table of each partition and perform the calculation on the work table. It can only work with what is available in the result set.
If my query is a non-aggregate query, then the window will contain all the columns and the window aggregate function can work with any of the columns in the table depending on data type. If the query is an aggregate query, then the window function can only work on what could normally be listed in the SELECT list. That means that the window function can operate on any of the grouping columns or on an aggregate function.
Window aggregate functions are extremely useful. They make some queries very easy to write and maintain.