News from Aunt Kathi

Now that writing and editing are a big part of my job at Redgate, I rarely seem to find time to post anything here. Since there is a lot going on over the next two weeks, I thought this would be a great time to post. You can find most of what I write now days here.

new book

Once I’m home from PASS Summit, that will probably be the end of my speaking engagements for the year, but who knows? Contact me to speak at your user group remotely if you need a speaker in December.

PASS_19_Summit_Speaking_Generic_Architecture_Banners_v1-2_Email_Signature

Posted in Life in IT | Leave a comment

May 2019 T-SQL Tuesday #114: Puzzles

This month’s T-SQL Tuesday is hosted by Matthew McGiffen. His challenged is to come up with a puzzle.  My entry involves a simple request that is more difficult to solve than it seems on the surface.

TSQL2SDAY-300x300

I was recently asked by one of my siblings during a family party how to find the row with the maximum count from a dataset. Yes, I do have that geeky of a family. He was working with MySQL, but I figured the solution would be similar to T-SQL. Since neither type of database was handy during dinner, I just explained that you have to separate the logic with something like a subquery.

He had his own data, but I’ll explain with a similar scenario in AdventureWorks. The requirement is to find the customer and order count for the customer or customers who have placed the most orders. You could write a query like this which returns the rows first along with all the other rows.

 

SELECT CustomerID, COUNT(*) CountOfOrders
FROM Sales.SalesOrderHeader
GROUP BY CustomerID
ORDER BY COUNT(*) DESC;

One row is returned for each customer, not exactly what is needed.

puzzle1

You could use TOP to return just one row, but this is still not accurate since two customers have the maximum count, and only one of them will show up in these results.

SELECT TOP(1) CustomerID, COUNT(*) CountOfOrders
FROM Sales.SalesOrderHeader
GROUP BY CustomerID
ORDER BY COUNT(*) DESC;

puzzle2

If nesting aggregate functions was possible, you could filter in the HAVING clause. This query won’t work!

SELECT TOP(1) CustomerID, COUNT(*) CountOfOrders
FROM Sales.SalesOrderHeader
GROUP BY CustomerID
HAVING COUNT(*) = MAX(COUNT(*));

puzzle3

What about using a CTE (common table expression) or derived table to find the customers and counts first? You might think a query like this would work:

WITH Counts AS (
SELECT CustomerID, COUNT(*) CountOfOrders
FROM Sales.SalesOrderHeader
GROUP BY CustomerID
)
SELECT CustomerID, CountOfOrders, MAX(CountOfOrders) AS MaxCount
FROM Counts
GROUP BY CustomerID, CountOfOrders
HAVING CountOfOrders = MAX(CountOfOrders);

puzzle4

Unfortunately, since it’s grouped by CustomerID and CountOfOrders, the Max(CountOfOrders) is returned for each group. This is not the solution.

What about my favorite type of function, windowing functions? By adding an OVER clause to MAX(CountOfOrders), the overall max count can be returned.

WITH Counts AS (
SELECT CustomerID, COUNT(*) CountOfOrders
FROM Sales.SalesOrderHeader
GROUP BY CustomerID
)
SELECT CustomerID, CountOfOrders, MAX(CountOfOrders) OVER() AS MaxCount
FROM Counts
GROUP BY CustomerID, CountOfOrders
HAVING CountOfOrders = MAX(CountOfOrders);

puzzle5

The problem is that you cannot directly filter a windowing function. This returns an error:

WITH Counts AS (
SELECT CustomerID, COUNT(*) CountOfOrders
FROM Sales.SalesOrderHeader
GROUP BY CustomerID
)
SELECT CustomerID, CountOfOrders, MAX(CountOfOrders) OVER() AS MaxCount
FROM Counts
GROUP BY CustomerID, CountOfOrders
HAVING CountOfOrders = MAX(CountOfOrders) OVER();

puzzle6

The beauty of window aggregates is that you can have multiple levels of aggregation in the same query. What about moving the MAX(Count) to the CTE? You may think this looks strange since you can’t nest aggregate functions, but in this case, the window function is operating on an aggregate function!

WITH Counts AS (
SELECT CustomerID, COUNT(*) CountOfOrders,
MAX(COUNT(*)) OVER() AS MaxCount
FROM Sales.SalesOrderHeader
GROUP BY CustomerID
)
SELECT CustomerID, CountOfOrders, MaxCount
FROM Counts;

puzzle7

The only thing left to do is to add a filter, and this time it will work since the aggregations are completed in the CTE.

WITH Counts AS (
SELECT CustomerID, COUNT(*) CountOfOrders,
MAX(COUNT(*)) OVER() AS MaxCount
FROM Sales.SalesOrderHeader
GROUP BY CustomerID
)
SELECT CustomerID, CountOfOrders, MaxCount
FROM Counts
WHERE CountOfOrders = MaxCount;

puzzle8

There are many other ways of solving this. You could use a view or a derived table (subquery) for example, but I think this is the best method. Another thing to note is that the solution will require three levels if you don’t use the windowing function. Here is an example:

WITH Counts AS (
SELECT CustomerID, COUNT(*) CountOfOrders
FROM Sales.SalesOrderHeader
GROUP BY CustomerID
),
MaxCount AS (
SELECT MAX(CountOfOrders) AS MaxCount
FROM Counts
)
SELECT CustomerID, CountOfOrders, MaxCount
FROM Counts
CROSS APPLY MaxCount
WHERE CountOfOrders = MaxCount;

puzzle9

The first level returns the list of customers and counts. The second finds the overall maximum count. The outer query uses CROSS APPLY to join the two CTEs. Now that the aggregation is completed before the outer query, the filter works.

I thought this was a nice puzzle for T-SQL Tuesday because the initial requirements sound so simple. The solution is not difficult but also not intuitive.

 

Posted in Uncategorized | 4 Comments

Celebrating Women in Tech

NOTE: This post was originally published on Simple-Talk.

NASA recently named a facility in honor of Katherine Johnson, the mathematician who performed the calculations for space flights in the 50s and 60s. She was one of the “human computers” whose stories were made famous by the movie Hidden Figures just a few years ago. Until that movie was released, the contributions of these women were not well known. Imagine making those calculations by hand today!

There are many women pioneers in technology whose accomplishments outshine anything most of us could do these days. Ada Lovelace wrote the first computer program for a theoretical machine about 100 years before the first real computer existed. Admiral Grace Hopper developed one of the first compilers. The inventions of Hedy Lamarr during World War II lead to the Wi-Fi technologies we all now use.

At the dawn of the computing age, women coders were the norm. Most of the programmers on the ENIAC system were women, for example. Programming was considered low-prestige, unglamorous, and tedious work most suited for women, but even then, women didn’t get credit for their achievements.

Many things have changed over the decades, including the attitudes towards programming. It’s now a highly prized and well-paying skill. Development teams are more likely to be comprised solely of men. There are many factors that have led to this disparity, but the turning point seems to be the 1980s during the rise of home computers. Since then, the number of women in computer science university programs has decreased with a subsequent drop in women working in technology fields.

Women have always had the brain power to work in technology, but they are often discouraged from pursing tech careers by the very people who should be supporting them. Toxic “brogrammer” cultures or inflexible work environments can also contribute to making the women who work in tech drop out. It’s not easy working on a team where you don’t feel welcome, and fixing these problems is good for everyone, not just women.

When you think about how much easier coding is today with modern languages, you must realize these pioneering women had mad skills. Women have made a big difference in tech throughout history, but their stories are not always told. While the numbers are smaller, women continue to innovate and lead.

Do you enjoy your technology career? Be sure to thank the women tech pioneers whose accomplishments have made your life easier!

Posted in Uncategorized | Leave a comment