This was a lot easier than I thought it would be. For some back story – but without revealing too much about my industry – in our company, we have individuals who provide us with a great deal of business. However, they provide our competitors with a great deal of business as well. They are not contracted to give us all of their business, and there’s really nothing we can do aside from creating solid business relationships in order to assure that they do give us at least most of their business.
So in order to easily determine which individuals are giving us most of their business (and we define that as over 70% of their business), I wrote something very similar to the following simple SQL statement (for MySQL):
1. SELECT individual_id, count(1) AS total_volume, 2. sum(case business when 'Us' then 1 else 0 end) as our_volume 3. FROM main_data_table 4. GROUP BY individual_id 5. HAVING our_volume / total_volume >= 0.7;
Note: the line items are not part of the SQL code – they were added to make explaining it easier.
I don’t normally do a lot with the HAVING keyword, but I suppose I ought to. It comes in really handy in this example. Here’s the breakdown of what’s going on:
1 : This is a stupid way of doing it (not the actual way I chose) but it essentially gives you a total number of line items for the individual_id.
2 : Two things are going on here. First, we tell MySQL that when the line item represents our business (i.e. the center where the transaction happened was one of our centers), count it as 1. Otherwise, count it as 0. That way, when they are aggregated (using the sum() function), the result is the total number of line items for the individual_id that were completed at one of our centers.
5: This limits the results to those individual_id’s that produced at least 70% of their business at one of our centers. You must use the HAVING keyword when you want to limit the results to a calculated aggregate qualifier.