Calculated Items in Pivot Tables

Work with Excel pivot tables long enough and you will find yourself suffering many frustrations. For starters, there’s no way to create your own AutoFormats, so unless your company wants to use Microsoft’s idea of good-looking tables as its standard format, you’re prett much stuck with copy/pasting and doing your own formatting. This is as of Excel 2003, of course.

But one of my biggest pet peeves, and one that I’ve actually been able – to some extent – to overcome, is the idea of creating my own formulas that will pivot and calculate the way any other data element in the report would. That’s a little vague, so I’ll give an example.

Right now my department is working on yearly assesments of the market shifts and demographics for 2007 (yes, it’s half-way through 2008, but that’s how long it takes to get the needed data from the state). There are a few calculations that have to be done on this data, including, but not limited to:

  1. Since the data is still only current through the third quarter of 2007, the year of 2007 needs to be multiplied by 4/3 to annualize it for proper trending.
  2. Each campus has its own overview, and groups competing hospitals differently. For my books, I need to look at our main campus, all of our campuses as a group, the three main competing hospitals as one group, and all the competition as one group.
  3. For many of the campuses, we are looking at the tri-county and seven-county regions surrounding our hospital as a grouping for high-level analyses. The smaller campuses define primary and secondary service areas by zip codes (far more annoying).

All of these caluculations and groupings can be done using “Calculated Fields” and “Calculated Items”. Seems simple enough, but this feature hid itself from me for a long time.

Ok, so lets say you have a pivot table, created from a very simple data set of foodstuffs (carrots, milk, oranges, peas, tofu, and yogurt), some states where people eat these foodstuffs, and the number of people from each state who eat each foodstuff. It might look like this:

A very simple pivot table.

Sample Pivot Table

Now, say you want to group these states differently. You want to see FL, GA, AL, and LA grouped as the southeastern states. If your dataset is small enough and is already housed in excel, you can just overwrite the values of the states FL, GA, AL, and LA as “Southeast”, or create another column titled “Region” and fill in “Southeast” wherever applicable. But if you’re using a pivot table with over 65k records, this isn’t an option (Excel 2007 features an unrestricted number of rows, which will make a lot of this obsolete).

Pivot Table Toolbar

PivotTable Toolbar

Ever seen this window before? It’s the pivot table toolbar. It’s shown here as a modal window, but it can also be embedded into the main toolbar, of course, which is where I usually have it. If you don’t have this showing, just right-click on your pivot table and select “Show PivotTable Toolbar”.

Even if you have used this toolbar before, you might not have noticed that “PivotTable” drop-down on the left. There are some interesting things in there, like this:

PivotTable drop-down

You may have to click on appropriate fields in your pivot table first in order to access both Calculated Fields and Calculated Items. In my case, I clicked on one of the states first. Once you select Calculated Item, you get the following dialog box:

Calculated Field options

This shows the fields in my pivot table on the left, and the values available for the selected field on the right. This is not the same dialog box you get for a calculated item, because calculated fields assume you are using the values of a field to create a calculation, whereas calculated items use the items themselves. It’s a very fine line between the two, and I urge you to examine for yourself the subtle differences.

Anyway, on with the show. This is how I would create a grouping of the southeastern states:

Modified Calculated Item options

Essentially, the field “Southeast” will contain the added values of whatever would have shown for AL, FL, GA, and LA. This remains true no matter how the table is pivotted (another distinction between this and calculated items). Once this is created, it shows as a regular item under the “State” field, as such:

Completed pivot table

Updated PivotTable

Now I am free to pivot as I see fit, and the “Southeast” value will always act as the simple sum of those four states, whether nested or not.

You may have noticed that adding a calculated item causes the totals to be “double-dipped”. This is yet another annoyance of pivot tables, but, really, how else would it be done? You are creating basically another item, and it needs to be included in the “Grand Total”. In our situation, we don’t need to see the southeastern states twice, we only need to see their grouped value. Simple enough, all we need to do is filter out the individual states by clicking the “State” drop-down and unchecking them, as such:

Pivot Table Filter

Now your pivot table will only show the non-southeast states and the Southeast grouping, ensuring that your totals will be correct, and all will be right with the universe.

Advertisements

18 thoughts on “Calculated Items in Pivot Tables

  1. I couldn’t help but notice that when you created the “Southeast” field, and added it to your table, you’re essentially “double-dipping” the southeastern states. I.e. their foodstuffs are counted twice, once as individual states and again as part of the “Southeast” total.

    Is there anyway around this? I’m not very familiar with pivot tables, but can they take “IF”s as inputs? For example, outside of the pivot table, you could have an input, something like “Demographic” which could be either “State”, “Region”, etc. Then for the pivot table, you could say “=IF(demographic=state, list foodstuff by state, IF(demographic=region, list foodstuff by region,’No Demographic Selected’))”. That’s sort of a mix of algorithm and actual code, but you get the point. Anyway, if that is possible, it seems like it would avoid the issue of double-dipping. You could also avoid using pivot tables altogether and stick to LOOKUP tables (which I’m also not very familiar with), which definitely can take IF inputs. I’m not entirely sure if ALL of the functionality of pivot tables can be duplicated in a LOOKUP table (which could then be subsequently filtered as desired), but it’s just a thought.

  2. Excellent point. I forgot to mention that calculated items will be included in the total, and that you would need to look out for that. I have updated the post to include this information, as well as a much easier way to exclude the individual southeastern states than the way you had in mind.

    That said, yes, you can use formulas for Calculated Items. For example, if you only want to include circumstances when the total eaters Florida is greater than 1000, your Calculated Item would be “=if(FL > 1000, FL)”. I’m not sure about the real-world application of this, though, but if I ever find a time when I use this, I will surely post about it.

    Lookup tables are great, but have nowhere near the possibilites of a pivot table. Simply put, lookup tables would be a good way of formatting the data before putting it in a pivot table for usability. Think of a pivot table as a dynamic way to view static information. You can have a spreadsheet with 50 columns and 50,000 rows, throw it in a pivot table, and mold all that data any way you see fit. It really is phenomenal.

  3. The newly posted solution to the “Double Dipping Dilemma” or DDD as I like to call it, is in fact valid. However it is quite brute force, and requires the user to know which states are in which region. The “Demographic” input suggested in the first post is more automated and only requires that the programmer know the children of each category. I guess it depends on the application of the spreadsheet, but in my experience, it’s very beneficial to make all spreadsheets as user friendly as possible. Even if you expect to be the only person to use it, because you never know when you may want to move to a different city and some poor shmuck is going to have to take over your (incredibly convoluted Excel) work.

  4. In this case, I have two thoughts.

    1) The pivot table will remember which fields you deselected last time you saved the spreadsheet. Hence, each time you include the “State” field in the table, your southeastern states will automatically be deselected. If you have done the work of creating all of the state subgroupings, then the work of deselecting the individual states from the drop-down list is child’s play.

    2) The data fed to Excel could be preformatted into groupings. For example, a SQL statement like “SELECT state, food, sum(eaters) FROM table;” can be changed to read “SELECT state, case when state in ( ‘FL’,’GA’,’AL’,’LA’ ) then ‘Southeast’ else ‘Other’ as region, food, sum(eaters) FROM table;”. Note that both SQL statements require a GROUP BY clause, but I’m lazy enough not to type it in, but not too lazy to type in this explanation. My point is that the second statement will give you your subgrouping in a separate field, which is more valuable than a calculated item anyway. The problem is, often times we don’t have the ability to alter how we get our data and have to do what we can with it.

  5. In my pivot table, I insert a calculated field, ex : Quantity*Price.

    Type Quantity Price
    Rose 5 1 $
    Tulipe 2 5
    Rose 10 1$

    The calculated field will return for Rose 30$, because it adds the Price, multiply by the total quantity.

    How can I correct this.

    Thank you for your time and hope you understand my English,
    Micheline

  6. @Micheline,

    I know you understand, but for those reading this who might not, the reason you’re getting $30 is because the item “Rose” will only appear once in your pivot table, with a quantity of 15 and a price of $2. $2 x 15 = $30.

    Pivot tables are really only good at rolling up data (giving you a total value for prices and quantities of roses), and performing calculations based on those rolled up figures. What you need in your situation is a calculation to be applied on a line-by-line basis. As far as I know, this is beyond the abilities of pivot tables.

    My only suggestion is to access the source data if possible and add a field or column that would represent the product of those two fields. I apoplogize for not being able to solve your problem using only the pivot table. If anyone else has any suggestions to solve this problem, I’d love to hear them!

  7. Hi Micheline,

    Try adding this as a calculated field (quantity*(price/price))

    this will only work if you will not change prices. :)

  8. Here’s a twist…

    I have a pivot table with two row fields (Brand and Boutique). When I add a calculated item to the data field, EVERY Boutique shows up for EVERY Brand even when none of the source data has these combinations. Thus, each boutique is listed multiply times – within the Brand that it truly belongs, as well as in each of the other Brands. Of course the values for all data in these unwanted data fields is blank with the exception of the calculated items which are #DIV/0.

    When I take the Brand row field out of the pivot table, the boutiques each show up once. But I really need the Brand rows, too.

    Any ideas???

    • I have the same problem as Tania with Calculated Items and wonder if anyone has found an answer or can confirm that Microsoft intended for this to happen.

      • Tania & Elena,

        that is indeed the twist, as far as I can see the Calculated Value that is created is not linked back to the structure in the original data

        So it cant see that a particular brand only appears in a particular Boutique and displays it in all categories resulting in DIV errors. I tried using if and iserr statements to strip out these div errors but it only results in value errors that cant be removed

        This is the real deep point from this article since it pretty much makes pivot tables with more than two classificatory items (row items) unusable. This is a critical problem since most users will require more than two row items

        Tania, I dont think Microsoft intended it to happen but its a flaw in the logic of how we are trying to use pivot tables, instead we should be manipulating the dataset rather than complicating the pivot

        Andrew

  9. What I did is manualy nest the fields to create a herarchy model.

    That’s the only way… no way to nest field coming from a db.

    Hope this help.

    • I had exactly the same problems resulting in a DIV0 when trying to set up a Index. Would it be posible, like in access, to assign a “Null” value in the if condition so whet the divisor si = 0 then asign a Null value?

      • Peter, I’m not 100% sure what you’re trying to do. However, it is certainly possible to use IF/THEN logic within a formula to escape from the horror of the division by zero error.

        If you are dividing cell A2 by cell A1, for instance, your formula might look like this:

        =A2/A1

        But if you are worried that A1 might contain a zero and return a Div/0 error, you might do this instead:

        =IF(A1=0, “-“, A2/A1)

        In this case, Excel will first check to see if your divisor is zero, then perform the appropriate operation. The only null-style value that I am aware Excel allows is simply two quotation marks with nothing between them: “”. The contents of the cell would be empty, so the colloquial value is null, however, excel wouldn’t treat it differently than a regular cell (as a database would, for instance, treat a NULL value differently than it would a blank value).

        I hope this helps!

  10. Pingback: How to insert a calculated field within the data section?

  11. Hi, I am using an old version of Excel (2000).

    I am having a problem with a Calculated Item in Pivot Table.
    My data includes five fields:

    Manager
    Client
    Project
    Type
    Amount

    “Type” has three designations, which I use as column headings for the amount:

    Billed 2012
    Collected 2012 (always a negative number)
    Billed 2013

    Manager, Client, and Project are row headings.

    The Grand Total column picks up the sum of Billed 2012 + Billed 2013 – Collected 2012.

    I want only the difference of Billed 2012 and Collected 2012, but I want the Billed 2013 column to show in the pivot table.

    My calculated field is Billed 2012+Collected 2012, which should give me the difference of just those two fields.

    This works if I only use “Manager” as a row heading, but if I add Clients, my calculated field duplicates all projects and all clients for each manager, and does limit it to each manager’s particular projects and clients.

    Worse yet, if I add Clients AND Projects, the program stops responding, presumably because it has too much work to do by not limiting projects and clients to the corresponding manager.

    Is there to do this?

    Thanks,
    Michael

  12. Very great post. I just stumbled upon your blog and
    wanted to mention that I have truly loved browsing
    your weblog posts. After all I’ll be subscribing on your feed and I’m hoping you
    write once more very soon!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s