Are Averaged Aggregated Percentages % Straight Averages or Weighted Averages?

I’m a new user to Domo but experienced with Tableau, QlikSense, and PowerBI. I’m excited to keep learning with this new platform!

Aggregating Percentages

Does Domo calculate weighted average percentages on “Beast Mode” calculated fields when aggregating by averaging to a higher dimension?

  • For example, my data is row by Zip Code with numerical columns. Creating a calculated field that divides # New Activated Patients by # Patients to get % New Activated Patients provides a percentage per Zip Code. This works well on a Zip Code map.
  • In the above scenario, when the map is instead based on counties/FIPS, is the county’s percentage a weighted average percentage or a simple average of the Zip Code percentages?

Take the following example:

  • Zip 12345 has 5 activated users out of 10 possible for 50% activations in a "Beast Mode" calculated field.
  • Zip 98765 has 400 activated users out of 1000 possible for 40% activations in a "Beast Mode" calculated field.

County A contains both Zip Codes. When an Aggregated (Average) percentage for the County is calculated, is it a Weighted Average (40.1%) or a straight average (45%)?

The Weighted Average is BEST, while the straight average is IMPRECISE. But…which is it?

Tagged:

Best Answer

  • GrantSmith
    GrantSmith Indiana 🥷
    Answer ✓

    Domo aggregates your data based on the groupings you're using in your card. So in your example if you're doing it at a Zip code level it would have 5/10 (50%) and 400/100 (40%)

    If you remove the zip code field and replace it with the county it would pull all the values that match that criteria / buckets you've defined so you'd have (40 + 5) / (1000 + 10).


    Short answer: weighted average - It doesn't do average of average.

    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**

Answers

  • jaeW_at_Onyx
    jaeW_at_Onyx Budapest / Portland, OR 🔴

    @BrianRowe / @GrantSmith

    Neither of you are 'wrong' but it is inaccurate to say "it doesn't do average of an average"

    i'd agree "it's misleading to take the average of an average" but it can be done.

    this distinction is important because in many situations, new developers get tripped up about logic that's applied before or after the aggregation. I.e. at the row level vs. after the GROUP BY clause has been applied.


    -- row level calculation (average of an average)
    
    select
    avg( colA/colB )
    
    from table
    group by ...
    

    in the above example we are taking an average at the row level (a / b) and then aggregating it again after the group by. on most of days of the week we'd agree that's wrong.


    -- aggregate calculation (the 'mathematically correct' approach)
    sum(colA) / sum(colB)
    from table
    group by ...
    

    here there all the math happens AFTER the group by clause. this is usually 'right'.


    i'm spending time on this b/c i've seen people try to write

    CASE WHEN DATE < today()-30 THEN sum(colA) / sum(colB) END 
    

    this is 'wrong' b/c usually we want the CASE statement evaluated BEFORE aggregation (i.e. i only want activity older than 30 days). but b/c the inner function contains the aggregation (sum), the aggregation already happend (i.e. i've taken the average of all my data and lumped it into one number). it's impossible to test if the activity occured more than 30 days ago at this point.

    to calculate "only activity that occured 30 days ago" you have to apply the logic on the row level.

    sum(case when ... then amount) / sum(case when ... then denominatorAmount)
    
    Jae Wilson
    Check out my 🎥 Domo Training YouTube Channel 👨‍💻

    **Say "Thanks" by clicking the ❤️ in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"