Average of a calculated field that divides two columns

Hi,

I have a calculated field that divides two columns for each row. We can call this calculated field 'row total' :

SUM(`Total A`) / NULLIF(SUM(`Total B`),0)


I am trying to get the overall average of all the Row Totals:

avg(NULLIF(`Total Views`, 0) / NULLIF(`Total Conversions`,0))


I have been checking the dataset and average in Excel, and it does not match the average I get from the above calculation.


Thanks for your help!

Best Answer

  • GrantSmith
    GrantSmith Indiana 🔴
    Accepted Answer

    Hi @emmanuel_fabre

    It appears that the NULLs in your data are causing some issues which is odd because the AVG should handle nulls correctly. In any case I was able to utilize this beast mode to get around the nulls or 0 values as outlined in your excel document:

    AVG(CASE WHEN `Total Conversions` > 0 and `Total Views` > 0 THEN `Total Views`/`Total Conversions` END)
    

Answers

  • GrantSmith
    GrantSmith Indiana 🔴

    Hi @emmanuel_fabre

    How does the data not line up? Are you calculating an average of the averages or an average of the entire dataset in Excel?

    Simply put do you want:

    AVG(Total Views / Total Conversions)

    OR

    SUM(Total Views) / SUM(Total Conversions)?

  • Wow, what a quick response @GrantSmith !

    I would like an average of the entire dataset.

    When I try AVG(Total Views / Total Conversions):

    It does not provide any result (stays blank) for the overall group/dataset.


    If I add each user to the table, then it provides the 'average' for that row, which is the same as the total. Side note, it only does this for rows where conversions are 1.

    -------------------------------------------------------------------------------------------------------------------------------

    The second formula you suggested - SUM(Total Views) / SUM(Total Conversions) - is just the sum of the calculation, not an average.


  • GrantSmith
    GrantSmith Indiana 🔴
    edited July 28

    @emmanuel_fabre

    Is your average higher or lower than the Excel value? With your 4-line example would your average be 78678.0625?


    When using AVG it excludes NULL values from its calculation. If your Beast mode value is higher than the excel value it may be because it's excluding those. Should the conversion rate be 0 if there aren't any conversions?

  • emmanuel_fabre
    edited July 28

    @GrantSmith

    The conversion rate is NULL when there aren't any conversions for that row.

    Right now, I am not getting any average, so I cannot tell you whether it's higher or lower than the excel value.

    The average from AVG(Total Views / Total Conversions) returns blank in domo.

    The 'average' from SUM(Total Views) / SUM(Total Conversions) returns the total/overall 'conversion rate' (screenshot above shows the 'total' and 'avg' columns have the same result, 1,814.63.

  • jaeW_at_Onyx
    jaeW_at_Onyx Budapest / Portland, OR 🟤

    In general it sounds like you're struggling with 3 major issues


    SUM(`Total A`) / NULLIF(SUM(`Total B`),0)
    

    1, don't convert your denominator to null if it's 0. you can't divide by NULL. that's bad math.

    2, understand that this math is taking the sum of the entire column and dividing by the sum of Total B. In your description it sounds like you want to take the average of each row's calculation for a / b. If so, that should be written as

    avg( total_a / total_b )
    

    In other words, where you apply your aggregate functions (Avg) and (SUM) are very important because it determines if the math is applied before or after row consolidation.

    3. You don't need any fancy NULL handling b/c the AVG function ignores NULLs. That said you should build in handling for div by 0 because N/0 is technically not null is undefined.


    avg(NULLIF(`Total Views`, 0) / NULLIF(`Total Conversions`,0))
    

    again... don't convert your denominator to null. you don't even need to convert your numerator to null, b/c 0 divided by anything is zero. you might want to develop NULL or 0 handling if you want to avoid the div 0 error.

  • @jaeW_at_Onyx

    With regards to the first issue, I incorrectly thought I was converting NULL to 0. I understand now that I am converting 0 to NULL. I've fixed that.

    It sounds like you're telling me to use --> avg( total_a / total_b )

    However, that does not return any value in my 'aggregate' row. The 'aggregate' row is the total of that group's data. Each group has multiple users, and each user has a 'total' (group names are on a hidden column I excluded from the image to make data private).

    When I try what you are saying above --> AVG(Total Views / Total Conversions), it returns blank for the average


    I think I may need to explain myself a bit more. Perhaps, this is impossible in domo without working it into the ETL?

    I have groups: 1, 2, and 3.

    I have people within these groups: 1a, 1b, 1c, 2d, 2e, 2f, 3g, 3h, 3i.

    Each of these people has a 'total', which is --> total views / total conversions.

    I then need the average for each group (1,2, and 3).

    for example:

    Group 1 includes: 1a - total 2. 1b - total 10. 1c - total 9.

    then, the overall total = 21 (2+10+9) . And the overall average = 7 (21/3).


    Does that make sense?

  • emmanuel_fabre
    edited July 28

    @jaeW_at_Onyx


    With regards to converting the denominator, I incorrectly thought I was converting NULL to 0, but I understand now I was converting 0 to NULL. I'll fix that.

    As I mentioned above, when I use avg(total_a/total_b), it returns a blank

    I may need to explain myself more. The row above is an 'aggregate' of a group. I excluded the column showing the group name for data privacy reasons.

    The data includes several groups (1,2, and 3). Each group has users (1a, 1b, 1c, 2d, 2e, 2f, 3g, 3h, 3i). Each user has a total (their individual total views/ total conversions).

    The aggregate row above needs to be the average of user totals within a group.

    Example:

    1a total = 2, 1b total = 9, 1c total =10. The overall group 1 total is 21 (2+9+10). The overall group average is 7 (21/3). This is what I am trying to get. The average 'total' within group 1.

    Unfortunately, vg(total_a/total_b) returns a blank. Is this possible in domo? do I need to work it in the ETL by creating the 'total' column in the ETL?

  • jaeW_at_Onyx
    jaeW_at_Onyx Budapest / Portland, OR 🟤

    if Total is an average. And then you want to take the average of Total, then you're trying to take the average of an average. on most days of the week that's considered mathematically problematic.

    if that's your requirement though avg( colA/colB ) should yield the correct result; however IF the values for ColA and ColB are not on the same row then the required calc is not possible.

    IF colB sometimes contains 0 or NULL you need to implement error handling because you will have errors.


  • @jaeW_at_Onyx

    Total is not an average, it's a simple calculation/division (views/conversions -- 29,993,987.5/16,529 = 1,814.63).

    My dataset is made up of different groups. I want the average for that group. Each row has a ColA/ColB (views/conversions). However, I want the average of all rows.

    Ex: Here, I need the average 'myDivision' for Group A (which is 11). The myDivision column is a calculated field, and is not an original column of the dataset -->

    I apologize if I haven't explained myself well.


    Now, the line I have been trying to display has a simple table is the aggregate values for Group A: the 'Total Views', 'Total Conversions', and 'Average Conversion Rate' (A.K.A. Average myDivision). In the image below, I labeled myDivision as 'total', which I recognize is misleading.

    Since myDivision is a calculated field, I need to reproduce it when creating the average column you see above. What I mean is, domo does not allow me to create a column by simply creating a calculated field comprised of avg(myDivision), aka avg(total). So, I write it out like this -->

    avg(Total Views/Total Conversions). However, that gives me the NULL value you see above.


    Is this because of error handling? Or am I trying to do something that domo can't handle? I assume it's my ignorance, rather than domo ;)


    Lastly, I can't thank you enough for this help!!!

  • jaeW_at_Onyx
    jaeW_at_Onyx Budapest / Portland, OR 🟤

    Post a sample dataset. I'm not convinced the data is actually shaped the way you described, otherwise you should get a result.

  • emmanuel_fabre
    edited July 29

    @jaeW_at_Onyx

    Certainly. I assume you mean in an Excel doc? The sample dataset includes the mydesiredCalcfield, which is not in the original dataset. This column is the beast mode I create in domo, which is a simple division --> SUM(`Total Views`) / SUM(`Total Conversions`).

    I need to find the average of Group A's mydesiredCalcfield. When I create a new calculated field called AVGmydesiredCalcfield with --> avg(`Total Views` / `Total Conversions`) ... I get a blank. Please note, I cropped out the column with the Group A's real name for data privacy (col to left of Total Views):

    If I try AVGmydesiredCalcfield with --> avg(SUM(`Total Views`) / SUM(`Total Conversions`)), I get a processing error.


    I hope the sample dataset is what you requested :)

  • emmanuel_fabre
    edited July 30

    @GrantSmith


    Hi Grant Smith,

    I am genuinely perplexed. This is very similar (if not identical) to a beast mode I previously tried. The good news is it doesn't return a blank. The bad news, it is still giving incorrect averages. I QA'd against a few different groups, exporting the domo table result to Excel, making sure to filter out where total views = 0 and where total conversions = 0. The results still vary, whether it is small groups (<50 rows) or large groups (>1000 rows). Here is an example, where the exported table (filtered in excel) gives an average of 8,665.26 compared to domo's average of 7,898.28.


    I am honestly at a loss for words to explain what could be happening.


    Happy Friday!


    P.S. Coming from years of Tableau work, and based on a few months experience in Domo, I would NEVER recommend Domo. It is unreal how much time is wasted trying to get simple functions to...function. End rant.

  • emmanuel_fabre
    edited August 2

    @GrantSmith

    @jaeW_at_Onyx

    I figured it out!! Thank you SO MUCH for your help.

    The issue was that Domo was utilizing a count of users that included rows where user views were NULL and rows where user views were 0. It worked correctly once I created this calculation:

    CASE when `Total Views` IS NOT NULL AND `Total Views` > 0 AND `Total Conversions` IS NOT NULL and `Total Conversions` > 0

    THEN `Total Views`/`Total Conversions`

    END


    The last issue is that my QA included a table of measures that had to be aggregated. When I would sum views or conversions it would combine views/conversions for user 2 when user 2 had more than one row, which ultimately three off the average for obvious reasons:

    User 2. 11 views

    User 2. 5 views

    User 3. 20 views

    It should average to 12 views, but it was averaging to 18 (as if User 2 was one row with 16 views).


    I am relieved and embarrassed by how easy it was. In any case, thank you for your patience and good-will!

  • jaeW_at_Onyx
    jaeW_at_Onyx Budapest / Portland, OR 🟤

    glad you figured it out!

    "IF colB sometimes contains 0 or NULL you need to implement error handling because you will have errors."

    CASE
    when `Total Views` IS NOT NULL
    AND `Total Views` > 0
    AND `Total Conversions` IS NOT NULL
    and `Total Conversions` > 0
    THEN `Total Views`/`Total Conversions`
    END
    

    SIMPLIFY YOUR MATH. The root problem is if the denominator is null or zero. (you can always divide if the numerator is 0.


    Take off your technical hat. Is it fair to assume that if you don't have a measurement (NULL) that it's the same thing as 0?

    if i take the avg of 5,4,1,2,3, 0 , 0 that's a different metric than the avg of 5,4,1,2, 3, NULL, NULL.

    in plain english. If i take a daily measurement, Mon through Sunday and I actually have no sales on sat and sunday, that's a different average than if my store is closed on saturday and sunday (hence the nulls in v2).


    Make sure you interpret your results. Don't just report the number b/c 'it works'.

    In your math, you are assuming that 0 and NULL is the same (in the numerator) and you probably should not.

    If you're doing a spend analysis and you had thousands of views of an add and zero conversions that means you did a really bad job at marketing or have a terrible website. that does not mean you should ignore that data point...

  • @jaeW_at_Onyx


    Thanks for the input.

    This is a special case due to incomplete/inaccurate data. In short, the number of conversions, clicks, etc does not always get inputted, so we have nulls or zeros, which the people doing the data entry seem to think is interchangeable. As a result, we have a few metrics like Views: Conversion which is caveated to be based just on those with conversions for individual Views: Conversions ratios. For the overall ratio, we include all rows, including those with nulls/zeros.

    Wonky, I know.