Using Beast Mode, how can I use the last value when a value equals zero?

I joined two datasets with MagicETL, one a view of daily website visits, the other a weekly snap shot of subscribers. I would like to create a Single Value gauge card that shows the visits per subscriber (2.15 in my example below) with the latest subscriber numbers in the summary line (3,988 in my example). I would prefer a Beast Mode solution as i will need the gauge card for this month as well as last month.

Any ideas please?

Answers

  • GrantSmith
    GrantSmith Indiana 🔴

    Hi @rachelchorlton

    There's not a simple way to do this with a beast mode. You'd need to do this within your ETL to get the most recent non-0 number in your dataset. You could sort of hack a solution together using case statements and the LAG window function however you'd be restricted to the number of rows back you'd be able to look and if it's outside of the defined number of rows then it wouldn't work.


    COALESCE(
     NULLIF(`Last Subscriber Number`, 0),
     NULLIF(LAG(MAX(`Last Subscriber Number`), 1) OVER (PARTITION BY `Website1` ORDER BY `Date Name`), 0),
     NULLIF(LAG(MAX(`Last Subscriber Number`), 2) OVER (PARTITION BY `Website1` ORDER BY `Date Name`), 0),
     NULLIF(LAG(MAX(`Last Subscriber Number`), 3) OVER (PARTITION BY `Website1` ORDER BY `Date Name`), 0),
     NULLIF(LAG(MAX(`Last Subscriber Number`), 4) OVER (PARTITION BY `Website1` ORDER BY `Date Name`), 0),
     NULLIF(LAG(MAX(`Last Subscriber Number`), 5) OVER (PARTITION BY `Website1` ORDER BY `Date Name`), 0),
    )
    

    If you have a span of 6 weeks where you don't have an actual value this won't work but you can expand it out to more weeks back. This is back of the napkin code and hasn't been tested but should give you a rough idea of how to possibly accomplish what you're asking for in a beast mode.


    Window functions (like LAG) aren't enabled by default in your instance. You'll need to talk with your CSM to get them enabled.

  • @GrantSmith if nothing else, you have saved me additional days just trying to figure this out, so thanks! I'll try your code above in my ETL later today, fingers crossed

  • GrantSmith
    GrantSmith Indiana 🔴

    The code above is for a beast mode.

    If you want to do it in an ETL, you'd need to use two separate tiles to add a constant called join column, set them to 1. Add a filter to one of those Add Constant tiles to filter out any records with a 0 value, then LEFT join those two together with a join tile (your main dataset on the left, filtered on the right), filter the joined data based on the date not being in the future. Put that into a Rank & Window tile to calculate the row number (order by your prior date in descending order. Make sure you partition your data correctly (likely by Website1). Then filter the records for ROW NUMBER = 1. That'll get the most recent value for you.

  • GrantSmith
    GrantSmith Indiana 🔴

    The ETL would be more accurate and flexible in terms of how far back your last value occurred whereas the beast mode would only go back as far as you have defined.

  • jaeW_at_Onyx
    jaeW_at_Onyx Budapest / Portland, OR 🟤

    This is a mind bender, but a little more bullet proof that @GrantSmith 's approach.

    His approach works as long as you never have more than a 6 day gap in data. But if you did you'd return the wrong result. Consider the following Example


    Assume you have gaps on Jan 6,7 and 9 (and 3 and 4 but i forgot to include them).


    create a new binary column "isNewGroup" which just asks is there a value for measurement.

    case when Measurement is not null then 1 else 0 end
    

    then calculate the cumulative sum of the isNewGroup column to identify groups of rows where the Measurement should be the same. Use a Rank & Window Tile. this is pseudo code.

    sum(isNewGroup) OVER (Partittion by Website order by Date Asc) 
    

    then use Group By tile to calculate the MAX(Measurement) GROUP BY Website, isNewGroup

    then JOIN back to the transactions ON Website and isNewGroup.


    Win.


    ----- THAT SAID ----

    JOINING daily measurements to weekly snapshots is exactly what you should NOT do. You are doing 'crazy math' to solve what is inherently a poor data design.

    It's a totally different use case, but the idea is the same:

    https://www.youtube.com/watch?v=PVbOeLSae9o&t=820s

    Instead, consider APPENDING the two datasets with a different constant ("activity type = 'daily measurement' or 'weekly snapshot')

    Then for the daily measurement just add two columns "date_end of week" and "isEndOfWeek"

    In Analyzer, you can just filter to 'isEndOfWeek' and you'll always have the end of week values. It would be inaccurate to look at intra week numbers because on Monday you'd show that your subscriber count was end of week value which is ... wrong.

    Alternatively, if you need to aggregate intraWeek nubmers just use date_endOfWeek on the axis.