Grab the most current entry

tobypenn
tobypenn ⚪️
edited September 2 in Dataflows

I have a dataset with a list of activities. For each opportunity I want to grab the most current update/activity show those notes.

so for a proof of concept, I might have the following:

Activity Date. type description

1/1/2021. environment buildout. Built stuff to support POC

1/5/2021 project kickoff things are going swimmingly

1/7/2021 ongoing check-ins prospect thinks we're awesome

1/14/2021. ongoing check-ins crushing success criteria

etc..

For a weekly update, I'd like to have a table view, with just the most recent update. Is there a way to do that?

Answers

  • GrantSmith
    GrantSmith Indiana 🔴

    Hi @tobypenn

    You can utilize a window function in a beast mode to get the last date for a week and then compare it to the activity date and filter when they're the same

    CASE WHEN `Activity Date` = MAX(MAX(`Activity Date`)) OVER (PARTITION BY YEARWEEK(`Activity Date`)) THEN
      1
    ELSE 0
    END
    

    YEARWEEK returns the year and week in the YYYYWWW format for the supplied date.

    A couple of caveats:

    • If there is more than one activity on that date, all activities on that date will be listed
    • This assumes your weeks go Sun-Sat
    • You'll need window functions to be enabled in your instance - talk with your CSM.
    • This also assumes that you don't have any other dimensions/partitions in your data

    Alternatively you could utilize a MagicETL dataflow to calculate the year and week number, group by whatever dimensions you might need and the year and week, get the max date, join back to your original dataset based on your group by columns then filter your dataset where the date matches the max date.


    Another ETL method would be to use a rank and window function to calculate the row number partitioned by the dimension fields you want the ordering to be contained within and the year and week of your date, order by your date field descending and calculate the rank. Then filter where rank = 1.


    This last option is likely the cleanest and easiest to understand option for you and future users.

  • jaeW_at_Onyx
    jaeW_at_Onyx Budapest / Portland, OR 🟤
    CASE WHEN `Activity Date` = MAX(MAX(`Activity Date`)) OVER (PARTITION BY YEARWEEK(`Activity Date`)) THEN
      1
    ELSE 0
    END
    


    @GrantSmith and @tobypenn from my experience, using calculated columns in window functions will not work as expected in analyzer. Best I can tell, Domo broke that functionality in the Spring release and haven't fixed it since.


    in other words, you need to materialize yearweek(activity date) if you're going to take that approach.


    also, if it were me, i'd be wary of the MAX(MAX()) approach. it only makes sense depending on what you show on the axis.


    instead i would use a rank() function or sum(sum(1)) as a proxy for rank()

    sum(sum(1)) over ( partition by ... order by ... )
    

    if you're trying to do ranking you must have an explicit order by clause.


    Actually, the best route forward might be outlined here:

    https://www.youtube.com/watch?v=39VHAQEYHAc&t=3s