Beast mode: find most recent date for distinct ID and compare

I am trying to create the "Type" column for data in the format below. The logic: Find the row with the most recent "Date" for a distinct "ID" (second column) and compare it to the "Became Date". If "Became date" is NULL or newer than "Date":  "Type" column is set to Prospect. If "Became Date" is older than "Date": "Type" column is set to Customer. 

 

Is this even possible in a Beast Mode? 

 

DateIDBecame DateType
10/1/202011/1/2019 
10/2/2020110/3/2020 
10/3/20201 Prospect
10/1/202021/1/2019 
10/2/202021/1/2019 
10/3/202021/1/2019 
10/6/202021/1/2019Customer
9/1/2020310/1/2020 
9/3/2020310/1/2020Prospect

Best Answers

  • MarkSnodgrass
    MarkSnodgrass Portland, Oregon 🟤
    Accepted Answer

    By default, no you can't do this type of work in a beast mode. You can ask your CSM to enable "window functions in beast mode" and then you can use the OVER clause to accomplish. 

    You could do this in Magic ETL very easily with its built-in functionality. You could use a Group By tile to group by ID and MAX(date) and then join that tile to the original data on ID and date. Then use the new Formula tile to set Type to Customer if became date is less than date, otherwise set it to Prospect.

    Hope this helps.

  • jaeW_at_Onyx
    jaeW_at_Onyx Budapest / Portland, OR 🟤
    Accepted Answer

    re issue 1:  use a LEFT JOIN to keep all rows from the LEFT side.  then as you said, add an isMatch flag = 1 or 0 based on if it matched on the right.

    You could output them all to one dataset and then use a Dataset View (also a beta) to subset the data by 'with matches' vs. 'all data'.

     

     

    re:issue2: instead of doing a GROUP BY and a JOIN,

    use a RANK tile to calculate a row_number by ID order by date DESCENDING. 

    the descending is important.  because when row_number = 1 you know that's the most recent row.  Filter where Row_number = 1 then join THAT to your raw dataset then confidently grab other columns of interest.

     

    @GrantSmith , this design pattern may be relevant to your interests

Answers

  • Thanks so much. You saved me lots of thrashing in the beast mode. 

     

    I will try this in Magic ETL. I will take awhile, I have a learning curve. But I will come back and mark as solution as soon as it works. 

  • Yeah – my first Magic ETL flow. Success! Thanks so much, you saved me tons of time. Just a couple of questions:

    Question #1:

    I couldn't find a formula tile, is the formula tile in all of the releases yet? 

    I was able to work around it though, since the Group by and Join gave me what I needed and I could do the rest in a Beast Mode. 

    One question I have about the Group By and Join - I want to make the flow useful to lots of users with different use cases so I kept all the original rows after the join. I was able to do my customer/prospect check by first checking for NULL values in the new “Max date for ID column” but it may be confusing for others building cards to know to do that.  

    Question #2:

    When you do a group by is there a way to pull a value from a different column that is in the same row where the Max value was found? In this case, I like to pull the "Became Date" and put that value into a new column on the same row.

    Thanks for your help!