How would I filter date field for only the most recent date?

MichaelClark
MichaelClark Wichita Falls, Texas ⚪️

How would I configure a BeastMode to filter only the most recent date?

My data is machine name / last_sync, which is creating duplicates for the machine name and each reported last_sync date. I only want to capture the most recent date, which should also display one machine name per date.

Thanks,

Michael


Answers

  • MarkSnodgrass
    MarkSnodgrass Portland, Oregon 🟤

    You could create a beast mode that is MAX('last_sync') . This will give you the latest sync date based on the other columns that you are choosing to display in your card.

  • GrantSmith
    GrantSmith Indiana 🔴

    Alternatively you can use a beast mode and window functions to determine which row is the latest with something like:


    SUM(SUM(1)) OVER (PARTITION BY machine_name ORDER BY last_sync DESC)
    

    Then you can just filter this field for records with a value of 1.

  • jaeW_at_Onyx
    jaeW_at_Onyx Budapest / Portland, OR 🟤

    @GrantSmith you can't filter on window functions in analyzer unless you turn on the feature switch "enable filter on aggregates".


    also, while sum(sum(1)) .. should work, you may consider sum(max(1)) over ...

    this will give you a dense_rank instead of the count of rows (just in case you have multiple rows per machine on a given day.


    Honestly, if it was me though, i would recommend implementing the row-duplication as either a dataset view or as a separate dataset, current_data using Magic. calculating this window function in analyzer is unnecessarily expensive.

  • MichaelClark
    MichaelClark Wichita Falls, Texas ⚪️

    @jaeW_at_Onyx so I actually have a dataflow in order to bring in last_sync from the inventory report (as well as other fields) since the patch report does not have last_sync field. So your suggestion would be to remove duplicate machines with older sync dates in the dataflow?

    Michael

  • MichaelClark
    MichaelClark Wichita Falls, Texas ⚪️

    @jaeW_at_Onyx also, where is the switch to enable filter on aggregates? I believe it would be a useful function for the dashboards we are developing.

  • GrantSmith
    GrantSmith Indiana 🔴

    @user030156

    It's a Beta feature. Talk with your CSM to get the process started about getting it enabled in your instance

  • MarkSnodgrass
    MarkSnodgrass Portland, Oregon 🟤

    @user030156

    To your original question, I would advocate for implementing the simplest and most straightforward solution so that others are easily able to understand if they have to manage the card or dataset down the road. Did you try just choosing MAX in the aggregate options when you dragged in the date field into your table card and clicked on the pencil icon? If that didn't get you what you wanted, using the ETL to remove duplicates as @jaeW_at_Onyx would be the next simplest solution.

    I always to try not overthink things and see what straightforward options Domo is already giving me so that I am not wondering how a card is working six months down the road.

  • MichaelClark
    MichaelClark Wichita Falls, Texas ⚪️

    @MarkSnodgrass I dragged last_sync and chose MAX aggregation as well as created a BeastMode with MAX('last_sync') and I am still seeing the same output.

    Michael


  • MarkSnodgrass
    MarkSnodgrass Portland, Oregon 🟤

    @MichaelClark

    Try breaking things down. Start with only having the last sync field in the table card and setting the aggregation to max. This should give you one row with the latest sync data in your entire dataset. If you are seeing more than one row, than a setting in Analyzer is causing the issue. The most common reason is what is in the sorting properties. Make sure you are not sorting by something that would cause your card to not be able to group properly. Also, check the date range filter and make sure it is using the last sync for the date range. Start adding fields back in until you experience the duplication issue. This should reveal what is causing the issue.

  • jaeW_at_Onyx
    jaeW_at_Onyx Budapest / Portland, OR 🟤

    @MichaelClark and @MarkSnodgrass


    1) Michael's card looks the same because while he is showing max(date) he also has date on the axis hence why the data isn't changing.

    to put on the technical lens: your SQL query has become

    select
       machine,
       date,
       max(date)
    from
       table
    
    group by
       machine,
       date
    

    date and max date will always be the same until you remove date from the axis.


    2) regardless, Mark, using max(date) won't help because it will just return the max date. you won't be able to return other stuff related to the record with the max last_sync date.

    select
    machine
    max(date)
    
    from 
    table
    
    group by 
    machine
    

    to get the other related rows the query would have had to be written as

    select
       machine
       date
       ... and other stuff....
    from 
       table a
    where
       date = (select max(date) from table b where a.machine = b.machine )
    

    OR

    select
        machine,
        date
        ... and other stuff ...
    
    from table
    where
       date = max(date) over (partition by machine)
    


    but analyzer doesn't really support either of those options gracefully. so working it into ETL would probably be the cleaner solution without overcomplicating your beast modes.

  • MichaelClark
    MichaelClark Wichita Falls, Texas ⚪️

    @jaeW_at_Onyx @MarkSnodgrass it looks like removing the additional fields, adding back last_sync with MAX aggregation and adding NAME gives me the most recent sync date per machine!

    Michael