Only include the first instance and ignore others in beastmode

Hi all,

I'm trying to calculate between signup and the first sale:

DATEDIFF(`sale_date`,`signup_date`)


Which works fine. Only problem is, is when a person makes multiple sales it does counts towards the average. Is there a way to ignore any other instance and pick only the earliest date?


Thanks in advance

Best Answer

  • jaeW_at_Onyx
    jaeW_at_Onyx Budapest / Portland, OR 🟀
    Accepted Answer

    add a column to your transactional data, n_th sale, using a rank & window function.

    you can use Row_Number or Rank depending on the granularity of the data.


    Now just alter your beast mode.

    avg( case when n_th sale = 1 then DATEDIFF(`sale_date`,`signup_date`) end )

Answers

  • MarkSnodgrass
    MarkSnodgrass Portland, Oregon 🟀

    Have you tried using the MIN function on the date field you want the earliest date from?

  • I would recommend adding a first_sale_date field to your dataset. You could do this by using a group by tile and grouping by your cust_id and then calculating the min(`sale_date`) for each cust_id. You would then join that table back to your main table by joining on cust_id. This would give you the first_sale_date as a field for each line of data. I would then also calculate the DATEDIFF(first_sale_date,`signup_date`) within the dataflow as well.

    The downside to this approach is if you are wanting to filter out certain products. If you want to be able to show the time from sign up to when the customer first purchases a shirt instead of just any purchase, and then also be able to change a filter and show how long it took them to buy shoes, then this approach would not work.

    I offer this as a solution because I am not sure of how to accomplish it within a calculated field. Do you know if you have windowed functions enabled for calculated fields in your instance?

  • user046467
    user046467 βšͺ️

    @ST_-Superman-_ thanks I got that working.

    Only sidenote I've here is that if someone buys multiple things the average is still off.

    For example:

    4 sales | 61 days difference

    1 sale | 1 day difference

    Gives an average of 37 days which should be 30 days.

    When they just have one order everything is fine.

    Did I miss a step in your explanation?

Sign In or Register to comment.