Case aggregate within aggregate


I'm struggling with a beast mode. I'd like to know the count of customers that placed an order multiple times (repeat customers). The only way I can think of is the following formula:


SUM( (CASE when COUNT(distinct `order_id`)>1 then 1 else 0 end) )


But I know that it is not possible because there already is an aggregate in the CASE-statement. And I'm not able to group it by customer_id. Can somebody help? Or do I need to create some kind of flag in the dataset?

Best Answer

  • ST_-Superman-_
    Answer ✓

    I think you may need to calculate the number of orders per customer in a data flow using a windowed function (ETL or redshift).  You can then use a beastmode to count the number of customers with more than one order.


    You will want to consider the timeframe here though.  Once you build the number of orders per customer in the dataset, you will not be able to change the time frame that you are looking at.  Are you wanting to identify customers with multiple orders over all time, or this year, or the last 12 months, or last 90 days, etc.?


    I'm not aware of a way to accomplish both calculations via a beastmode.

    “There is a superhero in all of us, we just need the courage to put on the cape.” -Superman


  • Thank you, that was the easiest way and worked out fine. I only want to know for all time, so the loss of dates won't be a problem.


    I hoped to do it all in beastmode because I'm at the limit of my data storage.