# Rolling Average for Each Category

I have a dataset containing total monthly units by month and by person. I'd like to calculate the 3 month rolling average for each person. How should I group the data so that the rolling average calculation goes by person and not by date?

Tagged:

• This is how my dataset looks so far

• Hi @JR_S1L

You'd need to keep your data as is to partition based on name to get the lag for the prior 3 months to get your rolling average. You can use a rank & window tile in Magic ETL to calculate the lag just make sure you're partitioning by the name and that you're sorting/ordering by the month

**Did this solve your problem? Accept it as a solution!**
• Hi @GrantSmith ....I am new to lag functions, can you explain a bit further? This is what I have so far when calculating the lag using Magic ETL...how would I then calculate the rolling average?

• I've done a writeup of lag functions and rolling averages before utilizing Beast Modes but the idea at the end is the same concept. You just need to calculate the Lag for 1 month ago and 2 months ago into three separate fields. Then add all 3 together (current month, 1 month ago and 2 months ago) and divide the total by 3 using a formula tile. That will get your rolling 3 month average.

This is depending on how you want to calculate the rolling three months. Is that the current month and the prior 2 months or the prior 3 months excluding the current month? If you want the prior 3 months just calculate the lag for 3 months ago and add that value instead of the current value.

If you're interested here's the writeup using a beast mode (but if your rolling average won't need to have filters dynamically applied to it I'd recommend keeping it in the magic etl): https://dojo.domo.com/discussion/52679/domo-ideas-conference-beast-modes-rolling-averages#latest

**Did this solve your problem? Accept it as a solution!**
• +1 on Grant's solution using Lag(1) .. Lag(2), Lag(3) IF YOU DON"T HAVE GAPS IN YOUR DATA.

Only thing I strongly dislike about this approach is that you have to create 3 functions (not efficient) AND it must be true that there are no gaps in data.

Computationally, @GrantSmith i would say this is probably the only good use case for MySQL b/c it allows you to calculate LAG without the assumption of contigusous data (cc. @MarkSnodgrass

NOTE THIS IS ALL UNTESTED CODE there are probably missing commas, and when you set variables (as opposed to display you must use := and variable names are case sensitive.

if you write

```SELECT
t.*
, r.*
FROM
table t
, (SELECT @person := '' , @date := null FROM ) r
ORDER BY t.Date
```

You are CROSS APPLYing a generated table with one row (r) to every row of table t where table r contains variables person and date which are insantiated as blank and null respectively

from there if you write

```SELECT
@date as prevDate,
t.*,
@date := t.date as _setDateVariable

FROM
table t
, (SELECT @person := '' , @date := null FROM ) r

ORDER BY t.Date
```

when MySQL reads this function it will process each column of my SELECT statement in order (top to bottom). so when if i was on row 3 of my table...

prevDate is the value of assigned to @date (which happens to be the value of t.date from row 2 -- i'll explain in a second), then it reads all the values for that row of table t, then it assigns @date the value of t.date for the current row (row 3)

when i SELECT row 4...

prevDate is the value of @date (row 3) , t.* is all the contents of row 4, then i assign ( using := ) the value of t.date (row 4) to @date.

and so on.

implement partitioning and test for previous day

```SELECT
-- execute tests. testing @person implements my Partition clause
-- testing @lag1Date ensures the no gaps rule (this is the key part that Magic doesn't give you)

case when t.person = @person and @lag1Date = date_add(t.date, interval -1 day) then @lag1Value end as prevDayValue,
t.*,

-- paritioning assumes an ORDER BY clause
-- OR reset my variables to null
@lag1Value := case when t.person = @person then t.value else null end as _setLag1Value,
@lag1Date := case when t.person = @person then t.date else null end as _setLag1Date,

@person := t.person as _setPersionVariable

FROM
table t
, (SELECT @person := '' , @date := null , @lag1Value:=null, @lag1Date:=null, FROM ) r

ORDER BY t.person, t.Date
```

```SELECT
-- execute tests. testing @person implements my Partition clause
-- testing @lag1Date ensures the no gaps rule (this is the key part that Magic doesn't give you)

case when t.person = @person and @lag1Date = date_add(t.date, interval -1 day) then @lag1Value end as prevDayValue,
case when t.person = @person and @lag2Date = date_add(t.date, interval -2 day) then @lag2Value end as prevDayValue,

t.*,

-- cascade the values of lag1 to lag2 or reset to null
@lag2Value := case when t.person = @person then @lag1Value else null end as _setLag2Value,
@lag2Date := case when t.person = @person then @lag2Value else null end as _setLag2Date,

@lag1Value := case when t.person = @person then t.value else null end as _setLag1Value,
@lag1Date := case when t.person = @person then t.date else null end as _setLag1Date,

@person := t.person as _setPersionVariable

FROM
table t
, (SELECT @person := '' , @date := null , @lag1Value:=null, @lag1Date:=null, @lag2Value := null lag2Date:=null FROM ) r

ORDER BY t.person, t.Date
```

Jae Wilson
Check out my 🎥 Domo Training YouTube Channel 👨‍💻

**Say "Thanks" by clicking the ❤️ in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"
• Thinking it through another set based way of approaching this without variables would be to do what SQL has to do under the covers (generate windows) for each row in your dataset we'll generate the 4 rows, then we'll assign the value of the transaction for those lagged rows if they exist, then we'll aggregate.

this assumes we have a numbers table, sys_num, that contains integers between 1 and 3 in a column called n.

NOTE this is untested code but i think it works.

P.S. this code you could implement in Magic if you wanted to avoid writing SQL. it would perform faster since we didn't index MySQL.

duplicate (using CROSS APPLY) the rows in t.table 4 times

```CREATE TABLE referenceTable as

SELECT distinct
date_add(t.date, interval n day) as report_date
t.person
t.value as lag_value
n.num as lag_num
t.date as actual_transaction_date
FROM
table t
, (SELECT num from sys_num where num beteen 0 and 3) n
```

for each row in table t, you should have duplicate rows if table has a matching 0, 1, 2, or 3 day offset (report_date).

```CREATE TABLE blowOut as

SELECT
t.person,
, t.date,
, r.lag_num
, r.lag_value
,  case when lag_num = 1 then r.lag_value as lag1day end
, case when lag_num =2 then r.lag_value as lag2day end
, case when lag_num=3 then r.lag_value as lag3day end
, case when lag_num=0 then r.lag_value as lag0day end
FROM
table t
JOIN
referenceTable r

on t.person = r.person
and t﻿﻿.date = r.report_date
```

deduplicate rows

```create table lag_table as

SELECT
max(lag_1day) as lag1day,
max(lag_2day) as lag2day,
max(lag_3day) as lag3day,
max(lag4_day as lag4day,

date,
person﻿﻿

from blowOut

GROUP BY
date,
person
```

then join in the data for that person and that row

```select
t.*
<lt.columns>

from table t
inner join
lag_table lt
on t.person = lt.person
and t.date = lt.date﻿﻿﻿
```
Jae Wilson
Check out my 🎥 Domo Training YouTube Channel 👨‍💻

**Say "Thanks" by clicking the ❤️ in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"
• @GrantSmith , @jaeW_at_Onyx thank you both so much!!! I realized it's because we don't have window functions enabled in our domo instance that this was proving impossible for me...I've reached out to our rep to have them enabled so that I can apply the options that you provided! Again, thank you!!

• @GrantSmith , @jaeW_at_Onyx just to confirm...in order to be able to dynamically filter by name, these calculations need to be done in beast mode correct?