Hi All, I need help since I am not familiar with sql, but I need to a column with the calculated average for a particular product which does not have averages available in my datasource.
This is a small sample of my data:
As you can see the AUM product does not have an average balance in the source system hence it has to be a calculated field where the calculation is based on the spots balances of that particular account and product from prior month and current month / 2.
Thanks in advance for your help! have a great day.
yikes... this is ... not the ideal way to structure your data ... but we can make it work.
Goal: identify the record that = the first day of the month and the last day of the month then caculate the average between the two.
If it were me, I would recommend adding these collumns to a date dimension and then joining the date dimension to your dataset, so that you can recycle your code everywhere, but most people take the lazy solution and just build the logic into the card...
-- is it the start of the month??
WHEN 'date' = DATE_SUB(date, INTERVAL DAYOFMONTH(date)-1 DAY) then 'Amount'
-- is it the end of the month??
WHEN 'date' = LAST_DAY(date) then 'Amount'
) / 2
What I had in mind was not to add this calculation as a beast mode but to add the column in my current database, since once it is done I have to append it to the databases from other countries. That is what I thought about addressing it via sql.
I think I got lost when you say identify the first day of the month, since my data only has the last day of the reporting month.
oh ... well that makes it easier i guess. sorry i'm half looking at this at 9PM 😛
You can do this with a LAG function in a beast mode.
Contact your CSM to get windowed functions enabled on your instance.
You can actually get this field populated pretty easily using Magic ETL. Here's a picture of what your ETL would generally look like:
Here's a breakdown of the steps in the ETL
1. Split your input dataset into to two datasets by using the filter tiles. Filter one to Product equals AUM. Filter the other one to not equal to AUM.
2. Not pictured above, but you need to add a select columns tile on your AUM filtered data set and add all columns except for the average balance column.
3. Add a Rank & Window tile on the AUM filtered data. Use the AVG function and sort by date ascending and use the preceding and following values of 1 and 1. This will create a rolling average based off the month and the prior month. Name the column average balance (the same name that is in your main dataset).
4. Append the data back together using shared columns.
5. Add and name your output dataset.
This should do it for you.
Thanks Mark! I added a partition to include the account and it work perfectly. Quick question, do you have any tip to overcome the months with no data.
Ie: I have cases that the account haven't had any balance for more that 1 month hence is is not being fed into the datasource. The average then it is being calculated since the last month that particular account had balance.
Yes, you can overcome this with a few additional steps. Here is what you would want to do:
1. Add a 3rd connection from the input to dataset to a Select Columns tile and just select the month column.
2. Connect a Remove Duplicates tile to that tile and choose Month to get a distinct list of months from your dataset.
3. Add a join tile to connect the Remove Duplicates tile with the Filter to AUM tile and do a Left Join with Remove Duplicates on the left and join on Month.
4. Add a Value Mapper tile and replace nulls with 0 on the spot balance column.
5. Connect to the Rank & Window tile you already configured.
It should look something like this:
@MarkSnodgrass what if you have NO data for any account in August? Do you want to skip the month of August from your percentage calc?