Graphing a [Case Statement + Window Function] calculated field

Reply
Highlighted
White Belt

Graphing a [Case Statement + Window Function] calculated field

Hey everyone! 

 

I've run into an issue while attempting to create a card to graph a calculated field that involves a case statement and a window function but I'm not sure if my implementation is wrong or if this is a Domo limitation. I have a database that records the Employee's ID, Date, Hour of Availability (24H format) during the day that the employee has input and the availability status (e.g. if Employee 123 is going to work on 10/22/2020 for 3 hours we would see a 1 when he's availabile, 0 otherwise as below): 

 

 courier_id    | availability_date|  Hour  |  Nr_Hrs_Availibility

     123          |    10/22/2020    |     9     |              1

     123          |    10/22/2020    |    12    |              1

     123          |    10/22/2020    |    17    |              1

     124          |    10/22/2020    |    10    |              1

     124          |    10/22/2020    |    11    |              0

     124          |    10/22/2020    |    13    |              1

 

What now I'm trying to do is to categorize the employees into the ones that input 8 hours or less, between 9 and 20 hours and more than 20 hours and visualize weekly what percentage of the employee base work within these three ranges. I've created the attached Calculated Field that works and populates my data table correctly but when I attempt to do a COUNT(DISTINCT 'courier_id') in order to avoid double counting the Employee IDs that are repeated and attempt to visualize it (be it a Table, Graph or anything) I get a message saying "An issue has occurred during processing. We are unable to complete the request at this time." Do you have any idea why this might be happening?


Accepted Solutions
Highlighted
Black Belt

Magic will work in a pinch.  Dataset Views will work as well.  you can even construct your CASE statement within the DSV.

 

NOTE.  You CAN implement a window function inside the DSV, but be advised it can have some ... unexpected interactions in cards,

ALSO the rules of what you can and cannot do in a card extend to the rules you should follow when designing DSVs for consumption in analyzer cards.  

 

ex.  you can't filter on aggregate functions in analyzer**, so don't built an aggregation into the DSV and then expect to filter on the aggregate column in anlayzer.  

 

same for window functions.  if you implement a window function in a DSV everything used in the PARTITION or ORDER BY clause must still be an axis on the visualization.

 

** there is a beta that allows filtering on aggregates, but it's ... early beta and hasn't been actively developed past that point AFAIK.


Jae Wilson
Check out my Domo Training YouTube Channel

**Say "Thanks" by clicking the heart in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"

View solution in original post


All Replies
Highlighted
White Belt

Anyone? 🙁

Highlighted
Black Belt

@GrantSmith  is highly skilled at Window functions and dates and can probably figure it out for you.




**Make sure to like any users posts that helped you and accept the ones who solved your issue.**
Highlighted
Major Red Belt

Gee thanks @MarkSnodgrass  🙂

 

@AbelMM  - The issue resides with your windowing function. Domo is close but isn't exactly the same as SQL windowing functions.

 

Because you're adding an aggregate function (COUNT distinct) it's expecting your other columns in a group by statement under the hood. To make a long story short you should be able to fix your issue by changing your window function beast mode. Replace all of your instances of SUM(`nr_hours_availability`) to SUM(SUM(`nr_hours_availability`))

 

To get a deeper understanding of what's happening under the hood I encourage you to view @jaeW_at_Onyx 's video at https://www.youtube.com/watch?v=eifSYZIcPzg and also check out his channel in general. He's got some great videos.

 

Also, for future assistance it's generally a good idea to include your beast mode as text to allow others to easily copy and paste it and test it. This will get you a quicker response.



**Was this post helpful? Click the heart icon**

**Did this solve your problem? Accept it as a solution!**
Original Domo post:https://dojo.domo.com/t5/Beast-Mode-ETL-Dataflow/Calculate-average-time-to-accumulate-to-a-value/m-p/50668// more windowed function videos...
Highlighted
White Belt

Thank you so much for the video and the tips, @GrantSmith ! Unfortunately I'm still struggling a little to understand what's going on under the hood apparently. I made the suggested changes in my calculation and it seems to go a little better, since it is actually showing something for a change but not quite the expected result. I attempted both partitioning by Employee ID and by Week and also partitioning by Employee ID and ordering by Week but I think the latter yields a cumulative sum and the first works better at showing the expected result. Either way when I attempt to graph with the COUNT(DISTINCT 'courier_id'), I get two series, one for the Employee IDs and one for my calculated field. I also tried directly counting the 'courier_id' given that as per the logic since the calculated field is already grouping by Employee ID and Week, the use of DISTINCT was not needed anymore but that's not yielding better results either.  Here below is the code used for the calculated field and attached an image of what I'm getting in the Card Builder. Let me know if you have any other suggestions or need any more info and again thank you so much 😊

 

CASE
WHEN SUM(SUM(`nr_hours_availability`)) OVER(PARTITION BY `courier_id`, WEEKOFYEAR(`availability_date`,22)) <=8 THEN '<=8 HoA'
WHEN SUM(SUM(`nr_hours_availability`)) OVER(PARTITION BY `courier_id`, WEEKOFYEAR(`availability_date`,22)) >8 AND SUM(SUM(`nr_hours_availability`)) OVER(PARTITION BY `courier_id`, WEEKOFYEAR(`availability_date`,22)) <=20 THEN '<=20 HoA'
WHEN SUM(SUM(`nr_hours_availability`)) OVER(PARTITION BY `courier_id`, WEEKOFYEAR(`availability_date`,22)) >20 THEN '>20 HoA'
ELSE 'NA'
END

Highlighted
Major Red Belt

Hi @AbelMM 

From what I can tell your configuration should work. Have you tried graphing it as a table and validating that the beast mode is returning the correct data?

I'm also wondering if it's a Domo limitation because it's a windowing function you're attempting to plot as a series but it's treating it as a separate column and not a list of series values (typically you'll see this with multiple metric fields listed in the series)



**Was this post helpful? Click the heart icon**

**Did this solve your problem? Accept it as a solution!**
Highlighted
Black Belt

i'm reading very quickly, but i'm pretty confident it won't work as set up.

 

MY ASSUMPTION OF YOUR HIGH LEVEL GOAL

A list of availabilities and the number of employees who have that xyz total availability given a certain time frame.

<8, 5

<15, 23

<21, 100

>=21, 14

 

If that's your goal, you can't get it using window functions because you're trying to create a window function based on employee_ID, and to use a window function you must include everything that's in the ORDER BY and the PARTITION BY clause on the card axis.  ... so you can't categorize users with your CASE statement, you must show each individual user.


Jae Wilson
Check out my Domo Training YouTube Channel

**Say "Thanks" by clicking the heart in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"
Highlighted
White Belt

Hey @jaeW_at_Onyx 

 

Yes, the ultimate goal is to create either a graph or table that shows the weekly number of employees who fall within each of these categories, i.e. I would need to know the total number of weekly hours input by each employee in order to count him as one of the employees that input less than 8 hours on week 40, for example. So in the end I want to end up with one column per week that is divided into 3 series that represent the ranges for the hours input in the system, in which each series is counting the employee only once based on his availability. But then again I'm including the week in one of the axis, the courier on the other one, which are the two things I'm partitioning/ordering on, so that's a bit confusing. 

 

What would be the way to go? I'm thinking the DataSet Views would be the best way since we could create an intermediate dataset with one line per employee per week (right?) but I'm waiting for the functionality to be activated on our instance. Any other suggestions? Thanks! 

Highlighted
Black Belt

Magic will work in a pinch.  Dataset Views will work as well.  you can even construct your CASE statement within the DSV.

 

NOTE.  You CAN implement a window function inside the DSV, but be advised it can have some ... unexpected interactions in cards,

ALSO the rules of what you can and cannot do in a card extend to the rules you should follow when designing DSVs for consumption in analyzer cards.  

 

ex.  you can't filter on aggregate functions in analyzer**, so don't built an aggregation into the DSV and then expect to filter on the aggregate column in anlayzer.  

 

same for window functions.  if you implement a window function in a DSV everything used in the PARTITION or ORDER BY clause must still be an axis on the visualization.

 

** there is a beta that allows filtering on aggregates, but it's ... early beta and hasn't been actively developed past that point AFAIK.


Jae Wilson
Check out my Domo Training YouTube Channel

**Say "Thanks" by clicking the heart in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"

View solution in original post

Announcements
Domopalooza 2021 Call for Presenters: We want to hear how Domo is revolutionizing the way you do business!

Click here to submit your story.