Finding the latest date with specific criteria

Reply
Highlighted
Yellow Belt

Finding the latest date with specific criteria

I work in the training area at my organization.  We are trying to determine "What is the latest date they have completed all of their required training?"  For example, a new agent comes in, and they are assigned Program A, Program B, and Program C.  We want to know what the latest completion date is when they have completed all 3.  I have tried ETL (image attached) and SQL and cannot seem to get the output I need.  We would also add a few more "filters" such as "Is licensed = Y."

 

SQL:

select `Agent Code`, `Program Title`, MAX(`Program Enrollment Completion Date`) as "RTS Date"

from `bridge_program_enrollment_status`
WHERE `Program ID` = '16'
and `Program ID` = '12'
and `Program ID` = '64'
and `Program Enrollment Status 3` = 'Complete'
group by `Agent Code`

 

Any help would be much appreciated.

Greggory S. Wright, PhD
Highlighted
Black Belt

Re: Finding the latest date with specific criteria

Without seeing the actual data (or a mockup of it) it is hard to give complete guidance, but here are a few things to note, followed by a cleaned up SQL statement:

1) Any field this in your select statement that is not aggregated, must be included in your group by clause. I noticed Program Title was not included in the group by clause.

2) I'm guessing you aren't returning any rows with this select statement. This is likely due to the fact that you are requiring each row to be equal to 3 different program id's. This would not be possible. You should use the IN statement instead if you are looking for rows with these 3 id's. 

Here's a suggested SQL statement:

SELECT `Agent Code`, `Program Title`, MAX(`Program Enrollment Completion Date`) as "RTS Date"
FROM `bridge_program_enrollment_status`
WHERE `Program ID` IN ('16','12','64')
AND `Program Enrollment Status 3` = 'Complete'
GROUP BY `Agent Code`, `Program Title`

Hope this helps




**Make sure to like any users posts that helped you and accept the ones who solved your issue.**
Highlighted
Yellow Belt

Re: Finding the latest date with specific criteria

Yes, @MarkSnodgrass , that's what I was missing was the "IN" statement.  I haven't worked with SQL in several years and completely forgot about "IN." Thank you SO much!

Greggory S. Wright, PhD
Highlighted
Yellow Belt

Re: Finding the latest date with specific criteria

@MarkSnodgrass when I'm validating the data, I'm seeing that the "in" statement is allowing "any of the program completions" to count instead of they must have "all of the programs complete."  Any thoughts?

Greggory S. Wright, PhD
Highlighted
Black Belt

Re: Finding the latest date with specific criteria

@DrGWright 

What constitutes all three being complete? Do you have a 'Program Enrollment Status 1' and 'Program Enrollment Status 2' fields as well that say Complete or Not Complete? If so, I would add to your WHERE clause that 

WHERE `Program ID` IN ('16','12','64')
AND `Program Enrollment Status 3` = 'Complete' 
AND `Program Enrollment Status 2` = 'Complete'
AND `Program Enrollment Status 1` = 'Complete'

Again, hard to say for sure without knowing what your data actually looks like. If you mock up some sample data of what an incomplete record and a complete record looks like that would help.




**Make sure to like any users posts that helped you and accept the ones who solved your issue.**
Highlighted
Yellow Belt

Re: Finding the latest date with specific criteria

Thanks, @MarkSnodgrass , and example would be as follows:

Agent #     Program ID     Completion Date

1                 12                     1/1/2020

1                  16                     1/7/2020

1                  84                      1/5/2020

 

What I want to show is:

Agent #     Latest Completion Date

1                 1/7/2020

 

I'm wondering if I took out the "Program Title" from the SQL if that would take the duplicate records away, because I don't really care what prorgram they finished last, I just want to know the latest date.  Does that help?

 

As far as the "Completion Status" field, there is a 1, 2, and 3, but they are all built within the ETL and show "Complete" as the status, so there's no change in status that should affect this.

 

 

Greggory S. Wright, PhD
Highlighted
Yellow Belt

Re: Finding the latest date with specific criteria

Also, the Program Enrollment Status 1-3 do not correspond to each specific program, it's just an overall status within our LMS.

Greggory S. Wright, PhD
Highlighted
Black Belt

Re: Finding the latest date with specific criteria

Yes, dropping the Program Title from your select and group by statement will definitely help.




**Make sure to like any users posts that helped you and accept the ones who solved your issue.**
Highlighted
Yellow Belt

Re: Finding the latest date with specific criteria

@MarkSnodgrass what I'm getting in the SQL results is any of the program's completion dates, not necessarily having all of the programs complete.  

 

So, for example, what I'm seeing is:

Agent 1 is completes Program 1 on 2/2/2020, and hasn't completed Program 2 or 3, the data is still giving me the completed date of "Any program" instead of "all programs."  Any additional thoughts?

Greggory S. Wright, PhD
Highlighted
Major Brown Belt

Re: Finding the latest date with specific criteria

EDIT:: video response (20 min) https://youtu.be/9uNv1_0XXao

 

EDIT:: oops sorry ... just saw that this was a magicETL dataflow.  read my SQL example and i'll post the Magic version below.

 

OOOOH a fun brain teaser!

 

so step 1) build a query that just extracts the last time i completed the 3 programs.  we GROUP BY userID and Program, just in case it's possible to complete a survey multiple times.

 

in step 2) we'll use a HAVING clause to only keep the rows where you have 3 responses, indicating that you completed 'the right number' of training.

SELECT
userID
, count(*) as NumberOfCompletedPrograms
, max(CompletionDate) as 'Date all 3 completed'

FROM (
// step 1 find the last date you completed each certification SELECT userID , max(completion date) as CompletionDate , programID FROM table WHERE
programID in (list of programs) GROUP BY programID, userID ) as result
// group by user b/c ... i want one row per user.
GROUP BY UserID

// only keep the rows where you completed a total of 3 programs HAVING
count(*) = 3

side note... a lazy developer would build a lookup table for all the trainings with a 1 or a 0 for 'isRequired.  That way you don't have to maintain the code, all you have to do is add new ProgramIDs to the list and change the isRequired flag..

 

then you could rewrite the query as 

SELECT
userID
, count(*) as NumberOfCompletedPrograms
, max(CompletionDate) as 'Date all 3 completed'

FROM (
// step 1 find the last date you completed each certification
SELECT
userID
, max(completion date) as CompletionDate
, programID

FROM 
training_table tt
JOIN
lu_listOfPrograms lu
ON
tt.ProgramID = lu.programID
WHERE
lu.isRequired = 1

GROUP BY
programID, userID
) as result

// group by user b/c ... i want one row per user.
GROUP BY
UserID

// only keep the rows where you completed a total of 3 programs
HAVING
count(*) = (select count(*) from lu_listOfPrograms where isRequired = 1)

 

 

Magic Version

So actually, for Magic the process is still the same.

your HAVING clause is just a FILTER after you apply a GROUP BY.

 

If it were me, I would avoid aggregating the data like this because you can't answer 'what am i missing from my training'?

OPTION 1)

Use the PIVOT / rows-to-columns transform to spread all the required courses as a columnName with the Date Completed as the Value.  That way you can create a visualization that shows the ones you haven't completed.

 

OPTION 2)

Just Output my stage 1 subquery.  That way you can create a filtered (and actionable) list of people and the courses they still need to complete.  You can do most of what you initially asked for in Cards without pre-aggregating the data.  (Pre-aggregating like your requirement is not recommended because your data won't respond to filters (which users completed training 1 but did not complete training 2.)

https://dojo.domo.com/t5/Beast-Mode-ETL-Dataflow/Finding-the-latest-date-with-specific-criteria/m-p/47408#M8055 Response to #Domo Dojo post. "What is the lat...
Announcements
Want to join our weekly online meet up? Please visit this board here and let me know. Click here for more details!