Combine DataSets to a Filtered Count/Aggregated DataSet with different colums

TLDR: This is a complicated dataset (for me) that I am trying to build. Basically, I am taking data from two datasets, with different columns, aggregating them to columns that is listed correctly in rows... If that makes sense. I need help going about it the right way.

 

Enviroment:

Below is sample data I have and what I am trying to build from it... so you have an idea of what I am working with.

Input DataSet 1:

 

DateA:DateB:Employee:Sales Stage:
3/3/20183/5/2018John DoeClosed Won
3/25/20183/27/2018Jayne DoeClosed Lost
3/16/20183/18/2018John DoeClosed Won
4/1/20184/5/2018John DoeClosed Won
4/15/20184/17/2018Jayne DoeClosed Lost
4/20/20184/21/2018Jayne DoeClosed Won

 

Input DataSet 2:

 

DateC:Employee:Comm Structure:Sales:
3/31/2018John DoeConsultant100
3/31/2018Jayne DoeCoach200
4/30/2018John DoeConsultant300
4/30/2018Jayne DoeCoach350

 

Please note that the 'Input DataSet 2' entires listed are monthly summery stats. Each employee will have one entry for the month (the date is just listed as the 'last day of the month').

 

Output DataSet (What I am trying to achieve):

 

Date:Employee:Comm Structure:Sales:Total Ops:Ops Won:Avg Op Sales:Avg Op Sales Won:
3-2018John DoeConsultant100225050
3-2018Jayne DoeCoach200102000
4-2018John DoeConsultant30011300300
4-2018Jayne DoeCoach35021175350

 

What I currently have setup is using a dataflow to append the rows together with all columns from both tables. I then have calculated field(s) via 'Beast Mode' to 'Count' the 'Closed Lost' and 'Closed Won' sales stages. I am having a hard time using the count function by grouping the months together, but also inserting it on the same row as the summery entry for that specific month and employee. Below is what I currently got:

 

CASE WHEN DATE_FORMAT(`DateB`,'%M %Y') = DATE_FORMAT(`DateC`,'%M %Y') THEN
COUNT(`Sales Stage`)
END

 

Once I get the counts inserted correctly in the row, I can get the averages from there.

 

Question:

Am I going about this all wrong? I have been working on this for a couple days, but I am getting no where. Should I be filtering/agregatting the data with the options availible in the dataflow first? Any direction on how I may accomplish this would be appreciated.

 

Best Answer

  • DataJake
    DataJake

    domo

    💎

    Accepted Answer

    @user02364,

     

    You can accomplish this using a SQL Dataflow with two steps:

    1) Summarize the Won/Lost counts as column values at a monthly level by Employee

    2) JOIN the Summary data to the Sales data on the Employee and Date.

     

    (These steps are combined in the statement below) 

    -- Aggregate all to Monthly Level and Pivot out Won and Lost columns
    SELECT
    last_day(a.`dateb`) as `EndofMonth`, -- attribute to last day of month
    a.`Employee`,
    b.`Comm Structure`,
    b.`Sales`,
    count(case when `Sales Stage` = 'Closed Won' THEN a.`Employee` END) as `Won`,
    count(case when `Sales Stage` = 'Closed Lost' THEN a.`Employee` END) as `Lost`
    FROM dataset1 a
    JOIN dataset2 b on a.`Employee` = b.`Employee` and last_day(a.`dateb`) = b.`datec`
    GROUP BY last_day(a.`dateb`),
    a.`Employee`,
    b.`Comm Structure`,
    b.`Sales`

    wonlost.png

     

     

    This assumes you have both Won/Lost entries and Sales in the same month for all employees.

     

    It could also be done in Magic ETL if you aren't familiar with SQL syntax, but it would take more steps to accomplish.

     

     

     

Answers

  • DataJake
    DataJake

    domo

    💎

    @user02364,

    I am happy to help with this and can see how it's been a challenge. Can you clarify for me on the top row of dataset1, what is DateA vs DateB and how do they relate to the Sales Stage...

     

    DateA:DateB:Employee:Sales Stage:
    3/3/20183/5/2018John DoeClosed Won

     

    I want to make sure I understand how to attribute the Ops Won to a single date column.

    Thanks,

  • Hi DataJake,

     

    DateA: Is the date the lead was obtained, or when the opportunity was created

    DateB: Is the date when the employee determined the opportunity was either 'closed' or 'won'

     

    To my understanding, DateA is irrelevant for what I am trying to accomplish. I was trying to take DateB, and check if it occurs within the same month/year of DateC.... and then somehow, count all those that occurred within that month/year for that particular employee.

     

    Thanks though for taking a look at this.

  • DataJake
    DataJake

    domo

    💎

    @user02364,

     

    Ok, great. I understand what you are trying to do.. So let's remove DateA from the equation.

     

    Solution 1 - Append Datasets

    In your Magic ETL job where you append data, rename your Date Columns so they have the same name, this will allow you to retain the granularity of the closed/won activity and also the monthly sales details.

     

    DateB = Date

    DateC = Date

     

    Your output should look like this:

     

    DateEmployeeComm StructureSalesSales Stage
    3/31/18John DoeConsultant100 
    3/31/18Jayne DoeCoach200 
    4/30/18John DoeConsultant300 
    4/30/18Jayne DoeCoach350 
    3/5/18John Doe  Closed Won
    3/27/18Jayne Doe  Closed Lost
    3/18/18John Doe  Closed Won
    4/5/18John Doe  Closed Won
    4/17/18Jayne Doe  Closed Lost
    4/21/18Jayne Doe  Closed Won

     

    LIMITATION: this will still not apply the comm structure to the Employees in the Sales Stage Metrics. You will need a join for that instead of the append.

     

    BEASTMODE: Closed Lost Count

    COUNT(CASE WHEN `Sales Stage` = 'Closed Lost' THEN `Employee` END)

     

    Consideration

    Does the Comm struture for a given employee change from month to month? If not, then you could have a reference table that you join the appended data to in order to have it apply to all employees. If it can change from month to month, then you will need to join on Employee + End of Month column, which would require a little more logic in your dataflow.

     

    Here is a sample workflow for the Select Columns (Rename) and Append, folllowed up with a branch that gets the Comm Structure per employee (with Remove Duplicates so there is one row per employee)dojo.png

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

    This now will have a comm structure reference for all rows of an employee.


    dojo2.png

     

     

     

     

     

     

     

     

     

     

     

    Does this help?

     

     

     

     

     

     

  • Hello DataJake,

     

    Thank you for the through response. To follow up, I replicated what you provided, and I think we are closer to what I am trying to achieve, but not quite there yet. The issue I have currently, is similiar to what I had before my OP.

     

    In our example, the beast mode 'Closed Lost Count' does count all occurances for each row in the table, however, I am trying to insert the sum total of these "counts" in the same row(s) as the 'Sales' data (Or at least, that was my original plan). Then my plan was to do a quick beast mode average calculation for those rows (`Sales` / (Sum of total count for each employee/month/year)).

     

    Currently my output dataset with BM column you provided looks similiar to below:

     

    DateEmployeeComm StructureSalesSales StageBM: SS Counts
    3/31/18John DoeConsultant100 0
    3/31/18Jayne DoeCoach200 0
    4/30/18John DoeConsultant300 0
    4/30/18Jayne DoeCoach350 0
    3/5/18John DoeConsultant Closed Won1
    3/27/18Jayne DoeCoach Closed Lost1
    3/18/18John DoeConsultant Closed Won1
    4/5/18John DoeConsultant Closed Won1
    4/17/18Jayne DoeCoach Closed Lost1
    4/21/18Jayne DoeCoach Closed Won1

     

    I was attempting to do something similiar to below:

     

    DateEmployeeComm StructureSalesSales StageBM: SS CountBM: SS SumBM: Avg Sale
    3/31/18John DoeConsultant100 0250
    3/31/18Jayne DoeCoach200 01200
    4/30/18John DoeConsultant300 01300
    4/30/18Jayne DoeCoach350 02175
    3/5/18John DoeConsultant Closed Won1  
    3/27/18Jayne DoeCoach Closed Lost1  
    3/18/18John DoeConsultant Closed Won1  
    4/5/18John DoeConsultant Closed Won1  
    4/17/18Jayne DoeCoach Closed Lost1  
    4/21/18Jayne DoeCoach Closed Won1  

     

     

    The end goal here is to have a card display a 'grouped bar graph', grouped by employee of their average sales per month.

     

    So my question is, is it possible to take these counts from the various rows and insert their sums into the same rows as the 'Sales'? Or is there another way to accoplish my end goal?

     

    Thanks again for your help so far.