Grouping into one Row

I want to be able to group the parent_wID field so that it is one row per entry. As you can see, the Retail/Wholesale columns are build from beast modes from a separate field. This is the syntax i used to separate those. 

 

"Retail"

(CASE
when `Loan_Originator_Type` = 'Retail' THEN 'Yes'
else 0

end)

 

"Wholesale"

(CASE
when `Loan_Originator_Type` = 'Wholesale' THEN 'Yes'
else 0

end)

 

Unfortunately this spits out multiple rows, but I would like to aggregate and group everything into just the parent_wID.

image.png

Best Answer

  • GrantSmith
    GrantSmith Indiana 🔴
    Accepted Answer
    "Retail"
    
    MAX(CASE
    when `Loan_Originator_Type` = 'Retail' THEN 'Yes'
    else 0
    
    end)
    
     
    
    "Wholesale"
    
    MAX(CASE
    when `Loan_Originator_Type` = 'Wholesale' THEN 'Yes'
    else 0
    
    end)

Answers

  • GrantSmith
    GrantSmith Indiana 🔴

    Hi @user063136 

    A quick hack would be to aggregate these columns in Analyzer with a max() function which would cause Yes to override your 0 value. This is assuming the rest of the data you're displaying isn't unique or it's aggregated.

  • Could you explain how that works? Not sure what you mean.

  • oh that worked thank you so much!

  • jaeW_at_Onyx
    jaeW_at_Onyx Budapest / Portland, OR 🟤

    @GrantSmith and @user063136 ,

    judging from the results of your original post it looks like, for one parentID there can be both Retail and Wholsale loan originator types. If that weren't the case, then you shouldn't get multiple rows per parent.

     

    "Retail"

    (CASE
    when `Loan_Originator_Type` = 'Retail' THEN 'Yes'
    else 0

    end)

     

    "Wholesale"

    (CASE
    when `Loan_Originator_Type` = 'Wholesale' THEN 'Yes'
    else 0

    end)

     

    Given that you only want ONE response, I would put my aggregation INSIDE the CASE statement.

    CASE

    WHEN SUM (CASE WHEN LoanType = 'retail' THEN 1 else 0 END) > 0 THEN 'Retail'

    END

     

    Why is this better?  Improved transparency and increased code recycling

     

    With MAX(CASE ...) you can't answer the follow-up question: "how many retail loans are there" with the same logic.  You might say, "but Jae each parent only has 1 loan originator." that may be true, but when i roll my data up to a higher level of aggregation, i want to build logic that still works.

     

    By starting with SUM(CASE WHEN 'retail'  then 1 else 0 end) as the base metric, i can validate (without hacks) the assumptions about my data AND aggregate reliably to different levels.