Filtering based on distinct count values

Hello,  

 

The discussions I've found that are similar to this seem to suggest using ETL, which I think I might end up having to do, but just wanted to see if there are any beastmode ideas out there.

 

I'll refer to the column headers as Date, Collection Code, Donor, Donation ID, Data Entered?, Distinct Count of Data Entered, Distinct Count of Donations.

 

Each row will always have a Donation ID.  Some Donation ID's were or were not Data Entered.

 

What I am trying to do is compare the number of Data Entered to the total number of donations.

 

Donation ID is a unique value, but there are necessary duplications of donation ID's sprinkled throughout our data, hence the Distinct Count of Data Entered ( count distinct case when "data entered" = "yes" then "donation id") and Distinct Count of Donation ID ( count distinct "donation_id).

 

These are then grouped by collection date and collection code in a column chart.   Up to this point, all has worked fine.  I have attached some sample data, with the first two charts being the manipulation of the raw data, and the 3rd chart being how it looks now.

 

The problem I am having is that if there were 0 data entries for a grouped date/collection code, I do not want that row to show at all.  But, if I filter my Data Entry to 0's, then obviously this effects my count of donations because it removes all donations that were not data entered.

 

I also cant filter Distinct Count of Data Entered.  

 

 

Best Answer

  • ST_-Superman-_
    Accepted Answer

    Unfortunately, you can't filter a card based on an agregate value.  That means that if you want to filter out the combinations on Date and Collection Code that contain no Data entries, then you would need to first work that into a dataflow.  Either ETL or MySQL, so that the card can then filter on it.  

     

    Let me know if you want any help on the dataflow.  I don't see a solution using beastmodes.

Answers

  • I don't see how you can use filters in this scenario without impacting the count of distinct donation ID. I think you'll have to include those counts in your dataset as additional columns. 

  • You could potentially highlight the dates when you have data entered if you change to a heatmap.

    I'm not sure what your specifice use case is, but see if this would work for you:2020-03-04_12-08-02.png

     

  • jaeW_at_Onyx
    jaeW_at_Onyx Budapest / Portland, OR 🟤

    'Filtering on aggregates' is a Feature switch that can be enabled by your CSM.  It comes with the caveat that it is a beta feature that hasn't been fully developed and integrated into analyzer so some functionality around exports etc. won't work.

     

    Because of the caveats, I personally don't use it often and instead try to build dataflows that allow me to avoid COUNT DISTINCT.