Is there a way to perform a countifs-like function in Beastmode?

Basically, I have a dataset that is appended with new data regularly. I am trying to see how many accounts are associated with date a that are no longer showing up in the most recent (date b) appended data. I can write a meastmode variable that can show what lines qualify as date data a and date data b, and I can figure out the nominal difference, but that difference is a function of two variables: new accounts and closed accounts). I need to separate out new from closed. A countif function would allow me to search a field, using criteria from other fields. Anything like that? Good workarounds?

Comments

  • You can do a case statement inside a Count() function and that should be equivalent to a countif()

     

    Count(Case when `Field1` = 'text' and  `DateField` = '2016-04-01' then `Value` else 0 end)

  • Thank you for your prompt response! Unfortunately, this doesn't seem to be a dynamic solution, meaning it will only work with the one name that I hard code into the formula. I need it to work independently for each name.

     

    As an example, I uploaded a small dataset that has two fields: Month and Name, each with 22 lines. 11 of those Month lines say January and 11 say February. There are 11 unique names in the first 11 lines (January). The names in the next 11 lines (February) are copied from the first 11 lines, and then one name is changed. This simulates an event in which we have 11 accounts in January and 11 accounts in February, but we lost an account and gained an account between these two snapshots.

     

    Ideally, I would love to create a Beast mode field in which it looks at each name and associated month, figures out if it is month n or month n+1, and then looks to see if the name is present in both months. If the name is present in only month n it might say “Account Closed”.  If the name is present in only month n+1 it might say “New Account”. If the name is present in both months, it might just be blank.

     

    Am I missing something in your previous comment?

  • @Godzilla, are you able to provide any further insight? 

  • Was anyone able to find a solution to this problem? Running into the same thing right now.

  • Hello, 

    Sadly there is not a way to accomplish this through a beastmode. Beastmode runs data on a row by row comparison. This means that we cannot compare data from one row to another row. 

    To accomplish this we will need to use a MySQL dataflow. The query is slightly complex. 

    Lets say we start with a table such as: 

    monthmonth_numbername
    Jan1Albus Dumbledore
    Jan1Harry Potter
    Jan1Hermione Granger
    Jan1Ronald Weasley
    Jan1Rubeus Hagrid
    Jan1Severus Snape
    Feb2Gandalf
    Feb2Hermione Granger
    Feb2Ronald Weasley
    Feb2Rubeus Hagrid
    Feb2Samwise Gamgee
    Feb2Severus Snape
    Mar3Gandalf
    Mar3Hermione Granger
    Mar3Ronald Weasley
    Mar3Rubeus Hagrid
    Mar3Samwise Gamgee
    Mar3Severus Snape


    We have several names with people who are 'New Accounts', 'Continuing Accounts' and 'Closed Accounts'

    The first step in this is to create the base of our table. 

    SELECT
    a.`month`
    ,a.`month_number`
    ,a.`name`

    ,@prev:= `name`

    FROM input_dataset a
    ,(SELECT @prev:='') b

    order by `name`, `month_number`

    Here we create a variable '@prev'. This will be used later to determine if our value from the last row is the same for this row. 
    We also are ordering first by the name, then the date. (For simplicity I made a month name and month number column)

    Now that we have the base of our query we can create our statement to calculate what we need. 
    Lets break the statement we will need into pieces to get a better understanding of what it does. 

    If(@prev=a.`name` ,'','New Account')

    This will look at the previous name in our table and determine if it is new or not new. 

    We now know when an account/name is new or not new. 
    Next we will need to determine if ' ' results to a closed account or a continuing account. We will need to add a comparison: 

    a.`month_number` = (SELECT max(`month_number`) month from input_dataset where a.`name` = `name` )

    This is going to compare the current month, with the last month created for that account. 

    WIth these two statements we can make our final statement:

    IF(If(@prev=a.`name` ,'','New Account') = '' AND a.`month_number` = (SELECT max(`month_number`) month from input_dataset where a.`name` = `name` ),'Account Closed',If(@prev=a.`name` ,'','New Account')) as 'Example'


    Our transform/query will look similar to: 

    SELECT
    a.`month`
    ,a.`month_number`
    ,a.`name`

    ,IF(If(@prev=a.`name` ,'','New Account') = '' AND a.`month_number` = (SELECT max(`month_number`) month from input_dataset where a.`name` = `name` ),'Account Closed',If(@prev=a.`name` ,'','New Account')) as 'Example'

    ,@prev:= `name`

    FROM input_dataset a
    ,(SELECT @prev:='') b

    order by `name`, `month_number`

    2016-08-08_1449.png

    After running your dataflow with this query you will create a table that will look like: 

     

    monthmonth_numbernameExample
    Jan1Albus DumbledoreNew Account
    Feb2GandalfNew Account
    Mar3GandalfAccount Closed
    Jan1Harry PotterNew Account
    Jan1Hermione GrangerNew Account
    Feb2Hermione Granger 
    Mar3Hermione GrangerAccount Closed
    Jan1Ronald WeasleyNew Account
    Feb2Ronald Weasley 
    Mar3Ronald WeasleyAccount Closed
    Jan1Rubeus HagridNew Account
    Feb2Rubeus Hagrid 
    Mar3Rubeus HagridAccount Closed
    Feb2Samwise GamgeeNew Account
    Mar3Samwise GamgeeAccount Closed
    Jan1Severus SnapeNew Account
    Feb2Severus Snape 
    Mar3Severus SnapeAccount Closed


    Please let me know if you would like me to go into further detail on any of the steps this query is doing. 

  • @shaanarora @nlclayville, tagging you to check out ilikenno's reply. 

  • Thanks for the update @ilikenno and @kshah008. The issue I'm seeing here is that the current or most recent month will always say account closed. I forgot to update earlier but I used the LAG function in Redshift to accomplish what I needed. It may not be applicable in this situation but might be worth checking out to @nlclayville

  • @shaanarora

    Glad you were able to use the LAG function. 

    Just for additional reference. As mentioned the most recent will always return "Account Closed". We can use another formula to avoid this. 
    In the original "Example" query we will replace 'Account Closed' with another formula. 

    IF(If(@prev=a.`name` ,'','New Account') = '' AND a.`month_number` = (SELECT max(`month_number`) month from input_dataset where a.`name` = `name` ),'Account Closed',If(@prev=a.`name` ,'','New Account')) as 'Example'

    Replace with: 

    CASE 
    WHEN MONTH(`Date`) = (SELECT MONTH(MAX(`date`)) from input_dataset)
    AND YEAR(`Date`) = (SELECT YEAR(MAX(`date`)) from input_dataset)
    THEN 'Account Closed'
    ELSE 'Current Account'
    END



    @nlclayville
    If you have any additional questions please let me know. 

  • worked, thanks!