Correlation Coefficient in Beastmode

Reply
Yellow Belt

Correlation Coefficient in Beastmode

Has anyone successfully calculated a Pearson Correlation Coefficient in Beastmode?  My idea is to create a scatter plot that graphically displays the relationship between two fields in a dataset, and display the Correlation Coefficient for those two datasets as the Summary Number.

 

For example, if the card is filtered for a date range of March 1 - March 31, the key fields in the underlying dataset might look like the following table, which would result in a Correlation Coefficient of 0.5669 (this is the value that I would want to display as the Summary Number for the card).  Is this possible?

Thanks!

RevenueDuration
$603185
$262123
$906287
$429179
$872266
$302160
$185126
$596293
$481234
$173104
$62811
$296194
$27246
Black Belt

Re: Correlation Coefficient in Beastmode

You can do it in a data transform, but couldn't in a beastmode as far as my testing.

 

You'll need to create a data transform that calculates the Pearson R using the following logic:

 (Avg(`Revenue` * `Duration`) – (Avg(`Revenue`) * Avg(`Duration`))) / (STDDEV_POP(`Revenue`) * STDDEV_POP(`Duration`))

 

You have to break it into 2 selects, the first:

SELECT 	 Avg(`Revenue` * `Duration`) AS a,
(Avg(`Revenue`) * Avg(`Duration`)) AS b,
(STDDEV_POP(`Revenue`) * STDDEV_POP(`Duration`)) as c
FROM testing

And the 2nd:

SELECT (`a`-`b`)/`c` as 'Pearson r'
FROm transform_data_1

As you can see, this will give you the correct result:

image.png

From there you can basically do the following

SELECT *, (SELECT `Pearson R` from PearsonCalculation) as `Pearson`
FROM dataset

Then use a beastmode for MAX(`Pearson R`) as your summary number. It won't be dynamic where you can change the date range, but maybe someone else can figure out a way to get the beastmode portion to work.

 

Sincerely,

ValiantSpur

 

**Please mark "Accept as Solution" if this post solves your problem
**Say "Thanks" by clicking the "heart" in the post that helped you.

Black Belt

Re: Correlation Coefficient in Beastmode

Second option is to use the Corr function (which does a Pearson Correlation for you)

corr(`Duration`,`Revenue`)

You can use that as a beastmode but it doesn't let you put it in the summary number.

 

Yellow Belt

Re: Correlation Coefficient in Beastmode

Thanks for your ideas!  I'll try the first one later on.  I did try using the Corr() function in beastmode, but regardless of which fields I use I keep receiving the error "An issue has occurred during processing. We are unable to complete the request at this time."  I noticed that Corr does not appear in the Functions list - perhaps it is in beta version?

Black Belt

Re: Correlation Coefficient in Beastmode

I don't believe so, we aren't running any beta versions in our production environment and I was able to get it to work. Perhaps make sure both fields are integers?

Yellow Belt

Re: Correlation Coefficient in Beastmode

I've tried multiple times to make the corr() function work, even creating a very simple test dataset that matches the one I provided earlier.  No matter what I try, the formula validates, but I always receive the error

Have you gotten the corr() function to work?

 

Thanks again for all your help

Black Belt

Re: Correlation Coefficient in Beastmode

Yes, I was actually able to use your test dataset, create a beastmode with the corr function and then it worked fine. It sounds like it may be worth reaching out to Domo Support at this point. There may be a quirk going on with your instance.

White Belt

Re: Correlation Coefficient in Beastmode

@nitot @Valiant 

 

I'm also struggling with getting the corr() beastmode to manifest in a card without throwing back that error. Any thoughts on why the error occurs as I reach out to support in the meantime? 

Announcements
Looking for the latest Community solutions? Please visit our accepted solutions board here!