Has anyone successfully calculated a Pearson Correlation Coefficient in Beastmode? My idea is to create a scatter plot that graphically displays the relationship between two fields in a dataset, and display the Correlation Coefficient for those two datasets as the Summary Number.
For example, if the card is filtered for a date range of March 1 - March 31, the key fields in the underlying dataset might look like the following table, which would result in a Correlation Coefficient of 0.5669 (this is the value that I would want to display as the Summary Number for the card). Is this possible?
You can do it in a data transform, but couldn't in a beastmode as far as my testing.
You'll need to create a data transform that calculates the Pearson R using the following logic:
(Avg(`Revenue` * `Duration`) – (Avg(`Revenue`) * Avg(`Duration`))) / (STDDEV_POP(`Revenue`) * STDDEV_POP(`Duration`))
You have to break it into 2 selects, the first:
SELECT Avg(`Revenue` * `Duration`) AS a, (Avg(`Revenue`) * Avg(`Duration`)) AS b, (STDDEV_POP(`Revenue`) * STDDEV_POP(`Duration`)) as c FROM testing
And the 2nd:
SELECT (`a`-`b`)/`c` as 'Pearson r' FROm transform_data_1
As you can see, this will give you the correct result:
From there you can basically do the following
SELECT *, (SELECT `Pearson R` from PearsonCalculation) as `Pearson` FROM dataset
Then use a beastmode for MAX(`Pearson R`) as your summary number. It won't be dynamic where you can change the date range, but maybe someone else can figure out a way to get the beastmode portion to work.
**Please mark "Accept as Solution" if this post solves your problem
**Say "Thanks" by clicking the "heart" in the post that helped you.
Second option is to use the Corr function (which does a Pearson Correlation for you)
You can use that as a beastmode but it doesn't let you put it in the summary number.
Thanks for your ideas! I'll try the first one later on. I did try using the Corr() function in beastmode, but regardless of which fields I use I keep receiving the error "An issue has occurred during processing. We are unable to complete the request at this time." I noticed that Corr does not appear in the Functions list - perhaps it is in beta version?
I don't believe so, we aren't running any beta versions in our production environment and I was able to get it to work. Perhaps make sure both fields are integers?
I've tried multiple times to make the corr() function work, even creating a very simple test dataset that matches the one I provided earlier. No matter what I try, the formula validates, but I always receive the error
Have you gotten the corr() function to work?
Thanks again for all your help
Yes, I was actually able to use your test dataset, create a beastmode with the corr function and then it worked fine. It sounds like it may be worth reaching out to Domo Support at this point. There may be a quirk going on with your instance.