DataFusion - need to see what ISN'T included

I love DataFusion, it's a great way to combine data sources and avoid doing VLOOKUPs in Excel, etc.

 

However, I think we need a way to find out what is NOT being included in the fusion. I have my fusions set to combine two (or more) sources based on unique criteria, like Product Name and Launch Date. I require that both sources have to show the same thing in both fields in order to come into the fusion. This works great, I do want to omit data that doesn't match in those fields.

 

However, if the Launch date for a product gets changed on one source, but the other source didn't know about that change, that product will be missing from my fusion when data is refreshed. If I'm then reviewing cards about projected revenue, for example, that product will be missing from the calculation, and that is not obvious to me if my card is showing fiscal year or month, for example, rather than the more granular list of products and dates.

 

I need a way to find out what is NOT being fused, so I can spot check for missing data. Right now the only way to know that a product fell off of the data set is to review the fused data very carefully. My current method: I created a table card that just pulls in all the columns from the fusion, so I can export to Excel each week and compare last week's fusion to this week's fusion, as my data is refreshed weekly, and once the new data is in Domo I can't compare what is to what was. I know I can see all the data by drilling through my cards, but I really need the whole picture in table format to put it side by side with the prior version, so this is my workaround.  

 

I would recommend that on the page where you can administrate your fusion, you also have the option to export the non-fused data into a table, so admins can ensure the data that is not included is intended to be omitted. The export could be one table with all the fused columns, so some rows will be partially blank -- populated for datasource A, but not for the columns pertaining to the source it doesn't match, OR one file with multiple tabs, one per data source, so there are no gaps in the table and all data is shown in its original format.

 

Happy to discuss this use case if anyone from Domo wants to chat. Thanks for considering!

Broadway + Data
Tagged:
1
1 votes

· Last Updated

Comments

  • It sounds like this use case could also benefit from some kind of anomaly detection tracking the data set size over time. If I join data source A with data source B every day to get Data fusion C and suddenly the size of C drops by 60 percent while A and B sizes are stable, that would be a red flag that something odd has happened and you could alert based on that.

     

    At that point, the user would be directed towards the anomalous fusion, where the new ability to see the excluded data would allow them to diagnose the problem. I guess the tricky part is coming up with a configuration scheme/default setting that could be useful to people who experience this without being an annoyance to users for whom the anomaly detection's assumptions are a poor match.

  • Thank you for this idea. I am assigning this to our product manager @mattchandler for review.

  • Thanks @RobynLinden! This would be a great suggestion for DataFlow too—while it's currently possible for DataFlows, we could make it easier.

    Best,
    Matt Chandler
    Domo
  • Best,
    Matt Chandler
    Domo
  • @cr1ckt Thanks for your suggestion too—that would also be a great improvement.

    Best,
    Matt Chandler
    Domo
This discussion has been closed.