Data Fusion not running

GrantSmith Indiana 🥷

I've got two datasets 6.3M and 14.7M records. It's simple order and customer information tied together by a LONG (Numeric data type) ID field. Between the two data sets there's a total of only 64 columns so they're not huge but for whatever reason I can't get the data fusion to work. Occasionally I'll get the preview to work but sometimes it will fail and state I can still save my data fusion. I've reviewed the knowledge base ( and I'm following all of the guidelines.


I've tried both inner and left joins and also swapping the input order of the datasets but nothing seems to work.


Has anyone else run into an issue with a Data Fusion not runninng after saving?


Best Answer

  • GrantSmith
    GrantSmith Indiana 🥷
    Answer ✓

    Thanks @jaeW_at_Onyx 


    I was aware of the 60 second limit, it was just odd to me that I've had wider and deeper tables in fusions return instantaneously. These together only had 69 columns so they weren't wide by any means. It's a one to many relationship so I didn't need to worry about a cartesian product.


    For whatever reason I was able to just delete and recreate the fusion (after mutliple attempts prior) and it was able to run successfully.


  • jaeW_at_Onyx
    jaeW_at_Onyx Budapest / Portland, OR 🟤

    @GrantSmith ,

    Just do the JOIN on a small handful of columns, start with 2 columns from the LEFT and the RIGHT side.

    Keep in mind SELECT * across two deep tables can be ugly ... especially if you have wide tables.


    Validate that you get the expected number of rows (14.7M) in the output.  


    It's wild to me to have 14M transactions across 6.3M customers ...


    BUT, that said, once you've confirmed that your JOIN isn't producing a cartesian product (that you have multiple customerIDs with the same value) and your OUTPUT returns the correct number of results, you can ask the support team to "optimize your query".  


    They can do 'back end magic' to make sure your Fusion returns a result.


    THE ROOT CAUSE of the problem is that Fusions must return a result within 60 seconds or Domo will force quit the query.  


    If I were you, I'd build a Fusion on a subset of the customer dimension, then build out your visualizations /  dashboards.  This will give the DBAs a little more to work with as they're optimizing your fusion.  They'll be able to understand 'what's important' and make more meaningful projections than just on your JOIN column.


    Keep in mind that Optimizations can break if your schema changes (i.e. column names change).  You can add new columns (they just won't be optimized).


    hope that helps.

  • jaeW_at_Onyx
    jaeW_at_Onyx Budapest / Portland, OR 🟤

    Part of the issue can be related to caching.


    When you query a dataset in adrenaline, Domo starts caching some of the results.  In my mind, i think of it as analagous to the Statistic tables you build in a SQL database or the cache you have in MS SSAS MultiDimensional.


    So, the more questions you ask the faster Domo gets at returning an answer.

    When you update a dataset in Domo, Adrenaline clears the cache so all the stats etc go away and have to slowly build up as you query it.


    That's partially why (as I understand it) you might tinker with a fusion a few times, and after the 3rd try, you finally get a response.


    Have support / engineering take a look.

  • mine isn't running either. I swear it worked a few days ago.