Best practices for performance with the Snowflake federated connector

Hi,

I wanted to ask about any existing best practices for performance when building cards on top of a Snowflake federated connector.

Also, I have two concrete questions:

  • Are primary key constraints on the Snowflake side meaningful in Domo? Snowflake itself doesn't enforce those constraints, but they might be meaningful to Domo in some performance-significant way.
  • Suppose we have a table on the Snowflake side where we have a column with "categorical" data in the form of VARCHARs. Like for example the priority of some issue stored as 'LOW', 'MEDIUM', 'HIGH'. Or each possible store in the USA (say, less than 1000 in number) identified by its name.  If there are lots of rows, my intuition is that shoveling all those strings through the network might waste badwith and slow things down. Perhaps passing some compact numeric identifier would be better. But that creates the need of "enriching" the incoming data with the original descriptions Domo-side. Possibly using a DataSet view, which is one of the few transformation options currently available for Snowflake DataSets. Would this be a useful optimization?
Tagged:

Answers

  • Hi @danidiaz, how did you go with this?

    There's a recent article that sheds some light on how to set up a Snowflake Federated Connector [https://domohelp.domo.com/hc/en-us/articles/360049429094-Snowflake-Federated-Data-Setup] and mentions that "You can create Cards using Analyzer from this [Federated] DataSet, but ETL, SQL DataFlows and DataSet Alerts are not available to operate on Federated DataSets."

    If you are after something more specific, to address a business problem/scenario, then it's worth bringing up in the Dojo along with some anonymised/sample data, so that we can help.

  • jaeW_at_Onyx
    jaeW_at_Onyx Budapest / Portland, OR 🟤

    @danidiaz

    1) no, primary keys have little to no impact on analyzer's performance.

    2) if you can implement dataset views in analyzer you could try using a webform (or some other dataset stored in Domo) as a Dimensional table / Surrogate Key solution, where you JOIN in the attributes after it's pulled into Domo.


    Keep in mind that your Snowflake Connector will determine how long Domo caches the data, so if you have short cache windows, Adrenaline (domo's database layer) will have to reprocess your newly arriving data AND the JOIN.


    You could also look at implementing beast modes with CASE statements to convert your data.

    ---


    That said, given that in Analyzer you can only represent 255k data points (or so) your query to Snowflake will include a GROUP BY clause which will reduce the data to only several hundred or thousand rows, passing in a Name column probably won't be that intensive on the API / network. You're not passing your 1B row dataset into Domo's viz layer.