Reply
Highlighted
White Belt
Posts: 6
Registered: ‎08-25-2016
Accepted Solution

Asynchronously run custom workbench plugin

Hi,

 

I am currently working on creating a custom plugin for workbench 4.5. My goal is to query multiple data sources within a single DataReader. This is required because the data in my organization is sharded across multiple databases, but the tables across the databases all have the same schemas. I have successfully created a DataReader and DataProvider to handle this task, have installed the plugin on the workbench instance, and have successfully been able to get the data from all of the databases into DOMO.

 

The issue I am currently facing is the time it takes to run the new plugin. Importing 950k records takes 20 minutes using the new plugin compared to a sum total of 2 mins when using the standard ODBC data reader against all of the sharded databases. I believe the ODBC data reader is faster because it's utilizing multiple threads and I'd like to do something similar with my custom plugin.

 

If anyone has any idea how to do this (possibly some more info on how the "ExecutionCharacteristics" property is used on the DataReader or which dll contains the source code for the ODBCDataReader) I would be extremely grateful.

 

Thanks!


Accepted Solutions
Solution
Accepted by topic author Property_Ninja
‎08-10-2017 05:10 AM
White Belt
Posts: 11
Registered: ‎05-14-2015

Re: Asynchronously run custom workbench plugin

You could avoid the Domo Workbench software altogether and use python and dataframes for manipulating and joining your data from the multiple database queries you mentioned. Dataframes are convientent because they allow SQL like joining of data from within your script. 

 

Once you've joined the data using a pandas dataframe, you could programatically push your data directly to domo in your script using Domo's dataset API or Streams API.

 

The Streams API is nice because it can allow you to script an asynchronous push of data to Domo and is much faster than the workbench.

 

Hope this helps.

 

 

 

 

View solution in original post


All Replies
Moderator
Posts: 123
Registered: ‎01-22-2017

Re: Asynchronously run custom workbench plugin

Is anyone able to help with this topic?

Solution
Accepted by topic author Property_Ninja
‎08-10-2017 05:10 AM
White Belt
Posts: 11
Registered: ‎05-14-2015

Re: Asynchronously run custom workbench plugin

You could avoid the Domo Workbench software altogether and use python and dataframes for manipulating and joining your data from the multiple database queries you mentioned. Dataframes are convientent because they allow SQL like joining of data from within your script. 

 

Once you've joined the data using a pandas dataframe, you could programatically push your data directly to domo in your script using Domo's dataset API or Streams API.

 

The Streams API is nice because it can allow you to script an asynchronous push of data to Domo and is much faster than the workbench.

 

Hope this helps.

 

 

 

 

White Belt
Posts: 6
Registered: ‎08-25-2016

Re: Asynchronously run custom workbench plugin

I completely forgot about the stream API. We actually have created an in house .NET service for pulling from different datasources and pushing the data to other locations like DOMO. I was worried about performance and having to store overly large chunks of data in memory before sending it over to DOMO, but it looks like the Stream API handles all of those concerns.

White Belt
Posts: 11
Registered: ‎05-14-2015

Re: Asynchronously run custom workbench plugin

Yeah, you'll still have to split the and gzip the query results but once you have created the parts, the Streams API is very quick if you program it to push asynchronously. 

 

For example, we've written a little python cmd line application for pushing data up to domo using the Streams API.

 

On one dataset, we sent data to domo 57% faster than the Domo Workbench could by using the Streams API.

 

What you are really optimizing with the Streams API is the "send" portion of the job execution on the workbench.

 

For the send portion specifically, we sent data 95% faster than the Domo Workbench did. (an 8M row, 130 column file in 3.5 minutes)

Announcements
Domopalooza 2018 Call for Presenters! Do you have an amazing story about how Domo is revolutionizing the way you do business? Click here for more details! Thanks!