Extract 1 row / few rows from big dataset (>100M rows)

Reply
Highlighted
White Belt

Extract 1 row / few rows from big dataset (>100M rows)

When I try to extract the _BATCH_LAST_RUN_ from a MySQL dataset with 106M rows, the ETL takes 3h20min. Is there a way to extract only one row from the dataset. For the moment, I have to wait until the full 106M rows dataset is load in both Magic ETL and SQL. 

Highlighted
Major Brown Belt

Re: Extract 1 row / few rows from big dataset (>100M rows)

Hi,

 

It depends upon how your dataset is configured. For a standard dataset, the full dataset must load. Maybe if you can partition your dataset then you can speed this up. 

 

Jarvis

Highlighted
Major Brown Belt

Re: Extract 1 row / few rows from big dataset (>100M rows)

Yeah this is a tough one...

 

In a nutshell any ETL tool (except Fusion) will have to transfer your data into the transformation engine, a SQL database or Magic's ... data processing environment, before you can transform it.  Hence why your ETLs take so long.  

 

Your goal should be to use VIEWS which go directly to the Adrenaline (our database layer) to transform / subset your data. 

 

Up to about 5 months ago, your options for this were VERY limited... but ... stay tuned ...

 

There are updates to Product in the pipe that will make this story MUCH BETTER.  Find your Domo Customer Success Manager (CSM) and ask them how you can create views in Domo.  We've got new features coming to the UI that are really going to help you.

 

That said, if you're pretty nerdy you can use the JavaCLI to create a view without a pretty User Interface.

use get-schema to get the schema of an existing dataset in Domo, then use create-dataview... to ... create a dataview based on the schema you pulled with filters added.

Announcements
Want to join our weekly online meet up? Please visit this board here and let me know. Click here for more details!