Joining 2 datasets based on between condition.

Hello,

I have two data sets that i would like to join based on a between condition. The "Join data" available in the ETL shows only inner & outer joins. Where can i do custom joins?

Here is the detailed scenario.

dataset A has 2 columns - clicks, date

dataset B has ID, start date, end date.

I need an output  with B.ID, sum(A.clicks) based on A.date between B.start date and B.end date.

Each ID has separate start and end dates that don't overlap with other IDs.

Thanks in advance!

 

 

Best Answer

  • AS
    AS 🔵
    Accepted Answer

    Hi Srujana

     

    You'll want to use a SQL dataflow for this instead.  ETL only allows limited join options, like you stated.

Answers

  • If you would like help with the MySQL code, feel free to ask.  But @AS is right, I'm not aware of a way to do this inside an ETL at the moment.  I have heard tell of some additions to ETL that will allow you to write in Python or R code directly to the ETL.  But for now, I think MySQL is your solution.

  • Eventually we'll probably see SQL transforms in Magic ETL (so the rumor goes), which means we could have the best of both worlds. I have my fingers crossed, my hopes high, and my expectations in check.