MySQL Data Flows clarification: MySQL version & window functions (row_number)

I've neglected MySQL Data Flows because I'd passingly assumed that they were for MySQL-derived DataSets. It seems not, it looks like a MySQL DataFlow is a DataFlow that uses an embedded version of MySQL. Is this correct?

I've seen mention of Window Functions in passing, and assumed the MySQL must be 8.0+, as that's when MySQL added support for row_number(), lead(), lag(), etc. However, here's the version string I get out of MySQL inside of Domo:

5.6.28-76.1-56

So, no Window Function support.

Does anyone know if the internal MySQL is scheduled for an upgrade to 8.0+?

(Better: How about embedding Postgres as well....it's got a range of wonderful features not found in the MySQL family of SQL engines. :: alone for casting is worth the price of admission.)

Thanks for confirmation/clarification.

Tagged:

Answers

  • MarkSnodgrass
    MarkSnodgrass Portland, Oregon 🟀

    @DataSquirrel I have not heard any talk of a pending upgrade of MySQL, but I would assume they would at some point. Domo is going to focus its resources on Magic ETL and Adrenaline Dataflows and will try and steer people to those offering as much as possible. It is cheaper for them if people use those more, and I would encourage you to look into it as Magic ETL 2.0 runs really fast and can do all the window functions that you need it to do. If you don't have Magic ETL 2.0, ask your CSM to turn it on for you. I think you will be pleased with it.

  • @Mark , thanks for the answer. I'm checking into getting access to Magic ETL 2.0, it sounds quite good.

    Are there some docs on Adrenaline Dataflows? I feel like I'm missing a lot of information somewhere about subjects including Adrenaline.


    Thanks again!

  • MarkSnodgrass
    MarkSnodgrass Portland, Oregon 🟀

    @DataSquirrel there is a Domo user group Slack channel that I think you would enjoy being a part of. We also have weekly zoom meetings on various topics.

    @jaeW_at_Onyx can you get @DataSquirrel added to both?

  • Thanks a lot, Jae has been helping me out too, and I've ben getting some great info from his videos.

    I'm on Sydney time, so not sure about times, but...hopefully!

  • jaeW_at_Onyx
    jaeW_at_Onyx Budapest / Portland, OR 🟀

    Let’s work together in Slack! It’s a faster, simpler way to talk business, share files and get work done. Join here: https://join.slack.com/t/domousergroup/shared_invite/zt-n22nno2s-erzBJFTX3TdYOM_jb4058Q


    @DataSquirrel

    Domo spins up a VM (or similar) to process MySQL etls. They've known they're behind for YEARS. I do not believe we're ever getting an update. You can create a poor man's window function use user defined variables, but over massive datasets it wouldn't be performant b/c #indexing. You'd be better off with Magic 2.0. Their window functions are semi comprehensive but very performant.

    Redshift is an option for massive datasets but the time it takes to actually START a redshift dataflow (i usually tell people the jobs take 20 minutes in a queue before execution even begins) can kill the performance gain of a faster database engine.

    Adrenaline Dataflows is a premium feature (i'm making up a number but i think in the 50k region) that functionally allows you to write SQL against the Adrenaline database layer similar to how DataSetViews are creating SQL views against Adrenaline, with the key difference being that ADR dataflows just materializes the data as a dataset that gets stored in Vault (think S3 + automated disaster recovery).


    Hope that helps.