Applying calculations to hundreds of columns in a dataset - best practice?

I have a dataset that has ~1,200 columns and I need to perform calculations on 495 of those columns (simple multiplication and division) and create a new column for each with the calculated value...

 

Trying to do this in ETL will take forever to setup (would need 3 calculators, 2 of which will have all of the the 495 columns specified in them), so I thought I'd do it as an SQL transform, but I've run into the "Identifier name must be 64 characters or less" problem. I can't edit the field names before the data gets into DOMO as it comes from a 3rd party source, so I figured I could use an ETL to rename the columns that were too long BUT the "select columns" transformation screen hangs my browser (assuming because there's ~1,200 columns) (I'm on a new MacBook Pro have tried Chrome, Firefox and Safari with same result).

 

So I figured I'd do the calculations in best mode and save the results back to the dataset - but surely this isn't best practice?!

 

Suggestions of the "correct" way to tackle this most welcome!

Best Answer

  • DataMaven
    DataMaven 🟢
    Accepted Answer

    Trick to rename the columns with long names:. 

    In Magic ETL

    Add Constant (with your new column name)

    Set Column Value to the values in the column with the name that's too long.

    Go to SQL and use these new columns instead of the old ones.  This assumes it's just a few which are too long.  

     

    Have you tried bringing the data in via workbench and playing with the schema?

     

     

     

Answers

  • Thanks for the ideas @DataMaven 

     

    There's actually 130 columns whose name is too long, so your method is not impossible but still not great. Might end up being the answer.

     

    Workbench is Windows only, so I've never used it. Looks like I might need to hit the boss up for a PC.

  • Windows Virtual Machine is a cheaper alternative to PC and you can manage it from anywhere.