Preview Unique Values in ETL

I often use the Replace Text and Value Mapper functions in ETL to manipulate text strings. This helps ensure the filters on the card builder side pick up all the right rows. For example, if I have a column with...

 

ProductA

product a

Product A

 

...I can standardize them via ETL before I even get to card builder.

 

I think a great improvement would be the abilty to preview unique values in ETL, to make sure I have caught all of them. If I had the three strings above mapped as I intended, but a new one, "product A", came into the data set, it would not be caught at the ETL stage and might be ommitted from cards without my realizing. With data coming from various people and places in my organization, there is no way to ensure I am aware of all unique values before the data is imported. I have workarounds, but they are time consuming and I think an ETL solution would be very smooth and powerful. 

 

Currently you can preview your data at each stage in ETL (which is GREAT), but you can only scroll through rows in the preview, you can't sort or filter on preview, so there is no real way to confirm that all unique values have been addressed as needed.

 

The idea above would cover identifying unique values as an ETL is built, but that does lead me to another step: where an ETL is built and running consisently, but new values appear later on. I would still need a way to find out that this has occurred, and could envision this being part of the DataSet details in the Data Center (ex current tabs are Overview and Cards, perhaps lists of Unique Data could live here, too?). 

 

Happy to discuss with Domo team members if anyone wants to get in touch.

Broadway + Data
3
3 votes

· Last Updated

Comments

  • Thank you for submitting this idea.  I am assigning to our product manager @mattchandler for review

  • That's a great suggestion, thanks @RobynLinden! Seeing all unique values would be a big help, and it would also be great to be able to set an alert when new values come in.

     

    In the case you are discussing, would you want the Magic ETL DataFlow to fail and notify you when new values come in, or to succeed and notify you, or just to have a list of new values available in a different place? I'm just wondering how serious new values would be in your mind (core data integrity issue where you would want to fix it right away, vs optimization where you want to clean up individual values)

     

    Best,

    Matt Chandler

    Product Manager

    Domo

     

    **Say "Thanks" by clicking the thumbs up in the post that helped you.

    Best,
    Matt Chandler
    Domo
  • @mattchandler good questions. I would want it to succeed and notify me, because some new values might be expected - we add a new unique item, it should show up. Maybe, though, it would be best if the user can identify on the backend of ETL: RUN regardless of new values, or DO NOT RUN when new values appear -- because my response may actually differ from one ETL to the next. 

     

    It would be GREAT if you could do something cosmetic to whatever the output is to indicate new values - bold, cell shading, something like that, so it draws the eye to whatever it found new this time that was not there previously. That would save a ton of time and help me feel confident I have caught everything. I don't know if that would help users who share admin responsibilities on data flows, but I'm the only person pushing the button in my instance, so it would help me ensure I know what is "new" and what is coming in as expected. 

    Broadway + Data
  • @RobynLinden Thanks for the additional clarification, that makes sense and is a great suggestion.

    Best,
    Matt Chandler
    Domo
  • I like this suggestion a lot ?

This discussion has been closed.