Data Lineage - Figuring Out All of the Datasets That Are Inputs for Dataflow(s)

Hi,

 

We are going to be conducting our first dataset cleanup soon and I was having trouble figuring out the best way to determine which datasets are an input to a dataflow.

 

The datasets that are inputs don't have any cards associated with them nor is there anything represented in the data lineage. 

 

We are trying to avoid deleting a dataset that looks dormant but really is fueling many datasets. I have accidentally done this in the past. Any best practices/suggestions?

Best Answer

  • VictorReyes
    Accepted Answer

    Hi,

     

    This seems like a maintenance nightmare without this functionality.

     

    Here is the link to upvote this idea. I was not able to find a similar idea already out in the idea exchange forum: LINK

     

     

Answers

  • To have such functionality would be a great help in our team as well!

  • Darius
    Darius

    domo

    💎

    Thank you for your question regarding data lineage, VictorReyes. You may reach out to support to confirm that a dataset is not used as an input to a dataflow if you have concerns. Otherwise, the current best practice would be to keep an ongoing record of what datasets are used by dataflows as inputs. I understand that can be hard to scale, so this is also a feature that our product team is considering for future releases, and we will provide updates through post-update notifications in your Domo instance. If you have ideas of what you would like to see in this feature, please let me know so I can pass that to the product team. Thank you!

  • Hi,

     

    How is support able to determine this? Is it possible to pass back whatever log/info they are viewing as a dataset or update the data lineage functionality to show all of the dataflows that a dataset is apart of. This would be similar to viewing the data lineage for a dataflow output. 

  • Darius
    Darius

    domo

    💎

    Hello VictorReyes, thank you for your response. Under extenuating circumstances, support can manually check the usage of a dataset as an input to dataflows. We are working with the product team to expand that into a feature in the Domo platform so it is available to all users, however. 

     

    Thank you for the suggestion, which is a great idea! We currently do not have such an offering. Please visit the Dojo Ideas Exchange
    http://dojo.domo.com/t5/Ideas-Exchange-suggest-and-vote/idb-p/Ideas.
    You can search existing ideas and then vote for it if it matches what you have in mind. If you cannot find it, please create a new Idea which can then be voted on by your fellow Dojo members by clicking the white arrow next to the idea.

  • Thanks @VictorReyes for the post and the idea submission.

     

    I voted your idea up as a follow up to the thread, the product management team should assign this to a PM for review shortly.

     

    Regards,

     

  • Hi,

     

    Thank you for doing so.