Is there a way to identify orphan datasets that have no flows or cards dependent on them?

I'm looking to do two things. 1) Clean up unused datasets 2) Identify the most used datasets. Are there cards that will help with this?

Comments

  • GrantSmith
    GrantSmith Indiana 🔴

    Hi @user036002 ,

     

    You'll want to look at either the Domo Governance Third Party datasets. They contain information about the different objects within Domo along with meta data.

     

    Specifically the Data Set Details dataset to see card count = 0.

    You'll also need to do some ETL magic to join the datasets to dataflows to see that dataset isn't an input data set on a dataflow.

     

    The inverse would help you to identify the most used datasets (sum(card count) + count(dataflows with dataset as input).

     

     

  • jaeW_at_Onyx
    jaeW_at_Onyx Budapest / Portland, OR 🟤

    (domo governance) DG Dataflow Details will tell you which datasets are used as inputs into a dataflow.  JOIN that to the DataSet dataset to identify which datasets are not used in ETL.  Be careful of the treatment of fusions if you use them, b/c they don't follow the same dataset input / output pattern.

     

    consider metrics like 'not used in dataflow' , 'last time dataset updated', 'last time dataflow was modified' some people set up fire and forget update schemes so the data looks current but it's actually a stale unused pipeline.  consider counting the number of cards AND / OR the number of pages a card is shared to (DG Cards and Pages).  

     

    Also consider tying in the Activity Log to count how often a cards from a dataset were viewed in the last 90 days