Field Dependencies Documentation

There have been several requests for ERD style maps of our datasets, however I'd like to see these dependency maps taken to the field level.

 

If we have a field in a dataset that is result of a series of dataflows we need to know how its derived - what transforms have acted on that field from the raw data to the final dataset used to build cards. This is not just for the general knowlege but to see how changing one dataflow may impact the final dataset.

 

Breaking down a dataflow into multiple steps (some MySQL, some magic ETL) as opposed to a wall of SQL is great for creating the flows and visualizing the relationships, however it creates a lack of transparency when determining cascading dependencies from one flow to the next.

 

We've resorted to using a seperate documentation tool to achieve this but it is work intensive, and is in constant danger of being out of date.

12
12 votes

· Last Updated

Comments

  • Thank you for submitting this @anafziger. I am assigning to our product manager @JSharp to review and comment.

  •  @JSharp - We've been using a really great documentation tool and given API enpoints to pull in DataFlow metadata and the different transform operations (or SQL code) we would be able to completely automate this process.

     

    Unless this is already in the works internally it could be easier to integrate with them directly.

  • Understanding where data comes from and how it's derived is very important. This is a great callout and a fantastic idea. Thank you!

  •  @JSharp any update on this? Thanks

This discussion has been closed.