Use Domo cards to monitor, manage and optimise ETLs and data sets to improve data quality
Problem: Domo has a lot of information about ELTs and data sets but it is in disparate locations and much of it can't be monitored and optimise using Domo cards and alerts. While there are some custom views (Data Warehouse) and some alerts, we can the full power of Domo to be used to monitor,managed and optimise ELTs and data sets. Without this its difficult to ensure that the data is timely and accurate in complex sites.
Proposal: Existing operational data about ETLs be written to a log file, which can then be used as a data source in Domo to create collections/cards such as:
- ETL operations : A Gantt chart of the select ETLs showing the start/end time and status. This chart would enable customers to confirm that all the ETLs had run as expected. It would also show sequences of ETL that may not finish before the end of the day.
- ETL performance: A chart of the run times for select ETLs. This would enable customers to create alerts for ETLs that taken longer, or shorter, than normal, indicating an underlying problem in the data.
Existing operational data about Data Sets be written to a log file, which can then be used as a data source in Domo to create collections/cards such as:
- Data Freshness: A chart showing how current select data sets are. Alerts can be used to create a warning if the data has become stale. This can show problems with the ETLs and connections.
- Data Growth: A chart showing the number of rows in the select data sets. Alerts can be used to create warning if the data size changes significantly. A much smaller or larger data set might indicate a problem in the transforms and connectors.
ELTs (Magic and SQL) be extended to support a validation/assertion logging framework. When an ETL is created custom validation/assertions for each steps can optionally be created, which are written to a log, e..g,
- confirm that the number of rows before and after a transform step are the same.
- confirm that an aggregate (e.g., sum) of a column (e.g., media cost) is the same before and after the transform step.
The log would include sufficient details to support understanding the problem, e.g.,
2017/10/14, Error, "LOOKUP Campaign Name", Size, 150, "The number of rows should remain constant in this transform. An increase indicates an error introduce in the JOIN."
- 10.7K All Categories
- 13 Getting Started in the Community
- 38 Beastmode & Analytics
- 2.1K Data Platform & Data Science
- 59 Domo Everywhere
- 2.7K Charting
- 2.5K Ideas Exchange
- 1.3K Connectors
- 362 Workbench
- 303 Use Cases & Best Practices
- 503 APIs
- 119 Apps
- 48 News
- 753 Onboarding
- 1.2K 日本支部
- Private Company Board