Strategies for monitoring or alerting about failed updates
For production data, we've got two paths for our data to get into Domo:
1) Scrips build Excel workbooks that are imported via Workbench.
2) Centralized data is stored in Postgres, exposed through a view, and then pulled in via the Postgres SSH connector.
The question is, how do people go about detecting late jobs? This came up the other day with our Workbench jobs. We noticed that several DataSets had not been updated for some time. It turned out that Workbench was no longer set to try these jobs.The schedule was set to never. No clue why. I rescheduled everything and now its flowing. Workbench can notify you if a job fails, but it doesn't have any reason to notify you if a job isn't ever set to run.
So what I'm after is a server-side monitoring strategy - something that would detect that a Workbench job hadn't showed up since too long.
On the Postgres side, I think I've got options in the Advanced Scheduling section.
It would be nice to have a single strategy for monitoring that our flows are all working as expected. This doesn't have to be a Domo feature, it could just be something that pulls data from Domo to feed into something else. For example, the old Domo DataSet API returns 'last updated' information. I could potentially script something to pull the DataSet definitions on a regular schedule and combine that with the thresholds to build a custom alert system. That seems a bit over the top for what must be a quite common requirement. Can anyone point me in the right direction for a way to detect jobs that are late that does not rely on the settings section of the connectors themselves?