Re-running a failed job in workbench using automation

We are currently using Domo Workbench to pull in data from various 3rd party property management information systems and this requires a large amount of jobs to be run. The problem comes when one job fails for one reason or another and there seems to be no way to automatically set it to rerun, causing someone to have to go in and manually restart the job.

Does anyone know of a way to set these jobs to run again if they run into an error and fail?

We are looking into how to use PowerShell and command lines to check every 30 minutes if a job has failed/ran into an error. This might be possible to do because every job that is run has a log file and within that a unique ID number for the job itself. If a job fails the word "error" is present throughout the log file so in theory you could use PowerShell to automatically parse through the most recent logs for "error" and then open workbench and restart any failed jobs. We need to look further into the capabilities of PowerShell as we are not very knowledgeable about the program but it is a potential solution that was brought up. 

If anyone knows more about this topic or would be interested in collaborating on a solution please feel free to reach out.

Tagged:

Answers

  • MarkSnodgrass
    MarkSnodgrass Portland, Oregon 🟤

    Here are a couple of ideas for you:

    • There currently isn't a retry if fails option in Workbench, so you should submit this in the Ideas Exchange section of the Dojo and maybe it will be added in a future release.
    • Consider increasing the frequency of your job schedule. Domo is smart enough to not process if no data has changed. You could move from daily to hourly, which would be a built-in way to automatically retry.
    • You can use the Group Scheduling feature and have the first job be something that can test the connectivity and if it is successful then the rest of the jobs would run. You could have the first job run every hour, for example.

    Not sure if any of these will work for you, but they are easy to implement.

    Hope this helps.

  • GrantSmith
    GrantSmith Indiana 🔴

    You could leverage a scripting language and the web.exe executable to run the jobs and then determine if it succeeded or failed. To do that you’d likely need to read the JSON export (can do the export with the web.exe executable). Essentially you’d be writing another program to monitor workbench and have workbench kick off the job again if it failed. Really this is something that workbench should support and would be good product feedback for them.


    it depends on your use case if the job can be rerun every hour even it it succeeds or it it just needs to run successfully only once a day. If your use case it ok for multiple executions and won’t affect your pipeline then Mark’s solution may work where you just run it more frequently. If not then you’d likely need to write another script to export, check status, kick off again if necessary. Take your pick on programming languages. Not ideal.