Queuing of dataflow in case of parallel trigger request (manual and automatic)

Problem statement: 

Let us consider a daily once scheduled dataflow task (on datasource refresh) that takes over 2 hr to complete.

There are instances when I had to force run the dataflow, and during this time when there is a scheduled run, it will be ignored. Although we can say that the developer needs to be diligent in handling such scenarios, when there is a bigger team such issues can potentially happen. This can lead to data loss (esp because the dataflow is triggered after a fresh data is loaded by datasource). It would serve better if this aborted automatic process is logged in the history of dataflow, but it is not tracked currently.

 

Suggestions: 

Like we have in many ETL tools, we can queue the dataflow if it is triggered when the run is in progress. And in the "history" tab we can have a feature to track/monitor the queuing. If needed we can restrict the number of ques there can be for the dataflow (to one - the latest trigger)

 

 

Thanks!

2
2 votes

· Last Updated

This discussion has been closed.