Queuing of dataflow in case of parallel trigger request (manual and automatic)

Problem statement: 

Let us consider a daily once scheduled dataflow task (on datasource refresh) that takes over 2 hr to complete.

There are instances when I had to force run the dataflow, and during this time when there is a scheduled run, it will be ignored. Although we can say that the developer needs to be diligent in handling such scenarios, when there is a bigger team such issues can potentially happen. This can lead to data loss (esp because the dataflow is triggered after a fresh data is loaded by datasource). It would serve better if this aborted automatic process is logged in the history of dataflow, but it is not tracked currently.



Like we have in many ETL tools, we can queue the dataflow if it is triggered when the run is in progress. And in the "history" tab we can have a feature to track/monitor the queuing. If needed we can restrict the number of ques there can be for the dataflow (to one - the latest trigger)




2 votes

· Last Updated

This discussion has been closed.