Performance Issue MySQL DataFlow
I have a bit of an performance issue.
I am basically trying to create a table for our sales department, which includes all sales countries (as a filter), sales towns (column 1) and sales (Total) from this year (column 2), last year sales (column 3) and a variance column (column 4). Furthermore I want to insert a date picker and a possibility for YTD figures.
Here is my existing code:
1) Select created_at_trans,shop_town_id,
Sum(Total) AS Total
From buchungsliste_sh GROUP BY created_at_trans,shop_town_id
-Generate Output Table: proc_group
SELECT DISTINCT shop_town_id from buchungsliste_sh
-Generate Output Table: townsliste
select curdate() - INTERVAL (a.a + (10 * b.a) + (100 * c.a)+ (1000*d.a)) DAY as Dates,townsliste.shop_town_id
from (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as a
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as b
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as c
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as d
CROSS JOIN townsliste
where a.Dates between '2010-01-01' and '2025-01-01'
order by dates
-Generate Output Table: proc1
SELECT dt.Dates as datum,dt.shop_town_id,
DATE_SUB(dt.Dates, interval WEEK(STR_TO_DATE(CONCAT('12-31-',YEAR(dt.Dates)-1),'%m-%d-%Y'),0) week) AS LY_Date
From proc1 dt LEFT JOIN proc_group d ON dt.dates = d.created_at_trans AND dt.shop_town_id = d.shop_town_id
order by dt.dates
-Generate Output Table: proc_join
n1.total AS Ly_Total
From proc_join n LEFT JOIN proc_group n1 on n1.created_at_trans = n.ly_date
-Generate Output Table: proc2
ALTER TABLE proc2
ADD Season text;
SET season = (CASE
WHEN datum BETWEEN '2005-05-01' and '2006-04-30'
WHEN datum BETWEEN '2006-05-01' and '2007-04-30'
WHEN datum BETWEEN '2007-05-01' and '2008-04-30'
WHEN datum BETWEEN '2008-05-01' and '2009-04-30'
WHEN datum BETWEEN '2009-05-01' and '2010-04-30'
WHEN datum BETWEEN '2010-05-01' and '2011-04-30'
WHEN datum BETWEEN '2011-05-01' and '2012-04-30'
WHEN datum BETWEEN '2012-05-01' and '2013-04-30'
WHEN datum BETWEEN '2013-05-01' and '2014-04-30'
WHEN datum BETWEEN '2014-05-01' and '2015-04-30'
WHEN datum BETWEEN '2015-05-01' and '2016-04-30'
WHEN datum BETWEEN '2016-05-01' and '2017-04-30'
WHEN datum BETWEEN '2017-05-01' and '2018-04-30'
WHEN datum BETWEEN '2018-05-01' and '2019-04-30'
WHEN datum BETWEEN '2019-05-01' and '2020-04-30'
WHEN datum BETWEEN '2020-05-01' and '2021-04-30'
WHEN datum BETWEEN '2021-05-01' and '2022-04-30'
WHEN datum BETWEEN '2022-05-01' and '2023-04-30'
WHEN datum BETWEEN '2023-05-01' and '2024-04-30'
WHEN datum BETWEEN '2024-05-01' and '2025-04-30'
WHEN datum BETWEEN '2025-05-01' and '2026-04-30'
As you can see it is a very large and complex statement. Unfortunately, it is necessary to include all dates per town, because otherwise I have incorrect last year figures.
Town A had sales last year, but not on the same day this year. The figure won´t show up, if I change the code.
Step 4 took about 6 hours and step 5 about 10h.
Does someone could advise me how to have the same result, but in a more efficient way?
Many thanks for your help!
You can improve the performance by adding an index prior to the joins. You have to add a transform for each table you are indexing.
ALTER TABLE proc1 ADD INDEX(`dates`,`shop_town_id`);
ALTER TABLE proc_group ADD INDEX(`created_at_trans`,`shop_town_id`);
then run your join statement (#4)
ALTER TABLE proc_join ADD INDEX(`created_at_trans`,`shop_town_id`);
ALTER TABLE proc_group ADD INDEX(`ly_date`,`shop_town_id`);
then run your join statement (#5)
Let me know if that speeds up the process. My next thought would be to create a "join" field in those data tables by concating the two fields you are joining on CONCAT(`dates`,`shop_town_id`) as `join`
You would then want to index that new field prior to the join.
Good luck to you, let me know how it turns out.
“There is a superhero in all of us, we just need the courage to put on the cape.” -Superman
@ST_-Superman-_ gives very solid advice.
I offer one more suggestion to possibly speed things up for you. I have a similar transform process in one of my dataflows where I combine a list of all days with a list of all customers to get an array of all possible customer-day combinations. Instead of doing date math using your self-generated list of n days ago, I used Domo's Domo Dimensions connector (https://yourinstance.domo.com/appstore/connectors/com.domo.connector.domodimension) to bring to our data center a calendar file that has all days in it. Then I just join that calendar against a list of all customers to generate the customer-day combinations:
(select distinct customer_id from customers) a
--notice no join condition since you want a cartesian product, but you could still use CROSS JOIN if you wanted
I think it's really user-friendly this way.Aaron
MajorDomo @ Merit Medical
**Say "Thanks" by clicking the heart in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"1
- 7.6K All Categories
- 913 Connectors
- 241 Workbench
- 470 Transform
- 1.8K Magic ETL
- 60 SQL DataFlows
- 445 Datasets
- 24 Visualize
- 194 Beast Mode
- 2K Charting
- 6 Variables
- 14 Cards, Dashboards, Stories
- 1 Automate
- 348 APIs & Domo Developer
- 82 Apps
- 14 Predict
- 3 Jupyter Workspaces
- 11 R & Python Tiles
- 241 Distribute
- 59 Domo Everywhere
- 241 Scheduled Reports
- 14 Manage
- 35 Governance & Security
- 18 Product Ideas
- 1.1K Ideas Exchange
- Community Forums
- 15 Getting Started
- 1 Community Member Introductions
- 49 Community News
- 18 Event Recordings
- 579 日本支部