Reply
Highlighted
White Belt
Posts: 8
Registered: ‎02-07-2017
Accepted Solution

Creating rankings based on a column in the Data Center (ETl or SQL)

Hi Guys,

 

While writing the ETL/SQL functions, to load our data and prep it for building cards later, we need to create a set of ranks based on different variables (columns). We don't want this ranking to be done in the cards (on the spot), we want to materialize it in our datasets.

 

So in short I'm looking for the code to do this : (Employee, Age and Score are existing fields).

 

Employee     Age      Rank_Age      Score   Rank_Score

John            31              2               120         3

Joe              29              1               59           4

Jack            44              3               255          1

Jim              57              4               200         2 

 

Thanks!

 

Joeri


Accepted Solutions
Solution
Accepted by topic author JoeriDP
‎02-21-2017 04:52 AM
Green Belt
Posts: 55
Registered: ‎11-18-2015

Re: Creating rankings based on a column in the Data Center (ETl or SQL)

You can do this in a MySQL query, but you'll probably need to do a different query for each column you want a ranking for. Here's a sample for a single column (the DESC in the first FROM statement means descending, you can use ASC or leave it blank for the age column):

 

SELECT 
a.*, @rank:=@rank+1 AS Rank
FROM( SELECT * FROM sample_table ORDER BY `Score` DESC)a, (SELECT @rank:=0)b;


 

This will order the table by descending score and start at 1 for the top row and add 1 per row going down. From your example, it will output:

EmployeeAgeScoreRank
Jack442551
Jim572002
John311203
Joe29594

View solution in original post

Solution
Accepted by topic author JoeriDP
‎02-21-2017 04:52 AM
Green Belt
Posts: 55
Registered: ‎11-18-2015

Re: Creating rankings based on a column in the Data Center (ETl or SQL)

You're correct about the @rank assumption, the @ symbol in MySQL sets a variable. So you wouldn't have to use @rank but you could use @ any word as your variable.

 

In the FROM statement, you have you name the output of every SELECT. You can try removing the b and it will give you this error: "Every derived table must have its own alias". In SQL itself you could use a SET statement to initialize the variable @rank as 0 but in MySQL this is how you would initialize a variable.

 

 

SELECT a.*, IF(@prev <> `job_id`, @rank := 1, @rank := @rank + 1) AS rank, @prev := `job_id`
FROM (select * from test_dataset order by `job_id`, `candidate_id`)a, (select @prev := NULL)b, (select @rank := 0)c;

 

Here's a sample I did for your last question, it first orders by job_id then by candidate_id (or whatever you would want to rank by) and then checks to see if the job_id has changed row over row to start over the ranking. Let me know if that helps!

View solution in original post


All Replies
Solution
Accepted by topic author JoeriDP
‎02-21-2017 04:52 AM
Green Belt
Posts: 55
Registered: ‎11-18-2015

Re: Creating rankings based on a column in the Data Center (ETl or SQL)

You can do this in a MySQL query, but you'll probably need to do a different query for each column you want a ranking for. Here's a sample for a single column (the DESC in the first FROM statement means descending, you can use ASC or leave it blank for the age column):

 

SELECT 
a.*, @rank:=@rank+1 AS Rank
FROM( SELECT * FROM sample_table ORDER BY `Score` DESC)a, (SELECT @rank:=0)b;


 

This will order the table by descending score and start at 1 for the top row and add 1 per row going down. From your example, it will output:

EmployeeAgeScoreRank
Jack442551
Jim572002
John311203
Joe29594
White Belt
Posts: 8
Registered: ‎02-07-2017

Re: Creating rankings based on a column in the Data Center (ETl or SQL)

Hi Shaanarora,

 

Thanks this works great, I can make subqueries for the different rankings, so that's perfect.

As an oldschool ANSI/Oracle SQL guy, could you explain me what the folowing syntax does, just so I understand what I'm building.

 

@rank:=@rank+1 AS Rank

 

I don't know the @rank expression, however I assume this line will set a variable @rank and do a +1 every row in the output of the query (hence rows excluded by a where closes will not impact the counter) and name it Rank.

 

(SELECT @rank:=0)b

 

The FROM statement is clear, except for this part, I can guess more or less what it does, but would like to understand the syntax.

 

In the query below, would it be possible to restart the counter every 'new' job_id. Basically I want a ranking for candidate_id, based on the close_date, for each job_id ?

 

2017-02-20_1142.png

 

Thanks,

 

J.

 

Solution
Accepted by topic author JoeriDP
‎02-21-2017 04:52 AM
Green Belt
Posts: 55
Registered: ‎11-18-2015

Re: Creating rankings based on a column in the Data Center (ETl or SQL)

You're correct about the @rank assumption, the @ symbol in MySQL sets a variable. So you wouldn't have to use @rank but you could use @ any word as your variable.

 

In the FROM statement, you have you name the output of every SELECT. You can try removing the b and it will give you this error: "Every derived table must have its own alias". In SQL itself you could use a SET statement to initialize the variable @rank as 0 but in MySQL this is how you would initialize a variable.

 

 

SELECT a.*, IF(@prev <> `job_id`, @rank := 1, @rank := @rank + 1) AS rank, @prev := `job_id`
FROM (select * from test_dataset order by `job_id`, `candidate_id`)a, (select @prev := NULL)b, (select @rank := 0)c;

 

Here's a sample I did for your last question, it first orders by job_id then by candidate_id (or whatever you would want to rank by) and then checks to see if the job_id has changed row over row to start over the ranking. Let me know if that helps!

White Belt
Posts: 8
Registered: ‎02-07-2017

Re: Creating rankings based on a column in the Data Center (ETl or SQL)

Thanks for the explanation, I understand it now.

I'm going to experiment with the new code you added. I'll let you know how it goes Smiley Wink

 

Joeri

White Belt
Posts: 8
Registered: ‎02-07-2017

Re: Creating rankings based on a column in the Data Center (ETl or SQL)

Worked like a charm, thanks a lot for the help.

 

Joeri

Announcements
Important! If you would like to update or change your Dojo user name simply click on My Settings here! Thank you!