How to remove special characters from cells and parse data in sql

Reply
Highlighted
White Belt

How to remove special characters from cells and parse data in sql

I have data stored as text and am wanting to convert it to int with-in SQL.  Currently the text reads (8'-6 11/16").  I am needing to remove the foot symbol ('), the dash (-), and the inches (") symbol.  Then take each number (stored as text) and convert to int.  At which point I would rebuild the cell to read 102 11/16".  If it is possible to convert each number (stored as text) to decimal, then I could do a easy conversion and rebuild into fraction format. The SUBSTRING() command is not recognized in the "Transform" of MySQL DataFLow.


Accepted Solutions
White Belt

Re: How to remove special characters from cells and parse data in sql

Thanks for your help,

 

I was able to use the SUBSTRING_INDEX() to get the specific substring I was looking for.  One issue I had was that my data does not always have a fraction (/) character.  So, I had to run a SUBSTRING_INDEX() to get everything between the (-) and (").

SUBSTRING_INDEX()_1.png

 

I then needed to get the fraction seperated.  So I used the same SUBSTRING_INDEX() method, but added the the commands: WHERE and REGEXP '/'.  With the WHERE and REGEXP command added I was able to pull only the fraction and not the whole number.

SUBSTRING_INDEX()_2.png

 

 

In line 18, 19, and 20 of my output table I was able to get only the fraction and skipped the rows that did not contain a fraction.  

 

Thanks for your help with the SUBSTRING_INDEX().


All Replies
AS Major Blue Belt
Major Blue Belt

Re: How to remove special characters from cells and parse data in sql

What kinds of problems are you having with SUBSTRING()?  I've used that plenty of times in dataflows, so I'm surprised to see you're having an issue.

What kind of variety do you see in your string data?  Do each of these parameters (feet, inches, fraction numerator and denominator) exist every time (like 8'-0 0/16" vs 8'  for eight feet)?  Or are the excluded if they won't exist as part of that measurement?

If they are consistent, a series of substring functions should work great.  SUBSTRING_INDEX() could work really well here, too.

Once you get SUBSTRING() or SUBSTRING_INDEX() working, using a dataflow to convert or cast to decimal should be simple.

 

substring.PNG

 

Aaron
MajorDomo @ Merit Medical

**Say "Thanks" by clicking the thumbs up in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"
Tags (3)
White Belt

Re: How to remove special characters from cells and parse data in sql

The SUBSTRING() does not work in my situation because I am trying to use it on a column with varying information.  In my column I will have information in just inches (8 1/2") as well as information in ft and in (5'-6 1/8").  I have been successful in using the LEFT() command to remove all the (") special characters.  I am unable to get the RIGHT() command to work properly.  I want to use RIGHT() to keep everything to the right of the (-) in 5'-6 1/8.  The LEFT() command removed the (") but when I try the RIGHT() command the result I get is (1/8).  I am missing the 6.

 

LEFT() CODE:

     SELECT LEFT(height, LOCATE('"',height)-1) AS length_in, name
     FROM 96280_shipcon_table

 

RIGHT() CODE:

      SELECT RIGHT(length_in, LOCATE('-',length_in)+1) AS length_in_fin, name
      FROM length_in

 

LEFT() Command.pngRIGHT() Command.png

 

Thanks for your help.

AS Major Blue Belt
Major Blue Belt

Re: How to remove special characters from cells and parse data in sql

Variability makes it more complex for sure.  Will there always be a fraction?  That could matter in your approach as well.

I tinkered a little bit.  There has to be a better way but see if some of this helps (also see attached):

select
'8''-4 3/16"' as length
, case when '8''-4 3/16"' like '%''%' -- if there are feet measurements
then cast(substring_index('8''-4 3/16"','''',1) as unsigned) -- take the digits until the feet character
else 0 -- otherwise no feet
end feet
, case when '8''-4 3/16"' not like '%''%' -- if there are no feet measurements
then cast(substring_index('8''-4 3/16"',' ',1) as unsigned) -- take everything up to the space
else TRIM(SUBSTRING('8''-4 3/16"', LOCATE('-','8''-4 3/16"')+1,2)) -- if there are feet measurements, take everything between the - and the space
end inches
, SUBSTRING('8''-4 3/16"', LOCATE(' ','8''-4 3/16"')+1, length('8''-4 3/16"')-LOCATE(' ','8''-4 3/16"')-1) inches_fraction
Aaron
MajorDomo @ Merit Medical

**Say "Thanks" by clicking the thumbs up in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"
White Belt

Re: How to remove special characters from cells and parse data in sql

Thanks for your help,

 

I was able to use the SUBSTRING_INDEX() to get the specific substring I was looking for.  One issue I had was that my data does not always have a fraction (/) character.  So, I had to run a SUBSTRING_INDEX() to get everything between the (-) and (").

SUBSTRING_INDEX()_1.png

 

I then needed to get the fraction seperated.  So I used the same SUBSTRING_INDEX() method, but added the the commands: WHERE and REGEXP '/'.  With the WHERE and REGEXP command added I was able to pull only the fraction and not the whole number.

SUBSTRING_INDEX()_2.png

 

 

In line 18, 19, and 20 of my output table I was able to get only the fraction and skipped the rows that did not contain a fraction.  

 

Thanks for your help with the SUBSTRING_INDEX().

Announcements
Domopalooza 2018! Pre-conference training registration is now open! Click here!