I have a table with ~500k rows; varchar(255) UTF8 column filename
contains a file name;
I'm trying to strip out various strange characters out of the filename - thought I'd use a character class: [^a-zA-Z0-9()_ .\-]
Now, is there a function in MySQL that lets you replace through a regular expression? I'm looking for a similar functionality to REPLACE() function - simplified example follows:
SELECT REPLACE('stackowerflow', 'ower', 'over');
Output: "stackoverflow"
/* does something like this exist? */
SELECT X_REG_REPLACE('Stackoverflow','/[A-Zf]/','-');
Output: "-tackover-low"
I know about REGEXP/RLIKE, but those only check if there is a match, not what the match is.
(I could do a "SELECT pkey_id,filename FROM foo WHERE filename RLIKE '[^a-zA-Z0-9()_ .\-]'
" from a PHP script, do a preg_replace
and then "UPDATE foo ... WHERE pkey_id=...
", but that looks like a last-resort slow & ugly hack)
Answer
With MySQL 8.0+ you could use natively REGEXP_REPLACE
function.
REGEXP_REPLACE(expr, pat, repl[, pos[, occurrence[, match_type]]])
Replaces occurrences in the string expr that match the regular expression specified by the pattern pat with the replacement string repl, and returns the resulting string. If expr, pat, or repl is
NULL
, the return value isNULL
.
and Regular expression support:
Previously, MySQL used the Henry Spencer regular expression library to support regular expression operators (
REGEXP
,RLIKE
).
Regular expression support has been reimplemented using International Components for Unicode (ICU), which provides full Unicode support and is multibyte safe. The
REGEXP_LIKE()
function performs regular expression matching in the manner of theREGEXP
andRLIKE
operators, which now are synonyms for that function. In addition, theREGEXP_INSTR()
,REGEXP_REPLACE()
, andREGEXP_SUBSTR()
functions are available to find match positions and perform substring substitution and extraction, respectively.
SELECT REGEXP_REPLACE('Stackoverflow','[A-Zf]','-',1,0,'c');
-- Output:
-tackover-low
No comments:
Post a Comment