Wednesday 8 May 2019

find files with non-ascii chars in file name




Is there a way I can find files with non-ascii chars? I could use a pipe of course - and filter the files with perl, but for efficiency I'd like to set it all in find. I tried the following:



find . -type f -name '*[^[:ascii:]]*'


it doesn't work at all.



Edit:




I'm now trying to make use of



find . -type f -regex '.*[^[:ascii:]].*'


It is an emacs regexp and it has [:ascii:] class. But the expression I'm trying to use doesn't work.



Edit 2:




LC_COLLATE=C find . -type f -regex '.*[^!-~].*'


matches files with non-ascii chars (a complete voodoo...). But also matches files with a space in the name.


Answer



This seems to work for me in both default and posix-extended mode:



LC_COLLATE=C find . -regex '.*[^ -~].*'



There could be locale-related issues, though, and I don't have a large corpus of non-ascii filenames to test it on, but it catches the ones I have.


No comments:

Post a Comment

php - file_get_contents shows unexpected output while reading a file

I want to output an inline jpg image as a base64 encoded string, however when I do this : $contents = file_get_contents($filename); print &q...