Monday, 8 January 2018

r - How to trim leading and trailing whitespace?

itemprop="text">

I am having some troubles with leading
and trailing whitespace in a data.frame.

Eg I like to take a look
at a specific row in a data.frame
based on a certain condition:



>
myDummy[myDummy$country == c("Austria"),c(1,2,3:7,19)]

[1]
codeHelper country dummyLI dummyLMI dummyUMI
[6] dummyHInonOECD dummyHIOECD
dummyOECD
<0 rows> (or 0-length
row.names)


I was
wondering why I didn't get the expected output since the country Austria obviously
existed in my data.frame. After looking through my code history
and trying to figure out what went wrong I
tried:




>
myDummy[myDummy$country == c("Austria "),c(1,2,3:7,19)]
codeHelper country
dummyLI dummyLMI dummyUMI dummyHInonOECD dummyHIOECD
18 AUT Austria 0 0 0 0
1
dummyOECD
18
1


All I have changed
in the command is an additional whitespace after Austria.




Further annoying problems obviously
arise. Eg when I like to merge two frames based on the country column. One
data.frame uses "Austria " while the
other frame has "Austria". The matching doesn't
work.




  1. Is there a nice way
    to 'show' the whitespace on my screen so that i am aware of the problem?

  2. And can I remove the leading and trailing whitespace in
    R?



So far I used to write
a simple Perl script which removes the whitespace but it would
be nice if I can somehow do it inside R.


class="post-text" itemprop="text">
class="normal">Answer



Probably
the best way is to handle the trailing whitespaces when you read your data file. If you
use read.csv or read.table you can set
the
parameterstrip.white=TRUE.




If
you want to clean strings afterwards you could use one of these
functions:



# returns string w/o
leading whitespace
trim.leading <- function (x) sub("^\\s+", "",
x)

# returns string w/o trailing whitespace
trim.trailing
<- function (x) sub("\\s+$", "", x)

# returns string w/o leading
or trailing whitespace

trim <- function (x) gsub("^\\s+|\\s+$",
"", x)


To use one of
these functions on
myDummy$country:




myDummy$country <-
trim(myDummy$country)


/>


To 'show' the whitespace you could
use:




paste(myDummy$country)


which
will show you the strings surrounded by quotation marks (") making whitespaces easier to
spot.


No comments:

Post a Comment

php - file_get_contents shows unexpected output while reading a file

I want to output an inline jpg image as a base64 encoded string, however when I do this : $contents = file_get_contents($filename); print ...