Monday 13 November 2017

r - Convert data.frame columns from factors to characters

itemprop="text">

I have a data frame. Let's call him
bob:



>
head(bob)
phenotype exclusion
GSM399350 3- 4- 8- 25- 44+ 11b- 11c-
19- NK1.1- Gr1- TER119-
GSM399351 3- 4- 8- 25- 44+ 11b- 11c- 19- NK1.1- Gr1-
TER119-
GSM399352 3- 4- 8- 25- 44+ 11b- 11c- 19- NK1.1- Gr1-
TER119-
GSM399353 3- 4- 8- 25+ 44+ 11b- 11c- 19- NK1.1- Gr1-
TER119-

GSM399354 3- 4- 8- 25+ 44+ 11b- 11c- 19- NK1.1- Gr1-
TER119-
GSM399355 3- 4- 8- 25+ 44+ 11b- 11c- 19- NK1.1- Gr1-
TER119-


I'd like to
concatenate the rows of this data frame (this will be another question). But
look:



>
class(bob$phenotype)
[1]
"factor"



Bob's
columns are factors. So, for
example:



>
as.character(head(bob))
[1] "c(3, 3, 3, 6, 6, 6)" "c(3, 3, 3, 3, 3, 3)"

[3] "c(29, 29, 29, 30, 30,
30)"


I don't begin to
understand this, but I guess these are indices into the levels of the factors of the
columns (of the court of king caractacus) of bob? Not what I
need.



Strangely I can go through the columns of
bob by hand, and
do




bob$phenotype <-
as.character(bob$phenotype)


which
works fine. And, after some typing, I can get a data.frame whose columns are characters
rather than factors. So my question is: how can I do this automatically? How do I
convert a data.frame with factor columns into a data.frame with character columns
without having to manually go through each column?



Bonus question: why does the manual approach
work?



Answer




Just following on Matt and Dirk. If you want
to recreate your existing data frame without changing the global option, you can
recreate it with an apply
statement:




bob <-
data.frame(lapply(bob, as.character),
stringsAsFactors=FALSE)


This
will convert all variables to class "character", if you want to only convert factors,
see Marek's solution
below
.



As @hadley points out, the
following is more concise.



bob[]
<- lapply(bob,
as.character)



In
both cases, lapply outputs a list; however, owing to the
magical properties of R, the use of [] in the second case keeps
the data.frame class of the bob object, thereby eliminating the
need to convert back to a data.frame using as.data.frame with
the argument stringsAsFactors = FALSE.



No comments:

Post a Comment

php - file_get_contents shows unexpected output while reading a file

I want to output an inline jpg image as a base64 encoded string, however when I do this : $contents = file_get_contents($filename); print &q...