I need to sample 800 out of 1000 rows for a training set, but index goes by columns. Ex. df[1]
returns the first column.
Q2dat = read.csv("Q2in.csv")
Q2dat = as.data.frame(Q2dat)
Q2datTrain = sample(Q2dat,0.8*nrow(Q2dat)) # this only lets me sample columns, so 800 is too many
Q2datTrain = sample(nrow(Q2dat),0.8*nrow(Q2dat)) # this samples any value in the dataframe, but not whole rows
I'm not sure how to change the data frame so that it indexes by rows instead of columns, or how to sample whole rows.
Turning the data frame into a matrix just creates 8000 values, and when I specify the number of rows for the matrix, it's an unused statement
No comments:
Post a Comment