R Pivot Tables

3/7/2015

People love graphs and plots but I once the data begins to collect many factors and volume increases then it's a must for pivot tables. Now, the main reason I love R is because you can do the same-thing multiple ways. Depending on how I am feeling and the size of the data my methods change.

Pivot Methods

sql (sqldf)
data frame table function
reshapes cast
combination of data.table and reshape

# lets build some data roughly 20 million records will suffice.
test =structure(list(ID = c(23L, 23L, 23L, 23L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L),
CT = structure(c(1L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 3L),
.Label = c("A", "B", "D"), class = "factor")), .Names = c("ID", "CT"),
class = "data.frame", row.names = c(NA, -18L))

for(i in 1:20){
test = rbind(test,test)
print(formatC(nrow(test), format="d", big.mark=','))
}

# R df table function
start = Sys.time()
df = as.data.frame.matrix(table(test$ID,test$CT))
df$ID = rownames(df)
df = df[ ,c(4,2,3,1)]
end = Sys.time()
end - start

# reshape cast
start = Sys.time()
test$CTVal = 1
subjmeans <- cast(test, ID~CT)
end = Sys.time()
end - start

0 Comments

R Pivot Tables

Leave a Reply.

Author

Archives

Categories