Tuesday, 17 September 2013

How do I apply a function in R to certain columns of a data frame grouped by another column?

How do I apply a function in R to certain columns of a data frame grouped
by another column?

I've been looking at the help page for tapply and by and I'm not sure if
they are the right tool for this. For example, if I have a dataframe where
the columns are Name,Value1,Value2 and I want to apply a function, say
function f(x,y) { do_something } to Value1 and Value2 grouped by Name and
get as a result a dataframe with the columns Name,f(Value1,Value2) how
should I go about that?
I can get tapply to work in a simple case like this:
tapply(df$Name, df$value1, mean)
but what if my function takes as input df$value2 as well? and is not as
simple as mean? In other words, pseudo-notation for what I'm trying to do
would be:
tapply(df$Name, c(df$value1,df$value2), function f(x,y) { x+y+bla...})

No comments:

Post a Comment