Affectez plusieurs colonnes à l'aide de: = dans data.table, par groupe

Question

Quel est le meilleur moyen d’affecter plusieurs colonnes à l’aide de data.table? Par exemple:

f <- function(x) {c("hi", "hello")} x <- data.table(id = 1:10)

Je voudrais faire quelque chose comme ceci (bien sûr, cette syntaxe est incorrecte):

x[ , (col1, col2) := f(), by = "id"]

Et pour prolonger cela, il se peut que je dispose de nombreuses colonnes avec des noms stockés dans une variable (disons col_names) et je voudrais faire:

x[ , col_names := another_f(), by = "id", with = FALSE]

Quelle est la bonne façon de faire quelque chose comme ça?

Matt Dowle · Accepted Answer

Cela fonctionne maintenant dans la version 1.8.3 sur R-Forge. Merci de l'avoir souligné!

x <- data.table(a = 1:3, b = 1:6) f <- function(x) {list("hi", "hello")} x[ , c("col1", "col2") := f(), by = a][] # a b col1 col2 # 1: 1 1 hi hello # 2: 2 2 hi hello # 3: 3 3 hi hello # 4: 1 4 hi hello # 5: 2 5 hi hello # 6: 3 6 hi hello x[ , c("mean", "sum") := list(mean(b), sum(b)), by = a][] # a b col1 col2 mean sum # 1: 1 1 hi hello 2.5 5 # 2: 2 2 hi hello 3.5 7 # 3: 3 3 hi hello 4.5 9 # 4: 1 4 hi hello 2.5 5 # 5: 2 5 hi hello 3.5 7 # 6: 3 6 hi hello 4.5 9 mynames = c("Name1", "Longer%") x[ , (mynames) := list(mean(b) * 4, sum(b) * 3), by = a] # a b col1 col2 mean sum Name1 Longer% # 1: 1 1 hi hello 2.5 5 10 15 # 2: 2 2 hi hello 3.5 7 14 21 # 3: 3 3 hi hello 4.5 9 18 27 # 4: 1 4 hi hello 2.5 5 10 15 # 5: 2 5 hi hello 3.5 7 14 21 # 6: 3 6 hi hello 4.5 9 18 27

x[ , mynames := list(mean(b) * 4, sum(b) * 3), by = a, with = FALSE][] # same # a b col1 col2 mean sum Name1 Longer% # 1: 1 1 hi hello 2.5 5 10 15 # 2: 2 2 hi hello 3.5 7 14 21 # 3: 3 3 hi hello 4.5 9 18 27 # 4: 1 4 hi hello 2.5 5 10 15 # 5: 2 5 hi hello 3.5 7 14 21 # 6: 3 6 hi hello 4.5 9 18 27 x[ , get("mynames") := list(mean(b) * 4, sum(b) * 3), by = a][] # same # a b col1 col2 mean sum Name1 Longer% # 1: 1 1 hi hello 2.5 5 10 15 # 2: 2 2 hi hello 3.5 7 14 21 # 3: 3 3 hi hello 4.5 9 18 27 # 4: 1 4 hi hello 2.5 5 10 15 # 5: 2 5 hi hello 3.5 7 14 21 # 6: 3 6 hi hello 4.5 9 18 27 x[ , eval(mynames) := list(mean(b) * 4, sum(b) * 3), by = a][] # same # a b col1 col2 mean sum Name1 Longer% # 1: 1 1 hi hello 2.5 5 10 15 # 2: 2 2 hi hello 3.5 7 14 21 # 3: 3 3 hi hello 4.5 9 18 27 # 4: 1 4 hi hello 2.5 5 10 15 # 5: 2 5 hi hello 3.5 7 14 21 # 6: 3 6 hi hello 4.5 9 18 27

Gerry · Answer

La notation abrégée suivante pourrait être utile. Tout le mérite revient à Andrew Brooks, en particulier cet article .

dt[,`:=`(avg=mean(mpg), med=median(mpg), min=min(mpg)), by=cyl]