Différence entre as.data.frame (x) et data.frame (x)

Question

Quelle est la différence entre as.data.frame (x) et data.frame (x)

Dans l'exemple suivant, le résultat est identique à l'exception des noms de colonnes.

x <- matrix(data=rep(1,9),nrow=3,ncol=3) > x [,1] [,2] [,3] [1,] 1 1 1 [2,] 1 1 1 [3,] 1 1 1 > data.frame(x) X1 X2 X3 1 1 1 1 2 1 1 1 3 1 1 1 > as.data.frame(x) V1 V2 V3 1 1 1 1 2 1 1 1 3 1 1 1

Brandon Bertelsen · Answer

Comme mentionné par Jaap, data.frame() appelle as.data.frame() mais il y a une raison à cela:

as.data.frame() est une méthode pour contraindre d'autres objets à la classe data.frame. Si vous écrivez votre propre paquet, vous stockerez votre méthode pour convertir un objet de your_class sous as.data.frame.your_class(). Voici quelques exemples.

methods(as.data.frame) [1] as.data.frame.AsIs as.data.frame.Date [3] as.data.frame.POSIXct as.data.frame.POSIXlt [5] as.data.frame.aovproj* as.data.frame.array [7] as.data.frame.character as.data.frame.complex [9] as.data.frame.data.frame as.data.frame.default [11] as.data.frame.difftime as.data.frame.factor [13] as.data.frame.ftable* as.data.frame.integer [15] as.data.frame.list as.data.frame.logLik* [17] as.data.frame.logical as.data.frame.matrix [19] as.data.frame.model.matrix as.data.frame.numeric [21] as.data.frame.numeric_version as.data.frame.ordered [23] as.data.frame.raw as.data.frame.table [25] as.data.frame.ts as.data.frame.vector Non-visible functions are asterisked

Elliott · Answer

data.frame() peut être utilisé pour créer un cadre de données alors que as.data.frame() ne peut être utilisé que pour contraindre un autre objet à un cadre de données.

par exemple:

# data.frame() df1 <- data.frame(matrix(1:12,3,4),1:3) # as.data.frame() df2 <- as.data.frame(matrix(1:12,3,4),1:3) df1 # X1 X2 X3 X4 X1.3 # 1 1 4 7 10 1 # 2 2 5 8 11 2 # 3 3 6 9 12 3 df2 # V1 V2 V3 V4 # 1 1 4 7 10 # 2 2 5 8 11 # 3 3 6 9 12

James · Answer

Comme vous l'avez noté, le résultat diffère légèrement, ce qui signifie qu'ils ne sont pas exactement égaux:

identical(data.frame(x),as.data.frame(x)) [1] FALSE

Donc, vous devrez peut-être faire attention à ce que vous utilisez.

Mais il est également intéressant de noter que as.data.frame est plus rapide:

library(microbenchmark) microbenchmark(data.frame(x),as.data.frame(x)) Unit: microseconds expr min lq median uq max neval data.frame(x) 71.446 73.616 74.80 78.9445 146.442 100 as.data.frame(x) 25.657 27.631 28.42 29.2100 93.155 100 y <- matrix(1:1e6,1000,1000) microbenchmark(data.frame(y),as.data.frame(y)) Unit: milliseconds expr min lq median uq max neval data.frame(y) 17.23943 19.63163 23.60193 41.07898 130.66005 100 as.data.frame(y) 10.83469 12.56357 14.04929 34.68608 38.37435 100

user3276000 · Answer

Essayer

colnames(x) <- c("C1","C2","C3")

et puis les deux donneront le même résultat

identical(data.frame(x), as.data.frame(x))

Ce qui est plus surprenant, ce sont les choses suivantes:

list(x)

Fournit une liste d'un élément, l'élément étant la matrice x; tandis que

as.list(x)

donne une liste avec 9 éléments, un pour chaque entrée de matrice

MM

jtr13 · Answer

En regardant le code, as.data.frame échoue plus vite. data.frame émettra des avertissements et fera des choses comme supprimer les noms de domaine s'il y a des doublons:

> x <- matrix(data=rep(1,9),nrow=3,ncol=3) > rownames(x) <- c("a", "b", "b") > data.frame(x) X1 X2 X3 1 1 1 1 2 1 1 1 3 1 1 1 Warning message: In data.row.names(row.names, rowsi, i) : some row.names duplicated: 3 --> row.names NOT used > as.data.frame(x) Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : duplicate row.names: b