2016-12-22 7 views
-4

入力データフレームがあり、それぞれがTimeStampカラム+ k数値カラムを持ちます。一般的な方法で複数の入力データフレームから複数の入力データフレームを作成する

私は(出力データフレームjの数値列iは、入力データフレームの数値列jから値を有するように、それらのそれぞれがi、1つのTimeStamp列+ n数値の列がありますk出力データフレームに変換したいです列インデックスは最初の列であるTimeStamp列を除外します)、見つからない部分はNAsで埋めなければなりません。これらのデータフレームに

最初の列は、(異なるTimeStampを有していてもよい)は、入力データフレームの行

数が異なっていて、常に(TimeStamp sが重複している)TimeStampカラムです。例えば

n=2ためd1, d2は、以下の構造(一つのサンプルデータフレームd1k=4ため、kは任意とすることができるが、各データフレームのために同じになり下に示されている)を有し、それらの各々が別々のCSVファイルに格納されるデータフレームの各ファイル:

d1 <- structure(list(TimeStamp = structure(1:6, .Label = c("2016-12-20 10:17:20", "2016-12-20 10:19:20", "2016-12-20 10:19:40", "2016-12-20 10:20:00", "2016-12-20 10:20:20", "2016-12-20 10:20:40", "2016-12-20 10:21:00", 
"2016-12-20 10:21:20", "2016-12-20 10:21:40", "2016-12-20 10:22:00", 
"2016-12-20 10:22:20", "2016-12-20 10:22:40", "2016-12-20 10:23:00", 
"2016-12-20 10:23:20", "2016-12-20 10:23:40", "2016-12-20 10:24:00", 
"2016-12-20 10:24:20", "2016-12-20 10:24:40", "2016-12-20 10:25:00", 
"2016-12-20 10:25:20", "2016-12-20 10:25:40", "2016-12-20 10:26:00", 
"2016-12-20 10:26:20", "2016-12-20 10:26:40", "2016-12-20 10:27:00", 
"2016-12-20 10:27:20", "2016-12-20 10:27:40", "2016-12-20 10:28:00", 
"2016-12-20 10:28:20", "2016-12-20 10:28:40", "2016-12-20 10:29:00", 
"2016-12-20 10:29:20", "2016-12-20 10:29:40", "2016-12-20 10:30:00", 
"2016-12-20 10:30:20", "2016-12-20 10:30:40", "2016-12-20 10:31:00", 
"2016-12-20 10:31:20", "2016-12-20 10:31:40", "2016-12-20 10:32:00", 
"2016-12-20 10:32:20", "2016-12-20 10:32:40", "2016-12-20 10:33:00", 
"2016-12-20 10:33:20", "2016-12-20 10:33:40", "2016-12-20 10:34:00", 
"2016-12-20 10:34:20", "2016-12-20 10:34:40", "2016-12-20 10:35:00", 
"2016-12-20 10:35:20", "2016-12-20 10:35:40", "2016-12-20 10:36:00", 
"2016-12-20 10:37:00", "2016-12-20 10:37:20", "2016-12-20 10:37:40", 
"2016-12-20 10:38:00", "2016-12-20 10:38:20", "2016-12-20 10:40:40", 
"2016-12-20 10:41:20", "2016-12-20 10:41:40", "2016-12-20 10:44:20", 
"2016-12-20 10:44:40", "2016-12-20 10:46:00", "2016-12-20 10:49:40", 
"2016-12-20 10:50:00", "2016-12-20 10:50:20", "2016-12-20 10:55:00", 
"2016-12-20 10:56:00", "2016-12-20 10:57:20", "2016-12-20 10:59:20", 
"2016-12-20 10:59:40", "2016-12-20 11:00:20", "2016-12-20 11:01:20", 
"2016-12-20 11:05:40", "2016-12-20 11:06:00", "2016-12-20 11:07:20", 
"2016-12-20 11:08:20", "2016-12-20 11:08:40", "2016-12-20 11:11:40", 
"2016-12-20 11:12:00", "2016-12-20 11:14:20", "2016-12-20 11:14:40", 
"2016-12-20 11:15:00", "2016-12-20 11:15:20", "2016-12-20 11:15:40", 
"2016-12-20 11:16:00", "2016-12-20 11:16:20", "2016-12-20 11:18:20", 
"2016-12-20 11:18:40", "2016-12-20 11:19:00", "2016-12-20 11:19:20", 
"2016-12-20 11:19:40", "2016-12-20 11:21:20", "2016-12-20 11:21:40", 
"2016-12-20 11:22:20", "2016-12-20 11:22:40", "2016-12-20 11:23:00", 
"2016-12-20 11:23:20", "2016-12-20 11:25:00", "2016-12-20 11:25:20", 
"2016-12-20 11:26:00", "2016-12-20 11:26:40", "2016-12-20 11:27:00", 
"2016-12-20 11:27:20", "2016-12-20 11:27:40", "2016-12-20 11:28:00", 
"2016-12-20 11:28:20", "2016-12-20 11:28:40", "2016-12-20 11:34:40", 
"2016-12-20 11:36:20", "2016-12-20 11:36:40", "2016-12-20 11:41:00", 
"2016-12-20 11:41:20", "2016-12-20 11:42:20", "2016-12-20 11:42:40", 
"2016-12-20 11:46:40", "2016-12-20 11:47:00", "2016-12-20 11:47:20", 
"2016-12-20 11:47:40", "2016-12-20 11:48:00", "2016-12-20 11:48:20", 
"2016-12-20 11:48:40", "2016-12-20 11:54:00", "2016-12-20 11:54:20", 
"2016-12-20 11:57:40", "2016-12-20 12:00:00", "2016-12-20 12:00:40", 
"2016-12-20 12:01:00", "2016-12-20 12:01:20", "2016-12-20 12:01:40", 
"2016-12-20 12:02:20", "2016-12-20 12:02:40", "2016-12-20 12:03:00", 
"2016-12-20 12:03:20", "2016-12-20 12:03:40", "2016-12-20 12:07:00", 
"2016-12-20 12:07:20", "2016-12-20 12:07:40", "2016-12-20 12:08:00", 
"2016-12-20 12:08:20", "2016-12-20 12:10:20", "2016-12-20 12:10:40" 
), class = "factor"), b1 = c(-76L, 0L, 0L, -76L, -80L, -81L), 
    b2 = c(0L, -74L, -79L, -73L, -79L, -77L), b3 = c(0L, 0L, 
    -88L, -88L, -91L, 0L), b4 = c(0L, 0L, 0L, -78L, -80L, -78L 
    )), .Names = c("TimeStamp", "b1", "b2", "b3", "b4"), row.names = c(NA, 
6L), class = "data.frame") 

head(d1) 
#   TimeStamp b1 b2 b3 b4 
#1 2016-12-20 10:17:20 -76 0 0 0 
#2 2016-12-20 10:19:20 0 -74 0 0 
#3 2016-12-20 10:19:40 0 -79 -88 0 
#4 2016-12-20 10:20:00 -76 -73 -88 -78 
#5 2016-12-20 10:20:20 -80 -79 -91 -80 
#6 2016-12-20 10:20:40 -81 -77 0 -78 

d2 <- structure(list(TimeStamp = structure(137:142, .Label = c("2016-12-20 10:17:20", 
"2016-12-20 10:19:20", "2016-12-20 10:19:40", "2016-12-20 10:20:00", 
"2016-12-20 10:20:20", "2016-12-20 10:20:40", "2016-12-20 10:21:00", 
"2016-12-20 10:21:20", "2016-12-20 10:21:40", "2016-12-20 10:22:00", 
"2016-12-20 10:22:20", "2016-12-20 10:22:40", "2016-12-20 10:23:00", 
"2016-12-20 10:23:20", "2016-12-20 10:23:40", "2016-12-20 10:24:00", 
"2016-12-20 10:24:20", "2016-12-20 10:24:40", "2016-12-20 10:25:00", 
"2016-12-20 10:25:20", "2016-12-20 10:25:40", "2016-12-20 10:26:00", 
"2016-12-20 10:26:20", "2016-12-20 10:26:40", "2016-12-20 10:27:00", 
"2016-12-20 10:27:20", "2016-12-20 10:27:40", "2016-12-20 10:28:00", 
"2016-12-20 10:28:20", "2016-12-20 10:28:40", "2016-12-20 10:29:00", 
"2016-12-20 10:29:20", "2016-12-20 10:29:40", "2016-12-20 10:30:00", 
"2016-12-20 10:30:20", "2016-12-20 10:30:40", "2016-12-20 10:31:00", 
"2016-12-20 10:31:20", "2016-12-20 10:31:40", "2016-12-20 10:32:00", 
"2016-12-20 10:32:20", "2016-12-20 10:32:40", "2016-12-20 10:33:00", 
"2016-12-20 10:33:20", "2016-12-20 10:33:40", "2016-12-20 10:34:00", 
"2016-12-20 10:34:20", "2016-12-20 10:34:40", "2016-12-20 10:35:00", 
"2016-12-20 10:35:20", "2016-12-20 10:35:40", "2016-12-20 10:36:00", 
"2016-12-20 10:37:00", "2016-12-20 10:37:20", "2016-12-20 10:37:40", 
"2016-12-20 10:38:00", "2016-12-20 10:38:20", "2016-12-20 10:40:40", 
"2016-12-20 10:41:20", "2016-12-20 10:41:40", "2016-12-20 10:44:20", 
"2016-12-20 10:44:40", "2016-12-20 10:46:00", "2016-12-20 10:49:40", 
"2016-12-20 10:50:00", "2016-12-20 10:50:20", "2016-12-20 10:55:00", 
"2016-12-20 10:56:00", "2016-12-20 10:57:20", "2016-12-20 10:59:20", 
"2016-12-20 10:59:40", "2016-12-20 11:00:20", "2016-12-20 11:01:20", 
"2016-12-20 11:05:40", "2016-12-20 11:06:00", "2016-12-20 11:07:20", 
"2016-12-20 11:08:20", "2016-12-20 11:08:40", "2016-12-20 11:11:40", 
"2016-12-20 11:12:00", "2016-12-20 11:14:20", "2016-12-20 11:14:40", 
"2016-12-20 11:15:00", "2016-12-20 11:15:20", "2016-12-20 11:15:40", 
"2016-12-20 11:16:00", "2016-12-20 11:16:20", "2016-12-20 11:18:20", 
"2016-12-20 11:18:40", "2016-12-20 11:19:00", "2016-12-20 11:19:20", 
"2016-12-20 11:19:40", "2016-12-20 11:21:20", "2016-12-20 11:21:40", 
"2016-12-20 11:22:20", "2016-12-20 11:22:40", "2016-12-20 11:23:00", 
"2016-12-20 11:23:20", "2016-12-20 11:25:00", "2016-12-20 11:25:20", 
"2016-12-20 11:26:00", "2016-12-20 11:26:40", "2016-12-20 11:27:00", 
"2016-12-20 11:27:20", "2016-12-20 11:27:40", "2016-12-20 11:28:00", 
"2016-12-20 11:28:20", "2016-12-20 11:28:40", "2016-12-20 11:34:40", 
"2016-12-20 11:36:20", "2016-12-20 11:36:40", "2016-12-20 11:41:00", 
"2016-12-20 11:41:20", "2016-12-20 11:42:20", "2016-12-20 11:42:40", 
"2016-12-20 11:46:40", "2016-12-20 11:47:00", "2016-12-20 11:47:20", 
"2016-12-20 11:47:40", "2016-12-20 11:48:00", "2016-12-20 11:48:20", 
"2016-12-20 11:48:40", "2016-12-20 11:54:00", "2016-12-20 11:54:20", 
"2016-12-20 11:57:40", "2016-12-20 12:00:00", "2016-12-20 12:00:40", 
"2016-12-20 12:01:00", "2016-12-20 12:01:20", "2016-12-20 12:01:40", 
"2016-12-20 12:02:20", "2016-12-20 12:02:40", "2016-12-20 12:03:00", 
"2016-12-20 12:03:20", "2016-12-20 12:03:40", "2016-12-20 12:07:00", 
"2016-12-20 12:07:20", "2016-12-20 12:07:40", "2016-12-20 12:08:00", 
"2016-12-20 12:08:20", "2016-12-20 12:10:20", "2016-12-20 12:10:40" 
), class = "factor"), b1 = c(-76L, 0L, 0L, 0L, -82L, -74L), b2 = c(-87L, 
-76L, 0L, 0L, 0L, -69L), b3 = c(0L, 0L, -84L, -84L, 0L, -85L), 
    b4 = c(-75L, 0L, 0L, 0L, 0L, 0L)), .Names = c("TimeStamp", 
"b1", "b2", "b3", "b4"), row.names = c(NA, 6L), class = "data.frame") 

head(d2)  
#    TimeStamp b1 b2 b3 b4 
# 1 2016-12-20 12:07:20 -76 -87 0 -75 
# 2 2016-12-20 12:07:40 0 -76 0 0 
# 3 2016-12-20 12:08:00 0 0 -84 0 
# 4 2016-12-20 12:08:20 0 0 -84 0 
# 5 2016-12-20 12:10:20 -82 0 0 0 
# 6 2016-12-20 12:10:40 -74 -69 -85 0 

は今、私はkデータフレームn列のそれぞれが(別のCSVファイルとして保存することを)持っていると思います。与えられた例において、異なるデータフレームから

b1  
    #   TimeStamp d1 d2 
    #2016-12-20 10:17:20 -76 NA 
    #2016-12-20 10:19:20 0 NA 
    #2016-12-20 10:19:40 0 NA 
    #2016-12-20 10:20:00 -76 NA 
    #2016-12-20 10:20:20 -80 NA 
    #2016-12-20 10:20:40 -81 NA 
    #2016-12-20 12:07:20 NA -76 
    #2016-12-20 12:07:40 NA 0 
    #2016-12-20 12:08:00 NA 0 
    #2016-12-20 12:08:20 NA 0 
    #2016-12-20 12:10:20 NA -82 
    #2016-12-20 12:10:40 NA -74 

    b2  
    #   TimeStamp d1 d2 
    #2016-12-20 10:17:20 0 NA 
    #2016-12-20 10:19:20 -74 NA 
    #2016-12-20 10:19:40 -79 NA 
    #2016-12-20 10:20:00 -73 NA 
    #2016-12-20 10:20:20 -79 NA 
    #2016-12-20 10:20:40 -77 NA 
    #2016-12-20 12:07:20 NA -87 
    #2016-12-20 12:07:40 NA -76 
    #2016-12-20 12:08:00 NA 0 
    #2016-12-20 12:08:20 NA 0 
    #2016-12-20 12:10:20 NA 0 
    #2016-12-20 12:10:40 NA -69 

タイムスタンプが互いに素であるが、タイムスタンプ:以下のように、例えば、私は上記の入力からb1, b2, b3, b4(それらの両者が示されている)以下の出力データフレームを持ちたいはd1, d2をデータフレーム一般的には異なるデータフレームからに重複しますが、後者の場合は、NAsで記入する必要はありません(数値が存在するため)。

これを行うための最も単純で効率的で最も一般的な方法は何ですか(ループなしでbase R/dplyr/tidyr/data.tableを使用)。私は定数nkとデータフレームを任意に大きくすることができます。

+1

たぶん '地図(data.frame、D1 = D1、D2 = D2)のようなもの'? – Sotos

+0

このエラーを取得するには、データフレームには何も含まれていないため、 'エラー(関数(...、row.names = NULL、check.rows = FALSE、check.names = TRUE、:引数は行数が異なる:185,142 ' –

+1

はい、あなたのサンプルを更新する必要があります – Sotos

答えて

1

たぶん、あなたはこれを試すことができます。

#read d1 data from PATH1 
d1_df <- read.table("PATH1", header = T, sep = "\t", stringsAsFactors = F) 
#store d1 colnames 
d1_colname <- colnames(d1_df)[-1] 
#read d2 data from PATH2 
d2_df <- read.table("PATH2", header = T, sep = "\t", stringsAsFactors = F) 
#store d2 colnames 
d2_colname <- colnames(d2_df)[-1] 
#merge two df timestamp 
TimeStamp <-c(unlist(d1[,1]), unlist(d2[,1])) 
#merge two df colname 
merge_colname <- rbind(d1_colname, d2_colname) 
#to match the format want 
merge_df <- function(vec_colname){ 
    d1 <- c(unlist(d1_df[, vec_colname[1]]), rep("NA", nrow(d2_df))) 
    d2 <- c(rep("NA", nrow(d1_df)), unlist(d2_df[, vec_colname[2]])) 
    return(data.frame(TimeStamp, d1, d2)) 
} 
#get result,but is a list 
res_list <- apply(merge_colname, 2, merge_df) 
#create data frames from the result 
for(i in 1:length(res_list)){ 
    #bi <- res_list[[i]] 
    eval(parse(text=paste0("b",i,"<-res_list[[",i,"]]"))) 
} 

そして結果:

> b1 
      TimeStamp d1 d2 
1 2016-12-20 10:17:20 -76 NA 
2 2016-12-20 10:19:20 0 NA 
3 2016-12-20 10:19:40 0 NA 
4 2016-12-20 10:20:00 -76 NA 
5 2016-12-20 10:20:20 -80 NA 
6 2016-12-20 10:20:40 -81 NA 
7 2016-12-20 12:07:20 NA -76 
8 2016-12-20 12:07:40 NA 0 
9 2016-12-20 12:08:00 NA 0 
10 2016-12-20 12:08:20 NA 0 
11 2016-12-20 12:10:20 NA -82 
12 2016-12-20 12:10:40 NA -74 
+0

更新された記事を参照してタイムスタンプも検討してください。 –

+0

@sandipan私は自分の答えを更新しました。あなたはそれを確認することができます。 –

+0

@docendodiscimus申し訳ありませんが、コードを編集します。 –

関連する問題