私はRで10個のリスト（files1、files2、files3、... files10）で作業しています。各リストには複数のデータフレームが含まれています。R - forループのリストの名前を使用

ここで、各リストの各データフレームからいくつかの値を抽出します。

私がループので

nt = c("A", "C", "G", "T") 
for (i in files1) { 
    for (j in nt) { 
     name = paste(j, i, sep = "-") # here I want as output name = "files1-A". However this doesn't work. How can I get the name of the list "files1"? 
     colname = paste("percentage", j, sep = "") # here I was as output colname = percentageA. This works 
     assign(name, unlist(lapply(i, function(x) x[here I want to use the column with the name "percentageA", so 'colname'][x$position==1000]))) 
    } 
}

に使用するつもりだった、私はトラブルがリストの名前を使用して、変数に割り当てています。

私は最初のリストをループしているだけですが、すぐにすべてのリストをループすることもできますか？

つまり、以下のコードをforループにどのように入れることができますか？

A_files1 = unlist(lapply(files1, function(x) x$percentageA[x$position==1000])) 
C_files1 = unlist(lapply(files1, function(x) x$percentageC[x$position==1000])) 
G_files1 = unlist(lapply(files1, function(x) x$percentageG[x$position==1000])) 
T_files1 = unlist(lapply(files1, function(x) x$percentageT[x$position==1000])) 

A_files2 = unlist(lapply(files2, function(x) x$percentageA[x$position==1000])) 
C_files2 = unlist(lapply(files2, function(x) x$percentageC[x$position==1000])) 
G_files2 = unlist(lapply(files2, function(x) x$percentageG[x$position==1000])) 
T_files2 = unlist(lapply(files2, function(x) x$percentageT[x$position==1000])) 

.... 

A_files10 = unlist(lapply(files10, function(x) x$percentageA[x$position==1000])) 
C_files10 = unlist(lapply(files10, function(x) x$percentageC[x$position==1000])) 
G_files10 = unlist(lapply(files10, function(x) x$percentageG[x$position==1000])) 
T_files10 = unlist(lapply(files10, function(x) x$percentageT[x$position==1000]))

出典

2016-12-29 user1987607

'names（fileS1）'は 'NULL'を返しますか？ –

@ joel.wilson：はいそれはあります – user1987607

サンプルデータを投稿することは素晴らしいでしょう。 [再現可能な例を作る方法]を参照してください（http：// stackoverflow。com/questions/5963269/how-to-make-a-great-r-reproducible-example/5965451＃5965451）を参照してください。一般に、複数のファイルを読み込むために、私は1つのファイルからデータフレームを返す関数（variable1、variable2）を作成します。次に、 'group_by（variable1、variable2）' 'do（myfunction（。$ variable1、。$ variable2））'で 'dplyr'パッケージを使って複数のファイルを読み込みます。これは単一のデータフレーム内のすべてのデータを取得するのに適しています。 –

私はデータフレームを含む偽のリストを作成し、あなたの質問に答えるために：

今

n = data.frame(andrea=c(1983, 11, 8),paja=c(1985, 4, 3)) 
s = data.frame(col1=c("aa", "bb", "cc", "dd", "ee")) 
b = data.frame(col1=c(TRUE, FALSE, TRUE, FALSE, FALSE)) 
x = list(n, s, b, 3) # x contains copies of n, s, b 
names(x) <- c("dataframe1","dataframe2","dataframe3","dataframe4") 
files1 = x

、あなたのループで何が起こるかに入る：

i = files1 
j = "A"

したい場合nt（この場合はnt = "A"）に含まれているペデックスのデータフレームの名前は、名前（i）を使用する必要があります。

name_wrong = paste(j, i, sep = "-") 
name  = paste(names(i),j,sep = "-")

だからあなたが得る：

> name 
[1] "dataframe1-A" "dataframe2-A" "dataframe3-A" "dataframe4-A"

私はそれはあなたが必要なものであると思います。

出典

2016-12-29 11:39:46

これは私が欲しいものではありません。私はすべてのデータフレームに名前を付けるつもりはなく、リストの名前を使用したいだけです。 – user1987607

あなたのリストをリストに入れてみましょう： 'biglist < - list（files1 = files1）' 'names（biglist）'はあなたに '[1]" files1 "'を返します。 –

データ構造をフラット化すると、このデータは操作しやすくなると思います。 10個のデータフレームリストの代わりに、名前とファイル名でインデックス付けされたすべての観測データを含む単一のデータフレームを使用できます。

サンプルデータを生成し、質問からアイテムあたりわずか10または11ポイントで

簡易データを、コードを使用して、私は、リスト内の項目は、行の数が異なると仮定？

files1 <- list(item1 = data.frame(position = 1:10, 
            percentageA = 1:10/10, 
            percentageC = 1:10/10, 
            percentageG = 1:10/10, 
            percentageT = 1:10/10), 
       item2 = data.frame(position = 1:11, 
            percentageA = 1:11/20, 
            percentageC = 1:11/20, 
            percentageG = 1:11/20, 
            percentageT = 1:11/20)) 
str(file) 

# Select the 9th position using your code 
A_files1 = unlist(lapply(files1, function(x) x$percentageA[x$position==9])) 
C_files1 = unlist(lapply(files1, function(x) x$percentageC[x$position==9])) 
G_files1 = unlist(lapply(files1, function(x) x$percentageG[x$position==9])) 
T_files1 = unlist(lapply(files1, function(x) x$percentageT[x$position==9]))

この答えは、私はそれを予想よりはるかに長いです、単一のデータフレーム

# Now create anoter list, files2, duplicate just for the sake of the example 
files2 <- files1 
# file1 and file2 both have a name column inside their dataframes already 
# Create a list of list of dataframes 
lolod <- list(files1 = files1, files2 = files2) 
str(lolod) # a list of lists 
# Flatten to a list of dataframes 
# Use sapply to keep names based on this answer http://stackoverflow.com/a/9469981/2641825 
lod <- sapply(lolod, Reduce, f=rbind, simplify = FALSE, USE.NAMES = TRUE) 
# Add the name inside each data frame again 
addfilename <- function(i, listoffiles){ 
    dtf <- listoffiles[[i]] # Extract the dataframe from the list 
    dtf$filename <- names(listoffiles[i]) # Add the name inside the data frame 
    return(dtf) 
} 
lod <- lapply(seq_along(lod), addfilename, lod) 


# Flatten to a dataframe 
d <- Reduce(rbind, lod) 
# Now the data structure is flattened and much easier to deal with 

subset(d,position==9) 
# position percentageA percentageC percentageG percentageT name filename 
# 9   9  0.90  0.90  0.90  0.90 item1 files1 
# 19  9  0.45  0.45  0.45  0.45 item2 files1 
# 30  9  0.90  0.90  0.90  0.90 item1 files2 
# 40  9  0.45  0.45  0.45  0.45 item2 files2

にデータフレームのリストのすべてのリストを平らに1つのデータフレーム

# Add name to each data frame 
# Inspired by this answer 
# http://stackoverflow.com/a/18434780/2641825 


# For information l[1] creates a single list item 
# l[[1]] extracts the data frame from the list 
#' @param i index 
#' @param listoffiles list of data frames 
addname <- function(i, listoffiles){ 
    dtf <- listoffiles[[i]] # Extract the dataframe from the list 
    dtf$name <- names(listoffiles[i]) # Add the name inside the data frame 
    return(dtf) 
} 
# Add the name inside each data frame 
files1 <- lapply(seq_along(files1), addname, files1) 
str(files1) # look at the structure of the list 
files1table <- Reduce(rbind,files1) 

# Get the values of interest with 
files1table$percentageA[files1table$position == 9] 
# [1] 0.90 0.45 

# Get all Letters of interest with 
subset(files1table,position==9) 

# position percentageA percentageC percentageG percentageT name 
# 9   9  0.90  0.90  0.90  0.90 item1 
# 19  9  0.45  0.45  0.45  0.45 item2

にデータフレームのリストを平らにすることが。私はあなたを怖がらせなかったことを願っています。 tidy dataに触発されて、データ構造を簡略化することで、後で作業が容易になります。この複雑なリストの名前を変更することは、元のデータの中に名前を指定した場合はおそらく必要ではありません。

出典

2016-12-29 23:58:12

R - forループのリストの名前を使用

答えて

サンプルデータを生成し、質問からアイテム あたりわずか10または11ポイントで

この答えは、私はそれを予想よりはるかに長いです、単一のデータフレーム

関連する問題

サンプルデータを生成し、質問からアイテムあたりわずか10または11ポイントで