いくつかの因子列を持つデータフレームの行を束縛する

dplyr::bind_rowsのsuped-upバージョンを作成して、私たちが結合しようとしているdfsに因子列があるときにはUnequal factor levels: coercing to characterの警告を避けたい列）。ここでは例です：いくつかの因子列を持つデータフレームの行を束縛する

df1 <- dplyr::data_frame(age = 1:3, gender = factor(c("male", "female", "female")), district = factor(c("north", "south", "west"))) 
df2 <- dplyr::data_frame(age = 4:6, gender = factor(c("male", "neutral", "neutral")), district = factor(c("central", "north", "east")))

その後、bind_rows_with_factor_columns(df1, df2)リターン（警告なし）：

dplyr::data_frame(
    age = 1:6, 
    gender = factor(c("male", "female", "female", "male", "neutral", "neutral")), 
    district = factor(c("north", "south", "west", "central", "north", "east")) 
)

は、ここで私がこれまで持っているものです。

bind_rows_with_factor_columns <- function(...) { 
    factor_columns <- purrr::map(..., function(df) { 
     colnames(dplyr::select_if(df, is.factor)) 
    }) 

    if (length(unique(factor_columns)) > 1) { 
     stop("All factor columns in dfs must have the same column names") 
    } 

    df_list <- purrr::map(..., function (df) { 
    purrr::map_if(df, is.factor, as.character) %>% dplyr::as_data_frame() 
    }) 

    dplyr::bind_rows(df_list) %>% 
    purrr::map_at(factor_columns[[1]], as.factor) %>% 
    dplyr::as_data_frame() 
}

誰でも上の任意のアイデアを持っている場合、私は思ったんだけどforcatsパッケージを組み込んで、要因を文字に強要することを避けることができます。また、一般的には、この間のパフォーマンスを向上させるための提案があれば同じ機能を維持しています（私はtidyverse構文に固執したいと思います）。ありがとう！

出典

2017-02-16 Nick Resnick

なぜだけではなく 'do.call（rbind、リスト（DF1、DF2））'？ – Sotos

'suppressWarnings'または' purrr :: quietly'ですか？ – Axeman

友人から最適なソリューションに基づいて自分の質問に答えるために行く：

bind_rows_with_factor_columns <- function(...) { 
    purrr::pmap_df(list(...), function(...) { 
    cols_to_bind <- list(...) 
    if (all(purrr::map_lgl(cols_to_bind, is.factor))) { 
     forcats::fct_c(cols_to_bind) 
    } else { 
     unlist(cols_to_bind) 
    } 
    }) 
}

出典

2017-02-16 15:42:25

いくつかの因子列を持つデータフレームの行を束縛する

答えて

関連する問題