オープンエンドの質問のnワードの頻度を確認する

オープンエンドの質問から回答を分析したい。最初に1つの単語の雲、そして私は2-3単語のフレーズの頻度を数えたいときに問題を満たした。オープンエンドの質問のnワードの頻度を確認する

library('tm') 
tokenize_ngrams <- function(x,n=2)return(rownames(as.data.frame(unclass(textcnt(x,method="string",n=n))))) 
corpus <- Corpus(VectorSource(texts)) 
matrix <- TermDocumentMatrix(corpus,control=list(tokenize=tokenize_ngrams)) 
inspect(matrix[1:4, 1:3])

結果は2ワードフレーズと周波数のようになります。

は、ここに私のコードです。次のようにしかし、私は結果を得た：

    Docs 
Terms   1 2 3 
document   1 0 0 
first   1 0 0 
the    1 1 1 
this    1 1 1

Results of this code

出典

2017-07-17 周文妍

はTMを使用して答えを知ってはいけないが、これは正常に動作します：

require(quanteda) 
matrix <- dfm(texts, ngrams = 2) 
head(matrix)

を

出典

2017-07-28 15:49:00

オープンエンドの質問のnワードの頻度を確認する

答えて

関連する問題