2016-04-13 45 views
3

jsonのコンテンツを持つファイルを読み込み、いくつかのフィールドに基づいて表形式のデータに変換しようとしています。ファイルのjsonエントリをどのようにデータフレームに変換しますか?

ファイルは、このような内容を含む:

{"senderDateTimeStamp":"2016/04/08 10:03:18","senderHost":null,"senderCode":"web_app","senderUsecase":"appinternalstats_prod","destinationTopic":"web_app_appinternalstats_realtimedata_topic","correlatedRecord":false,"needCorrelationCacheCleanup":false,"needCorrelation":false,"correlationAttributes":null,"correlationRecordCount":0,"correlateTimeWindowInMills":0,"lastCorrelationRecord":false,"realtimeESStorage":true,"receiverDateTimeStamp":1460124283554,"payloadData":{"timestamp":"2016-04-08T10:03:18.244","status":"get","source":"MSG1","ITEM":"TEST1","basis":"","pricingdate":"","content":"","msgname":"","idlreqno":"","host":"web01","Webservermember":"Web"},"payloadDataText":"","key":"web_app:appinternalstats_prod","destinationTopicName":"web_app_appinternalstats_realtimedata_topic","esindex":"web_app","estype":"appinternalstats_prod","useCase":"appinternalstats_prod","Code":"web_app"} 

私はR.

のデータフレームにライン毎payloadData部をwithingタイムスタンプ、ソース、ホスト、ステータス・フィールドを変換できるようにする必要私はこれを試してみた:

ライブラリ(rjson) D < -fromJSON(ファイル= "file.txtは")

dput(d) 
structure(list(senderDateTimeStamp = "2016/04/08 10:03:18", senderHost = NULL, 
        senderAppcode = "web", senderUsecase = "appinternalstats_prod", 
        destinationTopic = "web_appinternalstats_realtimedata_topic", 
        correlatedRecord = FALSE, needCorrelationCacheCleanup = FALSE, 
        needCorrelation = FALSE, correlationAttributes = NULL, correlationRecordCount = 0, 
        correlateTimeWindowInMills = 0, lastCorrelationRecord = FALSE, 
        realtimeESStorage = TRUE, receiverDateTimeStamp = 1460124283554, 
        payloadData = structure(list(timestamp = "2016-04-08T10:03:18.244", 
               status = "get", source = "MSG1", 
               region = "", evetid = "", osareqid = "", basis = "", 
               pricingdate = "", content = "", msgname = "", recipient = "", 
               objid = "", idlreqno = "", host = "web01", webservermember = "webSingleton"), 
              .Names = c("timestamp", 
              "status", "source", "region", "evetid", 
              "osareqid", "basis", "pricingdate", "content", "msgname", 
              "recipient", "objid", "idlreqno", "host", "webservermember" 
               )), payloadDataText = "", key = "web:appinternalstats_prod", 
        destinationTopicName = "web_appinternalstats_realtimedata_topic", 
        hdfsPath = "web/appinternalstats_prod", esindex = "web", 
        estype = "appinternalstats_prod", useCase = "appinternalstats_prod", 
        appCode = "web"), .Names = c("senderDateTimeStamp", "senderHost", 
               "senderAppcode", "senderUsecase", "destinationTopic", "correlatedRecord", 
               "needCorrelationCacheCleanup", "needCorrelation", "correlationAttributes", 
               "correlationRecordCount", "correlateTimeWindowInMills", "lastCorrelationRecord", 
               "realtimeESStorage", "receiverDateTimeStamp", "payloadData", 
               "payloadDataText", "key", "destinationTopicName", "hdfsPath", 
               "esindex", "estype", "useCase", "appCode")) 

jsonエントリのpayloadDataセクションをデータフレームに変換する方法はありますか?

+1

はエラーを与える:構造でエラーが発生しました(リスト(タイムスタンプ=「2016-04-08T10 :03:18.244 "、status =" get "、: 'names'属性[16]はベクトルと同じ長さでなければなりません[15] – user1357015

+0

@ user1357015、作業出力の出力を更新しました – user1471980

答えて

1

これはあなたが望むものになるかもしれません:

library(rjson) 
d<-fromJSON(file="file.txt") 
myDf <- do.call("rbind", lapply(d, function(x) { 
       data.frame(TimeStamp = x$payloadData$timestamp, 
          Source = x$payloadData$source, 
          Host = $payloadData$host, 
          Status = x$payloadData$status)})) 
+0

エラー:予期しない '}': "ソース= D $ payloadData $ソース、 ステータス= Dの$のpayloadDataの$ステータス}" >) エラー:予期しない ')' で ")" >) エラー:予期しない ')' ")" で申し訳ありませんが – user1471980

+1

。逃した ")"。今すぐ働かなければならない。 – Psidom

+0

d $ payloadDataのエラー:アトミックベクトルの$演算子が無効 – user1471980

1

パッケージtidyjsonを考えてみましょう:あなたのコードを実行

library(tidyjson) 
library(magrittr) 

json <- '{"senderDateTimeStamp":"2016/04/08 10:03:18","senderHost":null,"senderCode":"web_app","senderUsecase":"appinternalstats_prod","destinationTopic":"web_app_appinternalstats_realtimedata_topic","correlatedRecord":false,"needCorrelationCacheCleanup":false,"needCorrelation":false,"correlationAttributes":null,"correlationRecordCount":0,"correlateTimeWindowInMills":0,"lastCorrelationRecord":false,"realtimeESStorage":true,"receiverDateTimeStamp":1460124283554,"payloadData":{"timestamp":"2016-04-08T10:03:18.244","status":"get","source":"MSG1","ITEM":"TEST1","basis":"","pricingdate":"","content":"","msgname":"","idlreqno":"","host":"web01","Webservermember":"Web"},"payloadDataText":"","key":"web_app:appinternalstats_prod","destinationTopicName":"web_app_appinternalstats_realtimedata_topic","esindex":"web_app","estype":"appinternalstats_prod","useCase":"appinternalstats_prod","Code":"web_app"}' 

json %>% 
    gather_keys() 

# head() of above 
# document.id     key 
# 1   1 senderDateTimeStamp 
# 2   1   senderHost 
# 3   1   senderCode 
# 4   1  senderUsecase 
# 5   1 destinationTopic 
# 6   1 correlatedRecord 

json %>% 
    enter_object("payloadData") %>% 
    gather_keys() %>% 
    append_values_string() 

# head() of above 
# document.id   key     string 
# 1   1 timestamp 2016-04-08T10:03:18.244 
# 2   1  status      get 
# 3   1  source     MSG1 
# 4   1  ITEM     TEST1 
# 5   1  basis       
# 6   1 pricingdate       
+0

@JasonAiskalns、jsonデータはファイル内にあります。私はあなたのコードを実行すると、このエラーを受け取るデータ<-fromJSON(file = "file.txt")をオブジェクトに読み込むためにこれを最初に実行しました:UseMethod( "as.tbl_json")のエラー: クラス "list"のオブジェクトに適用される 'as.tbl_json'の適用可能なメソッド – user1471980

関連する問題