2016-11-30 15 views
1

私は、最初の観測が月次データから来た時系列データセットを持っています。私は毎日に日付を変換し、毎月の初めに各値を入れました。今、データセットに重複する日付がなくなるまで、各重複値に1日を追加したいと思います。このステップは、後続の分析およびプロットにとって重要です。R - whileループで重複値を排除する

これは私に似ているデータセットを生成することである:この上にビットを読んだ後

sample <- rbind("2007-01-01","2007-02-01","2007-03-01","2007-05-01", 
      "2007-06-01","2007-07-01","2007-09-01","2007-10-01", 
      "2007-11-01","2007-12-01","2008-01-01","2008-02-01", 
      "2008-03-01","2008-05-01","2008-06-01","2008-07-01", 
      "2008-09-01","2008-10-01","2008-11-01","2008-12-01", 
      "2009-02-01","2009-04-01","2009-05-01","2009-06-01", 
      "2009-07-01","2009-09-01","2009-10-01","2009-11-01", 
      "2009-12-01","2010-01-01","2010-02-01","2010-03-01", 
      "2010-04-01","2010-05-01","2010-05-01","2010-05-01", 
      "2010-05-01","2010-05-01","2010-06-01","2010-06-01", 
      "2010-06-01","2010-06-01","2010-07-01","2010-07-01", 
      "2010-07-01","2010-07-01","2010-07-01","2010-08-01", 
      "2010-08-01","2010-08-01","2010-08-01","2010-09-01", 
      "2010-09-01","2010-09-01","2010-09-01","2010-09-01", 
      "2010-10-01","2010-10-01","2010-10-01","2010-10-01", 
      "2010-10-01","2010-11-01","2010-11-01","2010-11-01", 
      "2010-11-01","2010-11-01","2010-12-01","2010-12-01", 
      "2010-12-01","2010-12-01","2010-12-01","2011-01-01", 
      "2011-01-01","2011-01-01","2011-01-01","2011-02-01", 
      "2011-02-01","2011-02-01","2011-02-01","2011-03-01", 
      "2011-03-01","2011-03-01","2011-03-01","2011-04-01", 
      "2011-04-01","2011-04-01","2011-04-01","2011-04-01", 
      "2011-05-01","2011-05-01","2011-05-01","2011-05-01", 
      "2011-05-01","2011-06-01","2011-06-01","2011-06-01", 
      "2011-06-01","2011-06-01","2011-07-01","2011-07-01", 
      "2011-07-01","2011-07-01","2011-08-01","2011-08-01", 
      "2011-08-01","2011-09-01","2011-09-01","2011-09-01", 
      "2011-09-01","2011-10-01","2011-10-01","2011-10-01", 
      "2011-10-01","2011-10-01","2011-11-01","2011-11-01", 
      "2011-11-01","2011-11-01","2011-11-01","2011-12-01", 
      "2011-12-01","2011-12-01","2011-12-01","2011-12-01", 
      "2012-01-01","2012-01-01","2012-01-01","2012-01-01", 
      "2012-01-01","2012-02-01","2012-02-01","2012-02-01", 
      "2012-02-01","2012-02-01","2012-03-01","2012-03-01", 
      "2012-03-01","2012-03-01","2012-03-01","2012-04-01", 
      "2012-04-01","2012-04-01","2012-04-01","2012-05-01", 
      "2012-05-01","2012-05-01","2012-05-01","2012-05-01", 
      "2012-06-01","2012-06-01","2012-06-01","2012-06-01", 
      "2012-06-01","2012-07-01","2012-07-01","2012-07-01", 
      "2012-07-01","2012-07-01","2012-08-01","2012-08-01", 
      "2012-08-01","2012-09-01","2012-09-01","2012-09-01", 
      "2012-09-01","2012-09-01","2012-10-01","2012-10-01", 
      "2012-10-01","2012-10-01","2012-10-01","2012-11-01", 
      "2012-11-01","2012-11-01","2012-11-01","2012-11-01", 
      "2012-12-01","2012-12-01","2012-12-01","2013-01-01", 
      "2013-01-01","2013-01-01","2013-01-01","2013-01-01", 
      "2013-02-01","2013-02-01","2013-02-01","2013-02-01", 
      "2013-02-01","2013-03-01","2013-03-01","2013-03-01", 
      "2013-03-01","2013-03-01","2013-04-01","2013-04-01", 
      "2013-04-01","2013-04-01","2013-04-01","2013-05-01", 
      "2013-05-01","2013-05-01","2013-05-01","2013-05-01", 
      "2013-06-01","2013-06-01","2013-06-01","2013-06-01", 
      "2013-07-01","2013-07-01","2013-07-01","2013-07-01", 
      "2013-08-01","2013-08-01","2013-08-01","2013-09-01", 
      "2013-09-01","2013-09-01","2013-09-01","2013-09-01", 
      "2013-10-01","2013-10-01","2013-10-01","2013-10-01", 
      "2013-10-01","2013-11-01","2013-11-01","2013-11-01", 
      "2013-11-01","2013-11-01","2013-12-01","2013-12-01", 
      "2013-12-01","2013-12-01","2013-12-01","2014-01-01", 
      "2014-01-01","2014-01-01","2014-01-01","2014-01-01", 
      "2014-02-01","2014-02-01","2014-02-01","2014-02-01", 
      "2014-02-01","2014-03-01","2014-03-01","2014-03-01", 
      "2014-03-01","2014-03-01","2014-05-01","2014-05-01", 
      "2014-05-01","2014-05-01","2014-05-01","2014-06-01", 
      "2014-06-01","2014-06-01","2014-07-01","2014-07-01", 
      "2014-07-01","2014-07-01","2014-08-01","2014-08-01", 
      "2014-09-01","2014-09-01","2014-09-01","2014-09-01", 
      "2014-09-01","2014-10-01","2014-10-01","2014-10-01", 
      "2014-10-01","2014-11-01","2014-11-01","2014-11-01", 
      "2014-11-01","2014-12-01","2014-12-01","2014-12-01", 
      "2015-01-01","2015-01-01","2015-01-01","2015-01-01", 
      "2015-02-01","2015-02-01","2015-02-01","2015-02-01", 
      "2015-03-01","2015-03-01","2015-03-01","2015-03-01", 
      "2015-04-01","2015-04-01","2015-04-01","2015-04-01", 
      "2015-05-01","2015-05-01","2015-06-01","2015-06-01", 
      "2015-06-01","2015-07-01","2015-07-01","2015-08-01", 
      "2015-08-01","2015-09-01","2015-09-01","2015-09-01", 
      "2015-10-01","2015-10-01","2015-11-01","2015-11-01", 
      "2015-12-01","2016-01-01","2016-01-01","2016-01-01", 
      "2016-01-01","2016-02-01","2016-02-01","2016-02-01", 
      "2016-02-01","2016-03-01","2016-04-01","2016-04-01", 
      "2016-04-01","2016-04-01","2016-05-01","2016-05-01", 
      "2016-06-01","2016-06-01","2016-06-01","2016-06-01", 
      "2016-07-01","2016-07-01","2016-07-01","2016-07-01", 
      "2016-08-01","2016-08-01","2016-08-01","2016-08-01", 
      "2016-08-01","2016-08-01","2016-08-01","2016-08-01", 
      "2016-08-01","2016-08-01","2016-09-01","2016-09-01", 
      "2016-09-01","2016-09-01","2016-10-01","2016-10-01", 
      "2016-10-01","2016-11-01","2016-11-01") 
sample <- as.data.frame(sample) 
sample$Value <- (1:355) 
colnames(sample)[1] <- c("Date") 
View(sample) 

を、私は私が行う必要があることを通るwhileループであるという結論に達しましたそれが重複している場合は、各値に1日を加算します。

library(lubridate)  
while(sample$Date==sample$Date[-1]) {sample$Date <- sample$Date+days(1); print(sample$Date);} 

しかし、ループが実行されないと警告の多くを生成します。lubridate packageを使用すると、私はこのような何かを。あなたはこれをどのように解決するか考えていますか?私はこれがかなり簡単な質問であると仮定します。私はループするのが初めてです。

ありがとうございました!

+0

は、より小さなデータと期待される出力を共有します。 –

答えて

2

data.tableでこれを達成できます。まず、我々はfactorクラスから日付を変換するなど、物事を設定します:

library(data.table) 
setDT(sample) 
sample[ , Date := as.Date(Date) ] 

その後、我々はあなたの変換を実行します:

私たちは、各サブセットを分離され、ここで何をやっている
sample[ , Date := Date + (seq_len(.N) - 1L), by = Date ] 

日付値を一致させ、それらにシーケンスベクトルを追加する。たとえば、4つの一致する日付値を持つサブセットは、そのDateベクトルにc(0,1,2,3)日を追加して、最初の値は同じままで、後続の値は記述したとおりに増加します。

+0

ありがとうございます!私は、ある日を追加するだけでなく、月ごとに重複する値を等間隔に分散する方法で、どうすればいいのだろうと思っていましたか? – eborbath

+0

あなたは大歓迎です!あなたの質問に答えたなら、受け入れられたものとしてマークしてください...等間隔のシーケンスの場合は、おそらく 'seq.Date'が必要でしょう。 'seq.Date'の助けが必要な場合は、新しい質問をすることができます。 – rosscova

関連する問題