スカラーの配列から角括弧[]を削除します

大括弧で囲まれた日付があります[2014-11-08 06：27：00.0]、削除します。スカラーの配列から角括弧[]を削除します

の予想される出力は、2014年11月8日06：27：00.0

val conf = new SparkConf(true) 
    .set("spark.cassandra.connection.host", "127.0.0.1").setAppName("CasteDate").setMaster("local[*]") 
    .set("spark.cassandra.connection.port", "9042") 
    .set("spark.driver.allowMultipleContexts", "true") 
    .set("spark.streaming.receiver.writeAheadLog.enable", "true") 

val sc = new SparkContext(conf) 

val ssc = new StreamingContext(sc, Seconds(1)) 
val csc=new CassandraSQLContext(sc) 

val sqlContext = new org.apache.spark.sql.SQLContext(sc) 

var input: SimpleDateFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.S") 
input.setTimeZone(TimeZone.getTimeZone("GMT")) 
var dia: SimpleDateFormat = new SimpleDateFormat("dd") 
var mes: SimpleDateFormat = new SimpleDateFormat("MM") 
var ano: SimpleDateFormat = new SimpleDateFormat("yyyy") 
var horas: SimpleDateFormat = new SimpleDateFormat("HH") 
var minutos: SimpleDateFormat = new SimpleDateFormat("mm") 

val data=csc.sql("SELECT timecol from smartgrids.analyzer_temp").collect() 

import sqlContext.implicits._ 

val result = data.map(row => { 
         val day = dia.format(input.parse(row.toString())) 
         val month = mes.format(input.parse(row.toString())) 
         val year = ano.format(input.parse(row.toString())) 
         val hour = horas.format(input.parse(row.toString())) 
         val minute = minutos.format(input.parse(row.toString())) 
          }) 

val collection = sc.parallelize(Seq(("day", 2), ("month", 2), ("year", 4), ("hour", 2), ("minute", 2))) 
collection.saveToCassandra("features", "datepart", SomeColumns("day", "month", "year", "hour", "minute")) 
sc.stop()

後、私はエラーを受け取り、このコードを実行します。

java.text.ParseException: Unparseable date: "[2015-08-20 21:01:00.0]" 
    at java.text.DateFormat.parse(DateFormat.java:366)

私はので、このエラーがあると思います日付に大括弧があるので、削除したい。

val result = data.map(row => { 
    val day = dia.format(input.parse(row.toString().replace("[", "").replace("]", "").replace("(", "").replace(")", ""))) 
    val month = mes.format(input.parse(row.toString().replace("[", "").replace("]", "").replace("(", "").replace(")", "")))          
    val year = ano.format(input.parse(row.toString().replace("[", "").replace("]", "").replace("(", "").replace(")", "")))           
    val hour = horas.format(input.parse(row.toString().replace("[", "").replace("]", "").replace("(", "").replace(")", "")))          
    val minute = minutos.format(input.parse(row.toString().replace("[", "").replace("]", "").replace("(", "").replace(")", ""))) 
})

私はそれをテストし、それが動作します：

出典

2016-10-10 Marisa Cruz

そして、あなたはあなたの現在のアプローチで何の問題を持っているを使用している場合、文字列の配列として.collectプリントdistinct.as [文字列]？ – Yawar

あなたはどのようなタイプがありますか？ 'String'ですか？ –

問題に私のコードを追加したので、私がしようとしていることを理解するのが簡単かもしれません。コードを実行した後、日付の配列を取得します。これらの日付をマップすると、角括弧を削除します。 –

あなたは、不要な文字を削除し.replaceAllや正規表現を使用することができます。

str.replaceAll("[\\[\\]]","")

は、文字列から角かっこを削除します。

出典

2016-10-17 20:37:07

正規表現ありがとう。 btw、末尾に必要な余分な括弧があります – Davos

ありがとうございます。更新しました。 –

ソリューションは次のとおりです。

入力日付：

data: Array[org.apache.spark.sql.Row] = Array([2015-08-20 21:01:00.0]

出力：

Array(List(20, 08, 2015, 21, 01)

出典

2016-10-10 21:43:38

ヒント： 'string.replace'は正規表現を受け入れます。あなたは '.replace.replace.replace ... 'を連鎖するのではなく、一度にすべての文字を置き換えることができます。 –

うれしいことですが、それはひどく冗長です。特定のセットのすべての文字を削除しているだけの場合は、それらを除外するのはなぜですか？言い換えれば： '... toString（）。filterNot（" []）。contains（_）） ' – jwvh

ありがとう助けを借りて： –

dataset.select（「ID」）。あなたはスパーク2.0

出典

2017-08-28 08:22:55

スカラーの配列から角括弧[]を削除します

答えて

関連する問題