ドットで列の名前を変更するにはどうすればよいですか？

私はスパーク1.5を使用します。ドットで列の名前を変更するにはどうすればよいですか？

名前にドットが含まれている列（例：param.x.y）に苦労しています。私は最初にそれらを選択する問題を抱えていましたが、私は `character（` param.x.y`）を使う必要があることを読みました。

現在、列の名前を変更しようとすると問題が発生します。私は、同様のアプローチを使用していますが、それはそれは動作しないようです：

df.withColumnRenamed("`param.x.y`", "param_x_y")

だから私はチェックしたい - これは本当にバグである、または私が何か間違ったことをやっていますか？

出典

2016-12-26 Marko

あなたのコードのように見えますが、問題は「元の列名です。私はそれを削除し、それは私のために働いた。データフレーム内の列名の名前を変更する作業コードのサンプル。

import org.apache.spark._ 
import org.apache.spark.sql.SQLContext; 
import org.apache.spark.sql._ 
import org.apache.spark._ 
import org.apache.spark.sql.DataFrame 
import org.apache.spark.rdd.RDD 

// Import Row. 
import org.apache.spark.sql.Row; 
// Import Spark SQL data types 
import org.apache.spark.sql.types.{ StructType, StructField, StringType }; 

object RenameColumn extends Serializable { 

    val conf = new SparkConf().setAppName("read local file") 

    conf.set("spark.executor.memory", "100M") 
    conf.setMaster("local"); 

    val sc = new SparkContext(conf) 
    // sc is an existing SparkContext. 
    val sqlContext = new org.apache.spark.sql.SQLContext(sc) 
    def main(args: Array[String]): Unit = { 

    // Create an RDD 
    val people = sc.textFile("C:/Users/User1/Documents/test"); 
    // The schema is encoded in a string 
    val schemaString = "name age" 

    // Generate the schema based on the string of schema 
    val schema = 
     StructType(
     schemaString.split(" ").map(fieldName => StructField(fieldName, StringType, true))) 

    // Convert records of the RDD (people) to Rows. 
    val rowRDD = people.map(_.split(",")).map(p => Row(p(0), p(1).trim)) 
    // Apply the schema to the RDD. 
    val peopleDataFrame = sqlContext.createDataFrame(rowRDD, schema) 
    peopleDataFrame.printSchema() 

    val renamedSchema = peopleDataFrame.withColumnRenamed("name", "name_renamed"); 
    renamedSchema.printSchema(); 
    sc.stop 

    } 
}

その出力：

16/12/26 16:53:48 INFO SparkContext: Created broadcast 0 from textFile at RenameColumn.scala:28 
root 
root 
|-- name.rename: string (nullable = true) 
|-- age: string (nullable = true) 

root 
|-- name_renamed: string (nullable = true) 
|-- age: string (nullable = true) 

16/12/26 16:53:49 INFO SparkUI: Stopped Spark web UI at http://XXX.XXX.XXX.XXX:<port_number> 
16/12/26 16:53:49 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!

は、詳細についてはおspark dataframe documentation

アップデートを確認することができます。私は引用符で囲まれた文字列でテストされ、期待される出力を得ました。以下のコードとその出力を参照してください。

val schemaString = "`name.rename` age" 

    // Generate the schema based on the string of schema 
    val schema = 
     StructType(
     schemaString.split(" ").map(fieldName => StructField(fieldName, StringType, true))) 

    // Convert records of the RDD (people) to Rows. 
    val rowRDD = people.map(_.split(",")).map(p => Row(p(0), p(1).trim)) 
    // Apply the schema to the RDD. 
    val peopleDataFrame = sqlContext.createDataFrame(rowRDD, schema) 
    peopleDataFrame.printSchema() 

    val renamedSchema = peopleDataFrame.withColumnRenamed("`name.rename`", "name_renamed"); 
    renamedSchema.printSchema(); 
    sc.stop

その出力：答えを

16/12/26 20:24:24 INFO SparkContext: Created broadcast 0 from textFile at RenameColumn.scala:28 
root 
|-- `name.rename`: string (nullable = true) 
|-- age: string (nullable = true) 

root 
|-- name_renamed: string (nullable = true) 
|-- age: string (nullable = true) 

16/12/26 20:24:25 INFO SparkUI: Stopped Spark web UI at http://xxx.xxx.xxx.x:<port_number> 
16/12/26 20:24:25 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!

出典

2016-12-26 11:37:16

感謝。それで、本当にバグがあるので、手動でやる必要がありますか？ – Marko

ここにドットで問題はありません。私は問題が引用符であると思う。それはより多くを確認する必要があります –

私の最新のテストからは、ここにバグはありません。実装に問題がある可能性があります。明確にするために、更新された回答をご覧ください。 –

ドットで列の名前を変更するにはどうすればよいですか？

答えて

関連する問題