2016-06-26 12 views
0

私はハッシュの配列としてデータセットを持っています。ハッシュの配列内の列から行列を作成する

[ 
    {"id" => "1", "fruit" => "grape", "amount" => 10}, 
    {"id" => "2", "fruit" => "banana", "amount" => 6}, 
    ... 
] 

私は(集合のための行列として行列を作成するには、以下のような形式にデータを変換する必要があります。例えば、

id fruit  amount 
1 grape  10 
2 banana  6 
3 grape  7 
4 mango  15 
5 strawberry 5 

これはとしてハッシュの配列に格納されていますRglpkを用いて線形最適化問題):その後、

id is_grape is_banana is_mango is_strawberry 
1  1   0   0   0 
2  0   1   0   0 
3  1   0   0   0 
4  0   0   1   0 
5  0   0   0   1 

と列と行を入れ替えることにより、このような何かを得る:

[ 
    #1 #2 #3 #4  #5 # each column for id 1, 2, ... 
    1  0  1  0  0  # row is_grape 
    0  1  0  0  0  # row is_banana 
    0  0  0  1  0  # row is_mango 
    0  0  0  0  1  # row is_strawberry 
] 

1つの列に任意の数のカテゴリを入れることができます。ハードコードではなくis_grapeis_mangoタイプのカテゴリの値を動的に作成したいと考えています。行列形式でデータを取得するにはどうすればよいですか?

答えて

2
arr = [ 
    {"id" => "1", "fruit" => "grape", "amount" => 10}, 
    {"id" => "2", "fruit" => "banana", "amount" => 6} 
] 

# fruits = arr.group_by { |h| h['fruit'] }.keys.map { |e| "is_#{e}" } 
fruits = arr.map { |e| "is_#{e['fruit']}" }.uniq 
#⇒ [ "is_grape", "is_banana" ] 
arr.each_with_object([]) do |h, memo| 
    e = fruits.zip([0] * fruits.size).to_h 
    e['id'] = h['id'] 
    e["is_#{h['fruit']}"] += 1 
    # e["is_#{h['fruit']}"] += h['amount'].to_i # that seems meaningful 
    memo << e 
end 

与える:

#⇒ [ 
# [0] { 
#   "id" => "1", 
# "is_banana" => 0, 
#  "is_grape" => 1 
# }, 
# [1] { 
#   "id" => "2", 
# "is_banana" => 1, 
#  "is_grape" => 0 
# } 
# ] 
1
a = [ 
    {"id" => "1", "fruit" => "grape", "amount" => 10}, 
    {"id" => "2", "fruit" => "banana", "amount" => 6}, 
    {"id" => "3", "fruit" => "grape", "amount" => 7}, 
    {"id" => "4", "fruit" => "mango", "amount" => 15}, 
    {"id" => "5", "fruit" => "strawberry", "amount" => 5}, 
] 

fruits = a.map{|h| h["fruit"]}.uniq 
m = Array.new(fruits.length){[0] * a.length} 
a.each{|h| m[fruits.index(h["fruit"])][h["id"].to_i - 1] = 1} 
p m 

出力:

[ 
    [1, 0, 1, 0, 0], 
    [0, 1, 0, 0, 0], 
    [0, 0, 0, 1, 0], 
    [0, 0, 0, 0, 1] 
] 
0
arr = [ 
    {"id" => "1", "fruit" => "grape", "amount" => 10}, 
    {"id" => "2", "fruit" => "banana", "amount" => 6}, 
    {"id" => "3", "fruit" => "mango", "amount" => 4}, 
    {"id" => "7", "fruit" => "banana", "amount" => 3}, 
    {"id" => "5", "fruit" => "strawberry", "amount" => 7}, 
    {"id" => "6", "fruit" => "banana", "amount" => 1}, 
    {"id" => "4", "fruit" => "banana", "amount" => 3} 
] 

fruit_to_row = arr.map { |h| h["fruit"] }.uniq.each_with_index. 
    with_object({}) { |(f,i),h| h[f] = i } 
    #=> {"grape"=>0, "banana"=>1, "mango"=>2, "strawberry"=>3} 

arr.each_with_index. 
    with_object(Array.new(fruit_to_row.size) {Array.new(arr.size) {0}}) { |(h,i),a| 
     a[fruit_to_row[h["fruit"]]][i] = 1 } 
    #=> [[1, 0, 0, 0, 0, 0, 0], grape 
    # [0, 1, 0, 1, 0, 1, 1], banana 
    # [0, 0, 1, 0, 0, 0, 0], mango 
    # [0, 0, 0, 0, 1, 0, 0]] strawberry 
関連する問題