テンソルフローを使用したLSTM回帰モデルの実装

私は、入力番号のリストに対してテンソルフローLSTM回帰モデルを実装しようとしています。例：テンソルフローを使用したLSTM回帰モデルの実装

TIMESTEPS = 20 
num_hidden=20 

Xd, yd = load_data() 

train_input = Xd['train'] 
train_input = train_input.reshape(-1,20,1) 
train_output = yd['train'] 

# train_input = [[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20],.. 
# train_output = [[21],[22],[23].... 

test_input = Xd['test'] 
test_output = yd['test'] 

X = tf.placeholder(tf.float32, [None, 20, 1]) 
y = tf.placeholder(tf.float32, [None, 1]) 

cell = tf.nn.rnn_cell.LSTMCell(num_hidden, state_is_tuple=True) 

val, state = tf.nn.dynamic_rnn(cell, X, dtype=tf.float32) 
val = tf.Print(val, [tf.argmax(val,1)], 'argmax(val)=' , summarize=20, first_n=7) 

val = tf.transpose(val, [1, 0, 2]) 
val = tf.Print(val, [tf.argmax(val,1)], 'argmax(val2)=' , summarize=20, first_n=7) 

# Take only the last output after 20 time steps 
last = tf.gather(val, int(val.get_shape()[0]) - 1) 
last = tf.Print(last, [tf.argmax(last,1)], 'argmax(val3)=' , summarize=20, first_n=7) 

# define variables for weights and bias 
weight = tf.Variable(tf.truncated_normal([num_hidden, int(y.get_shape()[1])])) 
bias = tf.Variable(tf.constant(0.1, shape=[y.get_shape()[1]])) 

# Prediction is matmul of last value + wieght + bias 
prediction = tf.matmul(last, weight) + bias 

# Cost function using softmax 
# y is the true distrubution and prediction is the predicted 
cost = tf.reduce_mean(-tf.reduce_sum(y * tf.log(prediction), reduction_indices=[1])) 
#cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=y)) 

optimizer = tf.train.AdamOptimizer() 
minimize = optimizer.minimize(cost) 

from tensorflow.python import debug as tf_debug 
inita = tf.initialize_all_variables() 
sess = tf.Session() 
sess.run(inita) 

batch_size = 100 
no_of_batches = int(len(train_input)/batch_size) 
epoch = 10 
test_size = 100 
for i in range(epoch): 
    for start, end in zip(range(0, len(train_input), batch_size), range(batch_size, len(train_input)+1, batch_size)): 
     sess.run(minimize, feed_dict={X: train_input[start:end], y: train_output[start:end]}) 

    test_indices = np.arange(len(test_input)) # Get A Test Batch 
    np.random.shuffle(test_indices) 
    test_indices = test_indices[0:test_size] 
    print (i, mean_squared_error(np.argmax(test_output[test_indices], axis=1), sess.run(prediction, feed_dict={X: test_input[test_indices]}))) 

print ("predictions", prediction.eval(feed_dict={X: train_input}, session=sess)) 
y_pred = prediction.eval(feed_dict={X: test_input}, session=sess) 
sess.close() 
test_size = test_output.shape[0] 
ax = np.arange(0, test_size, 1) 
plt.plot(ax, test_output, 'r', ax, y_pred, 'b') 
plt.show()

しかし、私は、各ステップで代わりの減少、算出MSEが増加しコストを最小化することができません：

input_data = [1, 2, 3, 4, 5] 
time_steps = 2 
    -> X == [[1, 2], [2, 3], [3, 4]] 
    -> y == [3, 4, 5]

コードは以下です。私が使用しているコスト問題に問題があると思われます。

私が間違っていることに関する考えや示唆はありますか？

感謝のコメントで述べたように

出典

2017-11-14 Praveen

クロスエントロピーの代わりに平均二乗誤差を使用しようとすると、ここでは分類を行っていません。コスト= 0.5 * tf.square（y予測）。コスト= tf.reduce_mean（コスト） –

こんにちはアンソニー、入力のおかげで。私はあなたが示唆したようにコストとしてMSEを使用しようとしましたが、各エポック後の計算されたMSEエラーは依然として増加しています。 – Praveen

私は参照してください。アダムオプティマイザで学習率を下げてみましたか？ –

は、あなたはMSE関数にあなたの損失関数を変更して、あなたの学習率を低減しなければなりませんでした。あなたの誤差はゼロに収束していますか？

出典

2017-11-14 11:59:58

私は今、サイン関数を使っておもちゃの例のデータセットでテストしました。 MSEは小さな範囲（この場合は0.6と0.5の間）で変動しますが、その値より小さくなることはありません。正弦関数のような単純な場合、0に収束すべきだと思いますか？ – Praveen

テンソルフローを使用したLSTM回帰モデルの実装

答えて

関連する問題