ケラスLSTMウェイトを理解する

予測出力を得るために密集層重みを掛ける方法を理解できますが、LSTMモデルから行列をどのように解釈できますか？ケラスLSTMウェイトを理解する

from keras.models import Model 
from keras.layers import Input, Dense, LSTM 
import numpy as np 
np.random.seed(42) 

X = np.array([[1, 2], [3, 4]]) 

I = Input(X.shape[1:]) 
D = Dense(2)(I) 
linear_model = Model(inputs=[I], outputs=[D]) 
print('linear_model.predict:\n', linear_model.predict(X)) 

weight, bias = linear_model.layers[1].get_weights() 
print('bias + X @ weights:\n', bias + X @ weight)

出力：

linear_model.predict: 
[[ 3.10299015 0.46077788] 
[ 7.12412453 1.17058146]] 
bias + X @ weights: 
[[ 3.10299003 0.46077788] 
[ 7.12412441 1.17058146]]

LSTM例：

X = X.reshape(*X.shape, 1) 
I = Input(X.shape[1:]) 
L = LSTM(2)(I) 
lstm_model = Model(inputs=[I], outputs=[L]) 
print('lstm_model.predict:\n', lstm_model.predict(X)) 
print('weights I don\'t understand:\n') 
lstm_model.layers[1].get_weights()

ここ
は、いくつかのおもちゃの例

高密度例（フィッティング気にしない、それだけで行列の乗算についてです）です

出力：

lstm_model.predict: 
[[ 0.27675897 0.15364291] 
[ 0.49197391 0.04097994]] 

weights I don't understand: 
[array([[ 0.11056691, 0.03153521, -0.78214532, 0.04079598, 0.32587671, 
      0.72789955, 0.58123612, -0.57094401]], dtype=float32), 
array([[-0.16277026, -0.43958429, 0.30112407, 0.07443386, 0.70584315, 
      0.17196879, -0.14703408, 0.36694485], 
     [-0.03672785, -0.55035251, 0.27230391, -0.45381972, -0.06399836, 
     -0.00104597, 0.14719161, -0.62441903]], dtype=float32), 
array([ 0., 0., 1., 1., 0., 0., 0., 0.], dtype=float32)]

出典

2017-10-26 Alex Ozerov

あなたはテンソルオブジェクトから重みの名前を取得することができます

weight_tensors = lstm_model.layers[1].weights 
weight_names = list(map(lambda x: x.name, weight_tensors)) 
print(weight_names)

出力：から

['lstm_1/kernel:0', 'lstm_1/recurrent_kernel:0', 'lstm_1/bias:0']

source codeあなたはそれらの重みは、入力の重みに分割されていることがわかります、忘れて、セルの状態と出力

self.kernel_i = self.kernel[:, :self.units] 
    self.kernel_f = self.kernel[:, self.units: self.units * 2] 
    self.kernel_c = self.kernel[:, self.units * 2: self.units * 3] 
    self.kernel_o = self.kernel[:, self.units * 3:] 

    self.recurrent_kernel_i = self.recurrent_kernel[:, :self.units] 
    self.recurrent_kernel_f = self.recurrent_kernel[:, self.units: self.units * 2] 
    self.recurrent_kernel_c = self.recurrent_kernel[:, self.units * 2: self.units * 3] 
    self.recurrent_kernel_o = self.recurrent_kernel[:, self.units * 3:] 

    if self.use_bias: 
     self.bias_i = self.bias[:self.units] 
     self.bias_f = self.bias[self.units: self.units * 2] 
     self.bias_c = self.bias[self.units * 2: self.units * 3] 
     self.bias_o = self.bias[self.units * 3:] 
    else: 
     self.bias_i = None 
     self.bias_f = None 
     self.bias_c = None 
     self.bias_o = None

使い方それらの重みはimplementationに依存します。私はいつも製剤についてChristopher Olah's blogを参照しています。

出典

2017-10-26 12:26:38

ケラスLSTMウェイトを理解する

答えて

関連する問題