Fork me on GitHub

RNN入门demo

RNN入门Demo

​ 耗费了大量时间来讲解RNN和LSTM的原理,并且这一块确实有点难以理解。实践是检验真理的唯一标准,废话不多说,直接上基于TensorFlow平台的RNN加上LSTM优化后的代码和运行效果。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108

#author:victor
#什么是循环神经网络RNN
#What is Recurrent Neural Networks?(RNN)

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
tf.reset_default_graph()

#Mnist data
mnist=input_data.read_data_sets('MNIST_data',one_hot=True)

#networks parameters
lr=0.001#learning rate
training_iters=1000000#iteration也就是循环多少次
batch_size=128
display_step=10

n_inputs=28#MNIST data input(image shape:28*28)
n_steps=28#time steps,inut的28行,作为28列输出
n_hidden_units=128#neurons in hidden layer
n_classes=10#MNIST classes(0-9digits)

#tf.Graph input
x=tf.placeholder(tf.float32,[None,n_steps,n_inputs])
y=tf.placeholder(tf.float32,[None,n_classes])

#Define weights
weights={
#(28,128)
'in':tf.Variable(tf.random_normal([n_inputs,n_hidden_units])),
#(128,10)
'out':tf.Variable(tf.random_normal([n_hidden_units,n_classes]))
}

#Define biases
biases={
#(128,)
'in':tf.Variable(tf.constant(0.1,shape=[n_hidden_units,])),
#(10,)
'out':tf.Variable(tf.constant(0.1,shape=[n_classes,]))
}


#Define RNN
def RNN(X,weights,biases):

#hidden layer for input to cell
#X(128 batch,28 steps,28 inputs)
#把X转换成(128*28,28 inputs)
X=tf.reshape(X,[-1,n_inputs])
#把X转换成(128batch,28 steps,128 hidden)
X_in=tf.matmul(X,weights['in']+biases['in'])
#把X转换成(128batch,28steps,128hidden)
X_in=tf.reshape(X_in,[-1,n_steps,n_hidden_units])


#cell
#使用lstm(long-short Term Memory),因为使用RNN可能会存在梯度爆炸,用LSTM优化
#RNN中一般会用tanh()函数作为激活函数
#在迭代后期,会逐渐收敛,导致梯度趋于0,于是,出现了“梯度下降”的问题。
lstm_cell=tf.nn.rnn_cell.BasicLSTMCell(n_hidden_units,
forget_bias=1.0,
state_is_tuple=True)#state_is_tuple,生成的是不是一个元组
#lstm cell is divided into two parts(c_state,m_state),主线的state是c_state,副线的state是m_state
_init_state=lstm_cell.zero_state(batch_size,dtype=tf.float32)
#使用dynamic_rnn比rnn更好,优点在于对尺度不同的数据的处理上,会减少计算量
#time_major,上面的28 steps是它,
outputs,states=tf.nn.dynamic_rnn(lstm_cell,X_in,initial_state=_init_state,time_major=False)


#hidden layer for output as the final results
#method1:
results=tf.matmul(states[1],weights['out'])+biases['out']
#method2:
#or use unpack to list[(batch,outputs)..]*steps,就是把tensor解包成list
#outputs=tf.unstack(tf.transpose(outputs,[1,0,2])) #states is the last outputs
#选择最后一步的outputs,也就是-1
#results=tf.matmul(outputs[-1],weights['out'])+biases['out']
return results

#prediction
pred=RNN(x,weights,biases)
#cost
cost=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred,labels=y))
#train_op
train_op=tf.train.AdamOptimizer(lr).minimize(cost)

correct_pred=tf.equal(tf.argmax(pred,1),tf.argmax(y,1))
accuracy=tf.reduce_mean(tf.cast(correct_pred,tf.float32))

#important step
init=tf.initialize_all_variables()
with tf.Session() as sess:
sess.run(init)
step=0
while step*batch_size<training_iters:
batch_xs,batch_ys=mnist.train.next_batch(batch_size)
batch_xs=batch_xs.reshape([batch_size,n_steps,n_inputs])#28行,28列,在加上要一个batch_size
sess.run([train_op],feed_dict={x:batch_xs,y:batch_ys})
if step%20==0:
print(sess.run(accuracy,
feed_dict={x:batch_xs,
y:batch_ys
}
)
)
step+=1

运行效果

  • 由于自己是CPU版本的TensorFlow,运行起来比较慢,只能慢慢等待咯

RNN

  • 随着训练次数的增加,精确度也渐渐上升

    RNN

  • 设置的训练100w次,每20步输出一次结果,由于时间太久,我就不一一截图了。训练结束后,精确度99%