{"id":1061,"date":"2021-10-20T08:41:56","date_gmt":"2021-10-20T08:41:56","guid":{"rendered":"https:\/\/salarydistribution.com\/machine-learning\/2021\/10\/20\/adding-a-custom-attention-layer-to-recurrent-neural-network-in-keras\/"},"modified":"2021-10-20T08:41:56","modified_gmt":"2021-10-20T08:41:56","slug":"adding-a-custom-attention-layer-to-recurrent-neural-network-in-keras","status":"publish","type":"post","link":"https:\/\/salarydistribution.com\/machine-learning\/2021\/10\/20\/adding-a-custom-attention-layer-to-recurrent-neural-network-in-keras\/","title":{"rendered":"Adding A Custom Attention Layer To Recurrent Neural Network In Keras"},"content":{"rendered":"<div id=\"\">\n<p id=\"last-modified-info\">Last Updated on October 12, 2021<\/p>\n<p>Deep learning networks have gained immense popularity in the past few years. The \u2018attention mechanism\u2019 is integrated with the deep learning networks to improve their performance. Adding attention component to the network has shown significant improvement in tasks such as machine translation, image recognition, text summarization and similar applications.<\/p>\n<p>This tutorial shows how to add a custom attention layer to a network built using a recurrent neural network. We\u2019ll illustrate an end to end application of time series forecasting using a very simple dataset. The tutorial is designed for anyone looking for a basic understanding of how to add user defined layers to a deep learning network and use this simple example to build more complex applications.<\/p>\n<p>After completing this tutorial, you will know:<\/p>\n<ul>\n<li>Which methods are required to create a custom attention layer in Keras<\/li>\n<li>How to incorporate the new layer in a network built with SimpleRNN<\/li>\n<\/ul>\n<p>Let\u2019s get started.<\/p>\n<div id=\"attachment_12956\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/yahya-ehsan-L895sqROaGw-unsplash-scaled.jpg\"><img aria-describedby=\"caption-attachment-12956\" loading=\"lazy\" class=\"wp-image-12956\" data-cfsrc=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/yahya-ehsan-L895sqROaGw-unsplash-300x189.jpg\" alt=\"Adding A Custom Attention Layer To Recurrent Neural Network In Keras &lt;br&gt; Photo by \" width=\"661\" height=\"417\"><img decoding=\"async\" aria-describedby=\"caption-attachment-12956\" loading=\"lazy\" class=\"wp-image-12956\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/yahya-ehsan-L895sqROaGw-unsplash-300x189.jpg\" alt=\"Adding A Custom Attention Layer To Recurrent Neural Network In Keras &lt;br&gt; Photo by \" width=\"661\" height=\"417\"><\/a><\/p>\n<p id=\"caption-attachment-12956\" class=\"wp-caption-text\">Adding A Custom Attention Layer To Recurrent Neural Network In Keras <br \/>Photo by Yahya Ehsan, some rights reserved.<\/p>\n<\/div>\n<h2 id=\"Tutorial-Overview\">Tutorial Overview<\/h2>\n<p>This tutorial is divided into three parts; they are:<\/p>\n<ul>\n<li>Preparing a simple dataset for time series forecasting<\/li>\n<li>How to use a network built via SimpleRNN for time series forecasting<\/li>\n<li>Adding a custom attention layer to the SimpleRNN network<\/li>\n<\/ul>\n<h2 id=\"Prerequisites\">Prerequisites<\/h2>\n<p>It is assumed that you are familiar with the following topics. You can click the links below for an overview.<\/p>\n<h2 id=\"The-Dataset\">The Dataset<\/h2>\n<p>The focus of this article is to gain a basic understanding of how to build a custom attention layer to a deep learning network. For this purpose, we\u2019ll use a very simple example of a Fibonacci sequence, where one number is constructed from previous two numbers. The first 10 numbers of the sequence are shown below:<\/p>\n<p>0, 1, 1, 2, 3, 5, 8, 13, 21, 34, \u2026<\/p>\n<p>When given the previous \u2018t\u2019 numbers, can we get a machine to accurately reconstruct the next number? This would mean discarding all the previous inputs except the last two and performing the correct operation on the last two numbers.<\/p>\n<p>For this tutorial, we\u2019ll construct the training examples from <code>t<\/code> time steps and use the value at <code>t+1<\/code> as the target. For example, if <code>t=3<\/code>, then the training examples and the corresponding target values would look as follows:<\/p>\n<p><a href=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/fib.png\"><img loading=\"lazy\" class=\" wp-image-12955 aligncenter\" data-cfsrc=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/fib-300x209.png\" alt=\"\" width=\"472\" height=\"329\"><img decoding=\"async\" loading=\"lazy\" class=\" wp-image-12955 aligncenter\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/fib-300x209.png\" alt=\"\" width=\"472\" height=\"329\"><\/a><\/p>\n<h2 id=\"The-SimpleRNN-Network\">The SimpleRNN Network<\/h2>\n<p>In this section, we\u2019ll write the basic code to generate the dataset and use a SimpleRNN network for predicting the next number of the Fibonacci sequence.<\/p>\n<h3 id=\"The-Import-Section\">The Import Section<\/h3>\n<p>Let\u2019s first write the import section:<\/p>\n<div id=\"urvanov-syntax-highlighter-616fd3fb50caf780604859\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\nfrom pandas import read_csv<br \/>\nimport numpy as np<br \/>\nfrom keras import Model<br \/>\nfrom keras.layers import Layer<br \/>\nimport keras.backend as K<br \/>\nfrom keras.layers import Input, Dense, SimpleRNN<br \/>\nfrom sklearn.preprocessing import MinMaxScaler<br \/>\nfrom keras.models import Sequential<br \/>\nfrom keras.metrics import mean_squared_error<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">pandas <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">read_csv<\/span><\/p>\n<p><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">numpy <\/span><span class=\"crayon-st\">as<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">np<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">keras <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">Model<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">keras<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">layers <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">Layer<\/span><\/p>\n<p><span class=\"crayon-e\">import <\/span><span class=\"crayon-v\">keras<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">backend <\/span><span class=\"crayon-st\">as<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">K<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">keras<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">layers <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-v\">Input<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">Dense<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">SimpleRNN<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">preprocessing <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">MinMaxScaler<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">keras<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">models <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">Sequential<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">keras<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">metrics <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-v\">mean_squared_error<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<h3 id=\"Preparing-The-Dataset\">Preparing The Dataset<\/h3>\n<p>The following function generates a sequence of n Fibonacci numbers (not counting the starting two values). If <code>scale_data<\/code> is set to True, then it would also use the <code>MinMaxScaler<\/code> from scikit-learn to scale the values between 0 and 1. Let\u2019s see its output for <code>n=10<\/code>.<\/p>\n<div id=\"urvanov-syntax-highlighter-616fd3fb50cb4832004777\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\ndef get_fib_seq(n, scale_data=True):<br \/>\n    # Get the Fibonacci sequence<br \/>\n    seq = np.zeros(n)<br \/>\n    fib_n1 = 0.0<br \/>\n    fib_n = 1.0<br \/>\n    for i in range(n):<br \/>\n            seq[i] = fib_n1 + fib_n<br \/>\n            fib_n1 = fib_n<br \/>\n            fib_n = seq[i]<br \/>\n    scaler = []<br \/>\n    if scale_data:<br \/>\n        scaler = MinMaxScaler(feature_range=(0, 1))<br \/>\n        seq = np.reshape(seq, (n, 1))<br \/>\n        seq = scaler.fit_transform(seq).flatten()<br \/>\n    return seq, scaler<\/p>\n<p>fib_seq = get_fib_seq(10, False)[0]<br \/>\nprint(fib_seq)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<div class=\"urvanov-syntax-highlighter-nums-content\">\n<p>1<\/p>\n<p>2<\/p>\n<p>3<\/p>\n<p>4<\/p>\n<p>5<\/p>\n<p>6<\/p>\n<p>7<\/p>\n<p>8<\/p>\n<p>9<\/p>\n<p>10<\/p>\n<p>11<\/p>\n<p>12<\/p>\n<p>13<\/p>\n<p>14<\/p>\n<p>15<\/p>\n<p>16<\/p>\n<p>17<\/p>\n<p>18<\/p>\n<\/div>\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">get_fib_seq<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">scale_data<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-t\">True<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-p\"># Get the Fibonacci sequence<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">seq<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">np<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">zeros<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">fib_n1<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">0.0<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">fib_n<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">1.0<\/span><span class=\"crayon-h\"> <\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">i<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">range<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">seq<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">i<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">fib_n1<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">+<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">fib_n<\/span><\/p>\n<p><span class=\"crayon-e\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">fib_n1<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">fib_n<\/span><\/p>\n<p><span class=\"crayon-e\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">fib_n<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">seq<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">i<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-h\"> <\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">scaler<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">if<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">scale_data<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">scaler<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">MinMaxScaler<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">feature_range<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">seq<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">np<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">reshape<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">seq<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">seq<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">scaler<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">fit_transform<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">seq<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">flatten<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">seq<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">scaler<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-v\">fib_seq<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">get_fib_seq<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">10<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-t\">False<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">fib_seq<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<div id=\"urvanov-syntax-highlighter-616fd3fb50cb5090913378\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n[ 1.  2.  3.  5.  8. 13. 21. 34. 55. 89.]<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">[<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">1.<\/span><span class=\"crayon-h\">\u00a0\u00a0<\/span><span class=\"crayon-cn\">2.<\/span><span class=\"crayon-h\">\u00a0\u00a0<\/span><span class=\"crayon-cn\">3.<\/span><span class=\"crayon-h\">\u00a0\u00a0<\/span><span class=\"crayon-cn\">5.<\/span><span class=\"crayon-h\">\u00a0\u00a0<\/span><span class=\"crayon-cn\">8.<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">13.<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">21.<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">34.<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">55.<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">89.<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<p>Next, we need a function <code>get_fib_XY()<\/code> that reformats the sequence into training examples and target values to be used by the Keras input layer. When given <code>time_steps<\/code> as a parameter, <code>get_fib_XY()<\/code> constructs each row of the dataset with <code>time_steps<\/code> number of columns. This function not only constructs the training set and test set from the Fibonacci sequence, but also shuffles the training examples and reshapes them to the required TensorFlow format, i.e., <code>total_samples x time_steps x features<\/code>. Also, the function returns the <code>scaler<\/code> object that scales the values if <code>scale_data<\/code> is set to <code>True<\/code>.<\/p>\n<p>Let\u2019s generate a small training set to see what it looks like. We have set <code>time_steps=3<\/code>, <code>total_fib_numbers=12<\/code>, with approximately 70% examples going towards the test points. Note the training and test examples have been shuffled by the <code>permutation()<\/code> function.<\/p>\n<div id=\"urvanov-syntax-highlighter-616fd3fb50cb6040137465\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\ndef get_fib_XY(total_fib_numbers, time_steps, train_percent, scale_data=True):<br \/>\n    dat, scaler = get_fib_seq(total_fib_numbers, scale_data)<br \/>\n    Y_ind = np.arange(time_steps, len(dat), 1)<br \/>\n    Y = dat[Y_ind]<br \/>\n    rows_x = len(Y)<br \/>\n    X = dat[0:rows_x]<br \/>\n    for i in range(time_steps-1):<br \/>\n        temp = dat[i+1:rows_x+i+1]<br \/>\n        X = np.column_stack((X, temp))<br \/>\n    # random permutation with fixed seed<br \/>\n    rand = np.random.RandomState(seed=13)<br \/>\n    idx = rand.permutation(rows_x)<br \/>\n    split = int(train_percent*rows_x)<br \/>\n    train_ind = idx[0:split]<br \/>\n    test_ind = idx[split:]<br \/>\n    trainX = X[train_ind]<br \/>\n    trainY = Y[train_ind]<br \/>\n    testX = X[test_ind]<br \/>\n    testY = Y[test_ind]<br \/>\n    trainX = np.reshape(trainX, (len(trainX), time_steps, 1))<br \/>\n    testX = np.reshape(testX, (len(testX), time_steps, 1))<br \/>\n    return trainX, trainY, testX, testY, scaler<\/p>\n<p>trainX, trainY, testX, testY, scaler = get_fib_XY(12, 3, 0.7, False)<br \/>\nprint(&#8216;trainX = &#8216;, trainX)<br \/>\nprint(&#8216;trainY = &#8216;, trainY)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<div class=\"urvanov-syntax-highlighter-nums-content\">\n<p>1<\/p>\n<p>2<\/p>\n<p>3<\/p>\n<p>4<\/p>\n<p>5<\/p>\n<p>6<\/p>\n<p>7<\/p>\n<p>8<\/p>\n<p>9<\/p>\n<p>10<\/p>\n<p>11<\/p>\n<p>12<\/p>\n<p>13<\/p>\n<p>14<\/p>\n<p>15<\/p>\n<p>16<\/p>\n<p>17<\/p>\n<p>18<\/p>\n<p>19<\/p>\n<p>20<\/p>\n<p>21<\/p>\n<p>22<\/p>\n<p>23<\/p>\n<p>24<\/p>\n<p>25<\/p>\n<p>26<\/p>\n<\/div>\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">get_fib_XY<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">total_fib_numbers<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">time_steps<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">train_percent<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">scale_data<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-t\">True<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">dat<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">scaler<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">get_fib_seq<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">total_fib_numbers<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">scale_data<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">Y_ind<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">np<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">arange<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">time_steps<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">len<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">dat<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">Y<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">dat<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">Y_ind<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">rows_x<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">len<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">Y<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">dat<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-v\">rows_x<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">i<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">range<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">time_steps<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">temp<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">dat<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">i<\/span><span class=\"crayon-o\">+<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-v\">rows_x<\/span><span class=\"crayon-o\">+<\/span><span class=\"crayon-v\">i<\/span><span class=\"crayon-o\">+<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">np<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">column_stack<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">temp<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-p\"># random permutation with fixed seed\u00a0\u00a0 <\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">rand<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">np<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">random<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">RandomState<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">seed<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">13<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">idx<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">rand<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">permutation<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">rows_x<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">split<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-t\">int<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-e \">train_percent*<\/span><span class=\"crayon-v\">rows_x<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">train_ind<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">idx<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-v\">split<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">test_ind<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">idx<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">split<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">trainX<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">train_ind<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">trainY<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">Y<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">train_ind<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">testX<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">test_ind<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">testY<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">Y<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">test_ind<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">trainX<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">np<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">reshape<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">trainX<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-e\">len<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">trainX<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">time_steps<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">testX<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">np<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">reshape<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">testX<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-e\">len<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">testX<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">time_steps<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">trainX<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">trainY<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">testX<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">testY<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">scaler<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-v\">trainX<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">trainY<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">testX<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">testY<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">scaler<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">get_fib_XY<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">12<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">3<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">0.7<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-t\">False<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8216;trainX = &#8216;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">trainX<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8216;trainY = &#8216;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">trainY<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<div id=\"urvanov-syntax-highlighter-616fd3fb50cb7171718342\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\ntrainX =  [[[ 8.]<br \/>\n  [13.]<br \/>\n  [21.]]<\/p>\n<p> [[ 5.]<br \/>\n  [ 8.]<br \/>\n  [13.]]<\/p>\n<p> [[ 2.]<br \/>\n  [ 3.]<br \/>\n  [ 5.]]<\/p>\n<p> [[13.]<br \/>\n  [21.]<br \/>\n  [34.]]<\/p>\n<p> [[21.]<br \/>\n  [34.]<br \/>\n  [55.]]<\/p>\n<p> [[34.]<br \/>\n  [55.]<br \/>\n  [89.]]]<br \/>\ntrainY =  [ 34.  21.   8.  55.  89. 144.]<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<div class=\"urvanov-syntax-highlighter-nums-content\">\n<p>1<\/p>\n<p>2<\/p>\n<p>3<\/p>\n<p>4<\/p>\n<p>5<\/p>\n<p>6<\/p>\n<p>7<\/p>\n<p>8<\/p>\n<p>9<\/p>\n<p>10<\/p>\n<p>11<\/p>\n<p>12<\/p>\n<p>13<\/p>\n<p>14<\/p>\n<p>15<\/p>\n<p>16<\/p>\n<p>17<\/p>\n<p>18<\/p>\n<p>19<\/p>\n<p>20<\/p>\n<p>21<\/p>\n<p>22<\/p>\n<p>23<\/p>\n<p>24<\/p>\n<\/div>\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p>trainX =\u00a0\u00a0[[[ 8.]<\/p>\n<p>\u00a0\u00a0[13.]<\/p>\n<p>\u00a0\u00a0[21.]]<\/p>\n<p>\u00a0<\/p>\n<p> [[ 5.]<\/p>\n<p>\u00a0\u00a0[ 8.]<\/p>\n<p>\u00a0\u00a0[13.]]<\/p>\n<p>\u00a0<\/p>\n<p> [[ 2.]<\/p>\n<p>\u00a0\u00a0[ 3.]<\/p>\n<p>\u00a0\u00a0[ 5.]]<\/p>\n<p>\u00a0<\/p>\n<p> [[13.]<\/p>\n<p>\u00a0\u00a0[21.]<\/p>\n<p>\u00a0\u00a0[34.]]<\/p>\n<p>\u00a0<\/p>\n<p> [[21.]<\/p>\n<p>\u00a0\u00a0[34.]<\/p>\n<p>\u00a0\u00a0[55.]]<\/p>\n<p>\u00a0<\/p>\n<p> [[34.]<\/p>\n<p>\u00a0\u00a0[55.]<\/p>\n<p>\u00a0\u00a0[89.]]]<\/p>\n<p>trainY =\u00a0\u00a0[ 34.\u00a0\u00a021.\u00a0\u00a0 8.\u00a0\u00a055.\u00a0\u00a089. 144.]<\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<h3 id=\"Setting-Up-The-Network\">Setting Up The Network<\/h3>\n<p>Now let\u2019s setup a small network with two layers. The first one being the <code>SimpleRNN<\/code> layer and the second one being the <code>Dense<\/code> layer. Below is a summary of the model.<\/p>\n<div id=\"urvanov-syntax-highlighter-616fd3fb50cb8386660470\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# Set up parameters<br \/>\ntime_steps = 20<br \/>\nhidden_units = 2<br \/>\nepochs = 30<\/p>\n<p># Create a traditional RNN network<br \/>\ndef create_RNN(hidden_units, dense_units, input_shape, activation):<br \/>\n    model = Sequential()<br \/>\n    model.add(SimpleRNN(hidden_units, input_shape=input_shape, activation=activation[0]))<br \/>\n    model.add(Dense(units=dense_units, activation=activation[1]))<br \/>\n    model.compile(loss=&#8217;mse&#8217;, optimizer=&#8217;adam&#8217;)<br \/>\n    return model<\/p>\n<p>model_RNN = create_RNN(hidden_units=hidden_units, dense_units=1, input_shape=(time_steps,1),<br \/>\n                   activation=[&#8216;tanh&#8217;, &#8216;tanh&#8217;])<br \/>\nmodel_RNN.summary()<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-p\"># Set up parameters<\/span><\/p>\n<p><span class=\"crayon-v\">time_steps<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">20<\/span><\/p>\n<p><span class=\"crayon-v\">hidden_units<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">2<\/span><\/p>\n<p><span class=\"crayon-v\">epochs<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">30<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Create a traditional RNN network<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">create_RNN<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">hidden_units<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">dense_units<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">input_shape<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">activation<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">model<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">Sequential<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">model<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">add<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-e\">SimpleRNN<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">hidden_units<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">input_shape<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">input_shape<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">activation<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">activation<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">model<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">add<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-e\">Dense<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">units<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">dense_units<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">activation<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">activation<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">model<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">compile<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">loss<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;mse&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">optimizer<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;adam&#8217;<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">model<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-v\">model_RNN<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">create_RNN<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">hidden_units<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">hidden_units<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">dense_units<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">input_shape<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">time_steps<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 <\/span><span class=\"crayon-v\">activation<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8216;tanh&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8216;tanh&#8217;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">model_RNN<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">summary<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<div id=\"urvanov-syntax-highlighter-616fd3fb50cb9952225235\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\nModel: &#8220;sequential_1&#8221;<br \/>\n_________________________________________________________________<br \/>\nLayer (type)                 Output Shape              Param #<br \/>\n=================================================================<br \/>\nsimple_rnn_3 (SimpleRNN)     (None, 2)                 8<br \/>\n_________________________________________________________________<br \/>\ndense_3 (Dense)              (None, 1)                 3<br \/>\n=================================================================<br \/>\nTotal params: 11<br \/>\nTrainable params: 11<br \/>\nNon-trainable params: 0<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p>Model: &#8220;sequential_1&#8221;<\/p>\n<p>_________________________________________________________________<\/p>\n<p>Layer (type)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Output Shape\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Param #\u00a0\u00a0 <\/p>\n<p>=================================================================<\/p>\n<p>simple_rnn_3 (SimpleRNN)\u00a0\u00a0\u00a0\u00a0 (None, 2)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 8\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 <\/p>\n<p>_________________________________________________________________<\/p>\n<p>dense_3 (Dense)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0(None, 1)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 3\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 <\/p>\n<p>=================================================================<\/p>\n<p>Total params: 11<\/p>\n<p>Trainable params: 11<\/p>\n<p>Non-trainable params: 0<\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<h3 id=\"Train-The-Network-And-Evaluate\">Train The Network And Evaluate<\/h3>\n<p>The next step is to add code that generates a dataset, trains the network, and evaluates it. This time around, we\u2019ll scale the data between 0 and 1. We don\u2019t need to pass <code>scale_data<\/code> parameter as its default value is <code>True<\/code>.<\/p>\n<div id=\"urvanov-syntax-highlighter-616fd3fb50cba220555699\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# Generate the dataset<br \/>\ntrainX, trainY, testX, testY, scaler  = get_fib_XY(1200, time_steps, 0.7)<\/p>\n<p>model_RNN.fit(trainX, trainY, epochs=epochs, batch_size=1, verbose=2)<\/p>\n<p># Evalute model<br \/>\ntrain_mse = model_RNN.evaluate(trainX, trainY)<br \/>\ntest_mse = model_RNN.evaluate(testX, testY)<\/p>\n<p># Print error<br \/>\nprint(&#8220;Train set MSE = &#8220;, train_mse)<br \/>\nprint(&#8220;Test set MSE = &#8220;, test_mse)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-p\"># Generate the dataset<\/span><\/p>\n<p><span class=\"crayon-v\">trainX<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">trainY<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">testX<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">testY<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">scaler<\/span><span class=\"crayon-h\">\u00a0\u00a0<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">get_fib_XY<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">1200<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">time_steps<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">0.7<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-v\">model_RNN<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">fit<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">trainX<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">trainY<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">epochs<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">epochs<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">batch_size<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">verbose<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">2<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Evalute model<\/span><\/p>\n<p><span class=\"crayon-v\">train_mse<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">model_RNN<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">evaluate<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">trainX<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">trainY<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">test_mse<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">model_RNN<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">evaluate<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">testX<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">testY<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Print error<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;Train set MSE = &#8220;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">train_mse<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;Test set MSE = &#8220;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">test_mse<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<p>As output you\u2019ll see the progress of training and the following values of mean square error:<\/p>\n<div id=\"urvanov-syntax-highlighter-616fd3fb50cbb360207523\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\nTrain set MSE =  5.631405292660929e-05<br \/>\nTest set MSE =  2.623497312015388e-05<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p>Train set MSE =\u00a0\u00a05.631405292660929e-05<\/p>\n<p>Test set MSE =\u00a0\u00a02.623497312015388e-05<\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<h2 id=\"Adding-A-Custom-Attention-Layer-To-The-Network\">Adding A Custom Attention Layer To The Network<\/h2>\n<p>In Keras, it is easy to create a custom layer that implements attention by subclassing the\u00a0<code>Layer<\/code>\u00a0class. The Keras guide lists down clear steps for <a href=\"https:\/\/keras.io\/guides\/making_new_layers_and_models_via_subclassing\/\" target=\"_blank\" rel=\"noopener\">creating a new layer via subclassing<\/a>. We\u2019ll use those guidelines here. All the weights and biases corresponding to a single layer are encapsulated by this class. We need to write the <code>__init__<\/code> method as well as override the following methods:<\/p>\n<ul>\n<li><code>build()<\/code>: Keras guide recommends adding weights in this method once the size of the inputs is known. This method \u2018lazily\u2019 creates weights. The builtin function <code>add_weight()<\/code> can be used to add weights and biases of the attention layer.<\/li>\n<li><code>call()<\/code>: The <code>call()<\/code> method implements the mapping of inputs to outputs. It should implement the forward pass during training.<\/li>\n<\/ul>\n<h3 id=\"The-Call-Method-For-Attention-Layer\">The Call Method For Attention Layer<\/h3>\n<p>The call method of the attention layer has to compute the alignment scores, weights, and context. You can go through the details of these parameters in Stefania\u2019s excellent article on <a href=\"https:\/\/machinelearningmastery.com\/the-attention-mechanism-from-scratch\/\" target=\"_blank\" rel=\"noopener\">The Attention Mechanism from Scratch<\/a>. We\u2019ll implement the Bahdanau attention in our <code>call()<\/code> method.<\/p>\n<p>The good thing about inheriting a layer from the Keras <code>Layer<\/code> class and adding the weights via <code>add_weights()<\/code> method is that weights are automatically tuned. Keras does an equivalent of \u2018reverse engineering\u2019 of the operations\/computations of the <code>call()<\/code> method and calculates the gradients during training. It is important to specify <code>trainable=True<\/code> when adding the weights. You can also add a <code>train_step()<\/code> method to your custom layer and specify your own method for weight training if needed.<\/p>\n<p>The code below implements our custom attention layer.<\/p>\n<div id=\"urvanov-syntax-highlighter-616fd3fb50cbc843095096\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# Add attention layer to the deep learning network<br \/>\nclass attention(Layer):<br \/>\n    def __init__(self,**kwargs):<br \/>\n        super(attention,self).__init__(**kwargs)<\/p>\n<p>    def build(self,input_shape):<br \/>\n        self.W=self.add_weight(name=&#8217;attention_weight&#8217;, shape=(input_shape[-1],1),<br \/>\n                               initializer=&#8217;random_normal&#8217;, trainable=True)<br \/>\n        self.b=self.add_weight(name=&#8217;attention_bias&#8217;, shape=(input_shape[1],1),<br \/>\n                               initializer=&#8217;zeros&#8217;, trainable=True)<br \/>\n        super(attention, self).build(input_shape)<\/p>\n<p>    def call(self,x):<br \/>\n        # Alignment scores. Pass them through tanh function<br \/>\n        e = K.tanh(K.dot(x,self.W)+self.b)<br \/>\n        # Remove dimension of size 1<br \/>\n        e = K.squeeze(e, axis=-1)<br \/>\n        # Compute the weights<br \/>\n        alpha = K.softmax(e)<br \/>\n        # Reshape to tensorFlow format<br \/>\n        alpha = K.expand_dims(alpha, axis=-1)<br \/>\n        # Compute the context vector<br \/>\n        context = x * alpha<br \/>\n        context = K.sum(context, axis=1)<br \/>\n        return context<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<div class=\"urvanov-syntax-highlighter-nums-content\">\n<p>1<\/p>\n<p>2<\/p>\n<p>3<\/p>\n<p>4<\/p>\n<p>5<\/p>\n<p>6<\/p>\n<p>7<\/p>\n<p>8<\/p>\n<p>9<\/p>\n<p>10<\/p>\n<p>11<\/p>\n<p>12<\/p>\n<p>13<\/p>\n<p>14<\/p>\n<p>15<\/p>\n<p>16<\/p>\n<p>17<\/p>\n<p>18<\/p>\n<p>19<\/p>\n<p>20<\/p>\n<p>21<\/p>\n<p>22<\/p>\n<p>23<\/p>\n<p>24<\/p>\n<p>25<\/p>\n<\/div>\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-p\"># Add attention layer to the deep learning network<\/span><\/p>\n<p><span class=\"crayon-t\">class<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">attention<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">Layer<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">__init__<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-r\">self<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-o\">*<\/span><span class=\"crayon-o\">*<\/span><span class=\"crayon-v\">kwargs<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-r\">super<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">attention<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-r\">self<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">__init__<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-o\">*<\/span><span class=\"crayon-o\">*<\/span><span class=\"crayon-v\">kwargs<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">build<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-r\">self<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-v\">input_shape<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-r\">self<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">W<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-r\">self<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">add_weight<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">name<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;attention_weight&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">input_shape<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 <\/span><span class=\"crayon-v\">initializer<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;random_normal&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">trainable<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-t\">True<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-r\">self<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">b<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-r\">self<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">add_weight<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">name<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;attention_bias&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">input_shape<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 <\/span><span class=\"crayon-v\">initializer<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;zeros&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">trainable<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-t\">True<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-r\">super<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">attention<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-r\">self<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">build<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">input_shape<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">call<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-r\">self<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-v\">x<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-p\"># Alignment scores. Pass them through tanh function<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">e<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">K<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">tanh<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">K<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">dot<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">x<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-r\">self<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">W<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">+<\/span><span class=\"crayon-r\">self<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">b<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-p\"># Remove dimension of size 1<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">e<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">K<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">squeeze<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">e<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">axis<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\">\u00a0\u00a0 <\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-p\"># Compute the weights<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">alpha<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">K<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">softmax<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">e<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-p\"># Reshape to tensorFlow format<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">alpha<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">K<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">expand_dims<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">alpha<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">axis<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-p\"># Compute the context vector<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">context<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e \">x *<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">alpha<\/span><\/p>\n<p><span class=\"crayon-e\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">context<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">K<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">sum<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">context<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">axis<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">context<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<h3>RNN Network With Attention Layer<\/h3>\n<p>Let\u2019s now add an attention layer to the RNN network we created earlier. The function <code>create_RNN_with_attention()<\/code> now specifies an RNN layer, attention layer and Dense layer in the network. Make sure to set <code>return_sequences=True<\/code> when specifying the SimpleRNN. This will return the output of the hidden units for all the previous time steps.<\/p>\n<p>Let\u2019s look at a summary of our model with attention.<\/p>\n<div id=\"urvanov-syntax-highlighter-616fd3fb50cbd096730467\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\ndef create_RNN_with_attention(hidden_units, dense_units, input_shape, activation):<br \/>\n    x=Input(shape=input_shape)<br \/>\n    RNN_layer = SimpleRNN(hidden_units, return_sequences=True, activation=activation)(x)<br \/>\n    attention_layer = attention()(RNN_layer)<br \/>\n    outputs=Dense(dense_units, trainable=True, activation=activation)(attention_layer)<br \/>\n    model=Model(x,outputs)<br \/>\n    model.compile(loss=&#8217;mse&#8217;, optimizer=&#8217;adam&#8217;)<br \/>\n    return model    <\/p>\n<p>model_attention = create_RNN_with_attention(hidden_units=hidden_units, dense_units=1,<br \/>\n                                  input_shape=(time_steps,1), activation=&#8217;tanh&#8217;)<br \/>\nmodel_attention.summary()<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">create_RNN_with_attention<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">hidden_units<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">dense_units<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">input_shape<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">activation<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">x<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-e\">Input<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">input_shape<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">RNN_layer<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">SimpleRNN<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">hidden_units<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">return_sequences<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-t\">True<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">activation<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">activation<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">x<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">attention_layer<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">attention<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">RNN_layer<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">outputs<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-e\">Dense<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">dense_units<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">trainable<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-t\">True<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">activation<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">activation<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">attention_layer<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">model<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-e\">Model<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">x<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-v\">outputs<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">model<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">compile<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">loss<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;mse&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">optimizer<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;adam&#8217;<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">model\u00a0\u00a0\u00a0\u00a0<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-v\">model_attention<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">create_RNN_with_attention<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">hidden_units<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">hidden_units<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">dense_units<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">input_shape<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">time_steps<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">activation<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;tanh&#8217;<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">model_attention<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">summary<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<div id=\"urvanov-syntax-highlighter-616fd3fb50cbe306861253\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\nModel: &#8220;model_1&#8221;<br \/>\n_________________________________________________________________<br \/>\nLayer (type)                 Output Shape              Param #<br \/>\n=================================================================<br \/>\ninput_2 (InputLayer)         [(None, 20, 1)]           0<br \/>\n_________________________________________________________________<br \/>\nsimple_rnn_2 (SimpleRNN)     (None, 20, 2)             8<br \/>\n_________________________________________________________________<br \/>\nattention_1 (attention)      (None, 2)                 22<br \/>\n_________________________________________________________________<br \/>\ndense_2 (Dense)              (None, 1)                 3<br \/>\n=================================================================<br \/>\nTotal params: 33<br \/>\nTrainable params: 33<br \/>\nNon-trainable params: 0<br \/>\n_________________________________________________________________<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p>Model: &#8220;model_1&#8221;<\/p>\n<p>_________________________________________________________________<\/p>\n<p>Layer (type)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Output Shape\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Param #\u00a0\u00a0 <\/p>\n<p>=================================================================<\/p>\n<p>input_2 (InputLayer)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 [(None, 20, 1)]\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 <\/p>\n<p>_________________________________________________________________<\/p>\n<p>simple_rnn_2 (SimpleRNN)\u00a0\u00a0\u00a0\u00a0 (None, 20, 2)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 8\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 <\/p>\n<p>_________________________________________________________________<\/p>\n<p>attention_1 (attention)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0(None, 2)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 22\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/p>\n<p>_________________________________________________________________<\/p>\n<p>dense_2 (Dense)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0(None, 1)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 3\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 <\/p>\n<p>=================================================================<\/p>\n<p>Total params: 33<\/p>\n<p>Trainable params: 33<\/p>\n<p>Non-trainable params: 0<\/p>\n<p>_________________________________________________________________<\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<h3>Train And Evaluate The Deep Learning Network With Attention<\/h3>\n<p>It\u2019s time to train and test our model and see how it performs on predicting the next Fibonacci number of a sequence.<\/p>\n<div id=\"urvanov-syntax-highlighter-616fd3fb50cbf609618932\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\nmodel_attention.fit(trainX, trainY, epochs=epochs, batch_size=1, verbose=2)<\/p>\n<p># Evalute model<br \/>\ntrain_mse_attn = model_attention.evaluate(trainX, trainY)<br \/>\ntest_mse_attn = model_attention.evaluate(testX, testY)<\/p>\n<p># Print error<br \/>\nprint(&#8220;Train set MSE with attention = &#8220;, train_mse_attn)<br \/>\nprint(&#8220;Test set MSE with attention = &#8220;, test_mse_attn)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-v\">model_attention<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">fit<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">trainX<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">trainY<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">epochs<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">epochs<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">batch_size<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">verbose<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">2<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Evalute model<\/span><\/p>\n<p><span class=\"crayon-v\">train_mse_attn<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">model_attention<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">evaluate<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">trainX<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">trainY<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">test_mse_attn<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">model_attention<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">evaluate<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">testX<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">testY<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Print error<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;Train set MSE with attention = &#8220;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">train_mse_attn<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;Test set MSE with attention = &#8220;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">test_mse_attn<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<p>You\u2019ll see the training progress as output and the following:<\/p>\n<div id=\"urvanov-syntax-highlighter-616fd3fb50cc0019033450\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\nTrain set MSE with attention =  5.3511179430643097e-05<br \/>\nTest set MSE with attention =  9.053358553501312e-06<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p>Train set MSE with attention =\u00a0\u00a05.3511179430643097e-05<\/p>\n<p>Test set MSE with attention =\u00a0\u00a09.053358553501312e-06<\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<p>We can see that even for this simple example, the mean square error on the test set is lower with the attention layer. You can achieve better results with hyper-parameter tuning and model selection. Do try this out on more complex problems and adding more layers to the network. You can also use the <code>scaler<\/code>\u00a0object to scale the numbers back to their original values.<\/p>\n<p>You can take this example one step further by using LSTM instead of SimpleRNN or you can build a network via convolution and pooling layers. You can also change this to an encoder decoder network if you like.<\/p>\n<h2>Consolidated Code<\/h2>\n<p>The entire code for this tutorial is pasted below if you would like to try it. Note that your outputs would be different from the ones given in this tutorial because of the stochastic nature of this algorithm.<\/p>\n<div id=\"urvanov-syntax-highlighter-616fd3fb50cc1754964755\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# Prepare data<br \/>\ndef get_fib_seq(n, scale_data=True):<br \/>\n    # Get the Fibonacci sequence<br \/>\n    seq = np.zeros(n)<br \/>\n    fib_n1 = 0.0<br \/>\n    fib_n = 1.0<br \/>\n    for i in range(n):<br \/>\n            seq[i] = fib_n1 + fib_n<br \/>\n            fib_n1 = fib_n<br \/>\n            fib_n = seq[i]<br \/>\n    scaler = []<br \/>\n    if scale_data:<br \/>\n        scaler = MinMaxScaler(feature_range=(0, 1))<br \/>\n        seq = np.reshape(seq, (n, 1))<br \/>\n        seq = scaler.fit_transform(seq).flatten()<br \/>\n    return seq, scaler<\/p>\n<p>def get_fib_XY(total_fib_numbers, time_steps, train_percent, scale_data=True):<br \/>\n    dat, scaler = get_fib_seq(total_fib_numbers, scale_data)<br \/>\n    Y_ind = np.arange(time_steps, len(dat), 1)<br \/>\n    Y = dat[Y_ind]<br \/>\n    rows_x = len(Y)<br \/>\n    X = dat[0:rows_x]<br \/>\n    for i in range(time_steps-1):<br \/>\n        temp = dat[i+1:rows_x+i+1]<br \/>\n        X = np.column_stack((X, temp))<br \/>\n    # random permutation with fixed seed<br \/>\n    rand = np.random.RandomState(seed=13)<br \/>\n    idx = rand.permutation(rows_x)<br \/>\n    split = int(train_percent*rows_x)<br \/>\n    train_ind = idx[0:split]<br \/>\n    test_ind = idx[split:]<br \/>\n    trainX = X[train_ind]<br \/>\n    trainY = Y[train_ind]<br \/>\n    testX = X[test_ind]<br \/>\n    testY = Y[test_ind]<br \/>\n    trainX = np.reshape(trainX, (len(trainX), time_steps, 1))<br \/>\n    testX = np.reshape(testX, (len(testX), time_steps, 1))<br \/>\n    return trainX, trainY, testX, testY, scaler<\/p>\n<p># Set up parameters<br \/>\ntime_steps = 20<br \/>\nhidden_units = 2<br \/>\nepochs = 30<\/p>\n<p># Create a traditional RNN network<br \/>\ndef create_RNN(hidden_units, dense_units, input_shape, activation):<br \/>\n    model = Sequential()<br \/>\n    model.add(SimpleRNN(hidden_units, input_shape=input_shape, activation=activation[0]))<br \/>\n    model.add(Dense(units=dense_units, activation=activation[1]))<br \/>\n    model.compile(loss=&#8217;mse&#8217;, optimizer=&#8217;adam&#8217;)<br \/>\n    return model<\/p>\n<p>model_RNN = create_RNN(hidden_units=hidden_units, dense_units=1, input_shape=(time_steps,1),<br \/>\n                   activation=[&#8216;tanh&#8217;, &#8216;tanh&#8217;])<\/p>\n<p># Generate the dataset for the network<br \/>\ntrainX, trainY, testX, testY, scaler  = get_fib_XY(1200, time_steps, 0.7)<br \/>\n# Train the network<br \/>\nmodel_RNN.fit(trainX, trainY, epochs=epochs, batch_size=1, verbose=2)<\/p>\n<p># Evalute model<br \/>\ntrain_mse = model_RNN.evaluate(trainX, trainY)<br \/>\ntest_mse = model_RNN.evaluate(testX, testY)<\/p>\n<p># Print error<br \/>\nprint(&#8220;Train set MSE = &#8220;, train_mse)<br \/>\nprint(&#8220;Test set MSE = &#8220;, test_mse)<\/p>\n<p># Add attention layer to the deep learning network<br \/>\nclass attention(Layer):<br \/>\n    def __init__(self,**kwargs):<br \/>\n        super(attention,self).__init__(**kwargs)<\/p>\n<p>    def build(self,input_shape):<br \/>\n        self.W=self.add_weight(name=&#8217;attention_weight&#8217;, shape=(input_shape[-1],1),<br \/>\n                               initializer=&#8217;random_normal&#8217;, trainable=True)<br \/>\n        self.b=self.add_weight(name=&#8217;attention_bias&#8217;, shape=(input_shape[1],1),<br \/>\n                               initializer=&#8217;zeros&#8217;, trainable=True)<br \/>\n        super(attention, self).build(input_shape)<\/p>\n<p>    def call(self,x):<br \/>\n        # Alignment scores. Pass them through tanh function<br \/>\n        e = K.tanh(K.dot(x,self.W)+self.b)<br \/>\n        # Remove dimension of size 1<br \/>\n        e = K.squeeze(e, axis=-1)<br \/>\n        # Compute the weights<br \/>\n        alpha = K.softmax(e)<br \/>\n        # Reshape to tensorFlow format<br \/>\n        alpha = K.expand_dims(alpha, axis=-1)<br \/>\n        # Compute the context vector<br \/>\n        context = x * alpha<br \/>\n        context = K.sum(context, axis=1)<br \/>\n        return context<\/p>\n<p>def create_RNN_with_attention(hidden_units, dense_units, input_shape, activation):<br \/>\n    x=Input(shape=input_shape)<br \/>\n    RNN_layer = SimpleRNN(hidden_units, return_sequences=True, activation=activation)(x)<br \/>\n    attention_layer = attention()(RNN_layer)<br \/>\n    outputs=Dense(dense_units, trainable=True, activation=activation)(attention_layer)<br \/>\n    model=Model(x,outputs)<br \/>\n    model.compile(loss=&#8217;mse&#8217;, optimizer=&#8217;adam&#8217;)<br \/>\n    return model    <\/p>\n<p># Create the model with attention, train and evaluate<br \/>\nmodel_attention = create_RNN_with_attention(hidden_units=hidden_units, dense_units=1,<br \/>\n                                  input_shape=(time_steps,1), activation=&#8217;tanh&#8217;)<br \/>\nmodel_attention.summary()    <\/p>\n<p>model_attention.fit(trainX, trainY, epochs=epochs, batch_size=1, verbose=2)<\/p>\n<p># Evalute model<br \/>\ntrain_mse_attn = model_attention.evaluate(trainX, trainY)<br \/>\ntest_mse_attn = model_attention.evaluate(testX, testY)<\/p>\n<p># Print error<br \/>\nprint(&#8220;Train set MSE with attention = &#8220;, train_mse_attn)<br \/>\nprint(&#8220;Test set MSE with attention = &#8220;, test_mse_attn)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<div class=\"urvanov-syntax-highlighter-nums-content\">\n<p>1<\/p>\n<p>2<\/p>\n<p>3<\/p>\n<p>4<\/p>\n<p>5<\/p>\n<p>6<\/p>\n<p>7<\/p>\n<p>8<\/p>\n<p>9<\/p>\n<p>10<\/p>\n<p>11<\/p>\n<p>12<\/p>\n<p>13<\/p>\n<p>14<\/p>\n<p>15<\/p>\n<p>16<\/p>\n<p>17<\/p>\n<p>18<\/p>\n<p>19<\/p>\n<p>20<\/p>\n<p>21<\/p>\n<p>22<\/p>\n<p>23<\/p>\n<p>24<\/p>\n<p>25<\/p>\n<p>26<\/p>\n<p>27<\/p>\n<p>28<\/p>\n<p>29<\/p>\n<p>30<\/p>\n<p>31<\/p>\n<p>32<\/p>\n<p>33<\/p>\n<p>34<\/p>\n<p>35<\/p>\n<p>36<\/p>\n<p>37<\/p>\n<p>38<\/p>\n<p>39<\/p>\n<p>40<\/p>\n<p>41<\/p>\n<p>42<\/p>\n<p>43<\/p>\n<p>44<\/p>\n<p>45<\/p>\n<p>46<\/p>\n<p>47<\/p>\n<p>48<\/p>\n<p>49<\/p>\n<p>50<\/p>\n<p>51<\/p>\n<p>52<\/p>\n<p>53<\/p>\n<p>54<\/p>\n<p>55<\/p>\n<p>56<\/p>\n<p>57<\/p>\n<p>58<\/p>\n<p>59<\/p>\n<p>60<\/p>\n<p>61<\/p>\n<p>62<\/p>\n<p>63<\/p>\n<p>64<\/p>\n<p>65<\/p>\n<p>66<\/p>\n<p>67<\/p>\n<p>68<\/p>\n<p>69<\/p>\n<p>70<\/p>\n<p>71<\/p>\n<p>72<\/p>\n<p>73<\/p>\n<p>74<\/p>\n<p>75<\/p>\n<p>76<\/p>\n<p>77<\/p>\n<p>78<\/p>\n<p>79<\/p>\n<p>80<\/p>\n<p>81<\/p>\n<p>82<\/p>\n<p>83<\/p>\n<p>84<\/p>\n<p>85<\/p>\n<p>86<\/p>\n<p>87<\/p>\n<p>88<\/p>\n<p>89<\/p>\n<p>90<\/p>\n<p>91<\/p>\n<p>92<\/p>\n<p>93<\/p>\n<p>94<\/p>\n<p>95<\/p>\n<p>96<\/p>\n<p>97<\/p>\n<p>98<\/p>\n<p>99<\/p>\n<p>100<\/p>\n<p>101<\/p>\n<p>102<\/p>\n<p>103<\/p>\n<p>104<\/p>\n<p>105<\/p>\n<p>106<\/p>\n<p>107<\/p>\n<p>108<\/p>\n<p>109<\/p>\n<p>110<\/p>\n<p>111<\/p>\n<p>112<\/p>\n<p>113<\/p>\n<p>114<\/p>\n<p>115<\/p>\n<p>116<\/p>\n<p>117<\/p>\n<p>118<\/p>\n<p>119<\/p>\n<p>120<\/p>\n<p>121<\/p>\n<\/div>\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-p\"># Prepare data<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">get_fib_seq<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">scale_data<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-t\">True<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-p\"># Get the Fibonacci sequence<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">seq<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">np<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">zeros<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">fib_n1<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">0.0<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">fib_n<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">1.0<\/span><span class=\"crayon-h\"> <\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">i<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">range<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">seq<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">i<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">fib_n1<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">+<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">fib_n<\/span><\/p>\n<p><span class=\"crayon-e\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">fib_n1<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">fib_n<\/span><\/p>\n<p><span class=\"crayon-e\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">fib_n<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">seq<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">i<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-h\"> <\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">scaler<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">if<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">scale_data<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">scaler<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">MinMaxScaler<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">feature_range<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">seq<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">np<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">reshape<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">seq<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">seq<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">scaler<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">fit_transform<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">seq<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">flatten<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">seq<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">scaler<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">get_fib_XY<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">total_fib_numbers<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">time_steps<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">train_percent<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">scale_data<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-t\">True<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">dat<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">scaler<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">get_fib_seq<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">total_fib_numbers<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">scale_data<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">Y_ind<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">np<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">arange<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">time_steps<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">len<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">dat<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">Y<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">dat<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">Y_ind<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">rows_x<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">len<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">Y<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">dat<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-v\">rows_x<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">i<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">range<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">time_steps<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">temp<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">dat<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">i<\/span><span class=\"crayon-o\">+<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-v\">rows_x<\/span><span class=\"crayon-o\">+<\/span><span class=\"crayon-v\">i<\/span><span class=\"crayon-o\">+<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">np<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">column_stack<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">temp<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-p\"># random permutation with fixed seed\u00a0\u00a0 <\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">rand<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">np<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">random<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">RandomState<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">seed<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">13<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">idx<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">rand<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">permutation<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">rows_x<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">split<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-t\">int<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-e \">train_percent*<\/span><span class=\"crayon-v\">rows_x<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">train_ind<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">idx<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-v\">split<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">test_ind<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">idx<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">split<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">trainX<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">train_ind<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">trainY<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">Y<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">train_ind<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">testX<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">test_ind<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">testY<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">Y<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">test_ind<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">trainX<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">np<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">reshape<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">trainX<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-e\">len<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">trainX<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">time_steps<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">testX<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">np<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">reshape<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">testX<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-e\">len<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">testX<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">time_steps<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">trainX<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">trainY<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">testX<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">testY<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">scaler<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Set up parameters<\/span><\/p>\n<p><span class=\"crayon-v\">time_steps<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">20<\/span><\/p>\n<p><span class=\"crayon-v\">hidden_units<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">2<\/span><\/p>\n<p><span class=\"crayon-v\">epochs<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">30<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Create a traditional RNN network<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">create_RNN<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">hidden_units<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">dense_units<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">input_shape<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">activation<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">model<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">Sequential<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">model<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">add<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-e\">SimpleRNN<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">hidden_units<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">input_shape<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">input_shape<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">activation<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">activation<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">model<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">add<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-e\">Dense<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">units<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">dense_units<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">activation<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">activation<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">model<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">compile<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">loss<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;mse&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">optimizer<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;adam&#8217;<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">model<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-v\">model_RNN<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">create_RNN<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">hidden_units<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">hidden_units<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">dense_units<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">input_shape<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">time_steps<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 <\/span><span class=\"crayon-v\">activation<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8216;tanh&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8216;tanh&#8217;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Generate the dataset for the network<\/span><\/p>\n<p><span class=\"crayon-v\">trainX<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">trainY<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">testX<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">testY<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">scaler<\/span><span class=\"crayon-h\">\u00a0\u00a0<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">get_fib_XY<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">1200<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">time_steps<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">0.7<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># Train the network<\/span><\/p>\n<p><span class=\"crayon-v\">model_RNN<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">fit<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">trainX<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">trainY<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">epochs<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">epochs<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">batch_size<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">verbose<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">2<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Evalute model<\/span><\/p>\n<p><span class=\"crayon-v\">train_mse<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">model_RNN<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">evaluate<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">trainX<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">trainY<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">test_mse<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">model_RNN<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">evaluate<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">testX<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">testY<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Print error<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;Train set MSE = &#8220;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">train_mse<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;Test set MSE = &#8220;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">test_mse<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Add attention layer to the deep learning network<\/span><\/p>\n<p><span class=\"crayon-t\">class<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">attention<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">Layer<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">__init__<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-r\">self<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-o\">*<\/span><span class=\"crayon-o\">*<\/span><span class=\"crayon-v\">kwargs<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-r\">super<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">attention<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-r\">self<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">__init__<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-o\">*<\/span><span class=\"crayon-o\">*<\/span><span class=\"crayon-v\">kwargs<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">build<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-r\">self<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-v\">input_shape<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-r\">self<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">W<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-r\">self<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">add_weight<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">name<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;attention_weight&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">input_shape<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 <\/span><span class=\"crayon-v\">initializer<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;random_normal&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">trainable<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-t\">True<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-r\">self<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">b<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-r\">self<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">add_weight<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">name<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;attention_bias&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">input_shape<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 <\/span><span class=\"crayon-v\">initializer<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;zeros&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">trainable<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-t\">True<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-r\">super<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">attention<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-r\">self<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">build<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">input_shape<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">call<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-r\">self<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-v\">x<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-p\"># Alignment scores. Pass them through tanh function<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">e<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">K<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">tanh<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">K<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">dot<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">x<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-r\">self<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">W<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">+<\/span><span class=\"crayon-r\">self<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">b<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-p\"># Remove dimension of size 1<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">e<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">K<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">squeeze<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">e<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">axis<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\">\u00a0\u00a0 <\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-p\"># Compute the weights<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">alpha<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">K<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">softmax<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">e<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-p\"># Reshape to tensorFlow format<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">alpha<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">K<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">expand_dims<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">alpha<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">axis<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-p\"># Compute the context vector<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">context<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e \">x *<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">alpha<\/span><\/p>\n<p><span class=\"crayon-e\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">context<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">K<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">sum<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">context<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">axis<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">context<\/span><\/p>\n<p><span class=\"crayon-e\">\u00a0\u00a0\u00a0\u00a0<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">create_RNN_with_attention<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">hidden_units<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">dense_units<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">input_shape<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">activation<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">x<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-e\">Input<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">input_shape<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">RNN_layer<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">SimpleRNN<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">hidden_units<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">return_sequences<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-t\">True<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">activation<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">activation<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">x<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">attention_layer<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">attention<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">RNN_layer<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">outputs<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-e\">Dense<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">dense_units<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">trainable<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-t\">True<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">activation<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">activation<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">attention_layer<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">model<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-e\">Model<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">x<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-v\">outputs<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">model<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">compile<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">loss<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;mse&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">optimizer<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;adam&#8217;<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">model<\/span><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Create the model with attention, train and evaluate<\/span><\/p>\n<p><span class=\"crayon-v\">model_attention<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">create_RNN_with_attention<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">hidden_units<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">hidden_units<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">dense_units<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">input_shape<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">time_steps<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">activation<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;tanh&#8217;<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">model_attention<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">summary<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><\/p>\n<p>\u00a0<\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-v\">model_attention<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">fit<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">trainX<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">trainY<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">epochs<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">epochs<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">batch_size<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">verbose<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">2<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Evalute model<\/span><\/p>\n<p><span class=\"crayon-v\">train_mse_attn<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">model_attention<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">evaluate<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">trainX<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">trainY<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">test_mse_attn<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">model_attention<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">evaluate<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">testX<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">testY<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Print error<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;Train set MSE with attention = &#8220;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">train_mse_attn<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;Test set MSE with attention = &#8220;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">test_mse_attn<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<h2 id=\"Further-Reading\">Further Reading<\/h2>\n<p>This section provides more resources on the topic if you are looking to go deeper.<\/p>\n<h3 id=\"Books\">Books<\/h3>\n<h3 id=\"Papers\">Papers<\/h3>\n<h3 id=\"Articles\">Articles<\/h3>\n<h2 id=\"Summary\u00b6\">Summary<\/h2>\n<p>In this tutorial, you discovered how to add a custom attention layer to a deep learning network using Keras.<\/p>\n<p>Specifically, you learned:<\/p>\n<ul>\n<li>How to override the Keras <code>Layer<\/code> class.<\/li>\n<li>The method <code>build()<\/code> is required to add weights to the attention layer.<\/li>\n<li>The <code>call()<\/code> method is required for specifying the mapping of inputs to outputs of the attention layer.<\/li>\n<li>How to add a custom attention layer to the deep learning network built using SimpleRNN.<\/li>\n<\/ul>\n<p>Do you have any questions about RNNs discussed in this post? Ask your questions in the comments below and I will do my best to answer.<\/p>\n<p>\u00a0<\/p>\n<p>\u00a0<\/p>\n<p>\u00a0<\/p>\n<p>\u00a0<\/p>\n<p>\u00a0<\/p>\n<p>\u00a0<\/p>\n<p>\u00a0<\/p>\n<p>\u00a0<\/p>\n<p>\u00a0<\/p>\n<p>\u00a0<\/p>\n<\/p><\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/machinelearningmastery.com\/adding-a-custom-attention-layer-to-recurrent-neural-network-in-keras\/<\/p>\n","protected":false},"author":0,"featured_media":1062,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/1061"}],"collection":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/comments?post=1061"}],"version-history":[{"count":0,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/1061\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media\/1062"}],"wp:attachment":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media?parent=1061"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/categories?post=1061"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/tags?post=1061"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}