Denken
Menu
  • About Me
  • Deep Learning with Pytorch
  • Generative AI: Tutorial Series
  • Python Tutorials
  • Contact Me
Menu

Deep Learning with Pytorch -Sequence Modeling – LSTMs – 3.2

Posted on June 20, 2019June 20, 2019 by Aritra Sen

In the previous article related to RNN , we understood the architecture of RNN , RNN has the problem working poorly when we need to maintain the long term dependencies (earlier layers of RNNs suffers the problem of vanishing gradients).
This problem has been almost solved with the new architecture of LSTMs (Long Short Term Memory) where the concepts of below given Gates has been introduced:

1. Learn Gate.
2. Forget Gate.
3. Remember Gate.
4. Use Gate.

Using all these gates we try to learn and forget from the short term and long term memory and we output new long and short term memory as shown below –

LSTM with different gates

Now let’s go through each of these gates and the mathematical equations behind these gates.

STM – Short Term Momory
LTM – Long Term Memory

Learn Gate:

Inputs – Short Term Memory & Event
Output – Nt * it

Step 1:
It combines the STM and the Event then multiples with a weight vector, adds the bias and squashes through a Tanh. This step generates new information from the STM and event.

Step 2:
Ignores few parts this new information with multiplying the output of the below equation i.e. the ignore term. Sigmoid action squashes the value (i.e. the combination of STM & Event) between 0 & 1. One means keep the required information and Zero means ignore the unnecessary information.

Final output of the Learn gate:

Forget Gate:

Input: LTM
Output: LTM * ft

Mainly works on the Long term memory and does the operation of what should we forget from the Long term memory using the short term memory information.

Step 1:
First the a forget factor is calculated as shown below , factor is being calculated from the previous short term memory and using the current event. Sigmoid action is again used to squash the value to 0 & 1.

Step 2:
Forget factor is then element wise multiplied with Long term memory to throw away few long term information as shown below –

Remember Gate:

Inputs: Outputs of the Learn Gate and Forget Gate
Output: New Long Term Memory

Just adds the outputs of the Learn Gate and Forget Gate and produces the new long term memory as output , shown below (LTMt is the new long term memory)-

Use Gate(output gate):

Inputs: Output of the forget gate , STM & Event
Output: New short term memory

Step 1:
Applies a Tanh at the output of the forget gate as shown below –

Step 2:
Applies a small neural network with sigmoid activation function on the short term memory and event as shown below –

Step 3:
Then the use gate does multiplication of the outputs of Step 1 & Step 2 and produces the output or the new Short Term Memory as shown below –

Putting it all together it looks like below –

Image Credits: Udacity Deep Learning LSTM video tutorial created by Luis Serrano.

Do like , share and comment if you have any questions.

Category: Machine Learning, Python

Post navigation

← Deep Learning with Pytorch – Custom Weight Initialization – 1.5
Deep Learning with Pytorch -Text Generation – LSTMs – 3.3 →

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

RSS Feeds

Enter your email address:

Delivered by FeedBurner

Pages

  • About Me
  • Contact Me
  • Deep Learning with Pytorch
  • Generative AI: Tutorial Series
  • Python Tutorials

Tag Cloud

Announcements Anrdoid BERT Bias Celebration Cricket CyanogenMod deep-learning Denken Experience Facebook Features Finetuning GCN GenerativeAI GNN Google HBOOT HBOOT downgrading HTC Wildfire huggingface India Launch Life LLM Lumia 520 MachineLearning mobile My Space nlp Orkut People Python pytorch pytorch-geometric Rooting Sachin Share Social Network tranformers transformers Tutorials Twitter weight-initialization Windows Phone

WP Cumulus Flash tag cloud by Roy Tanck and Luke Morton requires Flash Player 9 or better.

Categories

Random Posts

  • It’s a new day
  • Graph Neural Network – Message Passing (GCN) – 1.1
  • Generative AI: LLMs: Finetuning Approaches 1.1
  • Orkut vs Facebook
  • Spare a thought(Denken)

Recent Comments

  • Generative AI: LLMs: Reduce Hallucinations with Retrieval-Augmented-Generation (RAG) 1.8 – Denken on Generative AI: LLMs: Semantic Search and Conversation Retrieval QA using Vector Store and LangChain 1.7
  • vikas on Domain Fuss
  • Kajal on Deep Learning with Pytorch -Text Generation – LSTMs – 3.3
  • Aritra Sen on Python Tutorials – 1.1 – Variables and Data Types
  • Aakash on Python Tutorials – 1.1 – Variables and Data Types

Visitors Count

AmazingCounters.com

Archives

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org

Copyright

AritraSen’s site© This site has been protected from copyright by copyscape.Copying from this site is stricktly prohibited. Protected by Copyscape Original Content Validator
© 2025 Denken | Powered by Minimalist Blog WordPress Theme