This blog is not what you think that you are going read looking at the subject line.
Long short-term memory is a type of neural network that learns order dependence in sequence prediction problems.
It’s a complex process to get your head around because it doesn’t have 2 dimensions like a roadmap or 3 dimensions like gaming.
It has 1000’s of dimensions.
Think of how google makes its search intuitive to help you find what you are looking for.
So linguistically unrelated words can fall in the same space so that when you type one word, others appear intuitively for you to refine your search or search a bit quickly.
“Harvard” is closely related to “Ivy” and “university” as an example.
Come to think of it in real life this is exactly how human beings think when a sentence is being framed even though you might not pay attention to the process as it is deeply ingrained in our subconscious.
Yoshua Bengio, et al., wrote about this in their paper “Learning Long-Term Dependencies with Gradient Descent is Difficult in 1994.
The paper defines 3 basic requirements of a recurrent neural network:
- That the system be able to store information for an arbitrary duration.
- That the system be resistant to noise (i.e. fluctuations of the inputs that are random or irrelevant to predicting a correct output).
- That the system parameters be trainable (in reasonable time).
Context is key.
Recurrent neural networks must use context when making predictions, but to this extent, the context required must also be learned.
… recurrent neural networks contain cycles that feed the network activations from a previous time step as inputs to the network to influence predictions at the current time step. These activations are stored in the internal states of the network which can in principle hold long-term temporal contextual information. This mechanism allows RNNs to exploit a dynamically changing contextual window over the input sequence history
— Hassim Sak, et al., Long Short-Term Memory Recurrent Neural Network Architectures for Large Scale Acoustic Modeling, 2014
Context is Key
This and the following statement- “Context needs to be learned too” are very powerful statements.
For a lot of us, getting information becomes important.
Yes, information is important, it is the starting point of how you will think, react, create, design, solve a situation.
However, the later part-think, react, create, design require context and context happens not by linear-single, 2 or even 3-dimension thinking but the ability to bring multiple dimensions into the picture.
Making “Complex simple”
Google made search simple.
Apple made its phone intuitive and interface simpler.
Warren Buffett made investing boring and simple.
Look behind any product, process, philosophy that has received positive audience reception or outcome over time and you will realise that it is the simplest that gained maximum ground.
However, to make things simple and intuitive needs complex steps and that’s as true for the apple products, google search or investing.
Understanding the context is not simple.
Whether it is the narrative, or numbers, business case or sentiments, everything comes together.
Nothing can be achieved by looking at one element and taking a call.
Think about an IPO company listing at a P/E of 1600 (NSE: NYKAA) and just looking at that one number might make you wonder what’s happening here till you look at multiple parameters, to arrive at a reasonable picture of the future, and whether this price makes sense or is it actually whacky.
The beauty of understanding neural networks or long short-term memory is that it is actually no different from how we actually think.
The difficult part is the realisation that this is how you think and then attempting to use that realisation to understand something, however, finding it tough to use.
It is like driving, initially you keep looking all around you, drive at a slower speed because you are not sure of your skills.
However, overtime however it becomes intuitive.
Like driving, you will need to learn the whole process and practice till it becomes intuitive.
There is no short-cut.