本来这个题目跟网站的主题没有太大的关系，exodotes.com是一个关于到海外投资、退休、学习和移民的内容网站，而深度学习是人工智能领域里面一个当今最活跃的研究和应用领域，那么我为什么要在这里谈论深度学习的事情呢？我的逻辑是这样的，我本人目前在从事深度学习的研究工作，刚好因为一个项目的原因需要了解LSTM和GRU的内部工作原理，在网上读到一个评论，他的解释跟我读到的所有其他解释都不相同，但是可以让完全不了解神经网络的人都能理解大意，所以我决定把这个评论和其他人的提问整理起来，发布在这里。如果你认为这篇文章跟网站主题不匹配，你可以这样理解，如果你理解了LSTM，并能很好的利用起来，你可以精准预测世界上的一切事情，包括股市、汇市、美国的Power Super Bowl Jackpots数字（超过10亿美金）等等，那么你在这个世界上就可以“心想事成”和“胡作非为”了，isn’t it cool, dude?（这个字的发音是“独的”，美国黑人最喜欢这样称呼你）
但是忽略图像的顺序信息会造成重要信息的丢失，好像是初级的深度学习练习。 如果你看到一个海滩的场景，你应该在未来的框架中增加海滩活动：水中某人的图像应该标记为游泳，而不是洗澡，并且闭着眼睛的人的图像可能是在晒太阳。 如果你还记得鲍勃刚刚到达超市，那么即使没有任何独特的超市特色，鲍勃拿着一块培根的形象应该被归类为购物而不是烹饪。
So what you’d like is to let your model track the state of the world:
- After seeing each image, the model outputs a label and also updates the knowledge it’s been learning. For example, the model might learn to automatically discover and track information like location (are scenes currently in a house or beach?), time of day (if a scene contains an image of the moon, the model should remember that it’s nighttime), and within-movie progress (is this image the first frame or the 100th?). Importantly, just as a neural network automatically discovers hidden patterns like edges, shapes, and faces without being fed them, your model should automatically discover useful information by itself.
- When given a new image, the model should incorporate the knowledge it’s gathered to do a better job.
This, then, is a recurrent neural network. Instead of simply taking an image and returning an activity, an RNN also maintains internal memories about the world (weights assigned to different pieces of information) to help perform its classifications.
Let’s think about how your model updates its knowledge of the world. So far, you’ve placed no constraints on this update, so its knowledge can change pretty chaotically: at one frame it thinks the characters are in the US, at the next frame it sees the characters eating sushi and thinks they’re in Japan, and at the next frame it sees polar bears and thinks they’re on Hydra Island. Or perhaps it has a wealth of information to suggest that Alice is an investment analyst, but decides she’s a professional assassin after seeing her cook.
This chaos means information quickly transforms and vanishes, and it’s difficult for the model to keep a long-term memory. So what you’d like is for the network to learn how to update its beliefs (scenes without Bob shouldn’t change Bob-related information, scenes with Alice should focus on gathering details about her), in a way that its knowledge of the world evolves more gently.
This is how you do it.
- Adding a forgetting mechanism. If a scene ends, for example, the model should forget the current scene location, the time of day, and reset any scene-specific information; however, if a character dies in the scene, it should continue remembering that he’s no longer alive. Thus, you want the model to learn a separate forgetting/rememberingmechanism: when new inputs come in, it needs to know which beliefs to keep or throw away.
- Adding a saving mechanism. When the model sees a new image, it needs to learn whether any information about the image is worth using and saving. Maybe your mom sent you an article about the Kardashians, but who cares?
- So when new a input comes in, the model first forgets any long-term information it decides it no longer needs. Then it learns which parts of the new input are worth using, and saves them into its long-term memory.
- Focusing long-term memory into working memory. Finally, the model needs to learn which parts of its long-term memory are immediately useful. For example, Bob’s age may be a useful piece of information to keep in the long term (children are more likely to be crawling, adults are more likely to be working), but is probably irrelevant if he’s not in the current scene. So instead of using the full long-term memory all the time, it learns which parts to focus on instead.
And now you have a long short-term memory network. Whereas an RNN can overwrite its memory at each time step in a fairly uncontrolled fashion, an LSTM transforms its memory in a very precise way: by using specific learning mechanisms that help it keep track of information over longer periods of time.