They are different. Very deep nets have exploding/vanishing gradient problem. The authors of ResNet paper had seen that by stacking many layers of convolution and dense layers, the learning did not increase although they used ReLU activation and batch normalization. They used a concept named skip connection which helped the nets to learn whether the input to a typical layer should be preserved or it should be transformed by that layer. Using this concept allowed them to increase the number of layers without hesitating whether they would have vanishing/exploding gradients. The concept of residual nets was originally this. The paper uses this concept for spatial data but recently I've seen people debating using them in temporal cases too __time series data.

Recurrent nets are used in temporal domains. Tasks like sequence classification are examples of their usage. In this domain the net should know the information of previous seen data. Well known examples of these nets are LSTMS. Early recurrent nets had vanishing/exploding gradient problem too. but after years LSTMs get popular amongst deep-learning practitioners. They defined a concept named gates which could learn when to forget and when to keep the previous data.

--

--

Ke Gui
Ke Gui

Written by Ke Gui

An ordinary guy who wants to be the reason someone believes in the goodness of people. He is living at Brisbane, Australia, with a lovely backyard.

No responses yet