Deep learing is about learning features, this is the only tool we know that generated sensible features The classifiers we use are actully pretty simple. The features are the power.

In a way, deep learning is a brute-force approach to learning - it is not clever, it just tries to go down the hill. Which works fine, as the features are really OK. But other sparse representations could also be used. JPEG compression is a quite good compression of features; (this should be sparse)

Human brain has 600 billion units in six layers. Also - they are spiking and not exact.

What we want to skip in DL is erroneous human explanation. Humans suck at transferring knowledge. We may find, however, good domain knowledge we want to apply in the models and this is the place of hybrid modelling. But for perception and definition of words - we suck. Therefore, classification of whether something is in the domain of word will work with neurals; logic and rule application - it may but there are better tools

Batch normalization is division by L2. You may substitute different norms there, we just want to bound the values to some range. Easier calculations may be used.

It took Geoffrey Hinton approximately 25 years until his work finally worked out somehow

“We are like half awake”. There are people who kill it. I was always intimidated by this. But as the common rule says, you may use 20% effort to achieve 80% results of those people if you take the right bits. And maybe that is not that awful. Generally, as long as you don’t think and you act automatically, you devolve. Humans are not general intelligences when they don’t consciously think. So to think, person has to be conscious. And this is a hard feat in this world.

Questions for today: