Learning Spark
We’ll look at how the training of deep learning models can be significantly accelerated with distributed computing on GPUs, as well as discuss some of the challenges and examine current research on the topic. The original post is here, the reason I re-copy the post is that it’s and my equations both do not render properly in browser because…
In this post we will learn the foundations behind sequence to sequence models and how neural networks can be used to build powerful models capable of analyzing data that varies over time.
In this part we’ll give a brief overview of BPTT and explain how it differs from traditional backpropagation. We will then try to understand the vanishing gradient problem which has led to the development of LSTMs and GRUs, two of the currently most popular and powerful models used in NLP (and other areas). 我的参考/学习资料
未开始。 目标:为DCGAN 写一个MXNet 版本的源码分析。
这篇文章的一些截图均来自“一天学会深度学习” – 一个301页的幻灯片,深入浅出的讲解了深度学习的主要内容(我个人非常推荐初学者进行学习)。另外,在知乎,也有非常多的经验分享。
C/C++ 学习笔记。
internals=model.symbol.get_internals() #list all symbol
>>> net = mx.symbol.Variable('data')
>>> net = mx.symbol.FullyConnected(data=net, name='fc1', num_hidden=10)
>>> arg_shape, out_shape, aux_shape = net.infer_shape(data=(100, 100))
>>> dict(zip(net.list_arguments(), arg_shape))
{'data': (100, 100), 'fc1_weight': (10, 100), 'fc1_bias': (10,)}
>>> out_shape
[(100, 10)]
Visualizing what ConvNets learn.
ConvNet architectures make the explicit assumption that the inputs are images, which allows us to encode certain properties into the architecture.
The code of this work is on github. About how we design the “hyper parameters” is from the CS231 class