学习神经网络和深度学习推荐这本书,这本书站位高,且很多问题深入剖析了,甩其他同类书籍几条街。极力推荐。
多数书,不深度分析、没有知识体系,知识点零散、章节之间孤立。还有一些人Tian所谓的权威,醒醒吧。
https://udlbook.github.io/udlbook/
2 监督学习 (Supervised learning)
- 2.1 监督学习概览 (Supervised learning overview)
- 2.2 线性回归示例 (Linear regression example)
- 2.3 总结 (Summary)
3 浅层神经网络 (Shallow neural networks)
- 3.1 神经网络示例 (Neural network example)
- 3.2 通用逼近定理 (Universal approximation theorem)
- 3.3 多变量输入与输出 (Multivariate inputs and outputs)
- 3.4 浅层神经网络:一般情况 (Shallow neural networks: general case)
- 3.5 术语 (Terminology)
- 3.6 总结 (Summary)
4 深度神经网络 (Deep neural networks)
- 4.1 组合神经网络 (Composing neural networks)
- 4.2 从组合网络到深度网络 (From composing networks to deep networks)
- 4.3 深度神经网络 (Deep neural networks)
- 4.4 矩阵符号 (Matrix notation)
- 4.5 浅层与深度神经网络对比 (Shallow vs. deep neural networks)
- 4.6 总结 (Summary)
5 损失函数 (Loss functions)
- 5.1 最大似然 (Maximum likelihood)
- 5.2 构建损失函数的方法 (Recipe for constructing loss functions)
- 5.3 示例 1:单变量回归 (Example 1: univariate regression)
- 5.4 示例 2:二分类 (Example 2: binary classification)
- 5.5 示例 3:多类别分类 (Example 3: multiclass classification)
- 5.6 多输出 (Multiple outputs)
- 5.7 交叉熵损失 (Cross-entropy loss)
- 5.8 总结 (Summary)
6 拟合模型 (Fitting models)
- 6.1 梯度下降 (Gradient descent)
- 6.2 随机梯度下降 (Stochastic gradient descent)
- 6.3 动量法 (Momentum)
- 6.4 Adam 优化算法
- 6.5 训练算法超参数 (Training algorithm hyperparameters)
- 6.6 总结 (Summary)
7 梯度与初始化 (Gradients and initialization)
- 7.1 问题定义 (Problem definitions)
- 7.2 计算导数 (Computing derivatives)
- 7.3 玩具示例 (Toy example)
- 7.4 反向传播算法 (Backpropagation algorithm)
- 7.5 参数初始化 (Parameter initialization)
- 7.6 示例训练代码 (Example training code)
- 7.7 总结 (Summary)
8 衡量性能 (Measuring performance)
- 8.1 训练简单模型 (Training a simple model)
- 8.2 误差来源 (Sources of error)
- 8.3 减少误差 (Reducing error)
- 8.4 双重下降 (Double descent)
- 8.5 选择超参数 (Choosing hyperparameters)
- 8.6 总结 (Summary)
9 正则化 (Regularization)
- 9.1 显式正则化 (Explicit regularization)
- 9.2 隐式正则化 (Implicit regularization)
- 9.3 提高性能的启发式方法 (Heuristics to improve performance)
- 9.4 总结 (Summary)
10 卷积网络 (Convolutional networks)
- 10.1 不变性与等变性 (Invariance and equivariance)
- 10.2 用于一维输入的卷积网络 (Convolutional networks for 1D inputs)
- 10.3 用于二维输入的卷积网络 (Convolutional networks for 2D inputs)
- 10.4 下采样与上采样 (Downsampling and upsampling)
- 10.5 应用 (Applications)
- 10.6 总结 (Summary)
11 残差网络 (Residual networks)
- 11.1 序列处理 (Sequential processing)
- 11.2 残差连接与残差块 (Residual connections and residual blocks)
- 11.3 残差网络中的梯度爆炸 (Exploding gradients in residual networks)
- 11.4 批归一化 (Batch normalization)
- 11.5 常见残差架构 (Common residual architectures)
- 11.6 为什么带残差连接的网络表现更好?(Why do nets with residual connections perform so well?)
- 11.7 总结 (Summary)
12 变换器 (Transformers)
- 12.1 处理文本数据 (Processing text data)
- 12.2 点积自注意力 (Dot-product self-attention)
- 12.3 点积自注意力的扩展 (Extensions to dot-product self-attention)
- 12.4 变换器层 (Transformer layers)
- 12.5 用于自然语言处理的变换器 (Transformers for natural language processing)
- 12.6 编码器模型示例:BERT
- 12.7 解码器模型示例:GPT3
- 12.8 编码器-解码器示例:机器翻译 (Encoder-decoder example: machine translation)
- 12.9 用于长序列的变换器 (Transformers for long sequences)
- 12.10 用于图像的变换器 (Transformers for images)
- 12.11 总结 (Summary)
13 图神经网络 (Graph neural networks)
- 13.1 什么是图?(What is a graph?)
- 13.2 图表示 (Graph representation)
- 13.3 图神经网络、任务和损失函数 (Graph neural networks, tasks, and loss functions)
- 13.4 图卷积网络 (Graph convolutional networks)
- 13.5 示例:图分类 (Example: graph classification)
- 13.6 归纳式与转导式模型 (Inductive vs. transductive models)
- 13.7 示例:节点分类 (Example: node classification)
- 13.8 图卷积网络的层 (Layers for graph convolutional networks)
- 13.9 边图 (Edge graphs)
- 13.10 总结 (Summary)
14 无监督学习 (Unsupervised learning)
- 14.1 无监督学习模型分类学 (Taxonomy of unsupervised learning models)
- 14.2 什么是好的生成模型?(What makes a good generative model?)
- 14.3 量化性能 (Quantifying performance)
- 14.4 总结 (Summary)
15 生成对抗网络 (Generative adversarial networks)
- 15.1 判别作为信号 (Discrimination as a signal)
- 15.2 提高稳定性 (Improving stability)
- 15.3 渐进式增长、小批量判别与截断 (Progressive growing, minibatch discrimination, and truncation)
- 15.4 条件生成 (Conditional generation)
- 15.5 图像翻译 (Image translation)
- 15.6 StyleGAN
- 15.7 总结 (Summary)
16 归一化流 (Normalizing flows)
- 16.1 一维示例 (1D example)
- 16.2 一般示例 (General example)
- 16.3 可逆网络层 (Invertible network layers)
- 16.4 多尺度流 (Multi-scale flows)
- 16.5 应用 (Applications)
- 16.6 总结 (Summary)
17 变分自编码器 (Variational autoencoders)
- 17.1 潜在变量模型 (Latent variable models)
- 17.2 非线性潜在变量模型 (Nonlinear latent variable model)
- 17.3 训练 (Training)
- 17.4 ELBO 性质 (ELBO properties)
- 17.5 变分近似 (Variational approximation)
- 17.6 变分自编码器 (The variational autoencoder)
- 17.7 重参数化技巧 (The reparameterization trick)
- 17.8 应用 (Applications)
- 17.9 总结 (Summary)
18 扩散模型 (Diffusion models)
- 18.1 概览 (Overview)
- 18.2 编码器(前向过程)(Encoder (forward process))
- 18.3 解码器模型(反向过程)(Decoder model (reverse process))
- 18.4 训练 (Training)
- 18.5 损失函数的重参数化 (Reparameterization of loss function)
- 18.6 实现 (Implementation)
- 18.7 总结 (Summary)