[动手写 bert 系列] Bert 中的（add & norm）残差连接与残差模块（residual connections/residual blocks） - 视频下载 Video Downloader

[动手写 bert 系列] Bert 中的（add & norm）残差连接与残差模块（residual connections/residual blocks）

发布人

code：https://github.com/chunhuizhang/bilibili_vlogs/blob/master/fine_tune/bert/tutorials/07_add_norm_residual_conn.ipynb
动手写bert 系列：https://space.bilibili.com/59807853/channel/collectiondetail?sid=496538
pytorch 系列：https://space.bilibili.com/59807853/channel/collectiondetail?sid=446911

打开封面下载高清视频观看高清视频视频下载器

[pytorch] BN、LN、RMSNorm 及 pre LN vs. post LN 对比，标准化

[动手写bert系列] 01 huggingface tokenizer （vocab，encode，decode）原理及细节

[动手写 bert 系列] 解析 bertmodel 的output(last_hidden_state，pooler_output，hidden_state)

[pytorch] nn.Embedding 前向查表索引过程与 one hot 关系及 max_norm 的作用

【ResNet+Transformer】基于PyTorch的迁移学习残差网络Resnet，细胞分类任务、ViT、DERT目标检测

[bert、t5、gpt] 04 构建 TransformerEncoderLayer（FFN 与 Layer Norm、skip connection）

[动手写神经网络] 01 认识 pytorch 中的 dataset、dataloader（mnist、fashionmnist、cifar10）

[动手写 bert 系列] 02 tokenizer encode_plus, token_type_ids（mlm，nsp）

[动手写bert系列] BertSelfLayer 多头注意力机制（multi head attention）的分块矩阵实现

深度学习环境配置一套搞定：anaconda+pytorch+pycharm+cuda全详解，带你从0配置环境到跑通代码！

[动手写神经网络] 手动实现 Transformer Encoder

[pytorch 网络拓扑结构] 深入理解 nn.LayerNorm 的计算过程

[动手写 bert 系列] BertTokenizer subword，wordpiece 如何处理海量数字等长尾单词

【矩阵分析】矩阵奇异值与谱范数（spectral norm），F范数（Frobenius norm），核范数（nuclear norm）

[动手写Bert系列] bertencoder self attention 计算细节及计算过程

[BERT 番外] Sin Position Encoding 的简洁实现（RoPE 基础）

[性能测试] 03 单 4090 BERT、GPT2、T5 TFLOPS 测试及对比 3090TI

[pytorch模型拓扑结构] nn.MultiheadAttention, init/forward, 及 query，key，value 的计算细节

强推！这可能是B站最全的（Python＋机器学习＋深度学习）系列课程，从入门到精通，通俗易懂，还学不会我退出IT界！AI人工智能|神经网络|项目实战

[动手写 bert 系列] bert model architecture 模型架构初探（embedding + encoder + pooler）

[动手写 bert 系列] bert embedding 源码解析，word_embedding/position_embedding/token_type

[pytorch 模型拓扑结构] 深入理解 nn.CrossEntropyLoss 计算过程（nn.NLLLoss(nn.LogSoftmax))

[pytorch 强化学习] 10 从 Q Learning 到 DQN（experience replay 与 huber loss / smooth L1）

[动手写神经网络] pytorch 高维张量 Tensor 维度操作与处理，einops

[pytorch 强化学习] 01 认识环境（environment，gym.Env）以及 CartPole-v0/v1 环境

[动手写bert] bert pooler output 与 bert head

[pytorch distributed] 张量并行与 megtron-lm 及 accelerate 配置

[bert、t5、gpt] 11 知识蒸馏（knowledge distill）huggingface trainer pipeline

[性能测试] 04 双4090 BERT、GPT性能测试（megatron-lm、apex、deepspeed）

[AI 核心概念及计算] 概率计算 01 pytorch 最大似然估计（MLE）伯努利分布的参数

[QKV attention] kv-cache、decoder only vs. BERT, 单向注意力 vs. 双向注意力

[LLMs 实践] 17 llama2 源码分析（RMSNorm 与 SwiGLU）

[bert、t5、gpt] 05 构建 TransformerDecoderLayer（FFN 与 Masked MultiHeadAttention）

[bert、t5、gpt] 10 知识蒸馏（knowledge distill）初步，模型结构及损失函数设计

[pytorch 模型拓扑结构] 深入理解 nn.BatchNorm2d/3d

[pytorch] 激活函数，从 ReLU、LeakyRELU 到 GELU 及其梯度（gradient）（BertLayer，FFN，GELU）

【CNN卷积神经网络】浙大大佬2小时带你从0开始搭建CNN识别模块，猫狗识别+鸢尾花分类+视频分析与动作识别实战项目一次性全讲透！深度学习/毕设/课设

超全超简单！一口气刷完CNN、RNN、GAN、GNN、DQN、Transformer、LSTM、DBN等八大深度学习神经网络算法！真的比刷剧还爽！

[pytorch 神经网络拓扑结构] pad_sequence/pack_padded_sequence 时序模型如何处理不定长输入

[LLMs tuning] 04 optimizer Trainer 优化细节（AdamW，grad clip、Grad Norm）等