[SIGGRAPH 2023] GestureDiffuCLIP: Gesture Diffusion Model with CLIP Latents

发布人

GestureDiffuCLIP: Gesture Diffusion Model with CLIP Latents
Tenglong Ao, Zeyi Zhang, Libin Liu
SIGGRAPH 2023 (Jounal Track)

Project Page: https://pku-mocca.github.io/GestureDiffuCLIP-Page/

Abstract:
The automatic generation of stylized co-speech gestures has recently received increasing attention. Previous systems typically allow style control via predefined text labels or example motion clips, which are often not flexible enough to convey user intent accurately. In this work, we present GestureDiffuCLIP, a neural network framework for synthesizing realistic, stylized co-speech gestures with flexible style control. We leverage the power of the large-scale Contrastive-Language-Image-Pre-training (CLIP) model and present a novel CLIP-guided mechanism that extracts efficient style representations from multiple input modalities, such as a piece of text, an example motion clip, or a video. Our system learns a latent diffusion model to generate high-quality gestures and infuses the CLIP representations of style into the generator via an adaptive instance normalization (AdaIN) layer. We further devise a gesture-transcript alignment mechanism that ensures a semantically correct gesture generation based on contrastive learning. Our system can also be extended to allow fine-grained style control of individual body parts. We demonstrate an extensive set of examples showing the flexibility and generalizability of our model to a variety of style descriptions. In a user study, we show that our system outperforms the state-of-the-art approaches regarding human likeness, appropriateness, and style correctness.

打开封面下载高清视频观看高清视频视频下载器

[SIGGRAPH 2023] GestureDiffuCLIP: Gesture Diffusion Model with CLIP Latents

[SIGGRAPH 2024] A Vortex Particle-on-Mesh Method for Soap Film Simulation

全球学术顶会SIGGRAPH开榜：中国团队连获最佳论文荣誉提名，用生成式AI震撼整个3D世界！

[SIGGRAPH Asia 2022] Position-Based Surface Tension Flow

[SIGGRAPH 2024] 从儿童画到3D场景只要3分钟！Rodin@Real-time Live!

A Neural Network Model for Efficient Musculoskeletal-Driven Skin Deformation

【卡耐基梅隆大学SIGGRAPH2024】Laplacian is All You Need ：计算符号距离的扩散方法

【全网最新v4.9 SD教程】秋叶大神Stable Diffusion v4.9整合包！ 零基础入门到精通全套SD教程，这可能是B站唯一能将SD讲明白的教程，

Geometric Algebra and Computer Graphics

我们训练了一个没有道德限制的大模型

【AI教程】王炸！Stable Diffusion秋叶整合包4.9版来啦！支持SD3.0！附安装包！含Win+Mac版本和A卡N卡版本！一键安装，永久使用！

【2024年8月最新chatgpt】GPT4.0免费使用教程，免登录就可以直接使用

Qwen2-VL-7B实现精准pdf转markdown，从原理、代码实现、存在问题以及优化方向全流程讲解

B站水友开发的免费ChatGPT账号共享站，打开即用

Blender插件推荐：Squish让角色动画更自然！一键优化变形效果！

用AI一键扩图，秒杀PS！竖图变横图，半身变全身，SD统统帮你搞定（附脚本）

一键部署全天运行的AI聊天QQ机器人，只需一条命令即可成功部署，Linux上最简单的AI聊天QQ机器人部署教程

【AI语音】洛天依语音合成模型分享【GPT-SoVITS】

ControlVAE: 使用基于模型强化学习的物理角色动作生成

[SIG'24] An Induce-on-Boundary Magnetostatic Solver for Grid-Based Ferrofluids

【黑神话悟空】训练小狐狸萍萍 FLUX.1的LoRA演示

9月7日最新ChatGPT4.0使用教程，国内版免费网站，电脑手机版如何免下载安装通用2024

实操，用提示词微调AI大模型来写小说可以日更万字，效率起飞

【第九部】UE5.4.3_Metahuman-数字人超人类+AGLS-GASP

Slang in Vulkan

FLUX LORA训练丨真有手就行！

只需5分钟，水灵灵地实现网站复刻！

给大模型新人的经验，刷到少走3年弯路！

【NAI3进阶教程】导演工具/进阶提示词 一个视频带你玩转NAI3｜NovelAI Diffusion v3进阶版教程

【AI 绘画】如何在 SD WebUI Forge 安装 FLUX 模型

剧本到AI电影全自动！AIGC从业者必备 clapper AI视频编辑器

RAGFlow：知识库终极引擎

GARM-LS A Gradient-Augmented Reference-Map Method for Level-Set Fluid Simulation

这逼真程度？！差点把机器人当成我同事给牵走 | 暴走两万步挤进2024世界机器人大会上人最多的五个展厅

【AI 绘画】更快？更省显存？支持 FLUX？使用绘世启动器安装 SD WebUI Forge

超越GPT-4o视觉能力？本地部署Qwen2-VL多模态视觉大模型！超越人类的视觉理解能力，精准识别X光片判断骨折、CT扫描检测癌症，还能识别手写体汉字与英文！

鬼步舞SEVE，真人版本，3D动画版本，AI转换真人版本对比，你喜欢哪一个？视频动捕捕捉动作，DAZ导出3D人物，C4D处理人物

【SD教程】妈生感，超级自然的AI换脸！Stable Diffusion最强换脸插件一键安装！永久免费！换脸工具中的天花板！（附插件资料）

黑神话：三维高斯溅射

【保姆级教程】AI做3D动画电影，解决人物一致性，最强工作流，0门槛轻松上手，不露脸不拍片！

【几何直觉】3D Gaussian Splatting（三维高斯泼溅）, SuGaR 背后的几何 insight 讲解

【全网最新v4.9 SD教程】秋叶大神Stable Diffusion v4.9整合包！零基础入门到精通全套SD教程，这可能是B站唯一能将SD讲明白的教程，

【NAI3进阶教程】导演工具/进阶提示词一个视频带你玩转NAI3｜NovelAI Diffusion v3进阶版教程