V
主页
[论文简析]PolyFormer: Referring Image Seg. as Sequential Polygon Gen [2302.07387]
发布人
论文题目:PolyFormer: Referring Image Segmentation as Sequential Polygon Generation 论文地址:http://arxiv.org/abs/2302.07387 项目地址:https://polyformer.github.io/ * 视频受up能力限制经常出现中英混杂,散装英语等现象,请见谅。如论文理解报道出了偏差,欢迎各位怒斥。 ** 新论文推荐,过往论文查找,欢迎编辑这个文档: https://docs.qq.com/sheet/DSUdOTG9xWUdydVB6 *** Slides每1-2月会上传到置顶动态地址
打开封面
下载高清视频
观看高清视频
视频下载器
[论文简析]Location-Aware Self-Supervised Transformers for Semantic Seg.[2212.02400]
[论文速览]Autoregressive Image Generation using Residual Quantization[2203.01941]
[论文速览]A Simple LLM Framework for Long-Range Video Question-Answering[2312.17235]
[论文速览]NeRF-RL: Reinforcement Learning with Neural Radiance Fields[2206.01634]
[论文简析]Patching Open-Vocabulary Models by Interpolating Weights[2208.05592]
[论文简析]NeRF in the Wild: NeRF for Unconstrained Photo Collections[2008.02268]
[论文简析]NeRF: Representing Scenes as Neural Radiance Fields...[2003.08934]
[论文简析]DiffSeg: Unsupervised Zero-Shot Seg. using Stable Diffusion[2308.12469]
[论文速览]RetNet: A Successor to Transformer for Large Language Models[2307.08621]
[论文速览]Diffusion Policy: Visuomotor Policy Learning via Action Diff.[2303.04137]
[论文简析]DAT: Vision Transformer with Deformable Attention[2201.00520]
[论文速览]Theia: Distilling Diverse Vision Foundation Models for Robot..[2407.20179]
[论文简析]TokenLearner: What Can 8 Learned Tokens Do for Images and vids[2106.11297]
[论文简析]XCiT: Cross-Covariance Image Transformers[2106.09681]
[论文简析]XSkill: Cross Embodiment Skill Discovery[2307.09955]
[论文简析]TAN: Temporal Alignment Networks for Long-term Video[2204.02968]
[论文简析]Point Transformer V2[2210.05666]
[论文速览]CRG: Improving Grounding in VLM w/o training[2403.02325]
[论文速览]Open Vocab. Semantic Seg. with Patch Aligned Contrastive...[2212.04994]
[论文简析]Improving fine-grained understanding in image-text pre-training[2401.0986]
[论文速览]LLaRA: Supercharging Robot Learning Data for VLM Policy[2406.20095]
[论文速览]Bootstrapping Language-Image Pre-training...[2201.12086]
[论文简析]RetinaGAN: An Object-aware Approach to Sim-to-Real Transfer[2011.03148]
[论文简析]TransRank: SS Video...Ranking-based Transformation Recognition[2205.02028]
[论文简析]A Laplacian Pyramid Translation Network[2105.09188]
[论文速览]RegMixup: Mixup as a Regularizer Can Surprisingly Improve...[2206.14502]
【全374集】2024最新清华内部版!终于把AI大模型(LLM)讲清楚了!全程干货讲解,通俗易懂,拿走不谢!
[论文速览]Personalizing Text2Img Generation using Textual Inversion[2208.01618]
[论文简析]DeiT: Data-efficient Image Transformers[2012.12877]
[论文速览]Flamingo: a Visual Language Model for Few-Shot Learning[2204.14198]
[论文简析]CLIP Dense Inference Yields Open-Vocab ... For-Free[2309.14289]
[论文简析]Deconstructing Denoising Diffusion Models for SSL[2401.14404]
[论文速览]Open-vocabulary Object Segmentation with Diffusion Models[2301.05221]
[论文简析]End-to-End Learning... from Uncurated Instructional Videos[1912.06430]
[论文速览]Decision Transformer: RL via Sequence Modeling[2106.01345]
[论文速览]Generative Modeling by Estimating Gradients of the Data Dist[1907.05600]
[论文夕拾]Diffusion Models for Robotics
[论文速览]Denoising Diffusion Implicit Models / DDIM[2010.02502]
[论文速览]Taming Transformers for High-Resolution Image Synthesis[2012.09841]
[论文简析]Transf. Meta-learners for Implicit Neural Representations[2208.02801]