V
主页
Orca 2: Teaching Small Language Models How to Reason
发布人
【加群】 一起来刷arxiv,请加vx: pwbot02(请备注:b站arxiv) 【彩蛋】 可以试试/ask + 你的提问和本篇论文进行交流 【论文标题】 Orca 2: Teaching Small Language Models How to Reason 【论文简述】 论文《Orca 2:小型模型的多种推理能力训练》探讨了如何改进小型语言模型的训练信号以增强其推理能力。以往针对小型模型的研究通常依赖于模仿学习,目的是复制更强大模型的输出。然而,过分强调模仿可能限制了小型模型的潜力。本研究旨在教导小型模型为不同任务采用不同的解决策略,这些策略可能与更大模型使用的策略有所不同。通过进行逐步推理、先回忆再生成、回忆-推理-生成、直接回答等多种推理技巧的训练,同时帮助模型学会为每个任务确定最有效的解决策略。我们使用了15个多样化的基准测试(涵盖约100个任务和超过36,000个独特提示),对Orca 2进行评估。结果显示,Orca 2在测试复杂任务的零样本设置中显著超越了相似规模模型,并达到了与模型大小为其5-10倍的性能水平相当或更好。我们开源了Orca 2,以促进进一步研究的发展、评估和对齐小型语言模型。 【引导阅读的问题】 如何通过改进训练信号来提升小型语言模型的推理能力? 【论文链接】 https://arxiv.org/pdf/2311.11045
打开封面
下载高清视频
观看高清视频
视频下载器
Retrieval meets Long Context Large Language Models
Amortizing intractable inference in large language models
How Do Large Language Models Capture the Ever-changing World Knowledge? A Review
Memory Augmented Language Models through Mixture of Word Experts
FreshLLMs: Refreshing Large Language Models with Search Engine Augmentation
BitNet: Scaling 1-bit Transformers for Large Language Models
AutoMix: Automatically Mixing Language Models
Are Large Language Models Post Hoc Explainers?
Interactive Task Planning with Language Models
Aligning Text-to-Image Diffusion Models with Reward Backpropagation
LayoutPrompter: Awaken the Design Ability of Large Language Models
Controlled Decoding from Language Models
How FaR Are Large Language Models From Agents with Theory-of-Mind?
Large Language Models Cannot Self-Correct Reasoning Yet
NEWTON: Are Large Language Models Capable of Physical Reasoning?
Can Large Language Models be Good Path Planners? A Benchmark and Investigation o
Simple and Scalable Strategies to Continually Pre-train Large Language Models
Gemma 2震撼发布!90亿vs270亿参数大比拼,测试视频【中英字幕】
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
Toward Joint Language Modeling for Speech Units and Text
Universal Manipulation Interface: In-The-Wild Robot Teaching Without In-The-Wild
Survey on Factuality in Large Language Models: Knowledge, Retrieval and Domain-S
TEAL: Tokenize and Embed ALL for Multi-modal Large Language Models
LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Language Models can be Logical Solvers
Can a student Large Language Model perform as well as it's teacher?
OpenWebMath: An Open Dataset of High-Quality Mathematical Web Text
CogVLM: Visual Expert for Pretrained Language Models
TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models
Tuna: Instruction Tuning using Feedback from Large Language Models
Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning
Making Large Language Models Perform Better in Knowledge Graph Completion
TiC-CLIP: Continual Training of CLIP Models
Llemma: An Open Language Model For Mathematics
Video Language Planning
TrustLLM: Trustworthiness in Large Language Models
Large Language Model Cascades with Mixture of Thoughts Representations for Cost-
More Agents Is All You Need
SOTOPIA-$π$: Interactive Learning of Socially Intelligent Language Agents