A Theoretical Understanding of Chain-of-Thought: Coherent Reasoning and Error-Aware Demonstration

October 21, 2024

Yingqian Cui, Pengfei He, Xianfeng Tang, Qi He, Chen Luo, Jiliang Tang, Yue Xing

Computer Science

Statistics

Computation and Language

Artificial Intelligence

Machine Learning

Few-shot Chain-of-Thought (CoT) prompting has demonstrated strong performance in improving the reasoning capabilities of large language models (LLMs). While theoretical investigations have been conducted to understand CoT, the underlying transformer used in these studies isolates the CoT reasoning process into separated in-context learning steps (Stepwise ICL). In this work, we theoretically show that, compared to Stepwise ICL, the transformer gains better error correction ability and more accurate predictions if the reasoning from earlier steps (Coherent CoT) is integrated. Given that this coherent reasoning changes the behavior of the transformer, we further investigate the sensitivity of the transformer with Coherent CoT when the demonstration examples are corrupted at the inference stage. Our theoretical results indicate that the transformer is more sensitive to errors in intermediate reasoning steps than the final outcome. Building upon this observation, we propose an improvement on CoT by incorporating both correct and incorrect reasoning paths in the demonstration. Our experiments validate the effectiveness of the proposed approach.

Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters

December 20, 2022

95% Match

Boshi Wang, Sewon Min, Xiang Deng, Jiaming Shen, You Wu, ... , Sun Huan

Computation and Language

Chain-of-Thought (CoT) prompting can dramatically improve the multi-step reasoning abilities of large language models (LLMs). CoT explicitly encourages the LLM to generate intermediate rationales for solving a problem, by providing a series of reasoning steps in the demonstrations. Despite its success, there is still little understanding of what makes CoT prompting effective and which aspects of the demonstrated reasoning steps contribute to its performance. In this paper, we...

Find SimilarView on arXiv

Unveiling the Statistical Foundations of Chain-of-Thought Prompting Methods

August 25, 2024

94% Match

Xinyang Hu, Fengzhuo Zhang, ... , Yang Zhuoran

cs.AI

cs.CL

cs.LG

math.ST

stat.ML

stat.TH

Chain-of-Thought (CoT) prompting and its variants have gained popularity as effective methods for solving multi-step reasoning problems using pretrained large language models (LLMs). In this work, we analyze CoT prompting from a statistical estimation perspective, providing a comprehensive characterization of its sample complexity. To this end, we introduce a multi-step latent variable model that encapsulates the reasoning process, where the latent variable encodes the task i...

Find SimilarView on arXiv

Enhancing Chain-of-Thoughts Prompting with Iterative Bootstrapping in Large Language Models

April 23, 2023

94% Match

Jiashuo Sun, Yi Luo, Yeyun Gong, Chen Lin, Yelong Shen, ... , Duan Nan

Computation and Language

Artificial Intelligence

Large language models (LLMs) can achieve highly effective performance on various reasoning tasks by incorporating step-by-step chain-of-thought (CoT) prompting as demonstrations. However, the reasoning chains of demonstrations generated by LLMs are prone to errors, which can subsequently lead to incorrect reasoning during inference. Furthermore, inappropriate exemplars (overly simplistic or complex), can affect overall performance among varying levels of difficulty. We introd...

Find SimilarView on arXiv

A Hopfieldian View-based Interpretation for Chain-of-Thought Reasoning

June 18, 2024

94% Match

Lijie Hu, Liang Liu, Shu Yang, Xin Chen, Hongru Xiao, Mengdi Li, Pan Zhou, ... , Wang Di

Computation and Language

Artificial Intelligence

Human-Computer Interaction

Machine Learning

Chain-of-Thought (CoT) holds a significant place in augmenting the reasoning performance for large language models (LLMs). While some studies focus on improving CoT accuracy through methods like retrieval enhancement, yet a rigorous explanation for why CoT achieves such success remains unclear. In this paper, we analyze CoT methods under two different settings by asking the following questions: (1) For zero-shot CoT, why does prompting the model with "let's think step by step...

Find SimilarView on arXiv

February 18, 2025

93% Match

Yingqian Cui, Pengfei He, Jingying Zeng, Hui Liu, Xianfeng Tang, Zhenwei Dai, Yan Han, Chen Luo, Jing Huang, Zhen Li, Suhang Wang, Yue Xing, ... , He Qi

Computation and Language

Artificial Intelligence

Machine Learning

Chain-of-Thought (CoT) reasoning, which breaks down complex tasks into intermediate reasoning steps, has significantly enhanced the performance of large language models (LLMs) on challenging tasks. However, the detailed reasoning process in CoT often incurs long generation times and high computational costs, partly due to the inclusion of unnecessary steps. To address this, we propose a method to identify critical reasoning steps using perplexity as a measure of their importa...

Find SimilarView on arXiv