Graph of Thoughts: Enhancing Reasoning with Explicit Relational Structures

Introduction

Published in

Towards NeSy

2 min readNov 11, 2023

In this blog, I present a preprint paper by researchers from Shanghai Jiao Tong University titled “Beyond Chain-of-Thought: Effective Graph-of-Thought Reasoning in Large Language Models.” This paper represents a notable advancement beyond chain-of-thought prompting, an automated process that reframes questions and breaks them down into smaller steps. Unlike the step-by-step approach, the paper introduces an intermediate graph of information generation, utilizing a graph neural network to incorporate an additional modality of relational information.

Main Idea

The main workflow, depicted in Figure 2 from https://arxiv.org/pdf/2305.16582.pdf, is self-explanatory. The model follows two stages for answering questions: in the first stage, it generates rationales for related problems, and in the second stage, it answers the question. Additionally, the authors introduce the Graph-of-Thought constructor on top of the two major modalities — text and optional image — to obtain an intermediate graph fed into a Graph Neural Network (GNN).

The inclusion of the graph modality aims to mimic basic deduction processes, such as when A → B and B → C, leading to A → C. Remarkably, the testing results show outstanding performance.

Performance

On a challenging math dataset (test modality only), the proposed model, with a larger parameter size, outperformed GPT3.5/3. While it falls 10% lower in accuracy compared to GPT4, the latter is a complex system that might involve the adoption of mathematical APIs in the backend. (IMO almost a unfair comparison)

For the ScienceQA dataset, which incorporates both text and image modalities, the large version of GoT-T5 achieved state-of-the-art performance.

Notes at the end

The relational structure extracted from the rationales makes it easier for the pipeline to grasp essential information from the entire set of related data. In both stages of the process, the Large Language Model (LLM) serves more as an interpreter than a reasoner, alleviating the responsibility of the LLM to conduct multi-step reasonings. Instead, the adoption of the graph of thoughts transforms the process into a “flat” structure.

Reference

https://arxiv.org/abs/2305.16582