CMU & Inspired Cognition’s DocPrompting Improves Code Generation by Retrieving Relevant Documentation

Published in

SyncedReview

3 min readMar 1, 2023

The ability of large language models to generate computer code from natural language (NL) prompts has revolutionized the programming domain. Most contemporary models however can only generate code for seen libraries and function calls, and struggle when they encounter any of the new libraries or functions that are constantly being introduced. A human programmer facing such a challenge would typically research and retrieve user manuals and other relevant documents to familiarize themselves with the new library/function — could LLMs be taught to do the same?

In the new paper DocPrompting: Generating Code by Retrieving the Docs, a research team from Carnegie Mellon University and Inspired Cognition presents DocPrompting, a novel NL-to-code generation approach. Tasked with generating code to unseen functions or libraries from an NL intent, DocPrompting retrieves corresponding code documentation to enable the model to learn to perform the task.

DocPrompting is inspired by programmers’ use of manuals and documentation when encountering unseen/unused functions or libraries. The approach…

CMU & Inspired Cognition’s DocPrompting Improves Code Generation by Retrieving Relevant Documentation

Written by Synced