Impact of Generative AI on software Development

Raghav
b8125-fall2023
Published in
5 min readNov 15, 2023

1) Context:

The Software Development Lifecycle (SDLC) is a complex and multifaceted process that unfolds across four critical stages: Plan, Develop, Operate, and Secure. Each of these stages encompasses a myriad of tasks and responsibilities. In the “Plan” phase, which involves wireframing and designing mockups, Large Language Models (LLMs) play a role in providing support for UI/UX design. By building a first version of front-end UI prototypes and visualizations using instructions in English, designers can further refine and iterate on these creations. For instance, an AI can generate a preliminary draft of a website based on the instruction “build a website like apple.com for an online chat platform.” While this application of LLMs showcases a moderate impact due to the primarily visual and creative nature of UI/UX design, it provides marginal incremental value in comparison to existing low/no-code tools.

Moving to the “Develop” phase, where developers use Integrated Development Environments (IDEs) and other tools to write or test code, LLMs like OpenAI’s Codex come into play. They offer significant differentiation by facilitating code generation and documentation, as well as the creation of unit tests. For example, Codex can generate code snippets, auto-complete repetitive edits, and generate unit tests, resulting in meaningful time savings, particularly for boilerplate code and test creation. The impact here is high, as it addresses the often time-consuming and repetitive aspects of coding, enhancing overall productivity.

2) Large Language Model Evolution & Landscape:

The evolution of LLMs for software development has undergone distinct phases. In the initial period from 2018 to 2020, large-scale language models, such as OpenAI’s GPT and Google’s BERT, were launched, addressing a wide array of use cases, from content generation to data analytics. However, their efficacy in software development-focused tasks was limited due to constraints related to model training and data.

A significant turning point occurred with the launch of Codex in 2021. Unlike OpenAI’s earlier GPT series, which served as general-purpose models, Codex was specifically designed for software development, leveraging the GPT-3 base. Trained primarily on public GitHub codebases (approximately 160GB), Codex marked a departure from the broader open-ended use cases to focus on the intricacies of software development.

Following Codex’s introduction, multiple third-party solutions were built on its capabilities, aiding developers in saving time for relatively simpler software development use cases. These solutions, including GitHub Copilot, demonstrated the practical application of Codex in generating code suggestions for code snippets, streamlining the coding process.

The landscape of language models in software development continued to evolve, with several big tech players and research organizations launching their own models focused on addressing the specific challenges of software development. This proliferation indicated a growing recognition of the significance of language models in enhancing various aspects of the development process.

3) Feedback from Developers Using Codex-like Tools:

Feedback from developers using Codex-powered tools, exemplified by GitHub Copilot, has been predominantly positive. Developers have highlighted the substantial time savings, productivity enhancements, and simplification of work processes. Copilot, in particular, has been praised for its ability to generate a good first version of code, acting as a coding wingman for developers.

The positive feedback emphasizes several key aspects:

  • Generates a Good First Version: Copilot excels in suggesting trivial things perfectly, such as repetitive sentences or codes. It learns and improves with usage, significantly improving developers’ productivity.
  • Expedited Documentation: Developers no longer need to spend countless hours explaining code snippets for documentation and future maintenance. Copilot can generate a decent explanation of what a snippet of code does, streamlining the documentation process.
  • Autocompletion of Repetitive Edits: Copilot takes away drudgery from coding by autocompleting repetitive edits, especially in situations where a few lines need to be repeated with small differences.

While the feedback indicates tangible benefits, it’s essential to acknowledge certain limitations in the current state of Codex-powered tools:

  • Limited Accuracy in Generated Code: Codex’s generated code exhibits errors and may not work correctly in 50–60% of cases. It attempts to predict the next word based on patterns observed in codebases, sometimes leading to inaccurate suggestions.
  • Low-Quality Code: Since Codex is trained on publicly available code, the quality is not always consistent, and it may not follow the best path to a solution. It can include verbose code and potentially expose software to security vulnerabilities.
  • Copyright Infringement Concerns: There are instances where the code generated by Codex may raise copyright or licensing issues if it includes snippets from the training set, limiting its applicability for enterprise software.
  • Code Length and Complexity: Codex is most effective for writing small functions or pieces of code that perform specific tasks. Its performance decreases with the complexity of code, making it less suitable for handling large, intricate applications.
  • Bug/Error Propagation: If the existing code contains subtle bugs, Codex may suggest code with similar issues, inadvertently propagating errors.

4) Conclusion: Future of LLMs in Software Development:

Looking ahead, the future of Large Language Models (LLMs) in software development holds both promises and challenges. The ongoing improvements in accuracy and addressing existing limitations are expected through increased training parameters, enhanced dataset quality, and expansion of dataset size.

While current applications of LLMs in software development primarily focus on efficiency gains, supporting junior developers to write code, test, and document faster, market participants envision a broader role for LLMs in the next decade. The anticipation is that AI will not only assist business users in coding with minimal skills but will also enable the creation of full-fledged software based on natural language descriptions, minimizing the need for human intervention in code creation.

The structural transformation of IT teams is anticipated to occur within the next five years, with a shift towards significantly leaner teams. The distribution of talent is expected to evolve from a pyramid shape to a diamond shape, as AI takes on the generation of boilerplate code traditionally handled by junior developers.

This transformation is not only expected to revolutionize the coding landscape but also has implications for the software development tools market. The spend on software development tools is predicted to decrease due to advancements in AI, with a potential tenfold increase in the output of software engineers by 2030. This speculative glimpse into the future underscores the transformative potential of LLMs in reshaping the software development landscape.

Key Data Sources:

“Compressing BERT: Studying the Effects of Weight Pruning on Transfer Learning” by Gordon, A., Duh, K., Andrews, N., et. al. February 2020

“Language Models are Few-Short Learners” by Brown, T., Mann, B., Ryder, N., et. al. May 2020

“Dataset Cartography” by Swayamdipta, S., Schwartz, R., Lourie, N., et. al. October 2020

“Evaluating Large Language Models Trained on Code” by Chen, M., Tworek, J., Jun, H., Yuan, Q., et. al. July 2021

“Unified Scaling Laws for Routed Language Models” by Paganini, M., Hoffmann, J., Damoc, B., et. al. February 2022

“Competition-Level Code Generation with AlphaCode” by Li, Y., Choi, D., Chung, J., et. al. February 2022

“Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer” by Yang, G., Hu, E., Babuschkin, I., et. al. March 2022

“Training Compute-Optimal Large Language Models” by Hoffmann, J., Borgeaud, S., Mensch, A., et. al. March 202

--

--