The key to software development success with ChatGPT: Use it only for what it’s good at
Pretty much everything in life can be used for good or evil — from fire and water to machines, weapons, and yes, artificial intelligence (AI). Which should make the answer to the question, “Is AI good or bad?” obvious. It’s both.
The version of AI getting most of the media blowtorch over the past few months is, of course, ChatGPT, a large language model (LLM) chatbot launched at the end of November 2022 by OpenAI that rocketed to more than 100 million users and more than 620 million website visits in two months.
And along with the fascination over what it can or can’t do, is the basic question of what it is. Is it a toy, a tool, an asset, a threat, something else? Yes, yes, yes, yes, and yes. Like many digital products, what it is depends in large measure on how it’s used.
On the plus side, one university professor said its responses to questions on philosophy, religion, and science are as good or better than what he gets from undergrad students.
Plenty of users celebrate that it can churn out the kind of boilerplate writing — standard legal contracts or press releases — in seconds that would take a human at least minutes, if not hours. They say that if trained properly, it will be able to write error-free software code, which would make any software product much more secure.
But then, LLMs can also enable cheating, as college students have already demonstrated. Cybercriminals can enlist it to compile malware or help them launch more effective attacks.
All that is not only possible but already happening. Not at all perfectly — plenty of critics note that the free version of ChatGPT only contains data going up to 2021 and makes plenty of mistakes, some of them major.
Author, blogger, and activist Cory Doctorow, who writes frequently on cybersecurity topics, dismisses AI in general with the declaration that “AI isn’t ‘artificial’ and it’s not ‘intelligent.’ ‘Machine learning’ doesn’t learn,” and ChatGPT specifically as “best understood as a sophisticated form of autocomplete — not our new robot overlord.”
Who’s using who?
But it’s not so much what its capabilities are now. As plenty of experts say, LLMs will only get better and more powerful — for better and worse. As ZDNet put it about ChatGPT last month, “many see it, and other AI tools, as the future of how we’ll use the internet.”
Or, cynics might say, how the internet will use us. There are serious discussions over how powerful LLMs could become. Ars Technica reported last week that OpenAI recently gave early access of ChatGPT-4 to a research consultancy to see if it could do things like “make high-level plans, set up copies of itself, acquire resources, hide itself on a server, and conduct phishing attacks.”
While the researchers concluded that the latest version of the chatbot was “ineffective at autonomously replicating, acquiring resources, and avoiding being shut down ‘in the wild’,” they did note that it was able to trick a worker at the online marketplace TaskRabbit to solve a CAPTCHA (designed to prove a user isn’t a robot) for it.
But for now, and for probably the foreseeable future — note that in spite of speculation several years ago that fully autonomous vehicles would be arriving soon, they are nowhere near ready for prime time — organizations that hope to reap the benefits and mitigate the risks of AI should remember that it needs human oversight — lots of it. And probably will for a long time.
Its limits and liabilities are right in its name — its “intelligence” is artificial. It’s only as “smart” as its dataset and the training provided by humans.
And, as Doctorow notes, it’s not terribly creative. “Like all statistical inference tools, autocomplete is profoundly conservative,” he wrote. “It wants you to do the same thing tomorrow as you did yesterday.”
Maybe that’s a problem if you’re thinking it’s going to help you write the best song ever. But what if that’s exactly what software developers want? Jamie Boote, senior consultant with the Synopsys Software Integrity Group, has said since ChatGPT’s release that its value isn’t that it can break a lot of new ground but that it can rely on the massive amount of data it contains to do what has been done before much more quickly — as in do the same thing tomorrow that you did yesterday, only much faster.
“You used to need a human brain to do lot of programming grunt work,” Boote said, “but because ChatGPT has been trained — probably months and months or years and years of training of this model — the result is that all that upfront uploaded work means ChatGPT can respond in seconds.”
Not to mention that it doesn’t need to spend time tapping a computer keyboard. The text just flows in as if it had been copied and pasted — which it sort of has.
The 24/7 ‘employee’
For the software industry, that could have enormous productivity implications. Instead of junior developers doing programming grunt work, LLMs could do it 24/7 without needing a salary, benefits, or lunch breaks.
It would amount to “an infinitely large team of developers who all interpret spoken requests and turn them into code,” Boote said.
It could also, over time, yield essentially flawless code — the benefit of automation is that it can do the same exact thing time after time.
That doesn’t mean AI tools are risk free. Jeff Delaney, vice president of engineering with the Synopsys Software Integrity Group, said he thinks ChatGPT has value today but still requires human oversight since it can “generate protected source code without telling you.”
Delaney also demonstrated that another AI tool, Copilot, generated code but failed to catch a licensing conflict in an open source component of it. If licensing conflicts make it into production, the effects on an organization’s bottom line could become very expensive.
Boote, in a recent post in The New Stack, noted that AI tools come with “unique risks” arising from “how the AI is trained, built, hosted and used differently than other software tools.”
That list is essentially a roadmap of things that need to be addressed to mitigate those risks and to make sure the tool is going to do what you want it to do and not do what you don’t want it to do. Which requires some vetting.
For starters, prospective users of an AI chatbot should forget that it’s the latest shiny new tech toy and ask if it’s really a good fit for what they are trying to do. “The further a task can be broken down into ‘making a decision according to learned rules’ or ‘writing content that passes learned rules,’ the better the AI will be at it,” Boote wrote. “As the problem changes beyond that, the AI will get worse at it.”
Assuming it is a good fit, there are still other issues to address. Among them
- Most organizations won’t have a data center to provide the “dedicated and expensive hardware” an AI tool requires. That means it will be hosted remotely and require the same security measures that human remote workers do.
- An AI tool will also be dealing with, and helping to create, intellectual property (IP), which means it will need “safeguards in place to prevent IP loss as code leaves the boundary,” Boote wrote.
- Know whether as you use the tool, the tool’s vendor is using you — or at least using the input your organization creates to train the AI model. “Owners, for example, might want to keep advertisers from influencing the AI bot in a direction that benefits their clients,” he wrote.
- Check the accuracy of its results. ChatGPT and other AI chatbots are already notorious for making false declarations — a flaw known as AI hallucinating. “Understanding how and where an AI may hallucinate can help manage it when it does,” Boote wrote.
- Humans still have to supervise. As noted in the previous point, AI chatbots can automate numerous repetitive tasks but are not yet autonomous. As Boote put it, “Until AI matures further and its drawbacks are understood or mitigated, humans will still need to be kept in the loop. Due to limitations in this [ChatGPT] beta version’s training and model, the code it produces isn’t always perfect. It often contains business logic flaws that can change the way the software operates, syntax errors where it may blend different versions of software and other problems that appear human in origin.”
In some ways, as Boote has said a number of times, ChatGPT is similar to a junior developer. It’s much faster but it still needs to be managed. “Every line of code will have to be tested, and some will have to be fixed. However, initial reports are that this proofreading process is faster and easier than writing code from scratch,” he said.
And faster and easier is a win-win.
Obviously it’s still early in what is likely to be another revolution, or at least evolution, in technology. Malicious actors will, as they already are, seek to exploit AI’s weaknesses while the law-abiding community will seek to mitigate them.
For software developers, the best advice for the moment is likely coming from experts like Boote: “Understanding what AI is and isn’t good at is key.”