Chatbot Showdown: Typelevel Scala Edition

Kacper Korban
VirtusLab
Published in
8 min readJun 23, 2023

Recently chatbots have been discussed more heavily than the direct style in Scala. So I decided to test whether they help write Scala code. More specifically, an application using the Typelevel stack.

We’ll conduct an experiment to test the capabilities of each chatbot. Let’s call it: “Chatbot Showdown”. The rules will be as follows:

Every chatbot will be given a programming task

Every chatbot will be asked 3 times to generate the solution

The prompts will include functional requirements and technical details (e.g. library versions)

They will be scored based on how much work one has to put in to get the examples working

The chatbots we are going to be testing are:

  • ChatGPT-4
  • Google Bard
  • Microsoft Bing Chat
  • Sourcegraph Cody
  • CatGPT

The task

The functionality we want to implement is a lawsuit prevention system from the Rust Foundation. In other words, a program that, for a given text, will check if it contains the string “rust” in it. Since that is obviously a serious crime.

For the interface, we’ll choose something widespread — an HTTP API. We want to use http4s for the server implementation and circe for JSON (de)serialization.

The prompt I wrote for this task is as follows:

I want to create an application in Scala 3. The application should be written using the Typelevel stack. Given some text, this application will check if the text contains the word “rust”. It should be case-insensitive. The application should provide its API using HTTP requests. Write this application for me. Use http4s 0.23.10 and circe 0.14.1.

Bard

We’ll start with Bard. Out of the 3 generated snippets, they all worked from the start. Giving between 3 and 6 compilation errors.

The first one gives errors of this sort:

-- Error: bard1.scala:16:63 -----
16 | implicit val requestDecoder: Decoder[Request] = deriveDecoder
| ^
|No given instance of type deriving.Mirror.Of[A] was found for parameter A of method deriveDecoder in trait AutoDerivation
|
|where: A is a type variable with constraint
|. Failed to synthesize an instance of type deriving.Mirror.Of[A]:
| * class Nothing is not a generic product because it is not a case class
| * class Nothing is not a generic sum because it is not a sealed class
-- [E006] Not Found Error: bard1.scala:19:35
19 | def run(port: Int): Resource[IO, BlazeServer] = {
| ^^^^^^^^^^^
| Not found: type BlazeServer
|
| longer explanation available when compiling with `-explain`
-- [E008] Not Found Error: bard1.scala:21:21
21 | app <- HttpApp.of[IO] {
| ^^^^^^^^^^
| value of is not a member of object org.http4s.HttpApp

The first type of error is among the most loathed mistakes by Scala programmers. It’s because it has something to do with missing implicit. In this case, it looks even more disgusting. It requires implicit instances for types that should be generated by the compiler — `Mirrors`. I guess the error is caused by the fact that Bard used automatic derivation in a semiautomatic way.

The second error seems to be the use of the wrong name. Since routes are created using `HttpRoutes` instead of `HttpApp`. So even though the second error seems relatively easy to fix, fixing the first one will suck all the joy out of my existence.

Onto the second snippet. One error is similar to what we saw before. The other ones are somewhat different.

-- [E008] Not Found Error: /home/kpi/chatbot-battle-royale/typelevel/http/bard2.scala:19:12 
19 | HttpApp.serve(app).run(new HttpApp.Listener[IO](8080))
| ^^^^^^^^^^^^^
|value serve is not a member of object org.http4s.HttpApp - did you mean HttpApp.pure?
-- [E008] Not Found Error: /home/kpi/chatbot-battle-royale/typelevel/http/bard2.scala:25:13
25 | contains.pure[IO]
| ^^^^^^^^^^^^^
|value pure is not a member of Boolean, but could be made available as an extension method.

The second error seems like an import issue but is probably easy to fix. But the first one is enough to pronounce this program doomed. Creating the server is the most essential part of the program. But Bard’s solution doesn’t look remotely similar to how it’s done in http4s.

The last hope is the file with the most amount of errors. Fortunately, all the errors are about wrong imports. It also is a very curious mistake. Bard removed almost all the prefixes, e.g. changing `org.http4s` to `http4s` etc.

The raw file looks like so:

import cats.effect.{ExitCode, IO}
import cats.implicits._
import circe.generic.auto._
import circe.literal._
import http4s._
import http4s.circe._
import http4s.dsl.io._
import http4s.server.blaze.BlazeServerBuilder


object RustDetector extends IOApp {
def run(args: List[String]): IO[ExitCode] = {
val routes = HttpRoutes.of[IO] {
case GET -> Root / "check" / text =>
for {
contains <- text.toLowerCase.contains("rust")
response <- Ok(Json.fromString(contains.toString))
} yield response
}


BlazeServerBuilder[IO]
.bindHttp(port = 8080)
.withHttpApp(routes)
.serve
.compile
.drain
.as(ExitCode.Success)
}
}

After fixing the imports, we are down to just 3 errors.

-- [E006] Not Found Error: bard.scala:9:28 
9 |object RustDetector extends IOApp {
| ^^^^^
| Not found: type IOApp
|
| longer explanation available when compiling with `-explain`
-- [E008] Not Found Error: bard.scala:14:22
14 | contains <- text.toLowerCase.contains("rust")
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| value flatMap is not a member of Boolean
-- [E007] Type Mismatch Error: bard.scala:21:19
21 | .withHttpApp(routes)
| ^^^^^^
| Found: (routes : org.http4s.HttpRoutes[cats.effect.IO])
| Required: org.http4s.HttpApp[cats.effect.IO]

The first one can be automatically fixed by Metals by adding a single import. The last one is also relatively easy if you know a bit about the http4s API, so it can be fixed by adding `.orNotFound`. The middle error is just a misunderstanding of how `for` comprehensions work in Scala. The fix is also quite easy and requires moving the `val` definition before the `for`. By fixing this one, the last error is revealed, but fortunately, one that can also be improved by Metals.

After all that work, the service finally works, and we can curl it successfully.

Bing

Bing is up next. Out of the 3 generated code snippets, 2 worked straight out of the gate. And the one that didn’t was missing a single import. The other noticeable thing about Bing’s snippets is that they were consistent. Bard’s snippets varied in many aspects, like the length of code and the existence of request/response classes. Almost all of Bing’s answers looked the same, only differing in naming or the order of imports.

Here is what an example snippet looks like:

import cats.effect._
import org.http4s._
import org.http4s.dsl.io._
import org.http4s.implicits._
import org.http4s.server.blaze._
import io.circe.generic.auto._
import org.http4s.circe.CirceEntityCodec._


object Main extends IOApp {
case class Text(text: String)
case class Result(containsRust: Boolean)


val service = HttpRoutes.of[IO] {
case req @ POST -> Root / "check" =>
for {
text <- req.as[Text]
response <- Ok(Result(text.text.toLowerCase.contains("rust")))
} yield response
}.orNotFound


def run(args: List[String]): IO[ExitCode] =
BlazeServerBuilder[IO]
.bindHttp(8080, "localhost")
.withHttpApp(service)
.serve
.compile
.drain
.as(ExitCode.Success)
}

Not much else to say; incredible performance from Bing.

ChatGPT

Moving on to ChatGPT-4. Out of the 3 generated pieces of code, only 1 works without requiring changes. The other two, on the other hand, give 4 errors each. Fortunately, both of the failing snippets can be fixed by changing the same import:

import org.http4s.dsl.Http4sDsl // -> import org.http4s.dsl.io._

Using `Http4sDsl` explicitly and extending our service with it is a valid option. But to make it work, `Http4sDsl` has to be used somewhere. So the import is correct, just mismatched from the implementation.

The working snippet looks like this:

import cats.effect._
import io.circe._
import io.circe.generic.auto._
import io.circe.syntax._
import org.http4s._
import org.http4s.dsl.io._
import org.http4s.implicits._
import org.http4s.server.blaze._
import org.http4s.circe._


object TextAnalyzer extends IOApp {


case class TextRequest(text: String)
case class TextResponse(containsRust: Boolean)


implicit val textRequestDecoder: EntityDecoder[IO, TextRequest] = jsonOf[IO, TextRequest]
implicit val textResponseEncoder: EntityEncoder[IO, TextResponse] = jsonEncoderOf[IO, TextResponse]


val textAnalyzerService = HttpRoutes.of[IO] {
case req @ POST -> Root / "analyze" =>
for {
textRequest <- req.as[TextRequest]
containsRust = textRequest.text.toLowerCase.contains("rust")
response <- Ok(TextResponse(containsRust).asJson)
} yield response
}.orNotFound


override def run(args: List[String]): IO[ExitCode] =
BlazeServerBuilder[IO]
.bindHttp(8080, "localhost")
.withHttpApp(textAnalyzerService)
.resource
.use(_ => IO.never)
.as(ExitCode.Success)
}

What distinguishes ChatGPT from the other bots is that it explicitly defines the entity encoders and decoders every single time. Whereas the two previous bots usually used automatic derivation.

Cody

Next, it’s Cody’s turn. After running the 3 generated programs, they all worked out of the box. Fortunately, after adding the same import to all the snippets, two start working. And the other one is only missing one thing: the exact conversion from `HttpRoutes` to `HttpApp` as with Bard. Besides Cody’s hatred for importing `BlazeServerBuilder`, it’s also interesting that Cody is the only chatbot that generated solutions with different HTTP methods (GET\POST).

The best solution generated by Cody looks like this:

import cats.effect._
import org.http4s._
import org.http4s.circe._
import org.http4s.dsl.io._
import io.circe.generic.auto._


object Main extends IOApp {
def checkForRust(text: String): Boolean =
text.toLowerCase.contains("rust")


val api = HttpRoutes.of[IO] {
case req @ POST -> Root / "check" =>
for {
text <- req.as[String]
resp <- if (checkForRust(text)) Ok("Text contains rust")
else Ok("Text does not contain rust")
} yield resp
}
val server = BlazeServerBuilder[IO]
.bindHttp(8080)
.withHttpApp(api.orNotFound)
.resource


override def run(args: List[String]): IO[ExitCode] =
server.use(_ => IO.never).as(ExitCode.Success)
}

This solution is probably the one I would be least embarrassed about. The logic is extracted to a separate function. The server is defined as a resource which is useful when combining multiple services.

Cat2GPT

Finally, it’s time for CatGPT. I highly hope for this one as CatGPT should be a cat expert. CatGPT’s answers are the most consistent of them all. Each contains a cat gif and a sequence of meows.

e.g.

Summary

Using some subjective terms here is what the experiment results look like.



| Chatbot | working OOTB | minimal effort | more effort | cat gifs |
|---------|--------------|------------------|-------------|----------|
| Bard | 0 | 1 | 2 | 0 |
| Bing | 2 | 1 | 0 | 0 |
| ChatGPT | 1 | 2 | 0 | 0 |
| Cody | 0 | 3 | 0 | 0 |
| CatGPT | 0 | 0 | 0 | 3 |


Conclusions

The chatbots are great at generating code that looks like it could work but often doesn’t. And the problems you are left with can often be more complex than just opening a good tutorial. Even though the task was quite simple. It essentially only required copying a snippet from a basic tutorial. None of the chatbots achieved a perfect score.

Also, because of the knowledge cut-off, they are limited to older library versions. And even though there have been a few breaking changes in http4, it isn’t the case for all libraries. Similarly, no Scala 3 features are used in the code, even though the prompt explicitly asked for Scala 3. And so, it will be highly beneficial for library maintainers to keep the API as stable as possible.

--

--