Explorations in AI Tooling: Going Deeper

Marissa Biesecker
10 min readNov 16, 2023

--

But the social order is a sacred right which is the basis of all other rights. Nevertheless, this right does not come from nature, and must therefore be founded on conventions. —Jean-Jacques Rousseau

Feeling a little lost in AI Wonderland

Previously, I wrote about my initial thoughts into my explorations of AI and tooling built for programmers and how I was struggling to trust in these tools. In order for me to be more trusting to potentially use one of these tools, I needed to do more research about them, which I thought might be helpful to share to others who might think in similar wavelengths as I do. The following is the outline of my research, with a little summary at the end.

GitHub’s Copilot:

  • Parent company/major investors: Microsoft
    - utilizes OpenAI’s Codex* model (which is in private beta)
    * which is a descendant of GPT-3
  • IDE Integrations: Visual Studio Code, Visual Studio, Vim, Neovim, the JetBrains suite of IDEs, and Azure Data Studio
  • Resources to check out:
    - Copilot Explorer: A brilliant deep dive and reverse engineering effort by Parth.
    - Quickstart Guide: the official docs.
  • Pricing:
    - Trial: Free for 30 days.
    - Individuals: USD $10/month or $100 per year. Free for verified students, teachers, and maintainers of popular open source projects.
    - Business: USD $19 user/month
  • Privacy Policies
    - TLDR; User events and code snippets, including prompts* and suggestions, are encrypted and transmitted to GitHub (if not disallowed) and the Codex model, which have very strict access control.
    * again, check out Parth’s deep dive into the explorer to understand what prompts are and how they are used!
  • Pros
    - An existing company you already likely trust and use.
    - A very popular choice with a lot of resources means there are great docs and lots of help content already out there to understand and get started quickly.
    - Supports my code editor of choice (VSCode) and Vim, which my coding pairs like to use.
    - The file types that Copilot can have access to can be configured, creating more trust that secret files and their contents remain so.
    - Includes chat window and code completion.
    - Devs can opt out of snippets being sent to GitHub and used for product improvements.
  • Cons
    - Supporting a large, established tech company, at the expensive of supporting a smaller competitor.
    - The financial cost could be a barrier, depending on your circumstances, and there are other, free options available.
    - It is a bit limited with code editor support, and my not support yours.
    - There is an ongoing copyright infringement lawsuit alleging that the open source training violated licenses.
    - Devs need to opt out of snippets being sent to GitHub and used for product improvements, so code privacy is not automatic.
  • Initial Opinion: It’s GitHub, and as said before, I’m already trusting it to an extent. The extensive documentation and my better understanding of how it works thanks to other’s also using and writing about it, makes me want to try the free trial and see how it goes, but I have more to think about!

Tabnine

  • Parent company/major investors: Qualcomm Ventures, OurCrowd, Samsung NEXT, Atlassian Ventures, and Telstra Ventures
  • IDE Integrations: VSCode, IntelliJ, WebStorm, Pycharm, GoLand, Eclipse, Sublime, RubyMine, Clion, Neovim, PhpStorm, Android Studio, AppCode, Rider, Visual Studio
  • Pricing:
    - Basic: Free for 1 user. Short code completions (2 to 3 words).
    - Pro: USD $12/month. For pros and small teams. Whole-line & full-function code completions and natural language to code completions. Free 14 day trial.
    - Enterprise
  • Privacy Policies: Trust Center
    - They publish all security related information here, including overviews, policies, and audits.
    - Code Privacy
  • Resources to check out:
    - A developer’s hands-on review
  • Pros
    - They claim to be the first to bring a coding assistant to market and that they are the largest independent company focused on AI for software development.
    - They are very transparent about the open source permissive training for their models.
    - Includes chat window (in beta) and code completion.
    - Code privacy is automatic- code is never stored or shared.
    - They are very security oriented with many certificates and audits as proof.
    - On the Enterprise level, private models that can be trained on private code are available.
  • Cons
    - The free version is very limited and doesn’t feel very useful, and the paid tier is a bit more expensive than other options.
    - This product and true power seems to be focused on teams, not individuals, which doesn’t feel as ideal for me as an individual.
    - In comparison to docs for GitHub Copilot, these docs pale in comparison of information and guidance, which might not really be necessary, but doesn’t give me as much confidence in getting started.
    - There is also not as much community driven content available as for others.
    - No vim support, although this is not a true dealbreaker for me, it would be nice for it to be available to accommodate pairing sessions.
  • Initial Opinion: I was initially excited about this option. The privacy descriptions and company independence made me more at ease in giving my trust. But, the deeper I’ve dived in, I’m less sure, and with a slightly more expensive price point, the trial had better prove it as easier to use or with better suggestions than the competitors.

Codieum

  • Parent company/major investors: Exafunction, Green Oaks, Founders Fund
  • IDE Integrations: VSCode, JetBrains, Neovim/Vim, Jupyter, Visual Studio, Emacs, Sublime, Xcode, and more
  • Pricing:
    - Individual: Free
    - Teams: $15/user/month or $12/user/month for a year subscription
    - Enterprise: talk to a sales rep
  • Privacy Policies: Security
    - TLDR; Sounds like it functions similarly to Copilot. All data is encrypted. Telemetry data is collected, but never shared, and an opt-out of code snippet collection is an option. They don’t train on private data.
  • Resources to check out:
    - They did a product comparison to most of the other competitors I’m also looking into in this article.
  • Pros
    - A lot of IDE integrations, means pairing will be easy with everyone.
    - Has a browser demo playground to trial out, so I can gauge how much I might like it or find it useful without downloading anything or sending any information.
    - Free for individual users, forever, which makes it feel like an easy choice because the cost should never be a concern. And they are very open, transparent and convincing about how they are able to provide their service for free.
    - Removed GPL licensed code from their training data, meaning less to worry about (unlike the Copilot, which they have a lot to say about in this post).
    - I like how they have comparisons of competitions that includes comparisons of latency and quality, something that I couldn’t really include in this analysis without downloading and trying them out myself.
  • Cons
    - I’m honestly struggling to come up with a con here. Although, I suppose the most obvious is trusting what they are saying, especially about how it will continue to be free for individuals. I would also love to verify the opt-out code snippet sending like Parth was able to do in his explorations of Copilot, but that would be quite a lot of work, and it’s all a bit of an exercise in trust, as we’ve established anyway.
  • Initial Opinion: I think this might be the winner that I ultimately download and use. I was excited to hear about this from the AI Podcast, and after looking into it more and struggling to find cons, I’m excited to try this out and feel reassured I’m moving forward in a way that is reasonably responsible.

Replit AI (Ghostwriter Chat)

  • Parent company/major investors: Andreessen Horowitz’s Growth Fund led their latest round in funding. Others includeed Khosla Ventures, Coatue, SV Angel, Y Combinator, Bloomberg Beta, Naval Ravikant, ARK Ventures , and Hamilton Helmer. And over 2,500 Replit community members participated in crowdfunding in 2022. They are also partnered with Google Cloud.
  • IDE Integrations: Browser
  • Pricing: (same as normal Replit use)
    - Free: (with account)
    - Hacker: $7/month, $74/year
    - Pro: $20/month, $220/year, and includes advanced AI features like more advanced model and unlimited messages
  • Privacy Policies:
    - Public Repls may be used for training. Private (which means having a paid account) is the only way to ensure code won’t be used for training.
  • Resources to check out:
    - Docs
    - A version of their code generation model is available to explore and use on Hugging Tree.
  • Pros
    - Most developers, myself included are already familiar and have used Replit (and you can turn it off as well).
    - Integrated in a popular browser IDE (Replit), so you don’t need to download an extension.
    - Includes chat window, real time debugger, and test-run code; features which seem to be further along than the competitors and a much more integrated and immersive experience.
    - Replit has always been mobile minded, and it seems like they’ve taken great care in thinking how AI will and could be used on smaller devices and on the go as well.
  • Cons
    - Trained on publicity available code, which, like GitHub, is users from its platform using its public repos, (and also likely from data from its bounty service and software hosting).
    - The AI is exclusive to Replit, so if you want to work locally, or in a different IDE, it just won’t be an option.
  • Initial Opinion: While I admire Replit and think that they will have quite an advantage with all their data as a complete developer pipeline, I’m not currently a big Replit user, and don’t know that I want to move my work flow to this product. I will probably still try some experimentation though, especially to compare deeper. I’m also excited that they’ve released their model as open source, but I’m uneasy that getting more details about the training of the model doesn’t seem to be easy to find.

Amazon CodeWhisperer

  • Parent company/major investors: Amazon
  • IDE Integrations: VS Code, IntelliJ IDEA, AWS Cloud9, AWS Lambda console, JupyterLab and Amazon SageMaker Studio
  • Pricing:
    - Individual: free with account and AWS Builder ID, code suggestions, reference tracking, and security scans
    - Professional: $19/user/month. For organizations.
  • Privacy Policies:
    - CodeWhisperer uses content, such as code snippets, comments, cursor location, and contents from files open in the IDE, as inputs to provide code suggestions. Data is encrypted.
  • Resources to check out:
    - Docs
  • Pros
    - Available as an inclusion with other AWS services including AWS Cloud9, the AWS Lambda console, and Amazon SageMaker Studio.
    - Unlimited code suggestions.
    - Nice features including customization, security scans, flagging for open source training data.
    - Trained on Amazon and publicly available code (which could be a con as well, and is somewhat mitigated by the point above).
    - Good resources and docs.
  • Cons
    - It’s another Amazon product, giving more money to one of the world’s biggest companies and richest men.
    - Great if you’re already using or want to use AWS products, but could feel like just another pushy sales tactic, as it is trained to provide suggestions for AWS APIs.
  • Initial Opinion: I’ll say it again. It’s another Amazon product. I’m sure it will be given all the resources to make it a mighty competitor. And just like Apple made a cult like cozy environment, if you use other Amazon services, I’m sure this will give you the same warm fuzzies as Apple gives to their users. It’s not necessarily a bad thing, but it’s not what I’m valuing and looking for personally. I won’t say that I won’t give it a try though.
And a drumroll for the results…

Top Picks:

  • Best IDE integration options: Codeium
  • Low trust: Replit, Codeium
  • Best for Teams: Tabnine, CodeWhisperer
  • Small devices and on the go: Replit
  • Price: Codeium
  • My choice: Codeium

I ultimately chose to go with Codeium. They ultimately won my trust because they are a smaller company and they are taking concerns over training and open source licensing seriously by talking about it and taking action to remove GPL licensed work from their training data. These are two things that I value and am looking for in the product that I will use and support. I also really liked the docs and articles they had available. While they might be smaller and not have as much community as many of the others, they are making information easily available and transparent. And then of course, there is the price. Not only will it be free for me as an individual user, but they’ve set up an ambitious business model to keep it so, which, if I like it and it works for me will work in their favor, as I’ll be more likely to recommend it to future clients and projects. And sure, the pricing might change, as changing business climates necessitate, but given their level of transparency thus far, I’m more willing to take that risk. Now it’s time to take the next plunge, download it and start using it.

With use in mind, I’d like to call out a few things. Remember, as developers, we are ultimately responsible for the security and quality of our code. While all of these tools are great to get us going, coding faster, and learning, they are just that — tools. With great power, comes great responsibility! So, name things well and pay attention- these tools don’t truly understand the intent or the outcomes and consequences, even if they seem presented well and confidently. If you wouldn’t accept an enthusiastic junior’s suggestion without some critical thinking, don’t just auto-tab to complete without doing the same to the AI! Stay on guard and let’s see what we can build faster but still responsibly.

PS- As I really start to get going more and experiment with using these models and tools, I’ll probably write more, either adding to this article, or writing new ones. Feel free to keep an eye out for those. Happy coding ya’ll.

--

--