Memory Leak — #17

Astasia Myers
Memory Leak
Published in
4 min readFeb 17

--

VC Astasia Myers’ perspectives on machine learning, cloud infrastructure, developer tools, open source, and security. Sign up here.

🚀 Products

Coda AI: A Sneak Peek

Coda started a waitlist for its ­­­­alpha version of Coda AI that summarize meeting notes & transcripts in a snap using GPT3. It’s stackable with Coda’s other building blocks like tables, controls, text, and formulas.

Why does this matter? Incumbents are quickly adopting foundational models to enhance existing products. We believe that there will also be a wave of generative AI native SaaS companies that will win. SaaS companies that don’t adopt foundation models will not have the same fatality rate as on-premise software companies that didn’t move to SaaS.

Databricks Announces Multi-Cloud Support for Security Analysis Tool (SAT)

Last November, Databricks announced the availability of the Security Analysis Tool (SAT) for AWS. Recently they announced that SAT is available for Databricks customers on Azure and GCP. SAT helps their customers harden their Databricks environments by reviewing current deployments against our security best practices. It uses a checklist that prioritizes observed deviations by severity and provides links to resources that help resolve outstanding issues. SAT can be run as a routine scan for all workspaces in your environment to help establish continuous adherence to best practices, and health reports can be scheduled to provide continual confidence in the security of all data, including your sensitive datasets.

Why does this matter? A handful of data security posture management companies have sprung up over the past year. It is particularly important when there is sensitive data. Many infrastructure companies evolve to have security product lines over time like Cisco and VMware. It will be interesting to see how Databrick’s broadens its security solutions over time.

Source: https://www.databricks.com/blog/2023/02/03/announcing-multi-cloud-support-security-analysis-tool-sat.html

GitHub Copilot CLI

Ever have trouble remembering shell commands and flags for this or that? Ever wish you could just say what you want the shell to do? Don’t worry: GitHub is building GitHub Copilot assistance right into your terminal.

Why does this matter? In January 2023, Microsoft Chief Executive Satya Nadella said that more than 1 million people had used Copilot to date. We’ve discussed the rise of terminal technologies in the past here. It’s smart and unsurprising the GitHub wants to target the 60 million developers who use the terminal with Copilot, which has been wildly successful.

📰 Content

Big Data Is Dead

Jordan Tigani, co-founder and CEO of MotherDuck, make the case that the era of Big Data is over. It had a good run, but now we can stop worrying about data size and focus on how we’re going to use it to make better decisions. A couple of years ago he did an analysis of BigQuery queries, looking at customers spending more than $1000 / year. 90% of queries processed less than 100 MB of data.

Why does this matter? MotherDuck helps teams use DuckDB, an in-process SQL OLAP database management system. The thesis is that most companies don’t need to scale out data processing instances because they actually aren’t computing a lot of data. It is an alternative perspective to Spark. DuckDB has become a very popular open source project, and according to PyPI stats was download ~900K last month.

Source: https://motherduck.com/blog/big-data-is-dead/

LangChain + Chroma

LangChain announced an integration with Chroma, a vector store and embeddings database designed from the ground-up to make it easy to build AI applications with embeddings.

Why does this matter? ML practitioners use OpenAI’s Embedding API to generate language embeddings, and then index those embeddings in vector databases for fast and scalable vector search. This is a powerful and common combination for building semantic search, question-answering, threat-detection, and other applications that rely on NLP and search over a large corpus of text data. LangChain provides foundational model orchestration to enable these pipelines, and Chroma is a new vector database so the integration makes a ton of sense.

Why Kubernetes Has Emerged as the ‘OS’ of the Cloud

Seventy-one percent of all organizations run databases and caches in Kubernetes, representing a 48% year-on-year increase. Together with messaging systems (36% growth), organizations were increasingly using databases and caches to persist application workload states.

Why does this matter? Kubernetes continues to grow in popularity, while sentiment is mixed regarding the system according to our Twitter survey of 306 people. One cache solution that can run with Kubernetes is Dragonfly.

💼 Jobs

⭐️Claypot — Founding Engineer (Infra)

⭐️Grit — Design Engineer

⭐️ Speakeasy — Founding UX Lead

--

--

Astasia Myers
Memory Leak

Enterprise Partner @ Quiet Capital, previously Investor @ Redpoint Ventures and Cisco Investments