Cursor: The Team and Vision Behind the AI Coding Tool
After studying some of the stories, functional designs, technical points and thoughts behind the Cursor team, I realized once again that it’s not the youthful ambition of this team but an intuitive simplicity and purity. What I realized again is not the youthful ambition of the team, not the so-called grand blueprint of AI Coding, but an intuitive simplicity and purity.
🧑💻 About Cursor
Cursor is an AI-powered code editor designed to improve developer productivity. released in January 23rd. It is Forked based on VS Code for more flexible AI integration. As of Aug 24, it has over 40,000 customers (including several Startups, labs, and enterprises).
- Cursor is developed by the Anysphere team. The team’s co-founders — Michael Truell, Sualeh Asif, Arvid Lunnemark, and Aman Sanger — all graduated from Massachusetts Institute of Technology (MIT) in 2022. The founders — Michael Truell, Sualeh Asif, Arvid Lunnemark and Aman Sanger — all graduated from MIT in 2022.
- On 8/22/24, Cursor announced on its official Blog that it had raised $60 million in Series A funding. The round was funded by Andreessen Horowitz (a16z), Thrive Capital, OpenAI, Jeff Dean (Google’s Chief Scientist), Noam Brown, the founders of Stripe, GitHub, Ramp, and Perplexity, and is rumored to be worth $400 million. It is rumored that the company is now valued at $400 million and has annual ARR revenue of over $10 million. The list of investors shows a strong lineup of investors, including top-tier VCs like a16z and Jeff Dean, who is a god among engineers. But on the other hand, you can also see GitHub, OpenAI and other potential “competitors”, “both enemies and friends” of the figure.
🤩 Product Features
Cursor’s features and technical optimizations are all designed around the goals of “fast” and “accurate” product experience. This is a very intuitive “just need”, continue to dig deep and improve, let me once again feel a kind of simple purity and focus.
Efficient caching technology:
- KV Caching: Pre-populated cache reduces the number of tokens that need to be computed during keystrokes, significantly reducing response time.
KV Caching in Cursor
- KV Caching in Transformers: In Transformers, key-value (KV) caching is used to make each lexical element visible to previous lexical elements. Each lexeme needs to be forward propagated through the model, which involves a lot of matrix multiplication and slows down processing.
- Advantages of the KV cache: Cursor stores the previously computed key values in the GPU, avoiding the need to re-run the entire model for each keystroke. Reusing the KV cache reduces latency, computational cost, and GPU load.
- Cue Design with KV Cache: Cues designed for the model take caching into account to maximize their effectiveness.
- Cache warm-up: Cursor proactively prepares the KV cache by pre-populating it with likely context (e.g., the current file contents) to reduce the number of lexical elements that need to be processed before the user finishes typing, resulting in faster response times.
- Advanced Caching Heuristics: Cursor combines speculative and caching techniques to anticipate user actions (e.g., accepting suggestions) and pre-cache the results to create a seamless, fast user experience.
- Speculative Edits: Cursor uses a variant of the technique called “Speculative Edits” to anticipate the user’s next action and pre-compute the corresponding code changes. These precomputed changes are then cached and applied on-the-fly when the user action matches the prediction. This is a form of inferential caching that preserves the results of predictions and generated edits for future use.
Speculative Editing in Cursor
Speculative editing in Cursor draws on the principles of Speculative Decoding, which traditionally speeds up language model generation by processing multiple lexical elements at once, rather than one by one. However, Cursor’s speculative editing is optimized for the code, as the existing code serves as a natural “a priori” guide to the editing process:
- Chunking: Cursor breaks the original code into chunks and feeds these chunks to the model.
- Predictive Reproduction: The model typically reproduces the input code so that the chunks can be processed in parallel more quickly.
- Deviation Detection: The model continues to speculate until it predicts a change in the original code, at which point it generates new lexical elements that differ from the original code.
- Continuous Prediction: After a set of lexical elements deviate, Cursor returns to prediction based on the original code block.
This technique not only speeds up code editing, but also allows the user to preview changes during generation, eliminating loading screens and creating a seamless editing experience.
- Cache Warming: Cursor reduces latency by “warming up” the cache by pre-populating it with content that is relevant to the current context, such as the contents of the current file, ensuring that the information needed for inference is ready even before user input is complete.
Cache warming in Cursor
Cache warming predicts user demand and reduces AI response time latency by loading relevant context into the KV (key value) cache in advance. This is especially important in Transformer-based models, where the KV cache supports faster word element processing.
- Context Recognition: Along with user input, Cursor predicts the context (e.g., current file contents) that may be required for the next action.
- KV Cache Preloading: Preloading the KV cache with this context makes the required information immediately available when the user executes a command.
By preloading the cache, Cursor reduces the “first word delay” (TTFT), making AI-assisted responses more immediate and faster.
- Caching of Embeddings: To optimize code retrieval, Cursor calculates and caches embeddings for code snippets, avoiding double-counting of embeddings when searching for related code snippets, resulting in faster response times.
Embedding Cache in Cursor
The embedding cache in Cursor is designed to enable efficient code retrieval by storing a representation (embedding) of the code base instead of storing the original code directly.
- Block-based embedding: Cursor divides the code base into manageable blocks and computes embeddings for each block.
- Database storage: Represent the code base in a lightweight and secure way by storing only the embeddings and not the code itself.
- Fast Retrieval: When a user initiates a code search, Cursor can quickly locate the relevant code snippet by matching the query with the cached embeddings.
This technique is especially beneficial when dealing with large codebases, as it avoids parsing the entire codebase for each query, resulting in faster and more efficient resource utilization.
Optimized Attention Mechanism:
Reduced KV cache size:
- Multi-Query Attention (MQA) and Grouped Query Attention: the source mentions these as more efficient attention mechanism schemes compared to traditional multi-head attention. They reduce the number of key-value (KV) headers, effectively shrinking the size of the KV cache with little or no performance impact.
- Multiple Latent Attention (MLA): this approach compresses the KV information from all the attention heads into a single latent vector. Although more complex, MLA aims to further reduce the KV cache size.
Reducing the KV cache size allows Cursor to achieve the following effects:
- Greater context capacity: the smaller KV cache frees up memory, allowing the model to handle larger hint inputs and be able to process more code at once.
- Increased Cache Hit Ratio: The freed memory can be used to store more information, resulting in an increased cache hit ratio and further speedup.
- Enhanced speculation: As the KV cache takes up less space, Cursor can precompute and cache more aggressively, increasing the number of possible next actions and improving the responsiveness of the user experience.
- Optimized memory bandwidth
- Faster lexical element generation: The source points out that with large volumes of data and long context windows, the speed bottleneck shifts from parallel matrix multiplication to reading keys and values from the cache, which is a memory bandwidth constraint.
- Advantage of reducing KV cache size: By reducing the size of the KV cache using techniques such as MQA and MLA, Cursor reduces the amount of data read from memory, which speeds up lexical element generation in the inference phase.
Shadow Workspace
- Shadow Workspace is an innovative feature of Cursor that allows AI agents to work in the background, separate from the user’s main workspace. This separation allows AI to iterate on code, get feedback from lint tools, etc., and possibly even execute code without interrupting the user’s work.
How Shadow Workspace works:
- Implementation: Cursor generates a hidden window that acts as a Shadow Workspace. this window is a separate instance of Cursor but it uses the same underlying files as the user’s main workspace.
- Agent Activity: In the Shadow Workspace, AI agents are free to modify the code, but do not save these changes to disk, thus protecting the integrity of the user’s main workspace.
- Feedback Integration: Shadow Workspace provides AI agents with a mechanism for obtaining feedback from programming tools, such as the lint tool. lint tools can identify potential bugs or style violations in the code, helping the AI to improve the quality of code generation and editing.
- Possibilities for code execution: sources indicate future plans to enable AI to run code in Shadow Workspace for more comprehensive testing and validation.
- Future potential of Shadow Workspace
The Cursor team pointed out the promising future of Shadow Workspace, hinting at possible future features:
- Long-term prediction: AI agents in the background can analyze a user’s code and predict their intent over a longer time horizon, providing smarter code suggestions and assistance.
- Proactive code generation: AI can generate code in the background based on the user’s current task or project, providing a starting point or completing repetitive tasks.
- Agents-based code review: AI agents can enhance the code review process by analyzing code changes in pull requests, identifying potential errors and suggesting improvements.
🚀 Future Directions for Cursor
- Remote sandbox environments: develop remote execution systems that can reproduce the user’s environment for long duration tasks
- Improving AI debugging capabilities: focusing on simple error detection and gradually expanding to more complex error detection
- Explore homomorphic encryption: for language model inference and enhanced privacy protection
- Optimize context processing: explore infinite context windows and more efficient caching methods
🤔 Cursor Team’s view on the future of programming
The Cursor Team is clearly techno-optimists; AI will not replace ‘innovators’, but rather engage more people.
The Cursor Team envisions a future of software development in which human programmers are always in the driver’s seat, working in tandem with increasingly powerful AI systems. This collaborative approach emphasizes programmer speed, initiative, and control.
Core concepts
- Speed: Automating repetitive tasks with the assistance of AI allows programmers to focus on higher-level creative decisions.
- Initiative: Ensures programmers retain power over key design choices and have direct influence over the software development process.
- Control: Allows programmers to switch freely between different levels of abstraction, interacting seamlessly from high-level pseudo-code to detailed formal programming languages.
Perspectives on “Purely Conversational Programming Interfaces”
While some believe that the future belongs to purely conversational programming interfaces, the Cursor team sees limitations in this approach:
- Textual instructions lack the precision and care of direct interaction with code.
- Entirely delegating tasks to AI can lead to sub-optimal results, as programmers relinquish control over key design decisions.
The Ideal Programming Approach
The Cursor team believes that the best approach is to combine AI assistance with direct code manipulation:
- Programmers are able to iterate quickly, experiment freely, and shape the software to their vision.
- Using AI to generate sample code, handle complex migrations, and catch common errors allows them to focus on solving challenging design problems and creative solutions.
On the impact of AI on the future of programming skills
The team acknowledges concerns about how AI advances may affect programming skills. They argue that AI will not replace programmers, but rather make programming more fun and approachable by tackling tedious tasks and lowering the barrier to entry.
- While forms of programming may evolve, core skills such as problem solving, logical thinking, and creative expression will remain vital.
- The future of programming belongs to those who love to build and create, to those who feel challenged and fulfilled when building elegant and useful software.
The Cursor team’s vision is of a world where people and AI work in tandem, where technology not only empowers programmers, but also makes the development process more efficient and creative.
👨🏻💻 First project with Cursor
Background
- I am a 0 code base product;
- Before using Cursor I tried to program using the following:
- ChatGPT + VS Code
- VS Code + Copilot, Tongyi Code, Tencent Cloud AI Code Assistant…
Project Information: Resume-Companion-R3 (minimalist resume)
Github address (open source, need to take):
Project Background
- Stemming from the extension of the recruitment needs of the company I currently work for. I hope to make a tool to simplify the touches to simplify the complexity of the resume to 1 page A4 paper specifications. The final result is roughly as follows (the top is an example before processing (with 3 pages of content), the bottom is after touch-ups):
Function
At present, the shelf is up, but the function is still relatively simple, to be improved.
The following features are currently supported:
- 🤖 resume content touch-up and simplification
This piece is currently on the premise of how to retain effective information under the premise of reduction is still to be improved
- 📝 Intelligent adjustment of content according to the job description (JD)
- 🎨 Automatically beautify the resume
At present, it is still a beggar’s version, only a minimalist layout adjustment, to be improved
- 💾 Support multiple input formats (PDF, DOCX, TXT)
- ⬇️ single page A4 output, export PDF format
Project Size
Frontend Code:
src/client/index.html: 26 lines
src/client/script.js: 67 lines
src/client/style.css: 54 lines
Subtotal: 147 lines
Backend Code:
src/server/server.js: 217 lines
src/server/resumePrompt.js: 75 lines
Subtotal: 292 linesConfiguration Files:
package.json: 35 lines
.gitignore: 1 line
.nvmrc: 1 line
LICENSE: 21 lines
README.md: 143 lines
Subtotal: 201 linesTotal: 640 linesCode Distribution:
Frontend Code: 23%
Backend Code: 46%
Configuration Files: 31%
This is a relatively streamlined project, with the main functionality focusing on the resume processing and AI integration part of the back-end. The front-end uses a simple HTML/CSS/JS implementation without using complex frameworks, which makes the code structure clear and easy to understand. The proportion of configuration files is high.
These are the statistics that Cursor helped to do. On the whole, the use of AI capability saves a lot of document optimization part of the rules code writing. Although I am not a professional engineer, from the perspective of rule logic, I can imagine the amount of code in similar projects before LLM. However, the part of embellishment and simplification may not be well done.
Development time
Since this is a part-time project, I coded some of it whenever I had time, and after deducting the idle time in between, the overall project took about 10 days (if you convert it to 8 hours per day).
Recommended Prompts
Midday is an open community for Cursor Prompts. Above are the project prompts shared by users who use Cursor, you can get them if you need.
💡 Epiphanies
- AI forces you to have a deeper understanding of engineering. As a product veteran for many years, plus many years of experience with the development of tearing (oh no, exchange), the technical architecture, logic, some understanding, but not in the water, you will never be able to swim, as you can not feel the feeling of swimming. Although Cursor and other tools to help us generate the code, but because of his imperfection, this will encounter some problems. Such as: I obviously environment dependencies are installed, how to still compile failed, how to still Run not up, then you may want to ask the GPT, you may want to Search, is not their own way of asking the wrong way, resulting in the Cursor in the original spin (and from the experience so far, the reason is often really so). In the process it also gives you a more intuitive understanding of code and engineering.
- It’s not perfect, but the future looks promising. Although from many research reports and analysis, we can understand that the programming ability of several big models of TOP is directly comparable to that of ordinary coders, for example, the HumanEval of Claude 3.5 Sonnet reaches 92%. However, in actual use (e.g. Cursor uses Claude 3.5 Sonnet by default), there is still a gap in its understanding of some more complex requirements, resulting in results that may not be what you expect. But with each “collision”, you’ll find that it gets a little better. Just like the Cursor team said, they tried to do Coding tools in GPT 3.5, but the result was very poor, then GPT-4 came out, they got the early qualification of internal testing, and they were amazed, and now Claude 3.5 Sonnet is even better, and in order to cope with the investment, o1-preview, which is barely on line, performs very well in this aspect. this area has also performed admirably. So, as these infrastructures get better and better, the future looks promising.
- Traditional collaboration will change. As Cursor Team said in the Lex Fridman interview, AI-powered development tools will free programmers from tedious and repetitive coding tasks to focus more on product design, experimentation, and creativity. Of course, as programmers, they will stand more on the programmer’s perspective. As a product owner, it’s not just the programmers who are affected, but also the product managers or designers who have more business and design skills (no offense to programmers) who have more autonomy in the current workflow. The boundaries between the two may also become more blurred in the future, with the two roles of two nodes in a workflow (e.g., product to PRD, programmer to developer) merging into a single role, and the medium of communication between the nodes weakening (e.g., the medium between product and developer is the PRD). The traditional way of collaboration may change as a result. In the middle of this, more tools will be born to adapt to the new technology environment to serve the new era of “innovators”.
- Finally, thanks for the times.
Appendix
- Cursor official website: https://www.cursor.com/
- Cursor Team: Future of Programming with AI | Lex Fridman Podcast: https://www.youtube.com/watch?v=oFfVt3S51T4
- Cursor Founder’s 10,000-word interview: How did the world’s most popular AI programming app come to be? : https://www.huxiu.com/article/3456683.html
- We Raised $60M: https://www.cursor.com/blog/series-a
🛰️ Explore More
- 👨🏻💻 Product Lab: https://starm.ai/
- 🌳 Knowledge Tree: https://www.thewonderful.info/
- 🗣️ X: https://x.com/elekchen