walthamstow 3 hours ago

Eng leadership at my place are pushing Cursor pretty hard. It's great for banging out small tickets and improving the product incrementally kaizen-style, but it falls down with anything heavy.

I think it's weakening junior engineers' reasoning and coding abilities as they become reliant on it without having lived for long, or at all, in the before times. I think may be doing the same to me too.

Personally, and quietly, I have a major concern about the conflict of interest of Cursor deciding which files to add to context then charging you for the size of the context.

As with so many products, it's cheap to start with, you become dependent on it, then one day it's not cheap and you're fucked.

  • rco8786 2 hours ago

    I’ve been a paying cursor user for 4-5 months now and feeling the same. A lot more mistakes leaking into my PRs. I feel a lot faster but there’s been a noticeable decrease in the quality of my work.

    Obviously I could just better review my own code, but that’s proving easier said than done to the point where I’m considering going back to vanilla Code.

    • ljm an hour ago

      Same result - I tried it for a while out of curiosity but the improvements were a false economy: time saved in one PR is time lost to unplanned work afterwards. And it is hard to spot the mistakes because they can be quite subtle, especially if you've got it generating boilerplate or mocks in your tests.

      Makes you look more efficient but it doesn't make you more effective. At best you're just taking extra time to verify the LLM didn't make shit up, often by... well, looking at the docs or the source.. which is what you'd do writing hand-crafted code lol.

      I'm switching back to emacs and looking at other ways I can integrate AI capabilities without losing my mental acuity.

      • geoduck14 an hour ago

        Can you elaborate on the mistakes you see? What languages are you working with?

  • KronisLV an hour ago

    > Personally, and quietly, I have a major concern about the conflict of interest of Cursor deciding which files to add to context then charging you for the size of the context.

    > As with so many products, it's cheap to start with, you become dependent on it, then one day it's not cheap and you're fucked.

    If it gets too expensive, then I guess the alternative becomes using something like Continue.dev or Cline with one of the providers like Scaleway that you can rent GPUs from or that have managed inference… either that, or having a pair of L4 cards in a closet somewhere (or a fancy Mac, or anything else with a decent amount of memory).

    Whereas if there are no well priced options anywhere (e.g. the upfront investment for a company to buy their own GPUs to run with Ollama or something else), then that just means that running LLM based systems nowadays is economically infeasible for many.

  • theshrike79 2 hours ago

    What do you consider "heavy"? Is it optimising an algorithm or "rewrite this whole codebase in <a different language>"?

    • walthamstow an hour ago

      Refactoring a critical function that is very large and complex. Yeah, maybe it shouldn't be so large and so complex, but it is and that's the codebase we have. I'm sure many other companies do too.

      • kamaal an hour ago

        Thats not how apex productivity folks have used any IDE productivity leap including this one.

        You dont outsource your thinking to the tool, You do the thinking and let the tool type it for you.

        • walthamstow 7 minutes ago

          You're missing the step where I have to articulate (and type) the prompt in natural language well enough for the tool to understand and execute. Then if I fail, I have to write more natural language.

          You said just the same in another of your many posts:

          > if you can begin to describe the function well

          Like I said, it's great for the small stuff, but it's not great for the big stuff, for now at least.

  • malux85 2 hours ago

    Then only use it for the small tasks? There's one button you have to click to turn it off.

  • onion2k an hour ago

    I think it's weakening junior engineers' reasoning and coding abilities as they become reliant on it without having lived for long, or at all, in the before times.

    As someone old enough to have built websites in Notepad.exe it's totally reasonable that I ask my teams to turn off syntax highlighting, brace matching, and refactoring tools in VSCode. I didn't have them when I started, so they shouldn't use them today. Modern IDE features are just making them lazy.

    /s

    • moron4hire 25 minutes ago

      Not all change is progress.

      Change comes with pros and cons. The pros need to outweigh the cons (and probably significantly so) for change to be considered progress.

      Syntax highlighting has the pro of making code faster to visually parse for most people at the expense of some CPU cycles and a 10 second setting change for people for whom color variations are problematic. It doesn't take anything away. It's purely additive.

      AI code generation tools provide a dubious boost to short term productivity at the expense of extra work in the medium term and skill atrophy in the long term.

      My junior developers think I don't know they are using AI coding tools. I discovered it about 2 months into them doing it, and I've been tracking their productivity both before and after. In one case, one might be committing to the repository slightly more frequently. But in all cases, they still aren't completing assignments on time. Or at all. Even basic things have to be rewritten because they aren't suitable for purpose. And in our pair programming sessions, I see them frozen up now, where they weren't before they started using the tools. I can even see them habitually attempt to use the AI, but then remember I'm sitting with them, and halt.

      I tried to use AI code generation once to fill in some ASP.NET Core boilerplate for setting up authentication. Should be basic stuff. Should be 3 or 4 lines of code. I've done it before, but I forgot the exact lines and had been told AI was good for this kind of lazy recall of common tasks. It gave me a stub that had a comment inside, "implement authentication here". Tried to coax the AI into doing what I wanted and easily spent 10x more time than it would have taken to look up the documentation. And it still wasn't done. I haven't touched AI code gen since.

      So IDK. I'm very skeptical of the claims that AI is writing significant amounts of working code for people, or that it at all rivals even a moderately smart junior developer (say nothing of actually experienced senior). I think what's really happening is that people are spending a lot of time spinning the roulette wheel, always betting on 00, and then crowing they're a genius when it finally lands.

      • kamaal 14 minutes ago

        >>In one case, one might be committing to the repository slightly more frequently. But in all cases, they still aren't completing assignments on time.

        Most people are using it to finish work soon, rather than use it to do more work. As a senior engineer your job must not be to stop the use of LLMs, but create opportunities to build newer and bigger products.

        >>I can even see them habitually attempt to use the AI, but then remember I'm sitting with them, and halt.

        I understand you and I grew up in a different era. But life getting easier for the young isnt exactly something we must resent. Things are only getting easier with time and have been like this for a few centuries. None of this is wrong.

        >>Tried to coax the AI into doing what I wanted and easily spent 10x more time than it would have taken to look up the documentation.

        Honestly this largely reads like how my dad would describe technology from the 2000s. It was always that he was better off without it. Whether that was true or false is up for debate, but the world was moving on.

    • walthamstow an hour ago

      Is this supposed to be funny?

      • brookst 39 minutes ago

        Funny / sad. GP is just highlighting the all too common attitude of people who grew up using new tech (graphing calculators, Wikipedia, etc) who reach a certain age and suddenly new tech is ruining the youth of today.

        It’s just human nature, you can decide if it’s funny or sad or whatever

        • walthamstow 29 minutes ago

          Neither of you have comprehended the part of my post where I talk about myself and my own skills.

          Hiding behind the sarcasm tag to take the piss out of people younger than you, I don't think that's very funny. The magnetised needle and a steady hand gag from xkcd, now that is actually funny.

  • kamaal an hour ago

    >>but it falls down with anything heavy.

    If you are using LLMs to write anything more than a if(else)/for block at a time you are doing it wrong.

    >>I think it's weakening junior engineers' reasoning and coding abilities as they become reliant on it without having lived for long, or at all, in the before times.

    When I first started work, my employer didn't provide internet access to employees, their argument would always be how would you code if there was no internet connection, out there in the real world? , As it turns out they were not only worried about the wrong problem, but the got the whole paradigm about this new world wrong.

    In short it was not worth building anything at all in a world internet doesn't exist.

    >>then one day it's not cheap ...

    Again you are worried about the wrong thing, your worry should not be what happens when its no longer cheaper, but when it, as a matter of fact gets cheaper. Which it will.

    • bluefirebrand 44 minutes ago

      > If you are using LLMs to write anything more than a if(else)/for block at a time you are doing it wrong

      Then what value are they actually adding?

      If this is all they are capable of, surely you could just write this code yourself much faster than trying to describe what you want to the LLM in natural language?

      I cannot imagine any decently competent programmer gaining productivity from these tools if this is how limited they still are

      Why are people so bullish on them?

      • galbar 31 minutes ago

        This is how I feel. I mentioned this to a couple of friends over a beer and their answer was that there are many not "decently competent programmer"s in the industry currently and they benefit immensely from this technology, at the expense of the stability and maintainability of the system they are working on.

      • kamaal 31 minutes ago

        English to Code translation.

        Albeit they are fairly context aware as to what you are asking. So they can save a lot of RTFM and code/test cycles. At times they can look at the functions that are already built, and write new ones for you, if you can begin to describe the function well.

        But if you want to write a good function, like written to fit tightly to specifications. Its too much English. You need to describe in steps what is to be done, plus exceptions. And at some point you are just doing logic programming(https://en.wikipedia.org/wiki/Logic_programming) In the sense that whole english text looks like a list of and/or situations + exceptions.

        So you have to go one atomic step(a decision statement and a loop) at a time. But thats a big productivity boost too. Reason being able to put lots of text in place without you having to manually type it out.

        >>you could just write this code yourself much faster than trying to describe what you want to the LLM in natural language?

        Honestly speaking most of coding is manually laborious if you don't know touch typing. And even if you did know its a chore.

        I remember when I started using co-pilot with react it was doing a lot of otherwise typing work I'd have to do.

        >>I cannot imagine any decently competent programmer gaining productivity from these tools if this is how limited they still are

        IMO opinion, my brain atleast over the years has seen so many code patterns, debugging situations and what to anticipate and assemble as I go, that having some intelligent typing assistant is a major productivity boost.

        >>Why are people so bullish on them?

        Eventually newer programming languages will come along and people will build larger things.

laborcontract 4 hours ago

Cursor's current business model produces a fundamental conflict between the well-being of the user and the financial well-being of the company. We're starting to see these cracks form as LLM providers are relying on scaling through inference-time compute.

Cursor has been trying to do things to reduce the costs of inference, especially through context pruning. For instance, if you "attach" files to a conversation, Cursor no longer stuffs the code from those files into the prompt. Instead, it'll run function calls to open those files and read bits and pieces of the code until the model feels it has enough information. This seems like a perfectly reasonable strategy until you realize you cannot do the same thing with reasoning models, if you're limiting the reasoning to just the initial prompt.

If you prune out context from the initial prompt, instead of reasoning on richer context, the llm reasons only on the prompt itself (w/ no access to the attached files). After the thinking process, Cursor runs function calls to retrieve more context, which entirely defeats the point of "thinking" and induces the model to create incoherent plans and speculative edits in its thinking process, thus explaining Claude's bizarre over-editing behavior. I suspect this is why so many Cursor users are complaining about Claude 3.7.

On top of this, Cursor has every incentive to keep the thinking effort for both o3-mini and Claude 3.7 to the very minimum so as to reduce server load.

Cursor is being hailed as one of the greatest SAAS growth stories but their $20/mo all-you-can-eat business model puts them in such a bad place.

  • NitpickLawyer 2 hours ago

    > This seems like a perfectly reasonable strategy until you realize you cannot do the same thing with reasoning models, if you're limiting the reasoning to just the initial prompt.

    Keep in mind that what we call "reasoning" models today are the first iteration. There's no fundamental reason why you can't do what you stated. It's not done now, but it can be done.

    There's nothing stoping you from running "tinking" in "chunks" of 1-2 paragraphs, doing some search, and adding more context (maybe from pre-reasoned cache) and continuing the reasoning from there.

    There's also work being done on think - summarise - think - summarise - etc. And on various "RAG"-like thinking.

  • rafaelmn 3 hours ago

    >Cursor has been trying to do things to reduce the costs of inference, especially through context pruning. For instance, if you "attach" files to a conversation, Cursor no longer stuffs the code from those files into the prompt. Instead, it'll run function calls to open those files and read bits and pieces of the code until the model feels it has enough information. While that seems like a perfectly reasonable strategy, it starts to fall apart when integrating reasoning models.

    In general I feel like this was always the reason automatic context detection could not be good in fixed fee subscription models - providers need to constrain the context to stay profitable. I also saw that things like Claude Code happily chew through your codebase, and bank account, since they are charging by token - so they have the opposite incentive.

  • Roritharr 3 hours ago

    This is only surface-level deep. Cursor already has Quotas for their paid plans and Usage-based Pricing for their larger models, which I run into and fall over to their usage based model every month.

    Imo most of their incentive on context-pruning comes not just from reducing the token amount, but from the perception that you only have to find "the right way"tm to build that context window automatically, to get to coding panacea. They just aren't there yet.

    • laborcontract 2 hours ago

      If you’re going to pay on the margin, why not use those incremental dollars running the same requests on cline? I’m assuming cost is the deciding factor here because, quality-wise, plugging directly into provider apis with cline always does a much better job for me.

  • IanCal 2 hours ago

    > Instead, it'll run function calls to open those files and read bits and pieces of the code until the model feels it has enough information. This seems like a perfectly reasonable strategy until you realize you cannot do the same thing with reasoning models, if you're limiting the reasoning to just the initial prompt.

    There's nothing about this that conflicts with reasoning models, I'm not sure what you mean here.

    • laborcontract 2 hours ago

      what i mean is that their implementation (thinking only on the first response) renders zero benefit because it doesn’t see the code itself. They run multiple function calls to analyze your codebase in increments. If they ran the thinking model on the output of those function calls, then performance would be great but, so far, this is not what they are doing (yet). It also dramatically increases the cost of running the same operation.

      • throwaway314155 an hour ago

        This sounds like a Cursor issue, not something that effects reasoning models in general.

        edit: Ah, I see what you mean now.

        • laborcontract an hour ago

          That's my point. Cursor, by offering unlimited requests (500 fast requests + unlimited slow requests) to people paying a fixed $20/mo, they've put themselves into a ruthless marginal cost optimization game where one of their biggest levers for success is reducing context sizes and discouraging thinking after every function call.

          Software like Claude Code and Cline do not face those constraints, as the cost burden is on the user.

  • namaria 2 hours ago

    Reflecting on your comment I realized that using a huge amount of GPUs is akin to an Turing machine approaching infinite speed. So I think the promise of LLMs writing code is basically saying: if we add a huge number of reading/writing heads with unbounded number of rules, we can solve decideability. Because what is the ability to generate arbitrarily complex code if not solving the halting problem? Maybe there's a more elegant or logical way to postulate this, or maybe I'm just confused or plain wrong, but it seems to me that it is impossible to generate a program that is guaranteed to terminate unless you can solve decideability. And throwing GPUs at a huge tape is just saying that the tape approaches infinite size and the Turing machine approaches infinite speed...

    Or put another way, isn't the promise of software that is capable to generate any software given a natural language description in finite time basically assuming P=NP? Because unless the time can be guaranteed to be finite, throwing GPU farms and memory at this most general problem (isn't the promise of using software to generating arbitrary software the same as the promise that any possible problem can be solved in polynomial time?) is not guaranteed to solve it in finite time.

  • MrBuddyCasino 2 hours ago

    > Cursor has been trying to do things to reduce the costs of inference, especially through context pruning.

    You can also use cline with gemini-2.0-flash, which supports a huge context window. Cline will send it the full context and not prune via RAG, which helps.

    • laborcontract 2 hours ago

      I love cline but i’ve never tried the gemini models with it. I’ll give it a shot tonight, thanks for the tip!

HugoDias 9 minutes ago

I saw this post on the first page a few minutes ago (published 5 hours ago), but it quickly dropped to the 5th page. Given its comments and points, that seems odd. I had to search to find it again. Any idea why?

rhodescolossus an hour ago

I've tried Cursor a couple of times but my complain is always the same: why forking VS Code when all this functionality could just be an extension, same as Copilot does?

Some VSCode extensions don't work, you need to redo all your configuration, add all your workspaces... and the gain vs Copilot is not that high

  • frereubu an hour ago

    > and the gain vs Copilot is not that high

    I think that's (at least part of) your answer. More friction to move back from an entirely separate app rather than disabling an extension.

cyprx 4 hours ago

I had been using Cursor for a month until a day when my house got no internet, then i realized that i started forgetting how to write code properly

  • risyachka 2 hours ago

    I had the exact same experience, pretty sure this happens in most cases, people just don’t realize it

  • ant6n 3 hours ago

    Just get a Mac Studio with 512GB RAM and run a local model when the internet is down.

    • jjude 3 hours ago

      Which local model would you recommend that comes close to cursor in response quality? I have tried deepseek, mistral, and few others. None comes close to quality of cursor. I keep coming back to it.

      • itsabackupplan 3 hours ago

        It's a backup plan, who cares if the quality matches? If it did, Cursor would not be in question.

        • walthamstow 3 hours ago

          A $10k backup plan? That makes sense. No wonder you used a throwaway.

          • itsabackupplan 3 hours ago

            [flagged]

            • walthamstow 2 hours ago

              The hardware isn't free. Someone asks a question and your answer is who cares about $10k of hardware hanging around as a subpar backup?

              • itsabackupplan 2 hours ago

                See, the thing is, I never wanted to comment on the cost or feasibility of the hardware at all. What I was commenting was that any backup plan is expected to be subpar by very nature, and if not, shouldnbe instantly promoted. If you'll notice that was 100% of what I said. I was adding to the pile of "this plan is stupid". Cursor has an actual value proposition.

                Of course then you disrespected me with a rude ad hominem and got a rude response back. Ignoring the point and attacking the persin is a concession. M

                For the record, I and many others use throwaways wvery single thread. This isn't and shouldn't be reddit.

                • walthamstow 2 hours ago

                  You're right, I shouldn't have said the throwaway bit, sorry. However, you're ignoring the context of the conversation, which is a $10k piece of hardware. I don't know what you expected to add to the conversation by saying "who cares?" when someone asks for advice, in context or even in isolation.

    • pknerd 2 hours ago

      Can one run cursor with local LLMs only?

    • _puk 3 hours ago

      Back up your $20 a month subscription with a $2000 Mac Studio for those days your internet is down.

      Peak HN.

      • automatic6131 3 hours ago

        Lol he suggested a $10k Mac Studio

        But you can at least resell that $10k Mac Studio, theoretically.

        • mettamage 3 hours ago

          Trying to do that with an M1 laptop of 32 GB and it's hard to get even 1000 euro's for it in the Netherlands whereas the refurbished price is at double of that.

      • yohannesk 3 hours ago

        Even more absurd is that Mac Studio with 512GB RAM costs around $9.5K

      • rullopat 3 hours ago

        2000$? You wish!

        • _puk an hour ago

          Lol, not sure where I got the 2k from. Brain fart, but I'll let it stand :D

    • eadmund an hour ago

      But then I’d be using a Mac, and that would slow my development down and be generally miserable.

2sk21 an hour ago

I read this point in the article with bafflement:

"Learn when a problem is best solved manually."

Sure, but how? This is like the vacuous advice for investors: buy low and sell high

  • dkersten 15 minutes ago

    By trying things and seeing what it’s good and bad at. For example, I no longer let it make data modelling decisions (both for client local data and database schemas), because it had a habit of coding itself into holes it had trouble getting back out of, eg duplicating data that it then has difficulty keeping in sync, where a better model from the start might have been a more normalised structure.

    But I came to this conclusion by first letting it try to do everything and observing where it fell down.

gregwebs 34 minutes ago

AI blows me away when asked to write greenfield code. It can get a complex task using hundreds of lines of code right on the first try or perhaps it needs a second try on the prompt and an additional tweak of the output code.

As things move from prototype to production ready the productivity starts to become a wash for me.

AI doesn’t do a good job organizing the code and keeping it DRY. Then it’s not easy for it to make those refactorings later. AI is good at writing code that isn’t inherently defective but if there is complexity in the code it will introduce bugs in its changes.

I use Continue for small additions and tab completions and Claude for large changes. The tab completions are a small productivity boost.

Nice to see these tips- I will start experimenting with prompts to produce better code.

jillesvangurp 4 hours ago

The UX of tools like these is largely constrained by how good they are with constructing a complete context of what you are trying to do. Micromanaging context can be frustrating.

I played with aider a few days ago. Pretty frustrating experience. It kept telling me to "add files" that are in the damn directory that I opened it in. "Add them yourself" was my response. Didn't work; it couldn't do it somehow. Probably once you dial that in, it starts working better. But I had a rough time with it creating commits with broken code, not picking up manual file changes, etc. It all felt a bit flaky and brittle. Half the problem seems to be simple cache coherence issues and me having to tell it things that it should be figuring out by itself.

The model quality seems less important than the plumbing to get the full context to the AI. And since large context windows are expensive, a lot of these tools are cutting corners all the time.

I think that's a short term problem. Not cutting those corners is valuable enough that a logical end state is tools that don't do that that cost a bit more. Just load the whole project. Yes it will make every question cost 2-3$ or something like that. That's expensive now but if it drops by 20x we won't care.

Basically large models that support huge context windows of millions/tens of millions of tokens cost something like the price of a small car and use a lot of energy. That's OK. Lots of people own small cars. Because they are kind of useful. AIs that have a complete, detailed context of all your code, requirements, intentions, etc. will be able to do a much better job that one that has to guess all of that from a few lines of text. That would be useful. And valuable to a lot of people.

Nvidia is rich because they have insane margins on their GPUs. They cost a fraction of what they sell them for. That means that price will crash over time. So, I'm optimistic that a lot of this stuff will improve rapidly.

  • myflash13 27 minutes ago

    Try Claude Code. It figures out context by itself. I’m having a lot of success with it for a few days now, whereas I never caught on with Cursor due to the context problem.

  • _heimdall an hour ago

    > Nvidia is rich because they have insane margins on their GPUs. They cost a fraction of what they sell them for. That means that price will crash over time. So, I'm optimistic that a lot of this stuff will improve rapidly.

    That still leaves us with an ungodly amount of resources used both to build the GPUs and to run them for a few years before having to replace them with even more GPUs.

    Its pretty amazing to me how quickly the big tech companies pivoted from making promises to "go green" to buying as many GPUs as possible to burn through entire powerplants worth of electricity.

  • jampekka 3 hours ago

    > Nvidia is rich because they have insane margins on their GPUs. They cost a fraction of what they sell them for. That means that price will crash over time. So, I'm optimistic that a lot of this stuff will improve rapidly.

    OTOH currently the LLM companies are probably taking a financial loss with each token. Wouldn't be surprised if the price doesn't even cover the electricity used in some cases.

    Also e.g. Gemini already runs on Google's custom hardware, skipping the Nvidia tax.

stared an hour ago

Other useful things I've discovered:

- Push for DRY principles ("make code concise," "ensure good design").

- Swap models strategically; sometimes it's beneficial to design with one model and implement with another. For example, use DeepSeek R1 for planning and Claude 3.5 (or 3.7) for execution. GPT-4.5 excels at solving complex problems that other models struggle with, but it's expensive. - Insist on proper typing; clear, well-typed code improves autocompletion and static analysis.

- Certain models, particularly Claude 3.7, overly favor nested conditionals and defensive programming. They frequently introduce nullable arguments or union types unnecessarily. To mitigate this, keep function signatures as simple and clean as possible, and validate inputs once at the entry point rather than repeatedly in deeper layers.

- Emphasize proper exception handling. Some models (again, notably Claude 3.7) have a habit of wrapping everything in extensive try/catch blocks, resulting in nested and hard-to-debug code reminiscent of legacy JavaScript, where undefined values silently pass through multiple abstraction layers. Allowing code to fail explicitly is a blessing for debugging purposes; masking errors is like replacing a fuse with a nail.

  • stared an hour ago

    Some additional thoughts on GPT-4.5: it provides BFK-9k experience - eats e̶n̶e̶r̶g̶y̶ ̶c̶e̶l̶l̶s̶ budget ($2 per call!) like there is no tomorrow, but removes bugs with a blast.

    In my experience, the gap between Claude 3.7 and GPT-4.5 is substantial. Claude 3.7 behaves like an overzealous intern on stimulants. It delivers results but often includes unwanted code changes, resulting in spaghetti code with deeply nested conditionals and redundant null checks. Although initial results might appear functional, the resulting technical debt makes subsequent modifications increasingly difficult, often leaving the codebase in disarray. GPT-4.5 behaves more like a mid-level developer, thoughtfully applying good programming patterns.

    Unfortunately, the cost difference is significant. For practical purposes, I typically combine models. GPT-4.5 is generally reserved for planning, complex bug fixes, and code refinement or refactoring.

    In my experience, GPT-4.5 consistently outperforms thinking models like o1. Occasionally, I'll use o3-mini or DeepSeek R1, but GPT-4.5 tends to be noticeably superior (at least, on average). Of course, effectiveness depends heavily on prompts and specific problems. GPT-4.5 often possesses direct knowledge about particular libraries (even without web searching), whereas o3-mini frequently struggles without additional context.

blainm 3 hours ago

I've found tools like Cursor useful for prototyping and MVP development. However, as the codebase grows, they struggle. It's likely due to larger files or an increased number of them filling up the context window, leading to coherence issues. What once gave you a speed boost now starts to work against you. In such cases, manually selecting relevant files or snippets from them yields better results, but at that point it's not much different from using the web interface to something like Claude.

  • Semaphor 3 hours ago

    I had that same experience with Claude Code. I tried to do a 95% "Idle Development RPG" approach to developing a music release organization software. At the beginning, I was really impressed, but with more and more complexity, it becomes increasingly incoherent, forgetting about approaches and patterns used elsewhere and reinventing the wheel, often badly.

  • blitzar 2 hours ago

    Or the context not being large enough for all the obscure functions and files to go into the context. I am too basic to have dug deep enough, but a simple (automatic) documentation context for the entire project would certainly improve things for me.

  • turnsout an hour ago

    Agreed. One useful tip is to have Cursor break up large files into smaller files. For some reason, the model doesn't do this naturally. I've had several Cursor experiments grow into 3000+ line files because it just keeps adding.

    Once the codebase is reasonably structured, it's much better at picking which files it needs to read in.

Amekedl 3 hours ago

Compounding the opinions of other commentors, I feel that using Cursor is a bad idea. It's a closed source SaaS, and with these components involved, service quality can do wild swings on a daily basis, not something I'm particularly keen of.

  • rco8786 2 hours ago

    This is true of every single service provider outside of fully OSS solutions, which are a teeny tiny fraction of the world's service providers.

  • turnsout an hour ago

    There's always Aider with local models!

yard2010 3 hours ago

How can I stop Cursor from sending .env files with secrets as plain text? Nothing I tried from the docs works.

  • M4v3R 3 hours ago

    This is a huge issue that was already raised on their forums and it's very surprising they didn't address it yet.

    [0] https://forum.cursor.com/t/environment-secrets-and-code-secu...

    • timrichard 2 hours ago

      I have been adding .env files to .cursorignore so far.

      I can see from that thread that the approach hasn’t been perfect, but it seems that the last two releases have tried to address that :

      “0.46.x : .cursorignore now blocks files from being added in chat or sent up for tab completions, in addition to ignoring them from indexing.”

flippyhead 31 minutes ago

Note that the latest update (0.47.x) made this useful change:

Rules: Allow nested .cursor/rules directories and improved UX to make it clearer when rules are being applied.

This has made things a lot easier in my monorepos.

mrlowlevel 2 hours ago

Do any of these tools use the rich information from the AST to pull in context? Coupled with semantic search for entry points into the AST, it feels like you could do a lot…

  • zarathustreal an hour ago

    Don’t they all do this? Surely they’re not just doing naive text, n-gram, regex, embeddings, etc, right?

DaveMcMartin 35 minutes ago

For those of you who, like me, use Neovim, you can achieve "cursor at home" by using a plugin like Avante.nvim or CodeCompanion. You can configure it to suit your preferences.

Just sharing this because I think some might find it useful.

torginus 3 hours ago

I have been a religious Cursor + Sonnet user for like past half a year, and maybe I'm an idiot, but I don't like this agentic workflow at all.

What worked for me is having it generate functions, classes, ranging from tens of lines of code to low hundreds. That way I could quickly interate on its output and check if its actually what I wanted.

It created a prompt-check-prompt iterative workflow where I could make progress quite fast and be reasonably certain of getting what I wanted. Sometimes it required fiddling with manually including files in the context, but that was a sacrifice I was willing to make and if I messed up, I could quickly try again.

With these agentic workflows, and thinking models I'm at a loss.

To take advantage of them, you need very long and detailed prompts, they take a long time to generate and drop huge chunks of code on your head. What it generates is usually wrong due to the combination of sloppy or ambiguous requirements by me, model weaknesses, and agent issues. So I need to take a good chunk of time to actually understand what it made, and fix it.

The iteration time is longer, I have less control over what it's doing, which means I spend many minutes of crafting elaborate prompts, reading the convoluted and large output, figuring out what's wrong with it, either fixing it by hand, or modifying my prompt, rinse and repeat.

TLDR: Agents and reasoning models generate 10x as much code, that you have to spend 10x time reviewing and 10x as much time crafting a good prompt.

In theory it would come out as a wash, in practice, it's worse since the super-productive tight AI iteration cycle is gone.

Overall I haven't found these thinking models to be that good for coding, other than the initial project setup and scaffolding.

  • timrichard 2 hours ago

    I think you’re absolutely right and I’ve come to the same conclusion and workflow.

    I work on one file at a time in Ask mode, not Composer/Agent. Review every change, and insist on revisions for anything that seems off. Stay in control of the process, and write manually whenever it would be quicker. I won’t accept code I don’t understand, so when exploring new domains I’ll go back with as many questions as necessary to get into the details.

    I think Cursor started off this way as a productivity tool for developers, but a lot of Composer/Agent features were added along the way as it became very popular with Vibe Coders. There are inherent risks with non-coders copypasting a load of code they don’t understand, so I see this use case as okay for disposable software, or perhaps UI concept prototypes. But for things that matter and need to be maintained, I think your approach is spot on.

    • _heimdall an hour ago

      Have you found that this still saves you time overall? Or do you spent a similar amount of time acting as a code reviewer rather than coding it yourself?

      • timrichard 37 minutes ago

        Yes, I think so. Often it doesn’t take much more than a glance for simpler edits.

  • theshrike79 2 hours ago

    Do you have any Cursor rules defined? Those tend to control its habit of trying to go off the rails and solve 42 problems at once instead of just the one.

ookblah an hour ago

parts of the article are spot on. after the magic has worn off i find it's best to literally treat it like another person. would you blindly merge code from someone else or huge swaths of features? no. i have to review every single piece of code, because later on when there's a bug or new feature you have to have that understanding.

another huge thing for me has been to scaffold a complex feature just to see what it would do. just start out with literal garbage and an idea and as long as it works you can start to see if something is going to pan out or not. then tear it down and do it again with those new assumptions you learned. keep doing it until you have a clear direction.

or sometimes my brain just needs to take a break and i'll work on boilerplate stuff that i've been meaning to do or small refactors.

pestkranker an hour ago

Is there an equivalent to cursorrules and copilot-instructions for the Jetbrains IDEs (Rider) + GitHub Copilot extension?

divan 3 hours ago

How does the current state of Cursor agentic workflow compare to Windsurf Editor?

I've been using Windsurf since it was released, and back then, it was so ahead of Cursor it's not even funny. Windsurf feels like it's trained on good programming practices (check usage of the function in other parts of the project for consistency, double checking for errors after changes made, etc). It's also surprisingly fast (it can "search" the 5k files codebase in, like, 2 seconds. It even asked me once to copy and paste output from Chrome DevTools because it suspected that my interpretation of the result was not accurate (and it was right).

The only thing I truly wish is to have the same experience with locally running models. Perhaps Mac Studio 512GB will deliver :)

DiabloD3 an hour ago

I'm sorry, but isn't Cursor just an editor? Maybe an editor shouldn't actually have garbage parts to avoid?

Why not just use an editor that is focused on coding, and then just not use an LLM at all? Less fighting the tooling, more getting your job done with less long term landmines.

There are a lot of editors, and many of them even have native or semi-native LLM support now. Pick one.

Edit: Also, side note, why are so many people running their LLMs in the cloud? All the cutting edge models are open weight licensed, and run locally. You don't need to depend on some corporation that will inevitably rug-pull you.

Like, a 7900XTX runs you about $1000. You probably already own a GPU that cost more in your gaming rig.

  • krapht 37 minutes ago

    > Edit: Also, side note, why are so many people running their LLMs in the cloud? All the cutting edge models are open weight licensed, and run locally. You don't need to depend on some corporation that will inevitably rug-pull you.

    ???

    Deepseek R1 doesn't run locally unless you program on a dual socket server with 1 TB of RAM. Or enough cash to have a cabinet of GPUs. The trend for state-of-the-art LLMs is to get bigger over time, not smaller.

    Look, I've played with llava and llama locally too, but the benchmarked performance is nowhere near what you can get from the larger cloud providers who can serve hundred-million+ parameter models without quantization.

  • zild3d 24 minutes ago

    > All the cutting edge models are open weight licensed, and run locally.

    No? from https://lmarena.ai/ coding:

    ...

    Rank* ... Model ... Score ... Org ... License

    1 ... Grok-3-Preview-02-24 ... 1414 ... xAI ... Proprietary

    1 ... GPT-4.5-Preview ... 1413 ... OpenAI ... Proprietary

    3 ... Gemini-2.0-Pro-Exp-02-05 ... 1378 ... Google ... Proprietary

    3 ... o3-mini-high ... 1369 ... OpenAI ... Proprietary

    3 ... DeepSeek-R1 ... 1369 ... DeepSeek ... MIT

    3 ... ChatGPT-4o-latest (2025-01-29) ... 1367 ... OpenAI ... Proprietary

    3 ... Gemini-2.0-Flash-Thinking-Exp ... 1366 ... Google ... Proprietary

    3 ... o1-2024-12-17 ... 1359 ... OpenAI ... Proprietary

    3 ... o3-mini ... 1353 ... OpenAI ... Proprietary

    4 ... o1-preview ... 1355 ... OpenAI ... Proprietary

    4 ... Gemini-2.0-Flash-001 ... 1354 ... Google ... Proprietary

    4 ... o1-mini ... 1353 ... OpenAI ... Proprietary

    4 ... Claude 3.7 Sonnet ... 1350 ... Anthropic ... Proprietary

ThinkBeat an hour ago

What programming languages do you primarily use ? I feel that knowing what programming languages a llm is best at is valuable but often not directly apparent.

dimgl 28 minutes ago

The new Cursor update (0.47) is cursed. They got rid of codebase searching (WTF?) and the agent is noticeably worse, even when using Sonnet 3.5.

I'm really shocked, actually. This might push me to look at competitors.

flipgimble an hour ago

Cursor overwrites the “code” command line shortcut/alias that’s normally set by VS Code. It does this on every update with no setting to disable this behavior. There are numbers of forum threads asking about manual solutions. This seems like a deliberately anti-user feature meant to get their usage numbers up at all costs. This small thing makes me not trust the decision making process at Cursor won’t sell me out as a user.

  • throwaway314155 an hour ago

    This is the primary reason I uninstalled Cursor and subsequently realized that, hey, VS Code has most of these features now.

    What in the hell were they thinking?!

kevingadd 4 hours ago

> Like mine will keep forgetting about nullish coallescing (??) in JS, and even after I fix it up it will revert my change in its future changes. So of course I put that rule in and it won't happen again.

I'm surprised that this sort of pattern - you fix a bug and the AI undoes your fix - is common enough for the author to call it out. I would have assumed the model wouldn't be aggressively editing existing working code like that.

  • worldsayshi 4 hours ago

    Yeah I have seen this a bunch of times as well. Especially with deprecated function calls. It generates a bunch of code. I get deprecation warnings. I fix them. Copilot fixes them back. I have to explicitly point out that I made the change for it to not reintroduce the deprecations.

    I guess that while code that compiles is easier to train for but code with warnings less so?

    I remember there are other examples of changes that I have to tell the AI I made to not have it change it back again, but can't remember any specific examples.

  • Aeolun 4 hours ago

    It’s due to a problem with Cursor not updating the state of the files that have been manually edited since the last time they were used in the chat, so it’ll thing the fix is not there and blindly output code that doesn’t have it. The ‘apply’ model is dumb, so it just overwrites the corrected version with the wrong one.

    I think the changelog said they fixed it in 0.46, but that’s clearly not the case.

    • oefrha 4 hours ago

      Yep I asked about this exact problem the other day: https://news.ycombinator.com/item?id=43308153 Having something like “always read the current version of the file before suggesting edits” in Cursor rules doesn’t help, the current file is only read by the agent sometimes. Guess no one has a reliable solution yet.

  • siquick 4 hours ago

    Cursor in agent mode + Sonnet 3.7 love nothing better than rewriting half your codebase to fix one small bug in a component.

    I've stopped using agent unless its for a POC where I just want to test an assumption. Applying each step takes a bit more time but means less rogue behaviour and better long term results IME.

    • krets 4 hours ago

      Sounds like a human colleague of mine

    • worldsayshi 4 hours ago

      > love nothing better than rewriting half your codebase to fix one small bug in a component

      Relatable though.

      • MarcelOlsz 4 hours ago

        Reminds me of my old co-worker who rewrote our code to be 10x faster but 100x more unreadable. AI agent code is often the worst of both of those worlds. I'm going to give [0] this guy's strategy a shot.

        [0] https://www.youtube.com/watch?v=jEhvwYkI-og

    • WA 3 hours ago

      If you stopped using agent mode, why use Cursor at all and not a simple plugin for VSCode? Or is there something else that Cursor can do, but a VSCode plugin can't?

factsaresacred an hour ago

Too bad they removed the ability to use Chat (rebranded as Ask) with your own API keys in version 0.47. Now every feature requires a subscription.

Natural for Cursor to nudge users towards their paid plans, but why provide the ability to use your own API keys in the first place if you're going to make them useless later?

askonomm 4 hours ago

So, if I liked being a manager more than a developer, I'd use Cursor, and lean in entirely on AI?

  • lonelyasacloud 3 hours ago

    Yes; it can be used in agentic mode and along with the joys it also has a few of the frustrations that would be familiar if have managed human devs.

  • TingPing 2 hours ago

    If you don’t understand what it outputs then it’s just random garbage.

quotz 3 hours ago

Cline is much better

r_singh 3 hours ago

Just use Cline, it beats Cursor hollow — saves me like hours per day