Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

AI at the wheel, dev navigating

AI Coding with Cursor: Do You Still Need to Be Tech-Savvy?

Reading Time: 19 minutes.

🧐 What’s In It For Me?

Are you a developer concerned that AI might threaten your job? Based on my experience with Cursor — an IDE fully integrated with AI — I’d say the general perception holds true: junior developer roles are the most at risk, at least for now.

Curious how it actually feels to use cutting-edge AI coding tools? In this article, I share what Cursor did well (and where it fell short), what tech stack I used, and how I built and deployed a functional full-stack app — using only free hosting and AI-driven development.

🧭 Context

Cursor AI was recently evaluated at my workplace, Playtika, as a potential alternative to GitHub Copilot.

Although I haven’t been actively coding for the past three years — my role has shifted more toward engineering management — I was curious to see how the developer experience has evolved. I wanted to understand how tools like Cursor are reshaping the coding workflow, especially compared to what I was used to three years ago.

So I took Cursor for a spin, built a real-world app from scratch, and documented my experience — the good, the bad, and everything in between.

🔍 Exploring What’s on the Market

Before diving into the development experience, I wanted to get a broader sense of what tools are currently available. There’s no shortage of AI-assisted coding solutions — but their capabilities, focus areas, and maturity levels vary widely.

I’ve skipped pricing details in this comparison. Some tools are free, but in my experience, the best-performing ones typically aren’t. Most of them cost up to $20/month — including Cursor. 

Below, I’ve listed the most relevant tools, roughly ordered by user adoption (higher first):

ToolAI Agent ModeCustomizationAI Models IntegrationEditor BaseTarget Audience
GitHub Copilot❌ NoLowOpenAI, Claude, GeminiVS Code, JetBrainsDevs using Github
Cursor AI✅ YesHighOpenAI, Claude, Gemini, Mistral, DeepSeekVS Code forkDevs seeking full AI IDE
Amazon Codewhisperer❌ NoLowAmazon proprietaryVS Code/JetBrainsAWS devs
Windsurf✅ Yes (Cascade)MediumCustom (Cascade & Supercomplete)VS CodeAdvanced VS code users
Void✅ YesVery highCustom/local/self-hosted LLMsVS Code forkOpen-source enthusiasts & privacy-focused devs

🤖 Does AI Agent Mode Make a Difference?

One of the most notable evolutions in AI-assisted development is the rise of Agent Mode — where AI becomes more than a smart autocomplete.

Instead of suggesting lines of code, an AI agent can:

  • Understand and execute task-level instructions in natural language
  • Modify multiple files in context
  • Run commands or handle configuration steps

This shifts the role of AI from helping you type faster to actually taking work off your plate — like delegating to another developer.

Whether that potential translates into real productivity gains? That’s what I set out to test.

🚫 Why I Skipped Windsurf and Void (for Now)

Windsurf appeared to be the most complete alternative to Cursor, particularly due to its Cascade agent and scripting capabilities. However, its relatively low adoption and limited community support made me hesitant to invest time into testing it at this stage. I plan to revisit it once it gains more traction and maturity.

As for Void, its focus on open-source and self-hosted models is compelling — especially for privacy-conscious teams or those with custom infrastructure needs. That said, I currently expect the performance and reliability of those models to lag behind leading proprietary solutions, so I’ve chosen to hold off on evaluating it for now.

🌐 Exploring What Other People Do With It

To better understand how Cursor is used in practice, I reviewed a variety of sources — LinkedIn posts, blog articles, YouTube videos, and discussed with developers I know to get their feedback. 

While this was by no means an exhaustive study, I noticed a few recurring patterns:

🧪 Building PoC apps. Most public examples I found were relatively simple — focused on building small projects, often without deploying them to production or evolving them further. These seemed aimed more at learning or experimentation than long-term development. 

In many cases, the content reminded me of my academic exercises: easy to follow in a controlled setting, but not easily transferable to more complex, real-world scenarios. 

That said, I was surprised by the range of technologies people were enthusiastic to use it for — from machine learning to 3D modeling.

💼 Personal Branding and Selling. Some individuals appeared to be using Cursor primarily as part of personal branding or to promote their services, rather than showcasing concrete development workflows. Or to do predictions about its potential in replacing developer’s work.

⚙️ Copilot-like usage. When it comes to more experienced developers, their use of Cursor seemed more conservative — typically limited to well-scoped, small tasks within a single file, similar to how GitHub Copilot or even ChatGPT is often used.

That said, this is just what I was able to observe from public content. It’s entirely possible that more advanced or large-scale use cases exist, but I may have simply missed them.

Of all the resources I found, one of the most helpful was this concise guide with practical tips for using Cursor more effectively.

🗺️The Project – A Custom Platform for Tour Booking

To make the most of this experiment, I chose a project I genuinely care about: a custom platform for booking tours, which I plan to use as the foundation for a future startup.

The goal was to stay motivated beyond just a demo — building something useful, not just a throwaway proof of concept.

Project Development Process

📋 Gathering High-Level Requirements 

I started with some brainstorming and drafted the high-level requirements for the MVP. I prioritized features based on relevance and feasibility, and documented everything in a plain text file.

To keep development fast and focused, I chose a monolithic architecture, planning to revisit scalability and maintainability later if the project gains traction.

📦Choosing the tech stack

My thought process on choosing the tech stack was to identify technologies that are:

  • well suited to my high level requirements 
  • popular, with lots of online examples – so that the AI model is well trained
  • well integrated with VS Code, the IDE Cursor is based on
  • easy to deploy  — ideally for free

I validated my thinking in a quick back-and-forth with ChatGPT, and ended up with the following stack:

LayerTech
ArchitectureFull-stack serverless monolith
FrontendNext.js + React + Tailwind CSS
BackendNext.js API Routes (serverless functions)
DatabaseSupabase (PostgreSQL)
ORMPrisma
HostingVercel (frontend +API) + Supabase DB
VersioningGitHub

🚀 Ramping Up the Project

To get started, I created a ReqAndPlan.md file containing:

  • My high-level requirements
  • The chosen tech stack
  • A basic implementation plan

I asked ChatGPT to help me structure this document clearly, then used it as context when setting up my project in Cursor AI.

After installing Cursor (which includes a free 2-week trial), I opened a new project folder, added the markdown file, and set it as the agent’s working context. Then I gave it a simple prompt:

 “Set up the project with the specified tech stack.”

The result was impressive.

Cursor handled:

  • The entire project configuration
  • Boilerplate code for a responsive web app with decent a design, including:
    • A top navigation bar
    • A static homepage featuring sample tours
    • A static contact page
    • Menu items for Tours and Bookings (initially empty)

Where manual intervention was needed:

  • Installing a few external dependencies 
  • Setting up environment variables
  • Adjusting permissions for the agent

Still, all these were minor. Everything worked smoothly, and within less than an hour, I had a functional skeleton app up and running — visually and structurally similar to a static WordPress site.

Here’s the final output from the Cursor AI agent.

🏗️ Project Development

From this point on, I started implementing features one by one. My goal was to stay as high-level as possible — writing natural-language instructions to the AI agent, instead of coding directly.

I didn’t strictly follow the initial development plan. Like in any real-world project, priorities shifted along the way.

Rather than go through every single feature, I’ll focus on the most relevant takeaways: what worked, what didn’t, and where Cursor fits best (IMO) in a developer’s workflow.

By the way, once the free trial expired, I chose to continue with a paid subscription — a small spoiler that I found value in the tool.

You can follow the actual feature evolution in my GitHub repo (more up to date than the original ReqAndPlan.md file).

🚢 Production deployment 

Once I had a basic version of the app working locally, I moved on to production deployment. 

Cursor was helpful here — but not without issues. The deployment setup required a lot more hands-on guidance than the initial app scaffolding.


✅ What Worked
  • Cursor guided me through building the deployment pipeline with GitHub Actions.
  • It generated initial config files and added helpful comments in the code.
  • It responded well when I asked for debugging utilities, like:
    • A script to test the database connection
    • Log statements for the API responses

🛠️ What Needed Manual Intervention
  • Some of the deployment instructions didn’t match the actual UIs of Vercel or Supabase — I had to figure out updated variable names and menu locations myself.
  • Cursor suggested env variables were not entirely correct. Also, it attempted to store secrets in public config files. Once I flagged this, it explained the correct way to manage them — but didn’t implement it properly.
  • After every few failed attempts, it began looping through the same suggestions. I had to step in, debug the configuration manually, and point it in the right direction.

Despite these issues, I managed to get everything deployed — but it definitely required developer experience and attention to detail. This is not something I’d expect to be done correctly by someone inexperienced.

🎨 UI Adjustments

One area where Cursor really shined was in handling UI changes. When given clear, precise instructions, it was often fully autonomous — making correct changes and updating styles without any direct coding from my side.


✅ What Worked

Cursor responded well to most UI instructions — whether they were precise or just described the intended functionality.

It made coordinated changes across relevant components without needing detailed file references. For example, when asked:

Change the site color theme from purple to green
…it updated styles consistently across the app.

When I pointed out visual bugs (like layout shifts or styling inconsistencies), it usually fixed them without further help — much like addressing a QA ticket.


🧪 Example Instructions That Worked Well

Cursor instruction and output:

UI result:


Cursor instruction and output:

UI result:


Cursor instruction and output:

UI result:


🔍 My Involvement
  • I always did manual QA to verify the output.
  • When a change wasn’t quite right, I followed up with clarifications like:
    • “The layout looks better, but the top menu buttons are now misaligned.”
    • “Spacing is good. Add padding below each card”
  • Sometimes I used ChatGPT first to polish the instruction, then sent it to Cursor.

This was one of the areas where Cursor felt the most like a reliable junior front-end developer — fast, responsive to feedback, and often able to improve design choices on its own. It’s also where AI Agent Mode felt closest to delivering a real productivity boost: as long as the requirements were concrete, Cursor’s UI changes were fast and effective.

For me personally, this was also the area I was least familiar with, and where I would have otherwise spent the most time and effort if coding everything manually. Letting the AI handle the visual implementation while I focused on functionality made a clear difference.

🔐 Authentication & Straightforward App Behavior

For common, well-documented tasks like Google authentication or basic CRUD operations, Cursor performed surprisingly well — especially when given clearly written, functional requirements.

These types of tasks hit the sweet spot for AI-assisted development: predictable structure, clear goals, and limited cross-file dependencies.


✅ What Worked
  • Google authentication was almost fully autonomous:
    • Cursor generated the code
    • Walked me through the necessary external configuration steps
    • Produced a working login flow with minimal friction
  • CRUD operations (like managing tours or bookings) were mostly smooth:
    • I provided clear instructions — often refined with ChatGPT
    • Described what functionality I needed and which files were involved
    • Cursor created the required API routes and UI components

For example, a request like:

Create a form to add new tours, with fields for title, location, image, and description.

…resulted in fully functional code that only needed minor QA on my end. The only issue was that it initially saved images to a local folder, which didn’t work in production — I had to follow-up for a proper fix.


🧪 Example Instruction That Didn’t Work Well

Cursor instruction:

UI result: Instead of implementing an edit page, Cursor duplicated the entire Tours component with a slightly modified title. The logic was incorrect and the structure cluttered.

What Went Wrong

The instruction bundled together multiple functional and UI changes — page routing, form setup, CRUD logic, and confirmation handling — without a clear separation of concerns.

It was simply too much to ask in a single step. In hindsight, I should have broken it down into smaller, focused tasks (e.g., first create the edit page layout, then wire up each input, then handle delete separately).

🔍 What Helped
  • Writing clear, structured instructions, like a short user story or Jira ticket
  • Giving it context hints, such as which files might be involved
  • Following up with QA-level feedback, like error messages or screenshots
🧱 Adding More Complex Behavior

When I moved beyond basic CRUD and started implementing multi-step logic, Cursor began to show its limits. 

My goal was to introduce a review and rating system for tour guides — something that spanned multiple entities, views, and API layers.

My first attempt was to define the entire feature in one instruction.


🧪 1st Attempt: Large User Story

Cursor Instruction:

“Create a public profile page at /guides/[id] that displays information about a tour guide. The guide profile should be visible to all users, and include: profile picture, name, location, languages spoken, years of experience, specialties, and a detailed bio. Also include a list of reviews for this guide. Show their average rating and number of reviews.”

“Allow logged-in users to leave a review for the guide on this page. Each review should include a 1–5 star rating, a comment, and be linked to the user who wrote it. Store reviews in a guide_reviews table.”

Code Result: 

After generating and replacing a large amount of code, Cursor ran into multiple linter errors and began looping through the same fixes repeatedly — about every 4–5 cycles, and not the “3 attempts” it claims in the UI. It even searched Prisma documentation mid-way but still failed to resolve the problem (it was not a Prisma issue).

Worse, the more it tried to fix things, the more it broke previously working code.

Eventually, it asked for my help.

Its root cause analysis was mostly off — it only correctly flagged the issue with the reviewer relation. 

After some debugging, my first instinct was to rewrite the feature manually. In the end, I reverted all the changes, but decided to give AI another shot — just with a different, more structured strategy.

❗ What Went Wrong:
The instruction covered too much — UI routing, form logic, database schema, review rules, and access control — all in one pass. Cursor struggled to manage this scope and maintain context, and the result became hard to debug, review, or trust. It was a clear case of needing to break down complexity into smaller, guided steps.

🔁 2nd Attempt: Breaking It Down Into Steps

After reverting the first attempt, I decided to approach the feature like I would with a junior developer: break the task into isolated, testable chunks, give feedback often, and commit progress incrementally.

Here’s how it went:

Split the Tours page

  • Instruction: “Split the existing Tours page into Current and Past Bookings
  • Worked smoothly, required no follow-up. Here the code change result.

Added needed fields to DB:

  • I analyzed the DB schema, identified the needed extra fields
  • I instructed Cursor to add them with precise details (tables, fields, types and relations)
  • No issues during migration

Create the review submission API:

  • Cursor generated most of the logic correctly, but after reviewing I identified a business logic issue:

Build the client-side form:

  • A working form was added to allow submitting reviews for past bookings
  • Linked automatically to the previously generated API route. Here the code change result

Create API to calculate average ratings:

Create API to retrieve guide profile data:

  • It generated the correct API route
  • Bonus: also created the corresponding .tsx page — not requested, but functional. Here the code change result

Add API to retrieve all reviews for a guide:

  • Minor fix required for field mappings:

Here the code change result.

Finally, build the UI to display a guide’s reviews:

  • Initially, the page looked correct in code — I could see the <Link> element in the JSX
  • However, the link wasn’t showing in the browser
  • After I described my exact actions, the behavior I observed, and noted that the element was present in the code but missing from the rendered DOM (checked via browser dev tools), Cursor correctly identified the missing import and fixed the issue on its own
  • Here the code change result
💡 Key Learning
  • Breaking the work into atomic instructions gave Cursor a clear path to follow
  • I committed after each successful step — which made it easy to revert if needed
  • Feedback worked best when framed like a QA report: “This action triggers X, but I expected Y”

✨ Code Quality, Performance & Security 

Once the core features were working, I started exploring whether Cursor could help me improve code quality, performance, and security. While it did generate suggestions, the results were mixed — and at times, risky.

✨ Code Quality

Cursor frequently rewrote larger parts of the codebase — even when I only asked for small changes. This made ongoing code review essential.

Overall, it produced mostly decent code, but:

  • It lacked consistency across files and components
  • It occasionally replaced better, previously reviewed implementations with overly simplified versions
  • I couldn’t trust its changes without manually reviewing and retesting everything

What frustrated me most was its tendency to overwrite already validated, higher-quality code — even without being prompted to touch those areas. I gave up on reviewing and maintaining these changes for my startup project. But for a corporate or production system, that approach simply wouldn’t be acceptable.

While this behavior may improve in future versions, I still find it difficult to guide the AI on when to treat existing code as “gospel” and when it’s safe to modify it. Even among senior developers, best practices are often debated — so expecting an AI to consistently get it right is a tall order.

At best, it might align with the highest standard it finds in the codebase.

Review for Performance

My instruction: “review the current implementation for performance”.

Below its output:

It applied improvements in the files that (I assume) were in its current context — but not consistently across the codebase. Nothing broke, which was a win.

🧱 Review for Security Concerns

My instruction: “review the current implementation for security concerns”.

Below its output:

Cursor didn’t apply any changes automatically — instead, it listed several recommendations and asked whether I wanted to apply them one by one.

When I approved the suggestion related to environment variable handling, things went downhill quickly. It crashed the app, then tried to “fix” the issue by altering unrelated parts of the codebase — including my database schema. At one point, it even attempted to wipe and regenerate parts of the DB configuration.

I had to fully revert the changes.

❗ What Went Wrong:

The suggestions themselves weren’t necessarily bad — they could serve as a good starting point. But applying them blindly (even with approval) caused cascading issues — and once it goes off-track, it tends to snowball. 

For anything involving security, I’d recommend using it strictly as an advisory tool — useful only if you know what you’re doing and can apply them safely yourself.

🐛 Tests & Bug Fixing

Bug fixing with Cursor was a mixed experience. Sometimes it nailed issues on the first try — other times, it spiraled into unnecessary changes or overly aggressive refactoring.


What Worked

Cursor performed well when I:

  • Provided precise error messages (e.g., linting or runtime errors)
  • Explained the observed behavior across environments
  • Pointed out what I did and what I expected to happen
  • Asked it to generate logs or assist in interpreting them

It even has a fully autonomous mode where it can run commands and inspect the output itself — though I kept that disabled to maintain more control. I found this helped avoid situations where it would change multiple files at once without clear visibility into why, making it difficult to follow and review at the end.

Example:

🧱 What Didn’t Work
  • When asked to add functional tests (e.g., for the review submitting API), it went into a loop mocking various parts of the app without producing a usable test. After many failed attempts, I gave up on trying to collaborate with Cursor for this.
    • Possibly, TDD or starting from a smaller codebase would have yielded better results — I’ve seen examples online that suggest it works better in those contexts.
  • Field mismatches (e.g., reviewer_id vs. user_id) often caused it to go off-track — and its fixes sometimes made things worse
  • When it couldn’t solve a problem within a few tries, it would loop endlessly or start changing unrelated parts of the project
  • It frequently proposed refactoring entire modules to resolve simple bugs, sometimes introducing new performance or security risks
  • In one case, it falsely claimed that reviews and associated bookings would be deleted on cascade — both in the code comments and the UI message — without actually implementing any of the logic to do so
❌ When It Claimed Success But Wasn’t

There were several cases where Cursor said the problem was fixed, but it wasn’t. Notably:

  • I asked it to ensure consistent section widths across pages. It made some layout adjustments, confirmed the issue was fixed — but nothing ever changed visually in the browser. Despite trying everything I could think of to help it resolve the issue, it couldn’t get it to work.
  • In another case, it falsely claimed that reviews and associated bookings would be deleted on cascade — both in the code comments and the UI message — yet it didn’t implement any of the necessary logic or constraints.

💡 Takeaway

Cursor is most helpful when used in a QA-style debugging loop:

  1. Observe the issue
  2. Describe your steps and environment
  3. Share what you expected vs. what happened
  4. Provide logs or visible output when needed

It’s still not reliable enough to fully drive the debugging or testing process on its own — especially for complex flows or codebases that have grown beyond a few tightly scoped files.

What CursorAI Did Well

These were the areas where Cursor consistently added value and saved me time:

  • 🧱 Project setup — Generated a full working skeleton app from a high-level plan, with good defaults for layout and navigation
  • 🔐 Google authentication — Handled both frontend logic and external config with minimal friction
  • 🎨 UI adjustments — Responded well to functional instructions, adapted layouts and styles across components, and often filled in missing details sensibly
  • 🧑‍💻 CRUD features — Built standard operations quickly and with minimal hand-holding, especially when I provided clear, structured instructions

In short, Cursor worked best on well-scoped, conventional tasks, especially when the context was clear and the technology stack was popular.

🛠️ What Didn’t Work So Well

While Cursor shined in well-scoped tasks, it struggled in areas that required stability, precision, or nuanced decision-making.

  • Context wasn’t preserved after closing the app — I had to reintroduce what we were working on every time.
  • Unwanted changes appeared after reopening the IDE — sometimes even schema edits I never requested. In below picture, “History restored” marks the reopening of the IDE; I sent no other instruction in between
  • Vague instructions led to scope creep — it often changed unrelated code (e.g., UI elements, page titles, or styles) unless I explicitly said not to
  • Vague prompts triggered aggressive, unrelated edits, especially when I didn’t explicitly say “don’t touch X.”. E.g. 
  • Complex, multi-layered tasks failed unless broken into small steps — otherwise, it got stuck or made messy changes.
  • Test generation was unreliable, and fixes sometimes caused regressions or addressed the wrong thing.
  • It often said things were fixed when they weren’t — like UI layout or cascade deletion logic, which were never actually implemented.

Overall, it’s helpful — but only if you stay in control, break down tasks clearly, and double-check everything.

🧱 Things I Don’t Expect to Improve Soon

Some limitations felt more fundamental — unlikely to improve much in the short term:

  • Autonomous deployment setup is too risky
    → It introduced misconfigurations, exposed secrets, and tried to “fix” issues by modifying unrelated parts of the app.
  • Too much freedom with vague instructions
    “Clean up the test connection file” led to Cursor wiping the entire DB setup and reconfiguring it with local Postgres.
    Update the admin dashboard by adding a list users menu” caused it to duplicate and misnest the entire dashboard structure.
  • Inconsistent code quality in larger projects
    It rewrote significant portions of the app unnecessarily — sometimes simplifying well-structured code or breaking previously stable flows. For a fast-moving startup prototype, I could work with it. In a complex production-grade system, I think it would rather complicate my work

Conclusion

My Overall Feeling

Working with CursorAI was like managing a junior developer. To get good results, I had to:

  • Provide detailed specs and design, file references, and expected outputs
  • Point to probable causes when it got stuck — or step in and debug myself
  • Review and revert changes when it misunderstood the scope
  • QA features across environments, including happy paths, and give precise feedback

It reminded me of mentoring an enthusiastic junior: if you explain clearly and review thoroughly, you’ll get decent results. If not, you’ll end up cleaning up the mess.

But in the end, I got a fully functional web app, built with a tech stack I was rusty with, in around 24 hours of total work.

Where AI Shines

I’ll continue to use Cursor in contexts where speed and flexibility matter more than polish:

Rapid prototyping — quickly validate startup ideas using familiar, well-supported tech
Front-end development — especially when functionality matters more than pixel-perfect design
Agent “ask” mode — to explore unfamiliar parts of the codebase with better context than a standalone LLM

Is It a Threat to Engineering Jobs?

I don’t think so.

🙅‍♂️ I wouldn’t use the AI agent for enterprise-grade code, where I’d expect review time to outweigh any gains
🙅‍♂️ I wouldn’t rely on it for maintainable, secure, or performance-critical systems
🙅‍♂️ I wouldn’t trust it with large-scale refactors that span many files and dependencies
🙅‍♂️ And generally, I don’t see the benefit of using it for tasks where I’m already familiar with the technology. In many cases, writing a precise instruction, then reviewing and debugging the result, takes just as much time — if not more — than coding it myself.

After all, programming languages exist to express precise instructions to machines. Natural language is more flexible — but also vague and open to interpretation, which makes misunderstandings more likely.

“Computers are good at following instructions, but not at reading your mind.” — Donald Knuth

That said, for early-stage ideas and unfamiliar areas — like front-end stacks I don’t master — it’s a useful assistant. It helps me move fast without getting blocked.

Tips for Working with CursorAI

Based on my experience, here’s what helped:

Use a spec file that defines architecture and goals to ramp up the project

Use story-like prompts that reference real files or components

Be specific — “Fix X in file Y. Do not touch Z” works far better than “Clean this up”

Avoid vague verbs like “refactor,” “fix,” or “clean” unless tightly scoped

Always define scope and constraints up front — say what to change, and what not to touch

Break down complex tasks into atomic steps — especially those involving UI + API + DB

Treat Cursor like a QA team-mate would— give step-by-step feedback: “I did X, expected Y, but saw Z”

Don’t trust autosuggestions blindly — always review the diff before accepting

Commit often so you can rollback easily if something breaks

Use a separate branch or feature flag when testing larger changes

Use ChatGPT or prompt engineering tools to refine instructions before sending them to Cursor

Avoid relying on test generation unless starting with TDD in a small, focused scope

Final Thoughts

This project helped me understand AI coding assistants for what they really are:
Not geniuses — but tireless juniors. Fast, curious, eager to help… but often missing the bigger picture.

Would I recommend Cursor? Absolutely — for indie developers, startup prototypes, or exploring unfamiliar stacks.

But would I let it commit directly to production?

Not yet.

Tried something similar? Share your setup, what worked for you, or where it fell short.

Leave a Reply

Your email address will not be published. Required fields are marked *