
Over the past nine months, I've embarked on an intensive journey through the landscape of AI-assisted development, creating nine working applications using the major vibe coding tools and frameworks available—from the free tiers of Cursor and bolt.new to the premium versions of Cursor, Windsurf and Warp. What started as curiosity about these emerging tools evolved into an exploration of how artificial intelligence is reshaping software development, revealing both extraordinary possibilities and subtle dangers that could fundamentally alter how we approach our craft.
This article emerges from that hands-on experience—the successes that compressed months of work into hours, the failures that taught me why human oversight remains irreplaceable, and the evolving understanding of where AI assistance enhances creativity versus where it can inadvertently stifle it. These aren't theoretical observations about AI coding tools; they're insights from someone who has used these systems across diverse project types, from enhancing complex RAG-based chatbots to building intuitive map interfaces to developing demos for customer vulnerability detection.
As artificial intelligence transforms software development at an unprecedented pace, we find ourselves at a critical juncture where the promise of instant gratification meets the reality of sustainable engineering practices. The emergence of "vibe coding"—a term coined by OpenAI co-founder Andrej Karpathy in early 2025—represents both the incredible potential and hidden dangers of our AI-powered development future[1][2]. While these tools can dramatically accelerate innovation cycles, the central challenge remains: how do we harness their power without losing the essential human element that ensures quality, maintainability, and true understanding of our code?
The current ecosystem of AI-assisted development tools spans a remarkable spectrum, each serving different needs and philosophies. My experience across nine different projects has revealed that the tool's effectiveness depends heavily on the nature of the problem you're trying to solve.
At one end, we have cloud-based builders like Bolt.new and Lovable.ai that promise instant application deployment from simple natural language prompts[3][4]. I discovered this firsthand when attempting to build a "How rich are you?” personal finance application—a project that perfectly illustrated both the strengths and limitations of these platforms. Within minutes, both tools had generated beautiful, responsive user interfaces that would have taken me days to create manually. The UI components were polished, the user experience was intuitive, and the visual design exceeded my expectations.
However, the magic ended when the application needed to perform actual data mining and complex backend operations. The sophisticated financial calculations, data integration from multiple sources, and the nuanced logic required for accurate wealth assessment simply couldn't be generated by these platforms[5][6]. They excel at creating that magical "wow" moment where an idea becomes a working prototype in minutes, but they fall short when it comes to the complex requirements, security considerations, and architectural decisions that production software demands[4][5].
Moving up the sophistication ladder, we encounter serious development environments like Cursor and Windsurf, which have emerged as the current leaders in AI-assisted coding[7][8]. My experience with these tools spans both brownfield enhancement and greenfield development, revealing different strengths in each context.
For brownfield development, I used Cursor to enhance an existing RAG-based chatbot application—a well-architected system I had previously designed using my own patterns and principles. Cursor proved exceptional at understanding the existing code components (when pointed to them) and organically integrating new features like GraphRAG capabilities and AdaptiveRAG concepts that the LLMs had learnt on their pre-training. The AI could maintain consistency with my established patterns while introducing sophisticated enhancements that would have required significant research and implementation time.
However, I eventually hit a wall. While Cursor excelled at enhancing existing functionality and adding incremental complexity, it struggled with implementing the more sophisticated algorithms I envisioned. The tool could handle the integration work beautifully, but when it came to translating cutting-edge research concepts into robust, production-ready code, the gap between AI capability and human expertise became apparent.
However, it was a different story for a small front-end heavy application. For my_locations project—a map-based UI application—was one of my successful AI-assisted development using Cursor. As someone who readily admits to being weak in frontend code, I was amazed at how the tool guided me through creating a polished, functional map interface. Cursor didn't just generate code; it taught me frontend patterns and best practices that I've since applied to other projects[9][10].
When I needed to complete the "How Rich are you?" application with proper backend functionality, Windsurf proved to be the right tool for the job. Its more sophisticated understanding of full-stack architecture and data processing capabilities allowed me to implement the complex financial calculations and data integration that the cloud builders couldn't handle.
Perhaps the most intriguing development in the AI coding space is GitHub's Spec-Kit framework, which introduces a structured approach to AI-assisted development[11][12]. My most comprehensive exploration of this framework came through developing artmind—my pet "second brain" project that I've iterated through multiple versions over the past months.
The initial experience with Spec-Kit combined with Windsurf was genuinely impressive. The framework's systematic approach—moving through Specify, Plan, Tasks, and Implementation phases—created a sense of architectural rigor that ad-hoc prompting lacks[11][13]. For artmind's first iteration, I watched as my high-level vision for a personal knowledge management system expanded into a comprehensive specification covering document processing, semantic search, and q&a.
What fascinated me most was how the process became educational. As Spec-Kit generated hundreds of detailed tasks, it introduced me to Python tools and practices I hadn't encountered: ruff for linting, justfile for task automation, alembic for database migrations, and sophisticated pydantic models for data validation. The framework wasn't just generating code; it was exposing me to the broader Python ecosystem and modern development practices.
This educational aspect continued as I explored various agent development frameworks during the artmind iterations. I experimented with PydanticAI for structured agent interactions, Atomic for multi-agent coordination, and Instructor for reliable LLM outputs. Each framework revealed its own strengths and limitations—PydanticAI's type safety came with complexity overhead, Atomic's coordination capabilities struggled with state management, and Instructor's reliability sometimes came at the cost of flexibility.
However, the fundamental challenge with Spec-Kit became apparent across multiple artmind iterations. The framework's ability to expand a simple specification into hundreds of detailed tasks is genuinely impressive—it incorporates software engineering best practices, accessibility considerations, and robust error handling that individual developers might overlook[11]. But therein lies the problem: it moves so fast that it can easily outpace human comprehension and oversight.
By my second artmind iteration, I was dealing with thousands of lines of generated code across dozens of files. Even working feature by feature rather than attempting to build the entire application at once, the sheer volume of generated code became overwhelming. The disconnect between my original vision and the implemented solution grew with each iteration, creating a system that was technically sophisticated but increasingly alien to my understanding.

This overwhelming experience with rapid code generation led me to Warp terminal for my current artmind4 iteration, and the difference in development philosophy has been profound[14][15]. Rather than maximizing code generation speed, Warp focuses on maintaining developer connection to the development process[16][17].
Working on artmind4 through Warp felt fundamentally different from my previous iterations. I was able to use its features to enhance understanding rather than replace it. It provides better responses during error analysis and debugging and also feels like it bites only as much it can chew in one prompt and does more or less what I ask it to[15][18]. Instead of generating massive amounts of code that I struggle to comprehend, Warp took me through incremental development where I understood each step before proceeding.
What makes this approach particularly compelling for complex projects like artmind4 is Warp's sequential, terminal-based workflow that naturally enforces a more deliberate pace of development[14]. The tool encourages me to remain engaged with each step of the process, understanding the commands being executed and the changes being made[18]. This creates natural checkpoints where I can evaluate and understand each modification before proceeding—a stark contrast to the overwhelming code dumps from my Spec-Kit experiments.
The platform's integration with team knowledge through Warp Drive adds another layer of contextual awareness, allowing AI assistance to be informed by project-specific workflows and documentation[16]. For artmind4, this means the AI understands not just general Python patterns, but the specific architectural decisions and design principles I've established for this iteration.

Note that different LLMs like Claude Sonnet 4.x, GPT 4.x & 5.x, Gemini/Gemma all have different personalities and have a huge impact on code quality and the SWE workflow. However, that discussion is for another day (or article).
Recent research reveals a troubling paradox in AI-assisted development that my experience confirms. While usage continues to climb—with 84% of developers now using AI tools—trust in these systems is reducing[19][20]. Stack Overflow's 2025 Developer Survey shows that trust in AI accuracy has fallen from 43% to just 33% over the past year, even as adoption rates continue to rise[19][21]. This decline in trust isn't arbitrary—it's based on real experience with the limitations of current AI systems. The most common frustration, cited by 45% of developers, is dealing with "AI solutions that are almost right, but not quite"[19]. I've encountered this phenomenon repeatedly across my projects. The code appears functional, passes initial tests, but contains subtle logical errors or architectural decisions that create problems down the line[22][23].
The importance of maintaining true code ownership in an AI-assisted environment cannot be overstressed. Traditional ownership models assumed that whoever committed code had written and understood it. AI assistance breaks this assumption, creating situations where developers may submit code they don't fully comprehend. My approach is to treat AI suggestions generated code with the duality of enthusiasm and scepticism. Enthusiasm - as I learn new frameworks and clever patterns. But I quickly kill that feeling and apply a skeptic's hat with more scrutiny than I would apply to code from trusted colleagues[24]. For each significant piece of AI-generated code, I ensure I can explain every line, debug any issues that arise, justify the design decisions made, and maintain the code long-term[25]. This means regularly stepping back from the rapid pace of AI generation to truly understand what's being created.
During my artmind iterations, I learnt the hard way that accepting AI-generated code without this level of understanding creates technical debt that compounds quickly. I've adopted a rule: if I can't explain why the AI chose a particular approach, I either research it until I understand or request a different implementation that aligns with patterns I comprehend.
The legal implications add another layer of complexity that became relevant as my projects grew in sophistication. Current copyright law requires human authorship for protection, meaning code generated primarily by AI may not be eligible for copyright protection[26][27]. For commercial projects, we need to maintain detailed records of the prompts, modifications, and decision-making processes to document human creative input[27][28].
Through these hands-on examples, I've developed a practical framework for effective AI-assisted development. The human builder should serve as the navigator, making architectural decisions and maintaining strategic oversight, while AI functions as the driver, implementing specific solutions under human guidance[29][30].
For the projects that succeeded I provided rich contextual information rather than relying on AI alone to infer requirements and technical decisions. Things worked for a while when I gave the AI an open hand. But I did not let it run ahead of me for too long. I would terminate that branch of code and begin back from a checkpoint where I explicitly shared relevant codebase patterns, architectural decisions, and project constraints[29]. This contextual grounding helped AI generate more appropriate solutions and reduced the likelihood of suggestions that appeared functional but violated established patterns[31].
There is therefore the need to embrace iterative development and refinement rather than expecting immediate results. Starting with rough implementations and gradually refining them allows for better oversight and understanding[29][30]. This approach proved crucial during my RAG chatbot enhancements, where each iteration built upon previous understanding rather than attempting massive changes all at once.

Despite the challenges, the potential for AI tools to accelerate innovation cycles remains one of their most compelling benefits. My vulnerability detection demo exemplifies this perfectly—turning what would have been a month-long development project into a one-hour exercise. The challenge was to create a system that could analyze customer service chat transcripts and identify when agents encountered vulnerable customers—situations requiring special handling protocols. Using Windsurf, I was able to rapidly prototype a solution that incorporated natural language processing and pattern recognition. The speed of development was genuinely remarkable; what would have taken weeks of traditional development was accomplished in a single focused session.
The key insight from my experience is that this acceleration is most valuable in specific contexts: proof-of-concept development, rapid prototyping, and demonstrating feasibility to stakeholders. In these scenarios, the ability to quickly materialize ideas into working demonstrations significantly reduced the time between concept and validation[3][35]. This is a genuine benefit and can applied TODAY in all product development scenarios.
However, this same acceleration becomes problematic when applied to complex, long-term projects without appropriate guidance and safeguards. The "move fast and break things" mentality that worked for prototypes introduce significant technical debt and maintenance challenges in more sophisticated applications[36][37]. Most AI coding assistants will fail at enterprise scale because they lack understanding of complex, interconnected systems and organizational-specific patterns.
During my artmind development, I encountered the challenge of maintaining consistency across multiple AI-generated modules. Generic pattern suggestions from public repository training data sometimes violated the architectural decisions I had established earlier in the project[33][34]. This experience illustrated why enterprises need clear guidelines and standardized approaches to prevent conflicting tools and practices from creating integration challenges. This in fact now becomes a new "job role" of the Architects, Designers or Code Owners within the enterprise - “Encode the standards and patterns into the various SWE Agentic prompts or prompt frameworks themselves”.
As I continue developing artmind4 and planning future projects, I've established personal principles that guide my use of AI coding tools. These aren't theoretical guidelines—they're practical rules derived from real successes and failures across diverse development challenges.
Never let the AI run ahead of your understanding. Whether working with cloud builders for rapid prototyping, serious development environments for production code, or terminal-based tools for controlled development, I maintain my role as architect and steward of the systems I create. The most sophisticated AI tools enhance rather than replace human judgment, creativity, and oversight.
The tools will continue to evolve, becoming more powerful and more persuasive in their suggestions. My responsibility as a "builder" is to evolve alongside them, developing the judgment to know when to embrace their assistance and when to assert human control. AI becomes a force multiplier for human creativity when used thoughtfully, but it can quickly become a replacement for human understanding if we're not careful.
In this balance between human insight and machine capability lies the path to truly transformative software development—development that harnesses the power of AI without sacrificing the wisdom that only human experience and creativity can provide, and the human oversight which needs to be embedded in. We can unlock faster, richer, and more meaningful systems, tackling previously insurmountable problems. The need of the hour is to foster a culture of adaptability and continuously learn to work and evolve our ways of working with the AI tools. By navigating this transition thoughtfully, we can unlock a future where technology enhances human capabilities and drives sustainable progress and where human ingenuity is amplified, not diminished
Stay curious, keep learning, keep creating !!
Sources
[1] Vibe coding https://en.wikipedia.org/wiki/Vibe_coding
[2] What is vibe coding? | AI coding https://www.cloudflare.com/learning/ai/ai-vibe-coding/
[3] Top 10 Vibe Coding AI Tools Every Developer Needs in 2025 https://dev.to/devland/top-10-vibe-coding-ai-tools-every-developer-needs-in-2025-29pf
[4] Lovable vs. Bolt vs. Bubble: AI Builders Compared https://bubble.io/blog/lovable-vs-bolt-vs-bubble-comparison/
[5] Lovable vs. Bolt: Which AI coding tool is best? [2025] https://zapier.com/blog/lovable-vs-bolt/
[6] Bolt AI vs Lovable AI: Which AI App Builder Deserves Your ... https://uibakery.io/blog/bolt-ai-vs-lovable-ai
[7] Cursor vs Windsurf: An In-Depth Comparison https://www.appypievibe.ai/blog/cursor-vs-windsurf-ai-code-editor
[8] Cursor vs Windsurf: Full AI Coding Tool Comparison https://www.sevensquaretech.com/windsurf-vs-cursor-features-comparison-coding/
[9] Cursor vs Windsurf: AI Coding Assistant Comparison https://www.tembo.io/blog/cursor-vs-windsurf
[10] Windsurf vs Cursor: Which AI IDE Tool is Better? https://www.qodo.ai/blog/windsurf-vs-cursor/
[11] GitHub Spec Kit: A Guide to Spec-Driven AI Development https://intuitionlabs.ai/articles/spec-driven-development-spec-kit
[12] A look at Spec Kit, GitHub's spec-driven software ... https://ainativedev.io/news/a-look-at-spec-kit-githubs-spec-driven-software-development-toolkit
[13] Diving Into Spec-Driven Development With GitHub Spec Kit https://developer.microsoft.com/blog/spec-driven-development-spec-kit
[14] Warp Wrapped: 2024 in Review https://www.warp.dev/blog/2024-in-review
[15] Warp: a new AI-based terminal you should try out https://daily.dev/blog/warp-a-new-ai-based-terminal-you-should-try-out
[16] All Features https://www.warp.dev/all-features
[17] Warp: AI: Natural‑Language Coding Agents https://www.warp.dev/warp-ai
[18] Warp AI Terminal: A Beginner's Guide to the Future of ... https://dev.to/arjun98k/warp-ai-terminal-a-beginners-guide-to-the-future-of-command-line-interfaces-43k1
[19] Developers remain willing but reluctant to use AI: The 2025 ... https://stackoverflow.blog/2025/07/29/developers-remain-willing-but-reluctant-to-use-ai-the-2025-developer-survey-results-are-here/
[20] Trends #8: Developers use AI more, but they trust it much less https://newsletter.techworld-with-milan.com/p/trends-8-developers-use-ai-more-but
[21] Most developers use AI in their daily workflows - but they ... https://www.zdnet.com/article/most-developers-use-ai-daily-in-their-workflows-but-they-dont-trust-it-study-finds/
[22] The Productivity Paradox of AI Coding Assistants https://www.cerbos.dev/blog/productivity-paradox-of-ai-coding-assistants
[23] Trust in AI coding tools is plummeting https://leaddev.com/technical-direction/trust-in-ai-coding-tools-is-plummeting
[24] How Do I Maintain Code Ownership When Using AI? https://zenvanriel.nl/ai-engineer-blog/how-do-i-maintain-code-ownership-when-using-ai/
[25] Maintaining Code Ownership in the Age of AI Assistance https://zenvanriel.nl/ai-engineer-blog/maintaining-code-ownership-with-ai-assistance/
[26] Navigating the Legal Landscape of AI-Generated Code https://www.mbhb.com/intelligence/snippets/navigating-the-legal-landscape-of-ai-generated-code-ownership-and-liability-challenges/
[27] Think While You Are Using AI Coding https://www.bakerdonelson.com/think-while-you-are-using-ai-coding
[28] AI-Generated Code: Who Owns the Intellectual Property ... https://www.leadrpro.com/blog/who-really-owns-code-when-ai-does-the-writing
[29] Best practices for pair programming with AI assistants https://graphite.dev/guides/ai-pair-programming-best-practices
[30] Pair Programming with AI Coding Agents: Is It Beneficial? https://zencoder.ai/blog/best-practices-for-pair-programming-with-ai-coding-agents
[31] AI-Powered Coding Assistants: Best Practices to Boost ... https://www.monterail.com/blog/ai-powered-coding-assistants-best-practices
[32] AI Coding Assistant for Enterprises: Beyond GitHub Copilot https://fx31labs.com/ai-coding-assistant-enterprise-tools/
[33] AI Coding Assistants for Large Codebases: A Complete ... https://www.augmentcode.com/guides/ai-coding-assistants-for-large-codebases-a-complete-guide
[34] AI Coding Tools Are Not Enough. Here's What Enterprises ... https://www.codespell.ai/blog/ai-coding-tools-are-not-enough-heres-what-enterprises-actually-need
[35] Best Vibe Coding Tools & Why AI Agents Work Better with ... https://codehooks.io/blog/vibe-coding-tools
[36] The Productivity Trap https://matthewreinbold.com/2025/06/19/theproductivitytrap
[37] Leading Developer Productivity Tools in 2025 https://www.codeant.ai/blogs/leading-developer-productivity-tools
[38] Limitations of AI Coding Assistants: What You Need to Know https://zencoder.ai/blog/limitations-of-ai-coding-assistants