Building this site was also an experiment to evaluate AI-augmented coding paradigms. Biased by my long tenure at AWS, I chose Kiro.dev with its spec-driven development approach to run this experiment. The fact that I was an early pre-GA adopter with generous 1000 free task credits also helped. In this article, I share what worked, what didn’t, and how I made Kiro effective for my needs.

The Kiro Promise

My artistic impression of the Kiro logo

Kiro’s product promise is to help developers do their best work by bringing structure to AI coding with spec-driven development — from prototype to production (see https://kiro.dev/). The Kiro IDE is built on top of Code OSS (the open-source foundation of VS Code). Its differentiating experience is the spec-driven development mode, in which you use the Kiro AI assistant to define requirements1, a design document2, and a list of implementation tasks3. By completing these with Kiro, you get functional code with documentation and tests.

Building this site with AI

Disclaimer: I used Kiro before it was officially GA, and I expect Kiro, just like everything else related to AI, to evolve very rapidly. Some of my observations about Kiro may no longer be up-to-date by the time you read this.

The objective of my experiment was to see how useful AI coding tools would be to build something real: this site. I consider myself to be an experienced computer scientist with significant experience in building on the AWS cloud, and I have led teams of software development engineers at Amazon. However, I have not been coding professionally myself for several years now.

So, how did it go?

Spoiler alert: I successfully built this site 😉 However, my journey had its hiccups; let’s discuss what went well, where the Kiro approach has limitations, and what you can do about them.

Things Kiro does well

First of all, I believe that Kiro has strong clarity of vision (see my previous article), with its focus to deliver production software through spec-driven development. Executing on that vision still requires fine-tuning (more on that later), but the increase of rigor through documenting requirements, design, and tasks is a positive contrast to vibe coding (unstructured, conversational approach to AI-augmented software development). This approach treats AI as a tool with strengths and weaknesses that needs engineering discipline to deliver production-quality outcomes.

Second, the developer experience is pleasant. The tasks and design provide a reasonable default structure to the development process. I primarily guide development through the high level chat interface, but I retain the ability to dive deep and review or modify the code when necessary.

Lastly, and most importantly, it was fun! Using Kiro felt a bit like using an e-bike for an uphill ride. While you can pedal uphill without assistance, having it reduces the threshold of expected pain that prevents you from getting started in the first place.

Current limitations

A central issue that AI coding assistants grapple with is managing the underlying model’s context. Context windows are limited in size, and the bigger they get, the more likely it is that the model ‘forgets’ important details.

Kiro’s spec-driven development approach partially addresses the context window challenge by executing each task in its own context window. The consequence is that each session needs to figure out what happened before, which at times feels like working with Dory from the movie Finding Nemo. Kiro makes consequential architecture and implementation decisions without consulting its human operator, and does not keep track of them. The consequences of this behavior are:

  1. Scope creep: Kiro is eager to implement things that are not necessary. It operates like a very knowledgeable junior engineer who knows all these frameworks, but lacks the experience and wisdom to prioritize. This leads to unnecessary complexity and an excess of dependencies that will make maintenance so much harder. For example, in my project it implemented website performance monitoring by running a recurring Github action invoking Lighthouse before the website was even up and running. Maybe this will be useful at some point in the future, but it should not have been a priority at an early stage.
  2. Waterfall implementation: This might be my biggest critique of the spec-driven development approach as implemented by Kiro. Essentially, treating requirements gathering and design as a one-off up-front exercise leads to bad customer experiences. I have never encountered an up-front requirements document that is both minimal and complete. Important requirements will be missing, while irrelevant requirements are included ‘just in case’ they become requirements later. Kiro will implement all tasks generated from requirements and design in sequence, without pausing at major milestones for feedback. In my case, instead of pausing for feedback after the initial Hugo setup, it went ahead and implemented full infrastructure automation, functional testing, and monitoring. When I had a chance to properly review what it had done, this led to a lot of expensive (in terms of credits) refactoring
  3. Getting stuck in a loop: When troubleshooting issues or making small tweaks to existing code, Kiro does occasionally get stuck in a loop of making things worse rather than fixing them. In my experience, the root cause for this behavior is limited awareness (or context) in session about the broader system design and architecture. For example, when trying to fix CSS style sheet issues on my site, Kiro got confused about how custom style sheets it generated worked together with underlying theme style sheets. Failing to fix these issues is frustrating and consumes credits without producing the right outcomes.

Mitigation strategies

My mitigation strategies are based on 3 insights:

  1. The Kiro AI is not a ‘persistent’ engineering agent, but it spawns a new one for each session you start. It is a bit like onboarding a talented junior engineer for each task that needs to be done.
  2. To make Kiro (and any other AI coding assistant) work, I need to acknowledge and address the context window limitations.
  3. Best practices for developing with a team of human engineers are also useful for working with AI engineering agents.

Invest in setting up your steering documents: I eventually learned that you can configure Kiro’s behavior with steering docs. They allow you to define persistent context, such as your personal interaction preferences or project/team-wide coding standards. For example, my personal interaction guidelines mandate that the AI acknowledges my prompts with a simple OK rather than ‘You are absolutely right’ 🤪.

Rubber-ducking: When Kiro repeatedly failed to fix an issue, it dawned on me that it did not have the right context about how the system we were trying to fix worked at the right level of detail. After going in circles for a while, I remembered the technique of rubberducking, in which you explain how the code works to a rubber duck until you figure out the bug or issue. This technique proved to be very effective with Kiro (no rubber ducky required): whenever it got stuck, I asked it how the system it tries to fix works. It then analyzes the code structure and dependencies, adding this information into the context. With this, it found the root cause of the issue and was able to fix it.

Architecture Decision Records (ADR): I am a huge fan of ADRs, a lightweight approach to document consequential technical decisions when they are implemented. This is a great practice to onboard new engineers to a project, and a great mechanism to ensure any trade-offs in technical decisions are made intentionally with awareness of consequences. The benefits of this approach when applied to AI-augmented coding are similar: enforcing ADR review and approval for key technical decisions ensures that important technical decisions are made with a human in the loop. For my project, I enforced ADR creation and approval for any decision related to security, external dependencies, infrastructure platform choice, and technology platform selection. Even though I did it late in the game, documenting and reviewing each major such decision allowed me to get clarity on what Kiro did (without consulting me). Refactoring the code for decisions I reverted was more painful than if they had not been done in the first place, but definitely worth the effort to reduce complexity.

Going agile: The main reason I don’t like linear development processes such as Waterfall is that they don’t take into account the fact that most people never know all requirements up-front. Same with Kiro’s spec-driven development; the initial requirements and design were good enough to get started, but not good enough to implement everything in one go without feedback loops and course corrections. When I am leading engineering teams, I break down development units into experiences with a well-defined scope and outcome that can be validated through a demo. For this site, one example experience could be reading an article, or navigating the site. The finished product is a collection of such experiences and their iterative improvements. For my next project, I will add such a layer to the spec-driven development process so that I have clear milestones with feedback and validation before proceeding to the next task. With this approach I will catch issues early and refactor or pivot at minimal cost and effort.

Key take-aways

My experience building with Kiro was enlightening in different ways. First, while AI-augmented coding has limitations, it is evolving fast and I am convinced it is here to stay. However, I do not agree with the sentiment that it replaces human developers. What my experiment has shown me is that humans in the loop are critical to delivering the right outcomes. On its own, the AI assistant lacks good judgement on what is important. The fact that adopting best practices from software engineering such as documenting architecture decisions and taking a more iterative/agile approach to delivering software improve on the default experience also highlights that pure vibe coding is not the way to build production-ready software. To deliver the right outcomes, developers using these tools need a broader set of expertise across software engineering, product management, and technical project management. For engineering leaders, this means investing in developing these broader skills within your teams, not just adopting the tools.

I am curious to see how this space is evolving. These are interesting times. But most importantly, it was fun!


  1. A requirements.md document articulating requirements in the form of user stories with acceptance criteria. ↩︎

  2. A design.md document covering architecture, components and interfaces, data models, error handling, testing strategy, and security considerations. ↩︎

  3. A tasks.md document with a list of tasks to track progress towards completion. ↩︎