A reality check from a recent project I was involved with: When my team started working with GE Healthcare on their newest patient monitor, we knew it would take 3-5 years to get the product developed and certified. 

At the same time, GitHub was reporting that their AI tools make developers 55% faster. Sounds great, right?

The problem is that, in traditional organizations, actual software development only accounts for 5-10% of total R&D time. So, even if AI tools make your developers twice as fast, you're only improving your organization's overall efficiency by 2.5-5%. 

That's not the game-changing improvement we're looking for.

I’ve spent years implementing AI-driven development across different organizations – from agile startups to heavily regulated enterprises. And I can tell you that, for meaningful acceleration, you need more than just adding AI coding assistants to your toolchain. You need to rethink your entire development approach. 

We'll start with six fundamental areas you need to get right before AI can make a meaningful difference. Then we'll look at practical ways to implement AI-driven development, whether you're shipping web apps multiple times per day or building medical devices with multi-year certification cycles.

The six essential areas you need to get right

Before diving into AI-driven development, let's talk about the fundamentals. I've seen too many organizations jump straight into AI tools without having their basics in place. Here's what you need to get right first.

1. Agile practices – Jira automation

Most organizations use Jira or similar tools, and most developers find the manual information updates tedious. While these tools are great for reporting and management oversight, they can become a bottleneck for development work.

Modern software development needs a more streamlined approach. Instead of manually updating Jira tickets, modern tooling now enables seamless connection to Git, using branch names as the primary workflow marker. When making a change the application, you simply create a branch named something like HW-1234/feature-description, and automation picks up the Jira ticket information from there.

With this approach, you both maintain traceability and remove friction from the development process.

2. CI/CD and automating quality gates

Continuous Integration and Continuous Deployment might seem basic, but without rigorous automation you will fail in the era of AI-driven development. Your CI/CD pipeline acts as a guardrail, ensuring your AI-generated code meets your quality standards.

Here's what a minimal viable CI/CD setup could look like:

name: Continuous Integration

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    timeout-minutes: 5

    steps:
    - uses: actions/checkout@v4
    - uses: actions/setup-node@v4
    - run: npm ci
    - run: npm test
    - run: npm run lint

Simple? Yes. But this foundation lets you gradually add more sophisticated checks, including AI-specific validations.

3. Building quality in with continuous quality assurance

Your tests aren't just verification tools. They're a living documentation of your requirements. When you work with AI-generated code, this becomes even more critical. Tests define the boundaries of acceptable behavior, so your AI tools have clear targets to aim for.

In my recent work, we discovered a calculation bug that our AI-generated code introduced. Our comprehensive test suite, also planned and executed using AI, caught the issue, so we could quickly iterate on the solution.

4. Cloud Native development releasing all changed

Cloud native isn't just about running in the cloud – it's about embracing modern deployment patterns. Using services like Vercel, Fly.io or Heroku we can now deploy every feature branch automatically, creating disposable environments for testing and validation.

This approach is particularly valuable when you work with AI-generated code. It allows you to quickly verify changes in isolation without having to manage complex environments.

5. Security tooling for automated protection

Modern security tooling can automatically scan your dependencies, check for vulnerabilities and even review pull requests for security issues. Check for example GitHub Advanced Security or GitLab Security.

For AI-driven development, this becomes even more important. Your security tools are your safety net. They catch potential issues in AI-generated code before they reach production.

6. Organizing according to Team Topologies

Conway's Law tells us that your software architecture mirrors your organizational structure.

Team Topologies, a framework that was developed by Matthew Skelton and Manuel Pais, helps us organize teams to support modern development practices.

Check Matthew's introduction to Team Topologies in our DevOps Conference: DevOps Topologies 10 years on: What have we learned about silos, collaboration, and flow?

The key elements are:

  • Stream-aligned teams owning specific parts of customer value
  • Platform teams providing internal services
  • Complicated subsystem teams handling specialized components
  • Enabling teams supporting other teams' capability development

Why this structure is so important when you adapt AI tools? It helps you clarify where and how AI assistance can be most effectively applied.

Having these fundamentals in place isn't just about best practices. It's about creating an environment where AI tools can actually deliver value. Without this foundation, you risk building a house of cards that looks impressive but falls apart under real-world pressure.

For more, read our article "Transforming software development with AI and DevOps"

Where AI actually helps

After getting the fundamentals right, let's look at where AI can make a real difference. Skip the marketing hype - here's what actually works based on my hands-on experience.

Generating and completing code 

Let me walk you through a real example. Recently, I needed to implement a week number calculation function. Instead of diving into the intricacies of date manipulation myself, I let AI handle the algorithm. 

Here's how it played out:

function getWeekNumber(date = new Date()) {
  const startDate = new Date(date.getFullYear(), 0, 1);
  const days = Math.floor((date - startDate) / (1000 * 60 * 60 * 24));
  const weekNumber = Math.ceil((days + startDate.getDay() + 1) / 7);
  
  return weekNumber;
}

The code works for most cases, but since the week number is not calculated directly from the first of January, the code does not actually work. This highlights an important point: AI is great at generating initial solutions, but you need proper testing and validation.

The real power isn't in replacing developers. It's in handling the tedious and repetitive parts so you can focus on business logic and edge cases.

Understanding APIs and libraries

This is where AI truly shines. Instead of digging through documentation or Stack Overflow, you can ask direct questions about APIs and get contextual answers.

For example, when I started working with a new date manipulation library, instead of reading through pages of docs, I could ask:

  • "How do I handle ISO week numbers in this library?"
  • "What's the difference between local and UTC week calculations?"
  • "Show me examples of handling year boundary cases"

The responses are usually more practical and contextual than traditional documentation. But remember that AI's knowledge cutoff date means you should verify anything about recent API changes.

Generating boilerplate code 

Starting new projects or adding standard patterns used to be a copy-paste exercise from old projects. Now, AI can generate this boilerplate code, often in a cleaner state than your old reference projects.

What's interesting is that AI often produces better-structured boilerplate than human-written code. And it does so because it's synthesizing best practices from thousands of examples. 

Just make sure you’re clear about your requirements. The difference between good and mediocre results often comes down to how well you specify your needs.

Learning new technologies

Let me explain this with a practical example. 

I needed to learn about sound analysis in Python, a domain I wasn't familiar with. Instead of spending hours reading documentation, I could have a dialogue with AI:

  1. "Show me a basic sound analysis pipeline in Python"
  2. "Explain how the FFT parameters affect the analysis"
  3. "Help me optimize this for real-time processing"

The key is using AI as a learning accelerator, not a replacement for understanding. You still need to grasp the concepts, but AI can help you climb the learning curve faster.

A word of caution

While all these capabilities are powerful, they come with important caveats. So, let me wrap this section up with the four most important ones:

  • AI-generated code needs thorough testing
  • You must verify security implications
  • Edge cases often require human insight
  • The latest features or best practices might not be reflected

Always consider this for your AI implementation 

After exploring where AI can help, let's tackle what you actually need to watch out for when bringing AI into your development process. I've hit these challenges repeatedly while implementing AI-driven development, and I want to help you avoid the same pitfalls.

Prompt engineering skills are a must-have 

You need to learn a new skill that none of us saw coming a few years ago: prompt engineering. This isn't just asking AI to write code – it's about getting results you can actually use in production.

Let me show you what I mean. 

When I needed that week number calculation function, asking for "a function to calculate week numbers" gave me garbage. I had to learn to be specific: 

"I need a function that takes ISO dates, uses ISO week numbering, and throws clear errors for invalid inputs." 

Your ability to write clear prompts becomes as important as your coding skills.

Legal and IP carefulness

You've probably heard about the GitHub Copilot lawsuit: In November 2022, a class-action lawsuit was filed against Microsoft, GitHub, and OpenAI, alleging that GitHub Copilot violated the copyrights of open-source developers by reproducing code without proper attribution. Microsoft has pledged to defend users against copyright claims related to AI-generated code, but as the legal framework is still developing, you will need to be smart about how you use AI-generated code

The tricky part is that the code that AI generates does not automatically get copyright protection. You need to modify it enough to make it your own work. In practice, this means you can't just copy-paste AI outputs. You need to understand them, adapt them and integrate them properly into your codebase.

Security and data protection

Here's something that should scare you: everything you paste into ChatGPT can become training data for future versions. 

Never, ever share:

  • API keys or secrets
  • Internal architecture details
  • Proprietary algorithms
  • Customer data
  • Security-sensitive code

If you need to use AI with sensitive code, create sanitized examples or set up private instances. Yes, it's extra work, but it's better than explaining to your security team why your internal algorithms are showing up in other people's AI completions.

Testing and validation

Remember that week number bug I mentioned? The AI gave me code that looked perfect but failed on certain week days depending on the year. This is why you need solid testing - AI will confidently give you wrong code.

This is our testing approach, and it works great:

  1. Write tests before generating code (yes, TDD works with AI)
  2. Test edge cases explicitly
  3. Validate against real-world data
  4. Use CI/CD to automate validation
  5. Monitor production behavior

Here's a practical example from our week number project:

test('should return week 13 for 25th of March 2025', () => {
 
expect(getWeekNumber(new Date('2025-03-25'))).toBe(13);
});

Write tests like this before you even ask AI for code. When the AI gives you something that looks good but returns week 12 instead of 13, you'll catch it immediately. Your tests become your safety net.

Implementation strategy

Here's how to get started: Pick something non-critical for your first AI implementation. Maybe a utility function or an internal tool. Set up proper testing, try some prompts, and see what works. Get your team comfortable with the process.

Don't try to replace your developers with AI - that's not the point. You're adding a powerful tool to their toolkit, but they need to learn how to use it safely and effectively.

Measuring success with DevOps metrics

Let's talk about metrics. Not the vanity metrics that look good in presentations, but the ones that actually tell you if AI is helping your development process.

The four metrics that matter

When evaluating how your AI implementation is working, focus on the same four metrics you (hopefully) already track. Don't get distracted by AI-specific metrics like "lines of code generated" – they don't tell you anything useful.

Let me walk you through what to measure and what I've seen in practice:

Lead time for changes

This measures how long it takes to get a change into production. With AI, you'd expect this to improve, right? Well, it's complicated. 

In the GE Healthcare project, even though developers could write code faster with AI, the overall lead time barely changed because of regulatory requirements.

But in web development projects, I've seen teams cut their lead time significantly when they combine AI with good automation. You're not just coding faster – you're getting better at the whole process of turning ideas into working software.

Recovery time

When something breaks, how fast can you fix it? 

This is crucial with AI-generated code because you'll hit unexpected issues. In that week number calculation example I showed earlier, we found and fixed the bug quickly because we had good tests and deployment processes in place.

Your goal should be to maintain or improve your recovery time even as you introduce AI. If it's getting worse, you're probably moving too fast.

Deployment frequency

Recent research shows 75% of organizations now deploy multiple times per week. If you're deploying once per day, you're in good shape. 

With AI, you might deploy more often, but don't force it – frequency should follow your business needs.

In my projects, I've found that AI helps most with the small, frequent changes:

  • Fixing bugs
  • Adding minor features
  • Updating dependencies. 

The big architectural decisions still need human thought and planning.

Failure rate

This is the one to watch closely when you start using AI. Your failure rate shouldn't increase just because you're using AI-generated code. If it does, you need to strengthen your validation processes.

I track production failures carefully, especially in the early days of AI adoption. 

What I've found is that AI-related failures usually come from misunderstanding requirements rather than actual coding errors – the AI writes syntactically correct code that solves the wrong problem.

Putting all these metrics together

Here's a practical way to use these metrics: 

Start tracking them before you implement AI. Get a baseline. Then watch how they change as you introduce AI tools. Look for patterns:

  • Are simple changes getting through faster while complex ones take the same time?
  • Are you catching more issues in testing rather than production?
  • Is your team spending less time on boilerplate and more time on architecture?

Remember that huge GitHub claim about 55% faster development? In reality, you might see modest improvements in some areas and none in others. That's fine. Your goal is steady, sustainable improvement, not revolutionary change.

The metrics don't lie. If your lead time is dropping and your failure rate isn't rising, you're doing something right. If both are getting worse, step back and review your processes.

Let’s wrap things up

After walking you through the practical aspects of AI-driven development, let me be straight with you: AI isn't going to revolutionize software development overnight. 

What I've learned from implementing it across different organizations is that it's most effective when you treat it as another tool in your development toolkit – albeit a powerful one.

The real value of AI in software development isn't about replacing developers or even about writing code faster. It's about shifting where we spend our time. Instead of wrestling with boilerplate code or digging through API documentation, you can focus on the challenging parts of software development: 

  • Understanding user needs
  • Designing robust architectures
  • Handling complex business logic

Remember that example I started with – the GE Healthcare project? 

Even with all the AI tools in the world, you're not going to turn a three-year medical device development cycle into a three-month sprint. But you can make those three years more productive by letting AI handle the routine tasks while your team focuses on the complex challenges that require human insight.

If you're thinking about implementing AI in your development process, start with the basics I've outlined. * 

  • Get your CI/CD pipeline solid
  • Make sure your testing is robust
  • Understand your team topology

Then, gradually introduce AI tools where they make sense for your specific context.

Most importantly, keep measuring those four key metrics. They'll tell you whether AI is actually improving your development process or just adding complexity.

The future of software development isn't about AI replacing developers. It's about developers who know how to effectively use AI tools outperforming those who don't. Make sure you're in the first group.


This blog is based on a talk at the 2023 GOTO conference in Copenhagen.
Watch Marko's talk: 

 

Published: Sep 6, 2024

Updated: Mar 20, 2025

DevOpsAI