Looking for an innovative proof-of-concept (POC) that leverages generative AI and autonomous agents to transform software development? By utilizing large language models (LLMs) with retrieval-augmented generation (RAG), the system streamlines the process from customer requirements to technical design and code generation.
This demo I created showcases a fully AI-driven workflow where agents take on roles like Product Owner, Data Protection Officer, Software Architect, and Developer. These agents analyze user stories, generate specifications, ensure compliance, and produce working code.
This approach highlights AI's potential for automating and optimizing software engineering processes, paving the way for more efficient development cycles. Watch a tangible example of the power of GenAI in action.
(0:00 - 0:18)
Hello all, my name is Kalle Mäkelä. I come from Eficode. I work as lead AI consultant and my job here today is to show you what we have been building or basically experimenting on in regard of GenAI and software engineering.
(0:19 - 0:42)
What you're about to see is basically a demo built on top of Jira. We are using agents to drive software development from customer requirements to specifications to technical design and in the end writing up some code. This is a PoC demo and this is only the beginning.
(0:43 - 1:07)
But first of all, let's actually start explaining a little bit what LLMs and Retrieval Augmented Generation (RAG) functionality with the LLMs are all about. Foundationally, LLMs are a snapshot done in the past. So they cannot have the data that you are now experiencing when you are reading this or seeing this video.
(1:08 - 1:34)
So they are inherently always or they are foundationally always trained on the data that you use. And because of that, you need to always retrain it if you want to include data that was not included in the training set before. So, for example, here you can have a question, “What is your company's marketing strategy?”
(1:34 - 2:01)
So, of course, in the Eficode context, it doesn't know. So what we do to fix it, basically. So we are going to build this kind of functionality called retrieval augmented generation next to the large language model, meaning that basically when we ask about some data or what you have in the LLM, we are seeing that, okay, it doesn't know that data.
(2:01 - 2:14)
So we are going to augment the data and the context when we are asking again. So we will retrieve from your data sources, etcetera, information. And then we are going to generate an answer with the LLM.
(2:18 - 2:39)
So this is a very simple architectural picture of what we are having in the demo. But it is an even simpler context. What we have in the background in the demo, we have the large language model, which is the open AI model, what we are using there, which is quite good at what we are doing in the demo context.
(2:39 - 2:57)
And of course, in your business context, it's not so easy. We are going to have different roles that are going to be run as agents to actually have the queries done automatically with code. And then we are going to store the information and the response basically in the end in the Jira.
(2:59 - 3:21)
As from a rack point of view, we have all of the data we have within the Jira. But in the future, in the real world, you have an infinite amount of data that you need to include when you are making decisions based on or using LLM. All right, so let's go to the demo.
(3:21 - 3:34)
So here you can see a very simple Kanban board. We have two tickets here, first of which, which is actually that RAG. We don't have any confluence in this case.
(3:34 - 4:06)
We have everything that we need in the Jira ticket itself to make this demo as simple as possible. So we are actually operating this BookBridge Solutions company, which is a kind of Amazon startup up 25 years ago when they had a database and they had books, which they wanted to sell to the market. So it wasn't like a first of its kind digital bookstore, let's say like that.
(4:07 - 4:25)
So our data is within this ticket, which means that we have all of the background information for the company. So our mission, which is very important when we make decisions based on requirements and specifications. So it will direct our agents in the right direction.
(4:27 - 5:18)
Also, we have here basically defined the agents themselves. So we have four: We have a product owner agent, we have a data protection officer, which comes from the EU GDPR directive. So meaning that we will check business domain and nonfunctional requirements when we are actually working on what we are going to implement, which is very important and all of your, all of the companies in the world have specific needs when they are running their business in some legislation or like condiment or like, what is the business context? Is it automotive, is it banking, is it TelCo, is it DevSec or whatever? So then we have a software architect, which I'm going to explain later why it's quite important and then developer agent in the end.
(5:19 - 5:50)
So we are going to have the code written in the demo by that agent. So the knowledge base actually is very important also to basically describe a process, because like alternative AI agents, which are autonomous digital extensions of the roles that you have in any of your processes, they are context dependent and the context is always the process. They are not working on an island.
(5:51 - 6:43)
They are working as a part of end-to-end flow, how you are building products, how you are building services, like human beings. If you take a human being from the university and put it to work in your process, and if you don't train that person to work within the process, it just cannot fly. It doesn't work.
So, you need to tell agents like you tell human beings what is your context? And with the definition of ready and defensive done, we are kind of guiding the agent what is your starting point or what is your end result is all about, etc. And of course in the knowledge base, we want to tell the technical people, we want to tell more specifics for what kind of technologies we use to develop our services and products. And in this case, for example, we are pure Python.
(6:44 - 7:10)
Python house here, basically, because we have only kind of like the database and then an API developed with Python on top. And then we have a source code repository link here. All right.
But this is our knowledge base, which we are going to use when we are invoking the agents. But now I have another ticket here for an example. I have two customers, our end user needs here.
(7:11 - 7:39)
And I'm going to remove one, but I'm just explaining here that we have a BookBridge customer user story, and then I have a sales manager. So, for example, for the customer would have something like, “I want to see new books that are available for me based on my previous buying history so that I can quickly see a potential book that I want to read that I like.” But there is now this one thing that I want to do because I haven't actually tried it before.
(7:40 - 7:59)
So as a BookBridge sales manager, I want to see last month's customer purchase history so that I can tag books that aren't sold that much to be on sale. A little bit of bad English there, but let's see. LMS are quite good at understanding the meaning between the lines.
(8:00 - 8:33)
Let's rewrite the title here so that I have books on sale. OK, and it's quite important also when you are using alternative AI to actually have the motivation of who wants to do what. And this is also, of course, important when you are working with human beings so that the motivation is very, very important because now when we are creating specifications, in specifications we have what we are writing with Gherkin.
(8:34 - 8:42)
And you can see in the end what I mean about this motivation to be important. So let's see how it goes. I'm going to save it.
(8:42 - 9:02)
And by the way, here is this draft. So this draft is just like a label that we are searching for the agent. It will find all of the issues that have this label and it will basically read the information on the description field and then will do its job.
(9:03 - 9:15)
OK, I will close this up and then I'm going to execute the agent. Let's see. Now it starts to work.
(9:16 - 9:27)
So it now reads the information in that knowledge base. And after that, we should have a new ticket on the Kanban board. Let's see if we need to refresh.
(9:28 - 9:42)
No, we didn't need to refresh. So now we have this user story kind of analyzed and then we have gone then in the description field. So, let's see what the AI did for us.
(9:42 - 9:54)
So user story, view last month's customer purchase history. All of the BookBridge Sales Managers to view last month's customer purchase history to identify books that aren't sold much and back them for sale. So we have two.
(9:54 - 10:21)
So now you can immediately see that, OK, we have two scenarios: access purchase history and then tag books for sale. So it kind of dissected the user story. So it took the who wants to do what and then why as separate scenarios, which is very cool, in my opinion, because now we are able to verify in the functionality also that we are actually addressing the motivation of this user story, which is number one reasons that I like.
(10:21 - 10:49)
I'm frustrated working with the specification and user stories because the motivation for any actions in regard to any system is very, very key. So here we have access to our purchase history. Given I'm locked in the sales manager when I navigate to the purchase history section and I select the options to view the last month's history, then I should see a list of all customer purchases from the last month.
(10:49 - 11:05)
OK, sounds good. Tag book for sale. Given I am viewing last month's purchase history and identify books that haven't sold so much, and I select the option to tag them for sale, then the selected book should be marked for sale.
(11:05 - 11:12)
All right, so cool. So we have functional specifications there. So let's see what comes later or after this.
(11:12 - 11:32)
We have very, very simple, non-functional requirements. So, we have dependencies, design mockups, technical visibility, and performance criteria. And so in the agent itself, we are actually saying that, hey, these need to be handled in the process.
(11:33 - 11:46)
All right. But let's see what the next agent does so you will get more you will get more context to that. So I will run the DPO agent and let's wait for that to be finished now.
(11:46 - 11:53)
Yeah. So we got the new ticket or sorry, the same ticket. But let's see what we have.
(11:53 - 12:30)
All right. So we have a DPO data protection officer that took the functional specification and derived non-functional requirements for that. All right.
So this is kind of like a very, very foundational thing that you need to start or, like, when you are doing your business, it's very domain specific, and you have standards that you need to uphold. You need NIS2 coming into the industry in the EU. You have all of the things that need to be included into this requirement and specification work.
(12:30 - 12:52)
You know, you can automate, I believe, most of it so that you can just drive through the agents, this kind of like reading the documentation, getting the input from, for example, from customer interface. So we can have a previous week's main box here. We can see from the performance criteria there.
(12:53 - 13:16)
OK, we have some performance issues and those are kind of included automatically here. So we have here data privacy, access control, etcetera, compliancy. How do we see anything like very interesting stuff here? So, they only collect and display the minimum necessary data required for the sales manager to perform the task.
(13:16 - 13:26)
So a very good point. So we don't want to actually tell them, OK, which customers are using what. Next up, we want to go to more technical stuff or technical details.
(13:26 - 14:02)
So we start talking about designing the solution. So what we just, what we have been doing for two like columns now or phases, we have done specifications derived from require functional requirements for customer need basically. Then we have a place, a column where we have any amount of our business domain, non kind of functional requirements check that, OK, is the safety in, for example, for automotive things? So safety is quite critical so we can have an analysis there.
(14:02 - 14:18)
But whatever agents we want there to be done, like for the quality loop, quality assurance loop, we can get any kind of information there. But I just triggered, in the background, the architect agent. So we have next up here.
(14:18 - 14:53)
All right. So what the architect agent now does is that it will take the specifications and the whole ticket basically there and start designing against our repository a solution on it. And why do we want to do it like this? So like, why don't we just take the user story and then, you know, write the code? But kind of like in a startup phase when you are doing like, you know, like if you have small teams, you can quite easily like very, very quickly like experiment stuff.
(14:53 - 15:24)
If we are in a very kind of heavy industry, for example, or a healthcare domain, we are working on that domain where we want to audit our steps. We need to have this process like observability in check. And also, if we are basically going directly to from a customer need to, for example, using CoPilot and start developing, we are kind of like missing the possibility to keep our larger code base in check.
(15:24 - 15:50)
So maybe we will have in the end a lot of code churn or whatever. We are going to rewrite code a lot based on whoever wants to do whatever. So it's kind of good practice that the reasons why you have it in the human way of working also that you have you have an architect that is kind of like keeping the architecture and logical architecture in good condition.
(15:50 - 16:00)
And so it's like a good place to have discussions, also. So let's see. So here we have actually a couple of endpoints that we are going to create.
(16:00 - 16:13)
So we have last month's purchase history endpoint. Then we have books, book ID, tax for sale. Database Schema is going to have some changes because we don't have this information here.
(16:13 - 16:33)
And then we have all of this process related stuff for the developer. But by the way, I forgot to tell you, you can always, of course, go to this ticket and have modifications, you know, let's see if that comes through, for example. So, it is imperative that humans are in the loop.
(16:33 - 16:49)
We are agreeing on when we want to make these changes. But kind of like this is a very nice way to have the minimum amount of work kind of like very quickly. You know, in the ticket itself, and then you can collaborate.
(16:49 - 17:13)
So you can imagine in the teamwork, for example, scenario, we can actually use these kinds of flows, this agent driven flows to really, really have this team like cross functional teams working together. And personal note here is that 15 years ago, I was working in an extreme programming manner. So I had pair programming, I actually had more programming.
(17:14 - 17:31)
We kind of had two days of actual programming workshops in a meeting room where we had one screen where we coded. And then other people were thinking about different kinds of aspects of the solution, what we are bringing in. So this is kind of like, again, the same situation, which is, in my opinion, very fun.
(17:32 - 17:48)
All right, so let's now start coding. And while we do it, we actually go and select this to be implemented. So it's kind of like that we had this definition of ready check.
(17:50 - 18:11)
And now we are going to do the development work. So we are using open AI here, so we don't have here, for example, a cloud used. So we are taking the ticket again and basically prompting that, okay, implement these, these, these files, or what files need to be changed, etcetera.
(18:12 - 18:22)
And every time I have something different. So, okay, I have here, for example, an open pull request. GenAI created implementation of a task back in implementation.
(18:22 - 18:27)
That's not a nice title. We should fix it. Of course, you can use the CoPilot actions here.
(18:28 - 18:37)
For example, to analyze what we are doing. And so what we have changes. So, we have database changes coming here.
(18:37 - 18:45)
We have purchased a history DTO. We have a new tag for sales response. Then, we have the models that implement pie implementation.
(18:47 - 19:02)
So we have purchase history here, which we have like purchase date quantity. So, okay, so we are storing purchase history, which is a good thing. So that we can get the least amount of books sold there.
(19:03 - 19:11)
And now I have a new get routine here, call here one month ago. So we are going to get all of the books. Yes.
(19:12 - 19:27)
So that is, you know, the least amount of, I hope it's the least amount of sales. Then I have a tag for sale. So I'm going to basically have a tagging here.
(19:28 - 19:39)
So yeah, DB books for sale are true. So I'm going to get a list of books here and then I'm going to tag them for to be sold, to be under sale. And here we have the test cases.
(19:40 - 19:52)
So book purchase history. I have a couple of test data books here. And then I'm going to have the test case for the functionality.
(19:53 - 20:03)
With the quick glance, it looks to be okay. All right. So, of course, now comes the fun part, which is, of course, validation QA.
(20:04 - 20:26)
So like you saw, so in the end, you have unit tests. But in reality, of course, what you had in the beginning, here on the board, you have the specifications. And many of you may guess already, actually, these Gherkin scenarios are user acceptance testing test cases.
(20:26 - 20:41)
So you can actually do manual testing with these, or you can automate this test case to be your end-to-end test case. And of course, in this case, all of the end-to-end test cases are kind of API test cases on an API level. But you could have a UI component here.
(20:41 - 21:05)
So you would be, of course, implementing new end-to-end test cases here. And I'm going to show you in a couple of months, basically a new version where we have the QA agents working together with these developer agents, like in a proper way to actually be more robust on the test-driven development way. But thank you for your time, and see you around.
Published: Feb 13, 2025