A 5-stage guide to turbocharging the quality assurance process with generative AI
By Aska Cloke
Learn how to enhance test productivity and quality at each stage of the Software Development Life Cycle (SDLC)
There are many blogs and discussions about utilising generative artificial intelligence (GenAI) for quality assurance (QA) activities, which primarily focuses on test cases and automation script generation. In this blog, I explore how we can enhance test productivity and quality at each stage of the Software Development Life Cycle (SDLC) by leveraging GenAI.
By incorporating AI into this process from a QA point of view, I believe the practice of 'Shift Left' (this moves testing activities closer to the beginning of the SDLC) will occur organically. To prove this point, I used ChatGPT and Continue (VSCode extension) to create examples. This article may be useful for anyone involved in software development, but the focus is predominantly on testing activities.
Software Development Life Cycle
There are five main stages in the SDLC:
Analysis
Design
Implementation
Testing
Evolution/maintenance
Source: Software Development Life Cycle
I will use the AG Grid’s HR Demo site as a sample application throughout this process. The example task is to add a new functionality to this application as described in the story below:
As a HR system user
I want to add a column for Employee’s nationality next to the Country of Residence
So that I can identify whether employee requires a work permit
Let’s follow each stage of the SDLC using this story to identify how we can enhance QA activities with GenAI.
1. Analysis
In the analysis stage, information is gathered from stakeholders regarding their requirements and expectations of the system.
User stories are created at this stage.
A ‘Three Amigos’ session (a Business Analyst, a Developer and a Quality Assurance Tester get together to review requirements) may take place after identifying the requirements.
Before or during the ‘Three Amigos’ session, we can use ChatGPT to generate test cases for the user stories to explore how much value it adds.
This is what I asked in ChatGPT:
It generated seven test cases. I do not want to bore you with listing all test cases, so I selected a few below that could lead to a discussion to add more details to the requirements for the above story.
Test cases created by ChatGPT | Points to expand |
Precondition: Logged in as an authorised HR system user. | Are different levels of HR users required to add/edit this column? |
Precondition: Employee table has an additional column for ‘Nationality’. | Separate requirements for DB tables. |
Validate ‘Nationality’ data. | - What format column would this be? Free text, dropdown list? |
- What about employees with multiple nationalities? | |
Work permit identification: Check if the system identifies the need for a work permit based on the combination of ’Country of Residence’ and ’Nationality’ for each employee. | This test case may lead to us having discussions about the scope of work: |
- Should the system alert us when ‘Country of Residence’ and ‘Nationality’ differ? If yes, what sort of alert? Pop-up, email or another column? This may require another story. | |
- Logic of trigger. Should EU nations be ignored? | |
Error handling | - Is empty data allowed? |
- Are access level tests required? |
So, by generating test cases, we created a checklist of items to verify for the story. If I were to create test cases, I would need focused time to understand the story, followed by at least ten minutes to work on the test cases. In comparison, ChatGPT generated them within five seconds, allowing the 'Three Amigos' participants to start their discussions immediately. By going through this checklist, we can identify additional business and technical requirements, helping to solidify the overall requirements further.
2. Design
After the requirements are approved, they are transformed into blueprints for building the application. This may involve screen designs and prototypes to help visualise the requirements for the overall system and/or individual module designs.
Let’s assume that, after the ‘Three Amigos’ discussions, more details were added to the ‘Nationality’ column as below:
‘Nationality’ is selected from the dropdown list.
Can choose only one nationality.
This is a mandatory field, a warning message is displayed if nothing is selected.
An alert pops up for a mismatch between ‘Country’ and ‘Nationality’
Other logic for the alert (such as ignoring EU nations) is out of scope for this sprint
This UI design was drawn based on the above, improved requirements.
Sample UI design
Let’s generate the test cases by pasting the UI design screenshot into ChatGPT and see if we can identify issues that need to be addressed.
This is what I asked:
It produced seven test cases, which are mostly similar to those for the story. Since this design was created after conducting a requirements analysis based on the earlier test cases, there should be fewer questions raised. I identified one test case that may require further discussion.
Test case created by ChatGPT | Points to expand |
Accessibility and usability — confirm that the dropdown is accessible via keyboard navigation and can be expanded using the ‘Tab’ key. | - This will ensure developers use semantically meaningful HTML code. |
- Further discussion may be required for colour contrast. |
The key point here is that since we got well-defined functional requirements during the analysis stage, we can focus on visual and usability requirements during the design review session.
3. Implementation
The actual coding and building of software occurs in this stage. AI can now aid in developing code for both application and unit tests, but as my focus is on QA activities, I will not delve into it. I would like to examine what Test Engineers can do utilising GenAI during this stage.
Create test tickets: You can ask the GenAI to create a test case against the user story and UI screenshot using the Gherkin language. As it produces test cases for core functionalities, you can spend more time planning and scripting for edge cases. I’m a fan of using Gherkin to create test tickets, as it forces you to break down the code to the smallest testable function.
Preparing work for automation tests: I used GenAI to create a page object and test files for Playwright using the UI design screenshot.
As I was adding automation test script codes to the repository, I used 'Continue,' a VSCode extension, instead of ChatGPT this time.
I pasted the design screenshot into the prompt section and asked ChatGPT to create a page object and test files.
Since this is a VSCode extension, it has a handy tool for adding generated code to your repository, as shown below:
Apply to the current file.
Insert at the cursor.
Copy and paste the code.
You can see below what it generated.
Page object file
// employeePage.js
const { expect } = require('@playwright/test');
class EmployeePage {
constructor(page) {
this.page = page;
this.employeeRows = page.locator('tr.employee-row');
this.employeeId = page.locator('td:nth-child(2)');
this.department = page.locator('td:nth-child(3)');
this.employeeType = page.locator('td:nth-child(4)');
this.country = page.locator('td:nth-child(5)');
this.nationalityDropdown = page.locator('td:nth-child(6) select');
}
async getEmployeeCount() {
return await this.employeeRows.count();
}
async getEmployeeId(rowIndex) {
return await this.employeeRows.nth(rowIndex).locator('td:nth-child(2)').textContent();
}
async getDepartment(rowIndex) {
return await this.employeeRows.nth(rowIndex).locator('td:nth-child(3)').textContent();
}
async getEmployeeType(rowIndex) {
return await this.employeeRows.nth(rowIndex).locator('td:nth-child(4)').textContent();
}
async getCountry(rowIndex) {
return await this.employeeRows.nth(rowIndex).locator('td:nth-child(5)').textContent();
}
async selectNationality(rowIndex, nationality) {
await this.employeeRows.nth(rowIndex).locator('td:nth-child(6) select').selectOption(nationality);
}
async assertEmployeeDetails(rowIndex, expectedId, expectedDepartment, expectedEmployeeType, expectedCountry) {
const employeeId = await this.getEmployeeId(rowIndex);
const department = await this.getDepartment(rowIndex);
const employeeType = await this.getEmployeeType(rowIndex);
const country = await this.getCountry(rowIndex);
expect(employeeId).toBe(expectedId);
expect(department).toBe(expectedDepartment);
expect(employeeType).toBe(expectedEmployeeType);
expect(country).toBe(expectedCountry);
}
}
module.exports = { EmployeePage };
Test file
// employeeTest.spec.js
const { test, expect } = require('@playwright/test');
const { EmployeePage } = require('./employeePage');
test.describe('Employee Table Tests', () => {
let employeePage;
test.beforeEach(async ({ page }) => {
employeePage = new EmployeePage(page);
await page.goto('https://your-application.com/employee-table');
});
test('should display employee table', async () => {
const employeeCount = await employeePage.getEmployeeCount();
expect(employeeCount).toBeGreaterThan(0);
});
test('should display correct employee details', async () => {
await employeePage.assertEmployeeDetails(0, '0001', 'HR', 'Permanent', 'USA');
await employeePage.assertEmployeeDetails(1, '0002', 'IT', 'Contract', 'Canada');
});
test('should select nationality from dropdown', async () => {
await employeePage.selectNationality(1, 'American');
// Add assertions to verify the selected nationality
});
});
It’s usable, but definitely needs some time to review it. However, I found this saves quite a bit of time compared to manually creating these files — it’s a good base to work from. This also reminds me to inform developers to add test IDs and roles for accessibility (a11y) to each element!
4. Testing
Different types of tests are executed to ensure that the intended functionalities are implemented. Unintended behaviours are reported during this stage.
All QA preparation work has been aided and enhanced by GenAI at every stage up until this point. It's now time for QA Engineers to use some elbow grease and let their experience and skills shine!
Execute functional testing: Add results and evidence to test tickets. Raise bugs if necessary.
Exploratory testing: As Test Engineers may free up time by utilising GenAI, they can perform experience-based exploratory testing. This may help catch issues that scripted test cases cannot.
Complete automation test script: Identify what test to automate then update and add codes to the base files which were created by GenAI.
Create a test report: This is something GenAI can do. I created a sample bug report asking ChatGPT as below.
This command generated decent reports, including 'Description' and 'Steps to Reproduce.' Though I misspelt "width", it correctly assumed and generated the report. A few tweaks were needed, but, again, it saves time, allowing Test Engineers to focus on what truly matters at this stage — performing various types of tests and applying their experience to ensure the quality of newly implemented functionalities.
5. Maintenance
The maintenance stage occurs after the deployment of production. It involves improving, adapting and evolving the software. New requirements may arise from real user feedback. If production bugs are identified, they will be fixed, tested and released. Whether it involves fixing production bugs or implementing new requirements, the steps of SDLC stages 1-4 should be followed.
During the maintenance stage, automated tests need to be updated and maintained. Since the application is now in production, you can provide the URL to the GenAI to generate additional Playwright tests. This allows the AI to capture real locators.
For instance, let’s say a new business requirement states that 'Check' is no longer an available option for the payment method. In this case, additional tests need to be created. I used ‘Continue’ as shown below:
And it generated the below test.
// paymentMethodColumnTest.spec.js
const { test, expect } = require('@playwright/test');
test.describe('Payment Method Column Test', () => {
let page;
test.beforeAll(async ({ browser }) => {
page = await browser.newPage();
await page.goto('https://www.ag-grid.com/example-hr/');
});
test.afterAll(async () => {
await page.close();
});
test('should not contain "Check" in Payment Method column', async () => {
// Get all the cells in the Payment Method column
const paymentMethodCells = await page.$$eval('div[col-id="paymentMethod"] div.ag-cell-value', (cells) =>
cells.map((cell) => cell.textContent.trim())
);
// Check if none of the cells contain the value "Check"
expect(paymentMethodCells).not.toContain('Check');
});
});
It looks OK but, unfortunately, this did not correctly verify the column. Again, you can use it as a base and update it.
Conclusion
You may already be practising 'Shift Left', but, as demonstrated above, this approach can be further accelerated with GenAI. By scrutinising requirements and designs more thoroughly in the early stages of the SDLC, more time is freed up for Test Engineers to focus on higher-value tasks like experience-based exploratory testing and scripting efficient automated tests.
While GenAI offers significant support, Test Engineers must still fully understand and be capable of manually performing all QA activities. Exercising good judgment is crucial in deciding which tasks can be delegated to AI and which require personal attention.
GenAI can serve as governance throughout each stage of the SDLC, ensuring that processes are followed. By generating test cases, GenAI provides a checklist that acts as a safeguard before progressing to the next stage. A good checklist is one of the simplest yet most effective tools for maintaining quality — not only in software but in nearly any field. I recall reading an article about how the use of a surgical checklist cut infection and death rates by half (ref: National Library of Medicine).
Could a similar approach reduce production bugs by half if we implement test cases at every stage of the SDLC? While it's hard to quantify, I believe that whether they are UI/UX, requirements, or functional bugs, this approach will undoubtedly reduce the number of issues that slip through to production.