PUBLISH DATE: Jun 02 2025
UPD: Jun 03 2025
Reading time: 8 minutes
QA

AI in Testing: Is the Future Already Here?

AI is very likely to transform testing in the upcoming years. In this article, our Lead QA Engineer, Yuliya Kozina, who has a Ph.D. in AI research, offers a short review of Anthropic AI and its capabilities in software testing.

Artificial intelligence is transforming the core methods in software development at an ever-increasing pace. For this reason, some people say that the future envisioned in science fiction and futurist works of art is already here. Among the chief areas of AI application is software testing. Traditional testing methods demand significant expenditures of human resources and time, resulting in a quality assessment process that is slow and error-prone. The application of artificial intelligence enables the automation of the whole process, significantly raising its efficiency, accuracy, and speed. In this article, I review the core applications of artificial intelligence in software testing and evaluate Anthropic’s AI as a testing instrument, outlining its advantages, weaknesses, and prospects for further development.

1.1. AI: My Personal Experience

Before proceeding to the discussion of AI in QA, I want to add some personal background first. Between 2008 and 2011, I entered a Ph.D. program related to AI (to be more precise, my specialization was in “Systems and tools of artificial intelligence”), completed my dissertation, and then successfully defended it (Kozina, 2011). The entire process was far from simple, as any person studying in a Ph.D. program would confirm. It’s not surprising to me that many software developers and AI researchers prefer to forego Ph.D. programs altogether: they’re very demanding. Nonetheless, I decided to brave the challenges and can now show the relevance of my preceding research for QA practices.

During my Ph.D. studies, I analyzed the methods of using AI for image processing and recognition (Kozina, 2011). I chose photomasks, special plates used in the production of microcircuits, as a specific object of my research. My research’s primary goals included creating an image of a photomask, recognizing its reference signs (also known as alignment marks), combining the resulting image with an original reference, and, in the end, finding defects in the relevant photomasks. To recognize those reference signs, I created an AI system capable of combining two photomask images, a production (made in real conditions and potentially disrupted by outside visual noise) and a reference one.

Coding

I used neural networks to recognize reference signs. In my case, the neural network architecture was based on a multilayer perceptron; in turn, its training methodology involved using modified gradient methods and hyperbolic Wavelet Transformation Space. This approach, presented in detail within my dissertation, enabled me to create a system capable of recognizing reference signs and highlighting defects in images, combining high accuracy and, at the same time, hardware requirements that were significantly lower than in other models. 

After successfully defending my Ph.D. dissertation, I started working as an assistant lecturer at my department, became an associate professor, and later decided to start working as a QA specialist. Based on my work experience, I also began teaching QA at my university as a separate course and, as of today, continue teaching it.

Despite this change in research and work focus, I am still highly interested in AI. Considering the rapid development of those systems in the preceding years, I decided to analyze whether it’s possible to use my previous AI knowledge to enhance testing.

1.2. AI in Testing: What Problems Does It Solve?

I think it’s evident even to non-specialists that software QA as a field currently faces several limitations that can potentially be solved with AI. Here are some relevant examples:

  • Limited testing time. A tester can come up with an almost unlimited variation of input data, even for very small features. It’s impossible to analyze everything: from the standpoint of mathematical algorithm theory, this task is NP-complete.
  • Prediction of potential defects in current app releases.
  • Vague tasks and instructions. It’s challenging to test something when you are unsure about the primary goal of testing in the first place.
  • Significant time and effort necessary for manual QA.
  • Establishment of bug severity and priority.

Upon reviewing research, I can affirm that the academic literature already presents solutions to all those problems.

For example, specialists use genetic algorithms and fuzzy logic to decrease time expenditures for regression testing (Nooraei Abadeh, 2021; Kothiyal, 2025; “AI-powered,” 2025).

Robot working

One more approach is to use Bayesian networks to decrease the number of required tests (especially, at the level of unit testing). For example, one can create a program that solves a particular mathematical problem based on different inputs. Here, the aim is to minimize the number of tests and find all possible bugs and errors simultaneously. Instead of testing all possible combinations (this task can be impractical time-wise), the system helps QA specialists find the most urgent test cases through advanced equations.

And here’s an example of using algorithms to decide bug severity: one of the students in my university course has created an app that uses fuzzy logic to perform this task. The users can provide data on particular bugs, and the app then offers assessments of their severity as an output.

1.3. Using Anthropic AI for testing

In the course of my work, I had an opportunity to interact with Anthropic’s AI software and analyze its capabilities for software testing. Using this software isn’t difficult. You can deploy it in Docker via a terminal command and then add an API key. I can confirm that it functions well in Ubuntu Linux and, more importantly, has access to the Firefox browser, which is a default browser for Anthropic AI.

General Capabilities

So, here are some of the relevant capabilities for this software. Anthropic’s AI can:

  1. Open the Firefox browser and go to the site requested by a QA specialist;
  2. Interact with various site elements via a computer tool;
  3. Take screenshots to assist analysis;
  4. Generate descriptions for feature behavior;
  5. Write test cases based on natural language input;
  6. Create automated tests in any programming language users select.
Bot working on laptop

Automated Testing via AI

To facilitate automated tests, Anthropic AI can install the requisite tools, such as Python, Selenium WebDriver, and Firefox WebDriver (geckodriver), via bash and generate and edit Python scripts necessary for launching and configuring tests.

And, what about the mobile app testing? In this regard, Anthropic can create automated tests via Appium, which can usually be installed via an APK file.

Using this AI, I received Python- and Selenium-based tests generated via natural language for this site, keenethics.com, AI configurations for project structure created via Page Object pattern, and detailed reports and logs.

Anthropic offers its locators and then launches tests on its own. If something goes wrong, the AI can fix those automated tests (however, this process can sometimes be relatively slow). 

Robot working on PC

Finally, one more practical question remains: how can Anthropic’s users obtain projects created by the AI for their use cases? After all, it is deployed on a virtual machine. 

Everything is standard here: Anthropic AI allows users to copy project files from the virtual machine to local storage.

From the standpoint of mobile developers, the user experience is relatively straightforward, too. Anthropic AI recommends downloading APK files for mobile app testing via cloud services such as Dropbox and Google Drive.

All in all, here are the main use cases of AI in testing automation:

  • Creating project structure;
  • Generating automated test files;
  • Enabling Allure reports.
  • Saving all testing tools in an archive for convenient usage.

CI/CD and Load Testing via AI

My experience indicates that Anthropic AI can assist with CI/CD and load testing. For instance, the AI created a JMeter load test for one of the tasks I gave it. The information collected by it is sufficient to estimate response time, assess capacity, and analyze various bugs in the reviewed web app.

API Testing via AI

Lastly, Anthropic AI completed an API test I asked it to do. I gave it the following prompt: “Could you, please, test Reqres API (https://reqres.in/api/users)? Your goal is to create a user; queries can include variables ‘name’ and ‘job.”

Conclusions

Despite downsides like hallucinations and occasional slowdowns, Anthropic AI technology is a promising instrument that significantly raises testing efficiency. I must admit that modern AI’s rapid evolution sometimes surprises even an AI specialist like me. Nonetheless, I don’t think that it will replace testers in the near future. What is realistic, however, is that the entry requirements for newcomers to IT will become lower. At the same time, the technology can remove a lot of routine tasks for experienced specialists. So, returning to the title of this article: yes, the future, at least as envisioned in recent science fiction, is here. AI is already automating thousands of tasks right now.

References

  1. AI-powered regression testing: Driving software QA forward. Amzur Technologies. (2025, April 25). https://amzur.com/blog/ai-role-in-regression-testing-automation/
  2. Kothiyal, P. (2025, March 24). AI-powered regression testing: Improve software quality. Talent500 blog. https://talent500.com/blog/ai-powered-regression-testing/
  3. Nooraei Abadeh, M. (2021). Genetic-based web regression testing: An ontology-based multi-objective evolutionary framework to auto-regression testing of web applications. Service Oriented Computing and Applications, 15(1), 55–74. https://doi.org/10.1007/s11761-020-00312-y 
  4. Козіна, Ю. (2011). Методи розпізнавання зображень у просторі вейвлет-перетворення при контролі якості шаблонів. (2011). (dissertation). Одеса.
Rate this article!
5/5
Reviews: 1
You have already done it before!
Start growing your business with us

Get ready to meet your next proactive tech partner. Tell us about your project, and we'll contact you within one business day, providing an action plan

Only for communication
By submitting, I agree to Keenethics’ Privacy Policy.
Daria Hlavcheva
Daria Hlavcheva
Head of Partner Engagement
Book a call
What to expect after submitting the form?
  • Our Engagement Manager will reply within 1 business day.
  • You'll receive an optional NDA to sign.
  • We'll schedule a call to discuss the action plan.

Our Projects

We've helped to develop multiple high-quality projects. Learn more about them in our case study section

BankerAdvisor — Expert-Centric Investment Banking Tool
  • Banking and Finance

BankerAdvisor offers an efficient investment banking service that provides information on bank fees, interest rates, and customer services.

Case studies
Brainable
  • Education
  • Entertainment

Brainable is a website that offers more than 20 brain-training games, allowing its users to combine entertainment with significant cognitive health improvements through regular 15-minute exercise sessions.

Case studies
StoryTerrace Bookmaker
  • Media & Publishing

Bookmaker is an AI-based book publishing platform that helps its clients create their own books online together with a competent in-house editorial team.

Case studies
Check out our case studies
Case Studies
GDPR banner icon
We use cookies to analyze traffic and make your experience on our website better. More about our Cookie Policy and GDPR Privacy Policy