ChatGPT broke all Internet records by going viral in its first week. Five days with a million users is unexpected. A conversational AI that can answer questions in natural language writes poems, screenplays, social media posts, descriptive essays, and many other amazing things. As soon as we had access to the platform, our first thought was how we can use it to make the lives of software testers easier.
About ChatGPT
OpenAI’s ChatGPT, the Artificial Intelligence-powered chatbot that has gone viral, has crossed one million users in less than a week since it was officially made available to the public. ChatGPT was made available for public testing on Dec 15, 2022. OpenAI CEO Sam Altman confirmed this via a tweet. His post also attracted questions on whether the company plans to keep ChatGPT free forever, to which he replied that they “will have to monetize it somehow at some point,” and added that the computing costs of running this are “eye-watering”
Twitter CEO Elon Musk asked OpenAI CEO Altman “what is the average cost per chat for OpenAI.” Altman said this is “probably single-digits cents per chat; trying to figure out more precisely and also how we can optimize it.”
ChatGPT can now detect errors in your code. A recent study found that ChatGPT can identify and correct about 77.5% of the bugs. When measured against the common benchmark set for bug-fixing, QuixBugs, achieves a score significantly higher than its traditional deep-learning competitors, such as CoCoNut and Codex.
The study found that ChatGPT’s bug-fixing performance was superior to the outcomes noted for the conventional program repair approaches. The researchers discovered that out of the 40 provided bugs, the model could fix 31 of them. The model could occasionally quickly generate a solution, but other times it required more back-and-forth to comprehend the problem. Despite this variation in bug-fixing, the researchers pointed out that it might be advantageous for end users since making the same request more than once may result in a clear resolution.
What is ChatGPT?
A variant of OpenAI’s GPT (Generative Pre-training Transformer) language model is called ChatGPT. It can be fine-tuned for a wide range of natural language processing tasks, including dialogue systems and question-answering. It was pre-trained on a sizable dataset of conversational text.
ChatGPT is based on GPT-3.5, a language model that uses deep learning to produce human-like text. However, while the older GPT-3 model only took text prompts and tried to continue on that with its own generated text, ChatGPT is more engaging. It is much better at generating detailed text and can even write poems. Another unique characteristic is memory. The bot can remember earlier comments in a conversation and recount them to the user.
What happens when I create test scenarios, and how do I do it?
I’ve to instruct chatGPT to generate a test scenario by writing a few sentences in response to the prompt in this exciting experiment.
Playground
The playground is simply the name of another webpage part of the chatbot application. Users are required to open ChatGPT’s 49 different tools in the Playground.
The name was introduced through OpenAI’s OpenAI GPT-3 Playground in November 2021, where beta testers were given the opportunity to try out their GPT-3 model.
This time, ChatGPT came with plenty of examples.
Text To Command :
- Translate text into programmatic commands.
English To Other Languages :
- Translates English text into French, Spanish, Japanese, etc. in a single click.
Natural Language To OpenAI API :
- Create code to call the OpenAI API using a natural language instruction.
Python Bug Fixer :
- There are several ways of structuring the prompt for checking for bugs. Here we add a comment suggesting that the source code is buggy, and then ask codex to generate a fixed code.
Q&A :
- Answer questions based on existing knowledge.
Natural Language To OpenAI API :
- Create code to call the OpenAI API using a natural language instruction.
SQL Translate :
- Translate natural language to SQL queries.OpenAI’s ChatGPT, the artificial intelligence-powered chatbot that has gone viral, has crossed one million users in less than a week since it was officially made available to the public.
Python To Natural Language :
- Explain a piece of Python code in a human-understandable language.
VR Fitness Idea Generator :
- Create ideas for fitness and virtual reality games.
Recipe Creator (eat at your own risk) :
- Create a recipe from a list of ingredients.
Etc.
Alternatives of ChatGPT
Google Bard
With so many alternatives to ChatGPT, Google is the most recent to jump on board. Google unveiled “bard,” a powerful AI tool, on February 6. Google Bard is expected to give Chat GPT tough competition. LaMDA (Language Model for Dialogue Applications) provides power to Bard. It uses data from the internet to give users new, current, and excellent answers to their queries. Since it is still in testing, the general public cannot use it currently.
To test it out first before making it more widely accessible to the public in the upcoming weeks, Google will, however, “open it up to trusted testers.” The differences between Google AI Bard and chat GPT are considerable. Once bard is made widely accessible, one will notice a change in them.
Replica
Replika is another Chat GPT substitute that integrates content from scripted dialogue and the GPT 3 model. It serves as your personal companion. You can discuss with Replika any subject that interests you, such as love, life, your interests, etc. Replica can imitate users’ texting styles in addition to having conversations. It can even make video calls. In other words, you can talk to Replica just like you would talk to a person. You can download Replica from both the App Store and Google Play which is one of its best features.
Future Of Testing with AI: Will ChatGPT replace software testing?
When we see these kinds of developments, our initial concern is often whether they will be able to replace us or be a helpful tool for our job.
As part of our effort to better understand the impact of an AI like ChatGPT, we attempted to use it in some of the activities that testers might do as part of their work: designing test cases, test ideas or test data, automating scripts, reporting errors and assembling SQLs to generate test data or verify results.
Something interesting to see is that if we repeat the same question, ChatGPT seems to improve its responses. On a second round (a few minutes later), we observed the addition of the missing steps. A few days later, we repeated the request, and it warned us with a clarification that it had returned general steps since it did not have access to browse the internet.
Designing Test Cases And Test Data
At first sight, we thought that we got what we were looking for. But taking a closer look, it is clear that the test is written at a very high level, missing important information that could mislead any tester.
But for an AI, despite having some missing steps, such as clicking on the right-corner “Shopping Cart” button, it’s a pretty impressive output.
Imagine we have to test a system with specific inputs, and we are running out of ideas. We could brainstorm ideas with the chat! We played around a lot with requests and questions like:
- Give me test data for a login form.
- Can you give me test ideas for a bank transaction?
- Can you help me with test data for a date picker including edge cases?
The answers were fascinating and precise, with few cases even explaining why we should try this case.
Reporting Errors
Reporting errors is not simple, we need to find good ways to communicate certain aspects to avoid hurting someone’s feelings. ChatGPT can help us improve our writing, providing greater clarity in conveying the information, and even managing to do so in a more friendly and effective way, especially if we are reporting in a language other than our native tongue.
Combining Test Data
Pairwise (all-pairs) is an essential technique in testing, but it is challenging to calculate by hand; you need a tool. So we asked ChatGPT to calculate it for us, with its variables and values.
While ChatGPT interpreted the request correctly (which is not an easy task), it got it wrong: it applied the Cartesian product (another data combination technique) instead of all-pairs.
This shows us that we have to be careful with the tool. It can help us in many things, but you have to pay attention, not trust it entirely, and look critically at the result.
Assembling SQLs
Here we made a very concrete request for an SQL to query specific data from some tables that are not described completely. We liked that the system made a good explanation and facilitated the code.
Of course, it is necessary to review its answer in detail to define if it is optimal. Even so, sometimes it is easier to start from something already built than to start from scratch.
More On: Will ChatGPT Replace Software Testing?
ChatGPT is a promising tool, but our job as testers requires a lot of analytical and logical mindset and an empathetic view of the user’s reality, which is an intellectually challenging activity. Cutting to the chase, it shouldn’t be taken as a ‘‘replacement’’ tool, but it can help in different situations when used with care.
We can use it in creative ways to improve our work or develop valuable ideas, for instance:
- Generating test cases or test data, helping with test ideas.
- Assist you in drafting, and brainstorming.
- Avoiding blank page syndrome when creating SQL queries and trying to generate test ideas for a particular flow.
- Improving the way we communicate errors or results.
- Refactoring code or generating some base for what you are trying to implement.
So, can ChatGPT test better than us? It cannot, as mentioned by the CEO of OpenAI in this tweet:
“The biggest risk is people believing everything without checking. In my opinion, the human being will always need to supervise and validate what is done, which is a great opportunity for software testers,” said Fabián Baptista, co-founder, and CTO of Apptim.
“That is why it is increasingly important for testers to learn to program and understand how things work, as well as to understand these intelligent assistants so that they can help us do a better job,” he continued.
“It is an initial path in which we must pay close attention to the biases that can be generated. ChatGPT is a very powerful tool, and it’s crucial that we can use it with critical thinking,” highlighted Federico Toledo.
“The tool has many blind spots. Human and mature testing will continue to be necessary to consider accessibility, cybersecurity, different risk situations, and much more,” stressed Vera Babat, Chief Culture Officer of Abstracta.
Limitations
With greater adoption, ChatGPT is changing and evolving, but it still has some limitations. Its limitations are briefly listed below:
- Lack of domain expertise: Although ChatGPT can produce text that resembles human writing, it might need to catch up in some areas of specific domain knowledge or expertise compared to a human expert.
- Factual mistakes: Since ChatGPT was trained on a sizable amount of text data, it is possible that factual mistakes or misinformation in the training data ended up in the prototype.
- Code errors: Code generated by ChatGPT may not always function because it may not fully comprehend the task context or requirements. It may also be less optimized or contain errors. Before utilizing the generated code in any project, it is crucial to verify and test it deeply.
- Biases: The model may have picked up biases from a large amount of text data gathered from the internet it was trained on. As a result, offensive or discriminatory content can be generated.
- Limited Accountability: Because ChatGPT is a machine-learning model, it can be challenging to understand how it produces separate outputs. As a result, holding the model accountable for any errors or inaccuracies can be challenging.
- Limited Context Understanding: ChatGPT is trained to produce text based on input, but it may not fully comprehend the context or intent of that input. As a result, the input may become jumbled or misinterpreted, and the output might not be entirely accurate.
The organization that created ChatGPT, OpenAI, has been focusing on solutions to these problems. Always verify the GPT model’s data, especially when using them for essential tasks.
Conclusion
ChatGPT can produce effective, systematic test cases. The scenario can act as a good starting point for developing the actual scenarios, despite being fairly simple and only covering seen elements.
It requires a lot of tweaking before it actually integrates with our software.
It is unable to carry out the primary duty of a software tester, which is to execute the test.