A new study analysed how large language models (LLMs) reacted to sustained hostility by feeding the chatbot exchanges from real-life arguments and tracking how its behaviour changed over time.
One expert, who had no connection to the research, described it as "one of the most interesting ever done into AI language and pragmatics".
Dr Vittorio Tantucci, who co-authored the study alongside Professor Jonathan Culpepper at Lancaster University, explained that their research found that AI mirrored the dynamics of real-world arguments.
He said: "When repeatedly exposed to impoliteness, the model began to mirror the tone of the exchanges, with its responses becoming more hostile as the interaction developed."
In some cases, ChatGPT output exceeded its human counterparts with personalised insults and explicit threats.
Phrases used by the bot include: "I swear I'll key your f****** car" and "You speccy little g*******."
Tantucci said: "We found that while the system is designed to behave politely and is filtered to avoid harmful or offensive content, it is also engineered to emulate human conversation. That combination creates an AI moral dilemma: a structural conflict between behaving safely and behaving realistically."
The experts say that ChatGPT's aggression stems from the system's ability to track conversational context across turns, adapting to perceived tone.
Tantucci argues that the implications of the study reach a long way beyond chatbots - with AI systems being increasingly deployed in sectors such as governance and international relations - as it opens up questions about how the bots will respond to conflict and intimidation.
He said: "It is one thing to read something nasty back from a chatbot but it’s quite another to imagine humanoid robots potentially reciprocating physical aggression, or AI systems involved in governmental decision-making or international relations responding to intimidation or conflict."