A mysterious new AI chatbot called “gpt2-chatbot” has surfaced after being published on a popular large language model benchmarking site, LMSYS Org.
Speculation suggests that gpt2-chatbot has capabilities comparable to OpenAI’s GPT-4, making it one of the few AI models to have achieved this.
Ethan Mollick, a professor of artificial intelligence at the University of Pennsylvania’s Wharton School, wrote in a social media post: “Nobody knows who made it or what it is, but I’ve played with it a little and it seems that way. “are at the same rough skill level as GPT-4. A mysterious GPT-4 class model?”
There is a mysterious new model called gpt2-chatbot that is accessible through a major LLM benchmarking site. Nobody knows who made it or what it is, but I’ve played with it a bit and it seems to be at the same rough performance level as GPT-4. A mysterious GPT-4 class model? Clean! pic.twitter.com/1s2iEreaiT
– Ethan Mollick (@emollick) April 29, 2024
Access to the new model is currently limited to the Chatbot Arena website, albeit in a limited capacity. In the site’s “side-by-side” arena mode, where users consciously select the model, gpt2-chatbot is subject to a rate limit of eight queries per day, which limits users’ ability to test it thoroughly.
A post from the organization on X later confirmed that the chatbot had been temporarily removed “due to unexpectedly high traffic.” However, LMSYS recommends staying tuned for further releases.
Thank you for the incredible enthusiasm of our community! We really didn’t see that coming.
Just a few things to clarify:
– In line with our policy, we have worked with several model developers in the past to provide the community with access to unpublished models/checkpoints (e.g.…)
– lmsys.org (@lmsysorg) April 30, 2024
“Just to clarify, as per our policy, we have been working with several model developers to bring their new models to our platform for community preview testing,” LMSYS Org on X responded to a thread about gpt2-chatbot. “These models are for testing purposes only and will not be included in the leaderboard until they are released.”
Hi @simonw, Many thanks! We really value your feedback.
To be clear, as per our policy, we have worked with several model developers to bring their new models to our platform for community preview testing. These models are for testing purposes only and are not included in the list.
– lmsys.org (@lmsysorg) April 29, 2024
How was gpt2-chatbot received?
The LLM was even tested by OpenAI CEO Sam Altman, who said he had a “soft spot” for it. However, there is no confirmation as to whether this is a ChatGPT-4.5 or ChatGPT-5 model.
I have a weakness for gpt2
– Sam Altman (@sama) April 30, 2024
Another user said that it “definitely feels like GPT4.5/GPT5 to me.” I’ve given some difficult prompts where I can barely get a correct answer in Claude/GPT4 and it performed well.”
It definitely feels like GPT4.5/GPT5 to me. I gave it a few difficult prompts that I could barely get a correct answer to in Claude/GPT4 and it performed well.
— Torsten Jacobi (@jacobi_torsten) April 29, 2024
First mentions of the model appeared on 4chan before spreading to social media platforms like
Featured Image: Canva