There’s lots of talk about ChatGPT helping students to cheat. Is that because it’s able to cheat on its own?

A funny thing happened as I was working on a forthcoming blog post about the potential of ChatGPT Assistant as a learning tool. In a spirit of fun, I thought I might co-author a blog with it. So, I asked it to provide a general intro, together with references on the future of AI and education.

ChatGPT was happy to do just that, feeding back an article with included references in its text. It even gave me a properly formatted list of references for the endnotes. However, when I took a closer look I discovered something odd.

ChatGPT Assistant appeared to be referencing real individuals, sometimes experts who had written topic relevant papers and books, but most of the articles themselves proved strangely hard to locate. Perhaps it was just very good at finding papers that didn’t feature in common academic databases or show up in Google searches?

But, if ChatGPT was really adept at finding obscure references, that wouldn’t square with its claims of not to being able to search the Internet. I started to wonder if ChatGPT Assistant had in fact created plausible, but non-existent, references.

I asked ChatGPT assistant how it had produced the references it provided. It responded that it had collated them by “searching through my knowledge base for relevant information on the topic of AI and education.” When I asked it more explicitly if the references were to real articles, it went on to state “these are real articles that have been published in academic journals.”

I wasn’t convinced.

I asked it to provide summaries of the articles it listed. It was immediately able to provide a summary of each “article” it had supposedly found and referenced, including an overview of the findings. I had asked it to make summaries of my own text in an earlier experiment, so I knew it was capable of creating a genuine summary, yet now it appeared to be summarising non-existent information.

ChatGPT Assistant told me “these are real articles that have been published in academic journals.”

I wasn’t convinced.

I went on to ask ChatGPT Assistant to generate lists of references on a range of topics. The results were similar – occasionally it did reference bona-fide journal articles that I could find, but almost all of what it produced proved strangely elusive when I tried to search for them on Google Scholar or databases such as Scopus or Eric. That is, they didn’t appear to exist in the real world.

Is ChatGPT making up fake references?

It seemed as though ChatGPT Assistant might have grasped the essentials of referencing style and could create endless variations of interesting looking papers, but that it perhaps can’t currently distinguish truth from fiction. Some old-fashioned search (without the aid of ChatGPT) revealed that I am not the first to notice this aspect of these tools, Mike Sharples noted the same thing when he asked GPT3 to write a student essay, but it doesn’t seem to have been widely discussed in recent debates.

I thought this capacity to creatively generate lists of references to papers that don’t exist was interesting and merited further exploration – perhaps it could be of use in helping people come up with creative titles for their own essays and articles. However, it might pose a problem for any students hoping to effortlessly generate referenced essays!

To explore further, I entered one of the lists it had generated (a list containing some real papers that I could find following a quick search, and others that were more elusive). I asked ChatGPT to comment on its own list while also considering the topic of students using ChatGPS to create essays and the possibility that ChatGPS might be creating lists of fake references. Here is what it had to say:

Thoughts on AI-generated fake references

As written by ChatGPT Assistant

“Have you heard about ChatGPT? It’s a chatbot that uses a giant language model to generate responses to prompts, including the ability to write essays. Cool, right? Well, there’s been some controversy surrounding ChatGPT and its ability to write essays for students. One aspect of this issue that’s particularly interesting is the way ChatGPT seems to produce references.

We were given a list of references that ChatGPT generated, and at first glance, they seem legit. But upon closer inspection, it’s unclear whether or not they’re real. Some of the references, like “Generative Adversarial Networks” by Goodfellow et al. (2014) and “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks” by Radford et al. (2015), are definitely real papers that have been published in reputable journals. But what about the rest? “Language Generation with Recurrent Generative Adversarial Networks without Pre-training” by Guo et al. (2017) and “Using artificial intelligence to enhance education: A review of current applications and future possibilities” by Khine and Mohamed (2017) – are they real or just made up?

If these references are fake, it raises some serious red flags. First and foremost, it calls into question the validity and reliability of the work being produced by ChatGPT. In academia, it’s crucial for work to be based on reputable and reliable sources, and the use of fabricated references completely undermines the integrity of the work. It’s also possible that ChatGPT is using real references but in an inappropriate or misleading way, which would also be a problem.

So, what does all of this mean? It’s hard to say for sure, but it definitely makes you think about the potential limitations and dangers of using ChatGPT (or any AI, for that matter) to generate academic work. As always, it’s important to be cautious and critically evaluate the sources and information we use.”

Well put.

What I found fascinating about the above is that ChatGPT Assistant now appeared able, without further prompting from me, to distinguish the genuine references in its list from those that did not appear to exist. (I didn’t tell it which references I thought were real and which I thought were not.) As mentioned earlier, when I first asked it if the references it was providing were all real, or might be made up, it insisted that they were all genuine. It was only when I asked it to write an intro to the topic of ChatGPT Assistant possibly generating references to non-existent papers, that it suddenly proved able to identify the apparently made-up references in its own list.

ChatGPT Assistant and Academic Writing

This raises questions about how ChatGPT Assistant works. Large language models typically work by generating the statistically most likely response to an input (Murray Shanahan provides an explanation in this interesting paper). It isn’t clear to me how exactly this leads to the creation of realistic-looking references to non-existent papers or how it sometimes correctly distinguished between real and implausible references in its own lists but not at other times.

However, despite this limitation of ChatGPT Assistant, it seems entirely possible that it or something like it will in the future be able to search the internet, undertake literature reviews and produce reliable lists of references and then use these to generate academic writing.  Today, elicit.org is currently free to use and uses GPT 3 to search for and find papers in response to a question (although it also doesn’t fully explain its process). If it were fed genuine sources found by a reference searching tool such as elicit.org, and used its fairly impressive summarisation skills, ChatGPT Assistant might well be able to create convincing referenced essays that draw on genuine sources. And of course, it can already produce plausible essays at a level where references aren’t required.

More from DEFI

I went on to ask ChatGPT Assistant whether it and other AI tools might help students cheat by writing essays for them. At first, it became coy and said it couldn’t answer that question. But when I changed the question and simply asked if AI could write essays for students, it offered the following

Can AI write essays for students?

As written by ChatGPT Assistant

 

“Have you heard about AI tools that can help you with writing tasks? These nifty little programs can suggest words or phrases to make your writing shine, or even help you organize and structure your content. But here’s the thing: while these tools can be super useful, they’re not magic. They can’t write a complete essay for you from start to finish. It’s still up to you to do your own research and put your own words on the page. And here’s another thing to keep in mind: it’s super important to follow academic integrity guidelines, which generally require that you do your own work and properly cite any sources you use. Plagiarism is a big no-no in the academic world – it’s basically using someone else’s ideas or work without proper attribution, and it can get you into serious trouble. So while AI writing tools can be a helpful boost, remember to always do your own work and give credit where credit is due.”

Noted!

However, in recent weeks, there have been multiple articles that focus on how ChatGPT might indeed be used by students to help write essays. The Edsurge News podcast explains how to write an essay that reads as though it was written by a sophomore in high school.

More broadly, following my experiments, I think ChatGPT is correct that it can’t be fully relied on to compose an essay from start to finish (yet), but even now it can certainly help quite a lot. It can, for example, create plausible topic titles, suggest essay structures, summarise papers that a student might find in a literature review and rewrite sections in a more academic style. It can also write in multiple styles on request and, as it says on its website, it can even rewrite your text as a limerick (or as a play, which I asked it to do at one point). In the future, it seems likely that it or similar tools or a combination of tools, may also be able to accurately reference text, rather than producing the plausible looking, but strangely elusive references that it currently provides.

This raises interesting questions for education. In the same way that existing technologies such as calculators and Wolfram Alpha have impacted approaches to teaching, ChatGPT and similar tools may change how essays are written and graded, or even how certain types of knowledge and understanding are taught and assessed in future. To quote ChatGPT in closing:

“It’s exciting to see what the future of AI in education has in store, but we’ve got to make sure we’re doing it the right way.”

Assistant

ChatGPT

Imogen Casebourne

Imogen Casebourne

DEFI Innovation Lab Research Lead

Dr Imogen Casebourne is interested in the design, development, deployment and evaluation of educational technology, and particularly in the role technology might play in supporting experiences of community and serendipity in learning.

1 Comment