LLMs & Science: Publications Boom, Quality Stagnates
- A new study reveals how large language models (LLMs) are changing the landscape of scientific publishing, offering both benefits and drawbacks to research quality and accessibility.
- What: analysis of over 200,000 pre-print manuscripts on bioRxiv, SSRN, and arXiv to assess the impact of AI-assisted writing.
- Where: Data sourced from pre-print servers bioRxiv, SSRN, and arXiv.
AI’s Impact on Scientific Research: A Double-Edged Sword
Table of Contents
A new study reveals how large language models (LLMs) are changing the landscape of scientific publishing, offering both benefits and drawbacks to research quality and accessibility.
AI Bridging Language Gaps in Research
The use of AI writing tools, particularly large language models (LLMs), is significantly impacting researchers, especially those for whom English is a second language. Researchers found that submission rates to pre-print servers bioRxiv and SSRN nearly doubled for individuals with Asian names working at institutions in Asia after they began utilizing AI. This increase rose by over 40 percent at the arXiv server.
This suggests that LLMs are helping researchers overcome a meaningful hurdle: producing high-quality, compelling text in English. The ability to clearly articulate research findings is crucial for dissemination and impact, and AI is providing a valuable tool for those who may not have native-level English proficiency.
Quantity vs. Quality: A Complex Relationship
The researchers noted that papers with clear and complex language are generally perceived as stronger and receive more citations, suggesting that writing quality frequently enough serves as a proxy for research quality. non-AI-assisted papers utilizing complex language were more likely to be published in peer-reviewed journals.
Though, this dynamic shifted for papers generated with the assistance of LLMs. While LLM-produced papers generally exhibited higher linguistic complexity, they were less likely to be published.The researchers observed that “For LLM-assisted manuscripts, the positive correlation between linguistic complexity and scientific merit not only disappears, it inverts.”
Diversifying Research Sources with AI Assistance
Despite concerns about quality,AI assistance isn’t entirely negative. When analyzing the references cited in AI-assisted papers,researchers discovered that LLMs didn’t simply replicate existing citation patterns. Instead, they cited a broader range of sources, including more books and recent publications.
This suggests that AI has the potential to diversify the body of research considered by other scientists, provided researchers diligently check the references provided by AI tools – a practice that is critically important, as evidenced by recent reports of fabricated sources in educational materials.
Interpreting the Results: Caveats and Considerations
The researchers acknowledge several factors that could influence these findings. Many researchers may use AI to generate initial drafts that are then heavily edited, potentially leading to underreporting of AI usage. Therefore,the actual prevalence of AI in scientific writing is likely higher than the data suggests.
Another potential bias stems from the publication timeline. Manuscripts take time to be published,and using publication status as a measure of quality might unfairly penalize more recent drafts,which are more likely to involve AI. Despite these potential biases, the magnitude of the observed effects suggests they are unlikely to disappear entirely.
