The rapid proliferation of Large Language Models (LLMs) has led to their widespread adoption across scientific disciplines. However, a growing number of academic publications rely solely on off-the-shelf solutions applied to familiar tasks, without a well-elaborated methodologica
...
The rapid proliferation of Large Language Models (LLMs) has led to their widespread adoption across scientific disciplines. However, a growing number of academic publications rely solely on off-the-shelf solutions applied to familiar tasks, without a well-elaborated methodological innovation, theoretical grounding, or critical reflection. This trend has given rise to a form of superficial research in which generative LLMs are used without sound methodological motivation and critical risk-based reflection on their use. Based on these observations, this paper presents a critical examination of observed applications of LLMs in scientific publications, contrasting performance-driven applications with conceptually rigorous studies that integrate LLMs within structured scientific frameworks. This paper starts by providing a concise description of the technical internal working of LLMs and, based on that, some of their limited capabilities. Next, through a review of recent literature, the analysis identifies epistemological risks, structural incentives, and reproducibility challenges that compromise the integrity of scientific practice. The study concludes by proposing guidelines for the responsible and meaningful use of LLMs in research, emphasizing the need for theoretical alignment, methodological transparency, and the preservation of human epistemic agency.