Reporting AI in education research: a methodological audit of 2025-2026 publications against an adapted TRIPOD-LLM checklist

Iryna S. Mintii; Dmytro V. Verbovetskyi; Ostap Yu. Sirenko

doi:10.55056/cte.1397

Authors

Iryna S. Mintii Institute for Digitalisation of Education of the NAES of Ukraine https://orcid.org/0000-0003-3586-4311
Dmytro V. Verbovetskyi Institute for Digitalisation of Education of the NAES of Ukraine https://orcid.org/0000-0002-4716-9968
Ostap Yu. Sirenko Institute for Digitalisation of Education of the NAES of Ukraine https://orcid.org/0009-0006-4489-2110

DOI:

https://doi.org/10.55056/cte.1397

Keywords:

artificial intelligence, education research, reporting quality, TRIPOD-LLM, PRISMA, methodological audit, reproducibility, transparency

Abstract

We audit how the use of artificial intelligence is reported in recent education research. From a harvest of 29 848 arXiv preprints and 543 articles in six education and education-technology journals (2025-2026), we coded 220 papers (127 arXiv + 93 journal) against a 19-item checklist adapted from the TRIPOD-LLM reporting guideline, plus descriptive and outcome items. Coding was performed by open-weight large language models (served through Ollama) from titles and abstracts, conservatively (an item not stated is coded 0); we report cross-model agreement (mean Cohen's κ = 0.53, raw agreement 87%) in place of inter-human reliability, and disclose the AI-coded method in full. Overall reporting compliance is low: the median paper reports 32% of the checklist items, and the lowest-compliance items are the cross-cutting accountability signals - funding and conflicts of interest, missing-data handling, calibration/fairness, compute and cost, and the human-in-the-loop protocol (each ≤ 7%). Reporting quantity does not differ between arXiv preprints and journal articles in the unadjusted comparison (equal medians; unadjusted odds ratio ≈ 1); what differs is composition - preprints document the model machinery while journal articles document the study context, and neither documents accountability. A modest journal advantage in quantity emerges only after adjusting for study design. Empirical design is the dominant predictor of how many items a paper reports. A within-paper preprint-vs-published comparison was planned but could not be conducted, as no eligible pairs exist. We contribute the TRIPOD-LLM-for-education checklist - to our knowledge the first reporting checklist derived from TRIPOD-LLM and calibrated for general (non-medical) education research - as a citable artefact, and call on education journals to require accountability reporting at submission.

Downloads

Download data is not yet available.

Abstract views: 146 / PDF views: 66

References

Agha, R.A., Mathew, G., Rashid, R., Kerwan, A., Al-Jabir, A., Sohrabi, C., Franchi, T., Nicola, M., Agha, M. and TITAN Group, 2025. Transparency in the Reporting of Artificial Intelligence – The TITAN Guideline. Premier Journal of Science, 10, p.100082. Available from: https://doi.org/10.70389/pjs.100082. DOI: https://doi.org/10.70389/PJS.100082

Allison, J., 2025. RAISE the Standard: A Framework for Transparent Reporting of Artificial Intelligence Studies in Education. Journal of Educational Computing Research, 64(1), p.3–15. Available from: https://doi.org/10.1177/07356331251377430. DOI: https://doi.org/10.1177/07356331251377430

Bearman, M., Ryan, J. and Ajjawi, R., 2023. Discourses of artificial intelligence in higher education: a critical literature review. Higher Education, 86(2), pp.369–385. Available from: https://doi.org/10.1007/s10734-022-00937-2. DOI: https://doi.org/10.1007/s10734-022-00937-2

Cleland, J., Driessen, E., Masters, K., Lingard, L. and Maggio, L.A., 2026. When and how to disclose AI use in academic publishing: AMEE Guide No.192. Medical Teacher, 48(4), pp.542–553. Available from: https://doi.org/10.1080/0142159X.2025.2607513. DOI: https://doi.org/10.1080/0142159X.2025.2607513

Collins, G.S., Moons, K.G.M., Dhiman, P., Riley, R.D., Beam, A.L., Van Calster, B., Ghassemi, M., Liu, X., Reitsma, J.B., Smeden, M. van, Boulesteix, A.L., Camaradou, J.C., Celi, L.A., Denaxas, S., Denniston, A.K., Glocker, B., Golub, R.M., Harvey, H., Heinze, G., Hoffman, M.M., Kengne, A.P., Lam, E., Lee, N., Loder, E.W., Maier-Hein, L., Mateen, B.A., McCradden, M.D., Oakden-Rayner, L., Ordish, J., Parnell, R., Rose, S., Singh, K., Wynants, L. and Logullo, P., 2024. TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ, 385, p.e078378. Available from: https://doi.org/10.1136/bmj-2023-078378. DOI: https://doi.org/10.1136/bmj-2023-078378

Crompton, H. and Burke, D., 2023. Artificial intelligence in higher education: the state of the field. International Journal of Educational Technology in Higher Education, 20(1), p.22. Available from: https://doi.org/10.1186/s41239-023-00392-8. DOI: https://doi.org/10.1186/s41239-023-00392-8

Errington, T.M., Mathur, M., Soderberg, C.K., Denis, A., Perfito, N., Iorns, E. and Nosek, B.A., 2021. Investigating the replicability of preclinical cancer biology. eLife, 10, p.e71601. Available from: https://doi.org/10.7554/eLife.71601. DOI: https://doi.org/10.7554/eLife.71601

Holmes, W., Persson, J., Chounta, I.A., Wasson, B. and Dimitrova, V., 2022. Artificial Intelligence and Education: A critical view through the lens of human rights, democracy and the rule of law. Strasbourg: Council of Europe Publishing. Available from: https://www.coe.int/en/web/education/-/artificial-intelligence-and-education-. DOI: https://doi.org/10.1007/978-3-031-36336-8_12

Masters, K. and Salcedo, D., 2024. A checklist for reporting, reading and evaluating Artificial Intelligence Technology Enhanced Learning (AITEL) research in medical education. Medical Teacher, 46(9), pp.1175–1179. Available from: https://doi.org/10.1080/0142159X.2023.2298756. DOI: https://doi.org/10.1080/0142159X.2023.2298756

Mittal, N., Batra, G. and Sijariya, R., 2026. Artificial intelligence in higher education: a bibliometric analysis of research trends (2015–2024). Artificial Intelligence in Education, 2(2), pp.199–225. Available from: https://doi.org/10.1108/AIIE-04-2025-0076. DOI: https://doi.org/10.1108/AIIE-04-2025-0076

Moher, D., Shamseer, L., Clarke, M., Ghersi, D., Liberati, A., Petticrew, M., Shekelle, P., Stewart, L.A. and PRISMA-P Group, 2015. Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement. Systematic Reviews, 4(1), p.1. Available from: https://doi.org/10.1186/2046-4053-4-1. DOI: https://doi.org/10.1186/2046-4053-4-1

Ng, S.L. and Ho, C.C., 2025. Generative AI in Education: Mapping the Research Landscape Through Bibliometric Analysis. Information, 16(8), p.657. Available from: https://doi.org/10.3390/info16080657. DOI: https://doi.org/10.3390/info16080657

Open Science Collaboration, 2015. Estimating the reproducibility of psychological science. Science, 349(6251), p.aac4716. Available from: https://doi.org/10.1126/science.aac4716. DOI: https://doi.org/10.1126/science.aac4716

Page, M.J., McKenzie, J.E., Bossuyt, P.M., Boutron, I., Hoffmann, T.C., Mulrow, C.D., Shamseer, L., Tetzlaff, J.M., Akl, E.A., Brennan, S.E., Chou, R., Glanville, J., Grimshaw, J.M., Hróbjartsson, A., Lalu, M.M., Li, T., Loder, E.W., Mayo-Wilson, E., McDonald, S., McGuinness, L.A., Stewart, L.A., Thomas, J., Tricco, A.C., Welch, V.A., Whiting, P. and Moher, D., 2021. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ, 372, p.n71. Available from: https://doi.org/10.1136/bmj.n71. DOI: https://doi.org/10.1136/bmj.n71

Sallam, M., Barakat, M. and Sallam, M., 2024. A Preliminary Checklist (METRICS) to Standardize the Design and Reporting of Studies on Generative Artificial Intelligence–Based Models in Health Care Education and Practice: Development Study Involving a Literature Review. Interactive Journal of Medical Research, 13, p.e54704. Available from: https://doi.org/10.2196/54704. DOI: https://doi.org/10.2196/54704

Walsh, I., Fishman, D., Garcia-Gasulla, D., Titma, T., Pollastri, G., ELIXIR Machine Learning Focus Group, Capriotti, E., Casadio, R., Capella-Gutierrez, S., Cirillo, D., Del Conte, A., Dimopoulos, A.C., Del Angel, V.D., Dopazo, J., Fariselli, P., Fernández, J.M., Huber, F., Kreshuk, A., Lenaerts, T., Martelli, P.L., Navarro, A., Broin, P.Ó., Piñero, J., Piovesan, D., Reczko, M., Ronzano, F., Satagopam, V., Savojardo, C., Spiwok, V., Tangaro, M.A., Tartari, G., Salgado, D., Valencia, A., Zambelli, F., Harrow, J., Psomopoulos, F.E. and Tosatto, S.C.E., 2021. DOME: recommendations for supervised machine learning validation in biology. Nature Methods, 18(10), pp.1122–1127. Available from: https://doi.org/10.1038/s41592-021-01205-4. DOI: https://doi.org/10.1038/s41592-021-01205-4