Reporting AI in education research: a methodological audit of 2025-2026 publications against an adapted TRIPOD-LLM checklist
DOI:
https://doi.org/10.55056/cte.1397Keywords:
artificial intelligence, education research, reporting quality, TRIPOD-LLM, PRISMA, methodological audit, reproducibility, transparencyAbstract
We audit how the use of artificial intelligence is reported in recent education research. From a harvest of 29 848 arXiv preprints and 543 articles in six education and education-technology journals (2025-2026), we coded 220 papers (127 arXiv + 93 journal) against a 19-item checklist adapted from the TRIPOD-LLM reporting guideline, plus descriptive and outcome items. Coding was performed by open-weight large language models (served through Ollama) from titles and abstracts, conservatively (an item not stated is coded 0); we report cross-model agreement (mean Cohen's κ = 0.53, raw agreement 87%) in place of inter-human reliability, and disclose the AI-coded method in full. Overall reporting compliance is low: the median paper reports 32% of the checklist items, and the lowest-compliance items are the cross-cutting accountability signals - funding and conflicts of interest, missing-data handling, calibration/fairness, compute and cost, and the human-in-the-loop protocol (each ≤ 7%). Reporting quantity does not differ between arXiv preprints and journal articles in the unadjusted comparison (equal medians; unadjusted odds ratio ≈ 1); what differs is composition - preprints document the model machinery while journal articles document the study context, and neither documents accountability. A modest journal advantage in quantity emerges only after adjusting for study design. Empirical design is the dominant predictor of how many items a paper reports. A within-paper preprint-vs-published comparison was planned but could not be conducted, as no eligible pairs exist. We contribute the TRIPOD-LLM-for-education checklist - to our knowledge the first reporting checklist derived from TRIPOD-LLM and calibrated for general (non-medical) education research - as a citable artefact, and call on education journals to require accountability reporting at submission.
Downloads
References
Agha, R.A., Mathew, G., Rashid, R., Kerwan, A., Al-Jabir, A., Sohrabi, C., Franchi, T., Nicola, M., Agha, M. and TITAN Group, 2025. Transparency in the Reporting of Artificial Intelligence – The TITAN Guideline. Premier Journal of Science, 10, p.100082. Available from: https://doi.org/10.70389/pjs.100082. DOI: https://doi.org/10.70389/PJS.100082
Allison, J., 2025. RAISE the Standard: A Framework for Transparent Reporting of Artificial Intelligence Studies in Education. Journal of Educational Computing Research, 64(1), p.3–15. Available from: https://doi.org/10.1177/07356331251377430. DOI: https://doi.org/10.1177/07356331251377430
Bearman, M., Ryan, J. and Ajjawi, R., 2023. Discourses of artificial intelligence in higher education: a critical literature review. Higher Education, 86(2), pp.369–385. Available from: https://doi.org/10.1007/s10734-022-00937-2. DOI: https://doi.org/10.1007/s10734-022-00937-2
Cleland, J., Driessen, E., Masters, K., Lingard, L. and Maggio, L.A., 2026. When and how to disclose AI use in academic publishing: AMEE Guide No.192. Medical Teacher, 48(4), pp.542–553. Available from: https://doi.org/10.1080/0142159X.2025.2607513. DOI: https://doi.org/10.1080/0142159X.2025.2607513
Collins, G.S., Moons, K.G.M., Dhiman, P., Riley, R.D., Beam, A.L., Van Calster, B., Ghassemi, M., Liu, X., Reitsma, J.B., Smeden, M. van, Boulesteix, A.L., Camaradou, J.C., Celi, L.A., Denaxas, S., Denniston, A.K., Glocker, B., Golub, R.M., Harvey, H., Heinze, G., Hoffman, M.M., Kengne, A.P., Lam, E., Lee, N., Loder, E.W., Maier-Hein, L., Mateen, B.A., McCradden, M.D., Oakden-Rayner, L., Ordish, J., Parnell, R., Rose, S., Singh, K., Wynants, L. and Logullo, P., 2024. TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ, 385, p.e078378. Available from: https://doi.org/10.1136/bmj-2023-078378. DOI: https://doi.org/10.1136/bmj-2023-078378
Crompton, H. and Burke, D., 2023. Artificial intelligence in higher education: the state of the field. International Journal of Educational Technology in Higher Education, 20(1), p.22. Available from: https://doi.org/10.1186/s41239-023-00392-8. DOI: https://doi.org/10.1186/s41239-023-00392-8
Errington, T.M., Mathur, M., Soderberg, C.K., Denis, A., Perfito, N., Iorns, E. and Nosek, B.A., 2021. Investigating the replicability of preclinical cancer biology. eLife, 10, p.e71601. Available from: https://doi.org/10.7554/eLife.71601. DOI: https://doi.org/10.7554/eLife.71601
Holmes, W., Persson, J., Chounta, I.A., Wasson, B. and Dimitrova, V., 2022. Artificial Intelligence and Education: A critical view through the lens of human rights, democracy and the rule of law. Strasbourg: Council of Europe Publishing. Available from: https://www.coe.int/en/web/education/-/artificial-intelligence-and-education-. DOI: https://doi.org/10.1007/978-3-031-36336-8_12
Masters, K. and Salcedo, D., 2024. A checklist for reporting, reading and evaluating Artificial Intelligence Technology Enhanced Learning (AITEL) research in medical education. Medical Teacher, 46(9), pp.1175–1179. Available from: https://doi.org/10.1080/0142159X.2023.2298756. DOI: https://doi.org/10.1080/0142159X.2023.2298756
Mittal, N., Batra, G. and Sijariya, R., 2026. Artificial intelligence in higher education: a bibliometric analysis of research trends (2015–2024). Artificial Intelligence in Education, 2(2), pp.199–225. Available from: https://doi.org/10.1108/AIIE-04-2025-0076. DOI: https://doi.org/10.1108/AIIE-04-2025-0076
Moher, D., Shamseer, L., Clarke, M., Ghersi, D., Liberati, A., Petticrew, M., Shekelle, P., Stewart, L.A. and PRISMA-P Group, 2015. Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement. Systematic Reviews, 4(1), p.1. Available from: https://doi.org/10.1186/2046-4053-4-1. DOI: https://doi.org/10.1186/2046-4053-4-1
Ng, S.L. and Ho, C.C., 2025. Generative AI in Education: Mapping the Research Landscape Through Bibliometric Analysis. Information, 16(8), p.657. Available from: https://doi.org/10.3390/info16080657. DOI: https://doi.org/10.3390/info16080657
Open Science Collaboration, 2015. Estimating the reproducibility of psychological science. Science, 349(6251), p.aac4716. Available from: https://doi.org/10.1126/science.aac4716. DOI: https://doi.org/10.1126/science.aac4716
Page, M.J., McKenzie, J.E., Bossuyt, P.M., Boutron, I., Hoffmann, T.C., Mulrow, C.D., Shamseer, L., Tetzlaff, J.M., Akl, E.A., Brennan, S.E., Chou, R., Glanville, J., Grimshaw, J.M., Hróbjartsson, A., Lalu, M.M., Li, T., Loder, E.W., Mayo-Wilson, E., McDonald, S., McGuinness, L.A., Stewart, L.A., Thomas, J., Tricco, A.C., Welch, V.A., Whiting, P. and Moher, D., 2021. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ, 372, p.n71. Available from: https://doi.org/10.1136/bmj.n71. DOI: https://doi.org/10.1136/bmj.n71
Sallam, M., Barakat, M. and Sallam, M., 2024. A Preliminary Checklist (METRICS) to Standardize the Design and Reporting of Studies on Generative Artificial Intelligence–Based Models in Health Care Education and Practice: Development Study Involving a Literature Review. Interactive Journal of Medical Research, 13, p.e54704. Available from: https://doi.org/10.2196/54704. DOI: https://doi.org/10.2196/54704
Walsh, I., Fishman, D., Garcia-Gasulla, D., Titma, T., Pollastri, G., ELIXIR Machine Learning Focus Group, Capriotti, E., Casadio, R., Capella-Gutierrez, S., Cirillo, D., Del Conte, A., Dimopoulos, A.C., Del Angel, V.D., Dopazo, J., Fariselli, P., Fernández, J.M., Huber, F., Kreshuk, A., Lenaerts, T., Martelli, P.L., Navarro, A., Broin, P.Ó., Piñero, J., Piovesan, D., Reczko, M., Ronzano, F., Satagopam, V., Savojardo, C., Spiwok, V., Tangaro, M.A., Tartari, G., Salgado, D., Valencia, A., Zambelli, F., Harrow, J., Psomopoulos, F.E. and Tosatto, S.C.E., 2021. DOME: recommendations for supervised machine learning validation in biology. Nature Methods, 18(10), pp.1122–1127. Available from: https://doi.org/10.1038/s41592-021-01205-4. DOI: https://doi.org/10.1038/s41592-021-01205-4
Downloads
Submitted
Published
Data Availability Statement
The materials supporting this audit accompany the submission as GitHub repository https://github.com/imintii/TRIPOD-LLM-for-education: the raw OAI-PMH arXiv harvest (arxiv_oai_raw.jsonl, 29 848 records) and its in-window education-relevant subset (arxiv_filtered.jsonl, 11 718 records); the Crossref journal harvest (journals_raw.jsonl, 543 records); the combined coded dataset (coded_full_combined.csv, 220 papers) together with the per-model and cross-model coding files; the codebook; the analysis outputs for RQ1-RQ3 and the RQ4 status record; and the full pipeline (scripts/). No human-subjects data were collected; all inputs are public bibliographic metadata and abstracts.
Issue
Section
License
Copyright (c) 2026 Iryna S. Mintii, Dmytro V. Verbovetskyi, Ostap Yu. Sirenko

This work is licensed under a Creative Commons Attribution 4.0 International License.
How to Cite
Accepted 2026-03-19
Published 2026-03-21
