Journal Search Engine
Search Advanced Search Adode Reader(link)
Download PDF Export Citaion korean bibliography PMC previewer
ISSN : 1738-0294(Print)
ISSN : 2288-8853(Online)
Journal of Mushrooms Vol.23 No.4 pp.241-255
DOI : http://dx.doi.org/10.14480/JM.2025.23.4.241

Interpretable deep learning framework for predicting cordycepin production in Cordyceps militaris cultivated on Pinus densiflora sawdust

Si Young Ha, Hyeon Cheol Kim, Jae-Kyung Yang*
Department of Environmental Materials Science, Institute of Agriculture & Life Science, Gyeongsang National University, Jinju 52828, Republic of Korea

Abstract

Cordycepin is the principal bioactive compound produced by Cordyceps militaris and exhibits diverse pharmacological properties. However, cordycepin production is highly sensitive to cultivation conditions, leading to substantially variable production amounts and challenges in process optimization. An interpretable machine learning framework was established in this study to predict the cordycepin produced by C. militaris cultivated on Pinus densiflora sawdust. Three key cultivation parameters—input weight, growth weight, and particle size—were quantified using submerged mycelial culture. The cordycepin content was measured via high-performance liquid chromatography. Four predictive models (random forest, support vector machine, XGBoost, and artificial neural network) were optimized through a randomized hyperparameter search and evaluated using internal validation and Tropsha’s external quantitative structure-activity relationship criteria. The validation accuracy of XGBoost was the highest (root mean square error = 42.67 μg/mL), whereas the external performance of random forest was the most reliable (R² = 0.898). Shapley additive explanations revealed that input weight most strongly influenced cordycepin production, followed by growth weight and particle size, with distinct nonlinear and interaction-driven effects among the cultivation variables. Kernel density and dependence analyses confirmed the occurrence of multimodal production regimes associated with the substrate loading and particle size characteristics. Finally, the best-performing model was deployed through a streamlit-based graphical user interface, enabling the real-time prediction of cordycepin concentration with a 95% confidence interval. The results collectively demonstrate the utility of interpretable AI-driven modeling for unveiling complex biological responses, providing a practical decision-support tool for optimizing cordycepin production in fungal biotechnologies.

초록

Figure

Table