%0 Journal Article %@ 1438-8871 %I JMIR Publications %V 27 %N %P e59002 %T Text-Based Depression Prediction on Social Media Using Machine Learning: Systematic Review and Meta-Analysis %A Phiri,Doreen %A Makowa,Frank %A Amelia,Vivi Leona %A Phiri,Yohane Vincent Abero %A Dlamini,Lindelwa Portia %A Chung,Min-Huey %+ School of Nursing, College of Nursing, Taipei Medical University, 250 Wu-Xing Street, Taipei, 110, Taiwan, 886 227361661 ext 6317, minhuey300@tmu.edu.tw %K depression %K social media %K machine learning %K meta-analysis %K text-based %K depression prediction %D 2025 %7 11.4.2025 %9 Review %J J Med Internet Res %G English %X Background: Depression affects more than 350 million people globally. Traditional diagnostic methods have limitations. Analyzing textual data from social media provides new insights into predicting depression using machine learning. However, there is a lack of comprehensive reviews in this area, which necessitates further research. Objective: This review aims to assess the effectiveness of user-generated social media texts in predicting depression and evaluate the influence of demographic, language, social media activity, and temporal features on predicting depression on social media texts through machine learning. Methods: We searched studies from 11 databases (CINHAL [through EBSCOhost], PubMed, Scopus, Ovid MEDLINE, Embase, PubPsych, Cochrane Library, Web of Science, ProQuest, IEEE Explore, and ACM digital library) from January 2008 to August 2023. We included studies that used social media texts, machine learning, and reported area under the curve, Pearson r, and specificity and sensitivity (or data used for their calculation) to predict depression. Protocol papers and studies not written in English were excluded. We extracted study characteristics, population characteristics, outcome measures, and prediction factors from each study. A random effects model was used to extract the effect sizes with 95% CIs. Study heterogeneity was evaluated using forest plots and P values in the Cochran Q test. Moderator analysis was performed to identify the sources of heterogeneity. Results: A total of 36 studies were included. We observed a significant overall correlation between social media texts and depression, with a large effect size (r=0.630, 95% CI 0.565-0.686). We noted the same correlation and large effect size for demographic (largest effect size; r=0.642, 95% CI 0.489-0.757), social media activity (r=0.552, 95% CI 0.418-0.663), language (r=0.545, 95% CI 0.441-0.649), and temporal features (r=0.531, 95% CI 0.320-0.693). The social media platform type (public or private; P<.001), machine learning approach (shallow or deep; P=.048), and use of outcome measures (yes or no; P<.001) were significant moderators. Sensitivity analysis revealed no change in the results, indicating result stability. The Begg-Mazumdar rank correlation (Kendall τb=0.22063; P=.058) and the Egger test (2-tailed t34=1.28696; P=.207) confirmed the absence of publication bias. Conclusions: Social media textual content can be a useful tool for predicting depression. Demographics, language, social media activity, and temporal features should be considered to maximize the accuracy of depression prediction models. Additionally, the effects of social media platform type, machine learning approach, and use of outcome measures in depression prediction models need attention. Analyzing social media texts for depression prediction is challenging, and findings may not apply to a broader population. Nevertheless, our findings offer valuable insights for future research. Trial Registration: PROSPERO CRD42023427707; https://www.crd.york.ac.uk/PROSPERO/view/CRD42023427707 %R 10.2196/59002 %U https://www.jmir.org/2025/1/e59002 %U https://doi.org/10.2196/59002