Skip to main content

Comparison of LDA, NMF and BERTopic Topic Modeling Techniques on Amazon Product Review Dataset: A Case Study

  • Conference paper
  • First Online:
Computing, Internet of Things and Data Analytics (ICCIDA 2023)

Abstract

With the developing technology, the e-commerce market is growing day by day. As of 2022, it is estimated that 19.7% of the sales in the world are made over the internet. However, there are negative elements that distinguish online sales from regular sales. Communication between the seller and the customer is more difficult on the online platform. Likewise, problems such as quality or cargo are constantly written under the product reviews. For this reason, the seller must constantly monitor customer feedback and take the necessary action. With topic modeling algorithms, user complaints can be grouped and read in groups. In this study, LDA (Latent Dirichlet allocation), NMF (Non-Negative Matrix Factorization) and BERTopic algorithms tested on Amazon product review dataset were compared. According to the results obtained, all 3 algorithms are successful and useful. The BERTopic algorithm produced more meaningful results than other algorithms according to the consistency calculation metric.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Fang, X., Zhan, J.: Sentiment analysis using product review data. J. Big Data 2(1) (2015)

    Google Scholar 

  2. Gutiérrez, G.V.A.: A comparative study of NLP and machine learning techniques for sentiment analysis and topic modeling on amazon. Int. J. Comput. Sci. Eng. 9(2), 159–170 (2020)

    Article  Google Scholar 

  3. Egger, R., Yu, J.: A topic modeling comparison between LDA, NMF, Top2Vec, and BERTopic to demystify twitter posts. Front. Sociol. (2022)

    Google Scholar 

  4. Koruyan, K.: BERTopic Konu Modelleme Tekniği Kullanılarak Müşteri Şikayetlerinin Sınıflandırılması. İzmir Sosyal Bilimler Dergisi 4(2), 66–79

    Google Scholar 

  5. Grootendorst, M.: BERTopic: neural topic modeling with a class-based TF-IDF procedure. arXiv preprint arXiv:2203.05794 (2022)

  6. Sangaraju, V.R., Bolla, B.K., Nayak, D.K., Kh, J.: Topic Modelling on Consumer Financial Protection Bureau Data: An Approach Using BERT Based Embeddings. arXiv preprint arXiv:2205.07259 (2022)

  7. Abuzayed, A., Al-khalifa, H.: BERT for Arabic topic modeling: an experimental study on BERTopic technique. Procedia Comput. Sci. 189, 191–194 (2021)

    Google Scholar 

  8. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    Google Scholar 

  9. Févotte, C., Idier, J.: Algorithms for nonnegative matrix factorization with the \(\upbeta \) - divergence. Neural Comput. 23(9), 2421–2456 (2011,)

    Google Scholar 

  10. Hoyer, P.: Non-negative matrix factorization with sparseness constraints. J. Mach. Learn. Res. 5(9) (2004)

    Google Scholar 

  11. Donoho, D., Stodden, V.: When does non-negative matrix factorization give a correct decomposition into parts? Adv. Neural Inf. Process. Syst. 16 (2003)

    Google Scholar 

  12. McKinney, W.: Pandas: a foundational Python library for data analysis and statistics. Python High Perform. Sci. Comput. 14(9), 1–9 (2011)

    Google Scholar 

  13. Lin, X., Boutros, P.C.: Optimization and expansion of non-negative matrix factorization. BMC Bioinform. 21(1), 1–10 (2020)

    Google Scholar 

  14. Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)

    Google Scholar 

  15. Keita, Z.: Towards data science, Meet BERTopic BERT’s Cousin For Advanced Topic Modeling. Accessed 21 Nov 2022

    Google Scholar 

  16. Zvornicanin, Enes. When Coherence Score is Good or Bad in Topic Modeling. https://www.baeldung.com/cs/topic-modeling-coherence-score. Accessed 21 Nov 2022

  17. Reimers, N., Gurevych, I.: Sentence-bert: sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084 (2019)

  18. Goyal, C.: Step by Step Guide to Master NLP – Topic Modelling using NMF. https://www.analyticsvidhya.com/blog/2021/06/part-15-step-by-step-guide-to-master-nlp-topic-modelling-using-nmf. Accessed 21 Nov 2022

Download references

Acknowledgment

This study is supported by Artiwise Software Technologies.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Salih Can Turan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Turan, S.C., Yildiz, K., Büyüktanir, B. (2024). Comparison of LDA, NMF and BERTopic Topic Modeling Techniques on Amazon Product Review Dataset: A Case Study. In: García Márquez, F.P., Jamil, A., Ramirez, I.S., Eken, S., Hameed, A.A. (eds) Computing, Internet of Things and Data Analytics. ICCIDA 2023. Studies in Computational Intelligence, vol 1145. Springer, Cham. https://doi.org/10.1007/978-3-031-53717-2_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-53717-2_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-53716-5

  • Online ISBN: 978-3-031-53717-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics