Deteksi Kemiripan Dokumen Menggunakan Cosine Similarity Berdasarkan Representasi Teks Count Vectorizer Dan TF IDF
Abstract
Full Text:
PDFReferences
Pemerintah Indonesia, “Undang-Undang Nomor 4 Tahun 2014 Tentang Penyelenggaraan Pendidikan Tinggi dan Pengelolaan Perguruan Tinggi,” Standar Nasional Pendidikan, p. 37, 2014, [Online]. Available: https://peraturan.bpk.go.id/Home/Details/5441/pp-no-4-tahun-2014
Kementerian Pendidikan dan Kebudayaan, Permendikbud Nomor 3 Tahun 2020. www.kemdikbud.go.id, 2020.
A. Kleebayoon and V. Wiwanitkit, “Artificial Intelligence, Chatbots, Plagiarism and Basic Honesty: Comment,” Cell Mol Bioeng, vol. 16, no. 2, pp. 173–174, Apr. 2023, doi: 10.1007/s12195-023-00759-x.
V. Chandere, S. Satish, and R. Lakshminarayanan, “Online plagiarism detection tools in the digital age: A review,” Ann Rom Soc Cell Biol, vol. 25, no. 1, pp. 7110–7119, 2021, [Online]. Available: https://annalsofrscb.ro/index.php/journal/article/view/881
K. W. G. A. P. P. H. S. D. P. W. D. H. R. S. K. N. M. A. P. P. Musthofa Galih Pradana, Information Retrieval. Penamuda, 2024.
A. Kulkarni and A. Shivananda, Natural Language Processing Recipes. 2021. doi: 10.1007/978-1-4842-7351-7.
Raymond S. T. Lee, Natural Language Processing: A Textbook with Python Implementation. Springer, 2023.
Thushan Ganegedara, Natural Language Processing with TensorFlow - Second Edition. Packt Publishing, 2022.
J. Wang and Y. Dong, “Measurement of text similarity: A survey,” Information (Switzerland), vol. 11, no. 9, pp. 1–17, 2020, doi: 10.3390/info11090421.
M. M. Danyal, S. S. Khan, M. Khan, S. Ullah, M. B. Ghaffar, and W. Khan, “Sentiment analysis of movie reviews based on NB approaches using TF–IDF and count vectorizer,” Soc Netw Anal Min, vol. 14, no. 1, p. 87, Apr. 2024, doi: 10.1007/s13278-024-01250-9.
A. Wendland, M. Zenere, and J. Niemann, “Introduction to Text Classification: Impact of Stemming and Comparing TF-IDF and Count Vectorization as Feature Extraction Technique,” 2021, pp. 289–300. doi: 10.1007/978-3-030-85521-5_19.
G. M. Raza, Z. S. Butt, S. Latif, and A. Wahid, “Sentiment Analysis on COVID Tweets: An Experimental Analysis on the Impact of Count Vectorizer and TF-IDF on Sentiment Predictions using Deep Learning Models,” in 2021 International Conference on Digital Futures and Transformative Technologies (ICoDT2), IEEE, May 2021, pp. 1–6. doi: 10.1109/ICoDT252288.2021.9441508.
K. M. Suryaningrum, “Comparison of the TF-IDF Method with the Count Vectorizer to Classify Hate Speech,” Engineering, MAthematics and Computer Science (EMACS) Journal, vol. 5, no. 2, pp. 79–83, May 2023, doi: 10.21512/emacsjournal.v5i2.9978.
T. Ahmed, S. F. Mukta, T. Al Mahmud, S. Al Hasan, and M. Gulzar Hussain, “Bangla Text Emotion Classification using LR, MNB and MLP with TF-IDF & CountVectorizer,” in 2022 26th International Computer Science and Engineering Conference (ICSEC), IEEE, Dec. 2022, pp. 275–280. doi: 10.1109/ICSEC56337.2022.10049341.
H. D. Abubakar and M. Umar, “Sentiment Classification: Review of Text Vectorization Methods: Bag of Words, Tf-Idf, Word2vec and Doc2vec,” SLU Journal of Science and Technology, vol. 4, no. 1 & 2, pp. 27–33, Aug. 2022, doi: 10.56471/slujst.v4i.266.
A. Gupta and U. Sharma, “Machine Learning Based Aspect Category Detection for Hindi Data Using TF-IDF and Count Vectorization,” in 2024 2nd International Conference on Device Intelligence, Computing and Communication Technologies (DICCT), IEEE, Mar. 2024, pp. 39–44. doi: 10.1109/DICCT61038.2024.10532960.
M. Singhal, N. Singhal, S. Khera, A. Upmanyu, and P. Nagrath, “Improvisation of Reddit flair detection using TF-IDF and countvectorizer,” 2023, p. 020003. doi: 10.1063/5.0181369.
Sajid Khan, Mehmoon Anwar, Huma Qayyum, Farooq Ali, and Marriam Nawaz, “Fake News Classification using Machine Learning: Count Vectorizer and Support Vector Machine,” Journal of Computing & Biomedical Informatics, vol. 4, no. 01, Jan. 2023, doi: 10.56979/401/2022/85.
DOI: http://dx.doi.org/10.21927/ijubi.v7i2.5170
Refbacks
- There are currently no refbacks.
Copyright (c) 2025 Indonesian Journal of Business Intelligence (IJUBI)

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

IJUBI by https://ejournal.almaata.ac.id/index.php/IJUBI is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
View My Stats
Indonesian Journal of Business Intelligence (IJUBI)
Department of Information System
Alma Ata University
Email: ijubi@almaata.ac.id
 












