Optimalisasi Stemming Kata Berimbuhan Tidak Baku Pada Bahasa Indonesia Dengan Levenshtein Distance

Main Authors: Setya Putra, Rahardyan Bisma, Utami, Ema; Universitas Amikom Yogyakarta, Raharjo, Suwanto; Institut Sains & Teknologi AKPRIND Yogjakarta
Format: Article info application/pdf Journal
Bahasa: eng
Terbitan: Politeknik Harapan Bersama , 2018
Online Access: http://ejournal.poltektegal.ac.id/index.php/informatika/article/view/877
Daftar Isi:
  • Stemming algorithm Nazief & Andriani has been development in terms of the speed and the accuracy. One of its development is Non-formal Affix Algorithm. Non-formal Affix Algorithm improves the accuracy for non-formal affixed word. In its growth, Indonesian language is used in two ways: formal and non-formal. Non-formal language is commonly used in casual situations such as conversations and social media post (Facebook, Twitter, Instagram, etc.). To get the root of the word of a casual conversation or a social media post, stemming algorithm which can process the non-formal words with affixes already proposed. But, the previous algorithm unable to stem a non-formal word that slightly change the root word. Therefore, this study modifies Non-formal Affix Algorithm to increase stemming accuracy on non-formal word. Modifications are made by adding Levenshtein Distance. The result of the research shows that the algorithm made in this research has 96.6% accuracy while the Non-formal Affix algorithm has 73.3% accuracy in processing 60 non-formal affixed words. Based on the result, Levenshtein Distance approach can increase the accuracy on stemming non-formal affixed word.