Optimalisasi Stemming Kata Berimbuhan Tidak Baku Pada Bahasa Indonesia Dengan Levenshtein Distance
Main Authors: | Setya Putra, Rahardyan Bisma, Utami, Ema; Universitas Amikom Yogyakarta, Raharjo, Suwanto; Institut Sains & Teknologi AKPRIND Yogjakarta |
---|---|
Format: | Article info application/pdf Journal |
Bahasa: | eng |
Terbitan: |
Politeknik Harapan Bersama
, 2018
|
Online Access: |
http://ejournal.poltektegal.ac.id/index.php/informatika/article/view/877 http://ejournal.poltektegal.ac.id/index.php/informatika/article/view/877/696 |
Daftar Isi:
- Stemming algorithm Nazief & Andriani has been development in terms of the speed and the accuracy. One of its development is Non-formal Affix Algorithm. Non-formal Affix Algorithm improves the accuracy for non-formal affixed word. In its growth, Indonesian language is used in two ways: formal and non-formal. Non-formal language is commonly used in casual situations such as conversations and social media post (Facebook, Twitter, Instagram, etc.). To get the root of the word of a casual conversation or a social media post, stemming algorithm which can process the non-formal words with affixes already proposed. But, the previous algorithm unable to stem a non-formal word that slightly change the root word. Therefore, this study modifies Non-formal Affix Algorithm to increase stemming accuracy on non-formal word. Modifications are made by adding Levenshtein Distance. The result of the research shows that the algorithm made in this research has 96.6% accuracy while the Non-formal Affix algorithm has 73.3% accuracy in processing 60 non-formal affixed words. Based on the result, Levenshtein Distance approach can increase the accuracy on stemming non-formal affixed word.