Named entity recognition and dependency parsing for better concept extraction in summary obfuscation detection
Penulis/Author
Dr. Umar Taufiq, S.Kom., M.Cs. (1); Prof. Dr.-Ing. Mhd. Reza M. I. Pulungan, S.Si., M.Sc. (2); Dr. Yohanes Suyanto, M.I.Kom. (3)
Tanggal/Date
2023
Kata Kunci/Keyword
Abstrak/Abstract
Summary obfuscation is a type of idea plagiarism where a summary of a text document is inserted into another text document so that it is more difficult to detect with ordinary plagiarism detection methods. Various methods have been developed to overcome this problem, one of which is based on genetic algorithms. This paper proposes a new approach for summary obfuscation detection based on named entity recognition and dependency parsing, which is straightforward but accurate and easy to analyze compared to genetic algorithm-based methods. The proposed method successfully detects summary obfuscation at the document level more accurately than existing genetic algorithm-based methods. Our method produced accuracy at sentence level up to more than 84% for specific benchmark and threshold cases. In addition, we have also tested our proposed method on other types of plagiarism, and the resulting accuracy is excellent.