Abstrak/Abstract |
Word vectors are an important part of machine learning. Word vectors are a numerical representation of text data. One of the methods that can be used to convert text into numerics is word embeddings. The word embeddings algorithm that researchers often use is Continuous Bag of Word, Skip-Gram, and FastText. This paper will discuss the transformation of textual data from Islamic knowledge domain documents into numerical forms using these three algorithms, then evaluate the word vector results using intrinsic and extrinsic evaluation techniques. We conduct intrinsic evaluations by determining the words to be evaluated, then checking for the existence of synonyms, antonyms, related words, and derived words from the nearest set of words based on vector values. We also tried to use vector words to solve word analogy problems. The best word vector in extrinsic evaluation is the result of the CBOW algorithm which is integrated with Binary Relevance and Multilayer Perceptron, with an accuracy value of 77.56% and a hamming loss value of 8.14%. |