Abstrak/Abstract |
In the past decade, eye tracking has been a crucial
approach for object selection in digital assistive technology as well
as touchless digital signage. Accurate object selection depends on
performance of eye movements classification. Many deep learning
techniques have been proposed for eye movements classification.
Despite of these numerous models, previous approaches have
yet to achieve high classficiation accuracy—particularly when
dealing with smooth pursuit eye movement. To bridge this
scientific gap and improve the effectiveness of eye movement
classification, we propose a hybrid CNN-Transformer model. We
also incorporated Hyperband hyperparameter tuning to obtain
the best parameter values of the model. We evaluated our
approach in the GazeCom dataset. This dataset was enhanced
with customized annotations designed to accommodate different
types of eye movements. Our method yielded F1 scores of 0.9572,
0.9273, and 0.8358 for fixation, saccade, and smooth pursuit eye
movements, respectively. The proposed method achieved superior
F1 scores by a margin of 1% to 12.36% compared with the state-
of-the-art Temporal Convolutional Network (TCN). A significant
improvement was observed in the classification of smooth pursuit
eye movement. The experimental results imply that the proposed
method can serve as a guide for implementing the Transformer
models for eye movements classification |