Abstract:
Web accessibility remains a major challenge for users with physical and motor disabilities, as
most websites rely on mouse- and keyboard-based interactions that are unsuitable for hands-
free navigation. Existing voicebased solutions mainly depend on speech-to-text systems, which
require large datasets, high computational resources, and language-specific transcription, lim-
iting their effectiveness for lightweight, real-time accessibility support. This research proposes
a prediction-based voice accessibility system that enables web navigation without full speech
transcription.A dataset of approximately 3000 short audio samples was collected from 30 par-
ticipants, covering ten common accessibility commands such as scrolling, zooming, and navi-
gation control. MelFrequency Cepstral Coefficients (MFCCs) were extracted as compact audio
features, and multiple machine learning classifiers were evaluated. Model performance was as-
sessed using stratified train–test splits, cross-validation, precision, recall, F1-score, and confu-
sion matrices.A tuned XGBoost classifier achieved an overall accuracy of approximately 72%,
outperforming logistic regression, support vector machines, and random forests while main-
taining low latency suitable for real-time use. The model was deployed as a browser extension,
enabling language-independent, realtime voice-controlled web navigation and improving digi-
tal accessibility for disabled users.