Abstract:
Agriculture is one of the most important economic sectors in any country. As the
world’s population grows, governments increase their crop production every year. In
this regard, a wider variety of factors that affect the crop yield must be considered,
including soil, rainfall, light, water, and temperature. Soil is one of the most significant
factors for better production; simultaneously, the use of suitable soil fertilizer is a top
priority for improving agricultural productivity. Many factors influence soil fertility,
including climate, water, soil acidity, and soil nutrition. Traditional methods used by
farmers are not sufficient to determine the soil characteristics to predict soil fertility.
Agricultural crop productivity analytics is an emerging area of study in which the capabilities
of data mining are utilized. In this study, to predict the soil fertility, K-Nearest
Neighbour (KNN), Artificial Neural Networks (ANN), Logistic Regression, Naive Bayes,
and Support Vector Machines (SVM) have been considered and tested against specific
evaluation metrics for the highest classification accuracy. For this purpose, 600 records
of data consisting of five selected attributes were analyzed. A portion of the data was
obtained from the Kaggle Machine Learning Repository, while the rest of the data
was acquired from the Agricultural Office, Batticaloa, Sri Lanka. Since it is a binary
classification problem, the target variable consists of two classes namely the suitability
and unsuitability of the field for fertility. Based on the results, ANN showed a higher
accuracy than the other four algorithms. ANN was executed along with one input,
hidden, and output layer. Finally, ANN produced the results with 95% accuracy for
predicting soil fertility, and provided a lower error rate of 5%. Accordingly, the final
prediction model was developed using ANN.