An automated approach for selecting the most suitable AWS EC2 instance for software projects prior to deployment

Kumarage, K.S.D.; Lakshan, W.D.D.; Hewaratna, A.I.

Digital Library | SUSL Home
→
Research Publications
→
Proceedings
→
Workshops, Seminars, Symposiums ect
→
Faculty of Computing
→
COMPUTING UNDERGRADUATE RESEARCH SYMPOSIUM
→
ComURS2026 Computing Undergraduate Research Symposium : Abstracts
→
View Item

dc.contributor.author	Kumarage, K.S.D.
dc.contributor.author	Lakshan, W.D.D.
dc.contributor.author	Hewaratna, A.I.
dc.date.accessioned	2026-05-15T09:28:12Z
dc.date.available	2026-05-15T09:28:12Z
dc.date.issued	2026-01-28
dc.identifier.isbn	978-624-5727-44-5
dc.identifier.uri	http://repo.lib.sab.ac.lk:8080/xmlui/handle/susl/5288
dc.description.abstract	Cloud computing is now a staple of a modern software development with provisioning of scalable and on-demand resources. Amazon Web Services (AWS), the cloud services leader, offers great variety of types of the Elastic Compute Cloud (EC2) instances, and it makes the decision about the adequate instance to be chosen before deployment a difficult task. A poor decision will result in a reduction in performance, redundant costs, and slow deployment times. Available tools like AWS Compute Optimizer and Instance Type Finder are based on CloudWatch telemetry which means that applications must be deployed first and then it can recommend something, which in turn adds additional cost and delays. This study suggests the pre-deployment EC2 instance recommendation framework which is independent of cloud-generated telemetry. The proposed system analyzes local workload behavior using a hybrid prediction approach that combines machine learning with rule-based reasoning. System-level and application-level profiling tools are used to collect performance metrics from representative workloads, including CPU-intensive, memory-intensive, I/O-intensive, and mixed workloads. The collected metrics, such as CPU usage, memory consumption, disk throughput, and network activity, are preprocessed and transformed into structured feature vectors. In parallel, an EC2 instance specification dataset is constructed using official AWS documentation. A supervised XGBoost classifier is then applied to map workloads to the most suitable EC2 instance family, with initial labels generated through rule-based feature matching. After identifying the instance family, a secondary rule-based decision layer selects the specific instance type based on vCPU requirements, memory demand, network performance, and EBS usage patterns. To improve transparency and user understanding, a Retrieval-Augmented Generation (RAG) module retrieves relevant AWS documentation to support each recommendation.	en_US
dc.language.iso	en	en_US
dc.publisher	Faculty of Computing. Sabaragamuwa University of Sri Lanka.	en_US
dc.subject	Cloud computing	en_US
dc.subject	AWS EC2	en_US
dc.subject	Instance selection	en_US
dc.subject	Workload profiling	en_US
dc.subject	Automation	en_US
dc.title	An automated approach for selecting the most suitable AWS EC2 instance for software projects prior to deployment	en_US
dc.type	Article	en_US