Theses
Permanent URI for this communityhttps://rda.sliit.lk/handle/123456789/2429
Postgraduate students are required to submit a thesis as part of fulfilling the requirements of their respective postgraduate degree programmes. This community features merit-based graduate theses submitted by SLIIT postgraduate students. Abstracts are available for public viewing, while the full texts can be accessed on-site within the library.
Theses and Dissertations of the Sri Lanka Institute of Information Technology (SLIIT) are licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Browse
13 results
Search Results
Publication Open Access A Machine Learning Approach to Identify the Key Factors Affecting Correct Stream Selection and To Predict Suitable Subject Streams for Advanced Level Students in Sri Lanka(Sri Lanka Institute of Information Technology, 2025-12) Abeywardhana,K.G.H.Education plays a vital role in shaping the economic growth and sustainable development of a nation. It is not only a measure of a country’s intellectual wealth but also a determining factor in its future progress. In Sri Lanka, education is provided free of charge by the government from primary school through university, ensuring equal access for all students. Within this framework, the General Certificate of Education (Ordinary Level) – G.C.E. (O/L) and the General Certificate of Education (Advanced Level) – G.C.E. (A/L) examinations represent two critical milestones in the academic journey. The G.C.E. (A/L) examination, in particular, serves as the gateway to higher education and university admission, marking a pivotal stage in shaping students’ academic and professional futures. At the end of the O/L stage, students are required to select a subject stream such as Science, Arts, Commerce, or Technology to pursue during their A/L studies. This choice has a lasting impact, as it directly determines the student’s educational direction and career opportunities. However, many students make this crucial decision based on external influences, such as parental pressure, peer comparison, or limited guidance, rather than through a clear understanding of their academic strengths, personal interests, or long-term career aspirations. Consequently, this often leads to dissatisfaction, stream switching, or even discontinuation of studies. To address this issue, it is essential to adopt a data-driven approach that considers multiple factors, including students’ O/L examination performance, inborn talents, extracurricular activities, and preferred professional fields. This research introduces a machine learning-based model the Subject Stream Prediction System—designed to recommend the most suitable A/L subject stream for students. The proposed system not only predicts the optimal subject stream but also provides additional guidance by suggesting potential career paths, relevant educational qualifications, and technical skills aligned with the student’s profile. Four supervised machine learning algorithms K-Nearest Neighbors (KNN), Decision Tree, Random Forest, and Support Vector Machine (SVM)were trained and evaluated to develop the predictive model, ensuring the highest possible accuracy and reliability.Publication Open Access Predicting Cryptocurrency Trade Count with Machine Learning Models(Sri Lanka Institute of Information Technology, 2025) Doluweera,C.Y.PThis study examines the prediction of cryptocurrency trade counts using machine learning. Minute-level market data were cleaned, merged, and enriched with time features to form a reliable dataset. Two ensemble regressors, Random Forest and Gradient Boosting, were implemented alongside baselines, with performance judged mainly by Mean Absolute Error and supported by RMSE. Model tuning and cross validation were used to improve robustness, and results were visualized to compare errors, track actual versus predicted counts, and explain feature influence. Across the tested assets, Random Forest delivered the most consistent accuracy and generalizable results. Feature importance analysis showed trading volume in USD as the dominant driver of predictions, with additional value from simple temporal cues such as hour and day. Deep learning approaches were explored for their ability to capture non-linear and temporal patterns, but they required further stabilization to match the ensembles on this dataset. The work highlights both the promise and the limits of machine learning in a market that trades constantly and moves quickly. Models captured broad trends yet struggled with sharp spikes typical of high volatility periods. The thesis proposes practical next steps, including periodic retraining, integration of sentiment and external signals, and the use of explainable methods to improve transparency. These contributions offer a clear framework for real time trade count forecasting and for building adaptive tools to support decision making in digital asset markets.Publication Open Access Customer Risk Profiling System(Sri Lanka Institute of Information Technology, 2025-12) GUNASEKARA, G.A.R.This thesis addresses the critical need for advanced, ethical risk quantification in the motor insurance sector, currently hampered by fragmented data and limited cross-company fraud visibility. The primary objective was to design and validate a Customer Risk Profiling System (CRPS) that integrates heterogeneous data sources and utilizes Machine Learning (ML) for dynamic risk scoring. The methodology involved aggregating data streams including claims, premiums, policy history, and external PEP/AML compliance scores and employing Gradient Boosted Trees (GBTs) to achieve a high classification Accuracy of $0.7561$ and a ROC-AUC of $0.8982$. Empirical findings confirmed the predictive power of behavioral features over linear demographic metrics, validating the choice of non-linear ensemble models. The CRPS successfully segments customers into Low, Medium, and High-Risk tiers, enabling targeted intervention. Crucially, the system embeds Explainable AI (XAI) using SHAP values and a continuous Feedback Loop to maintain accuracy against concept drift, ensuring auditability and ethical governance against potential bias. The study concludes by proposing the Insurance National Grid (NIG), a centralized platform designed to connect all insurers to the regulator. The NIG would enforce data standardization and enable cross-company fraud detection, magnifying the CRPS's impact from a firm-specific tool to a national strategic asset, thereby promoting market efficiency, compliance, and sustained sector resilience.Publication Open Access The Role of Machine Learning Algorithms in Shaping Teenage Social Identity through Curated Digital Experiences.(Sri Lanka Institute of Information Technology, 2025-12) Sajipratha, R.Social media has taken its place as one of the most powerful instruments of the modern digital environment that affects the perception of young individuals towards each other and themselves. Identity formation is a significant developmental condition among teenagers, and due to the individualized algorithms that shape what they view, like, and interact with online, it is becoming more affected by the latter. The paper examines the influence of Machine Learning-based recommendation systems on the social identity of teenagers in the framework of algorithmic curation, and the connection between algorithmic exposure, diversity of content, and identity pressure. The study enables a more profound insight into how Artificial Intelligence impacts social comparison, body image, and self-perception of adolescents by studying the psychological implications of the use of algorithms in personalization. The research design was a quantitative one to analyze the data gathered based on 150 students between age 18 to 19 of an international school in Kandy, Sri Lanka. The research employed an indexed questionnaire, which measured the following: Algorithmic Exposure Index (AEI), Stereotypical Content Reinforcement (SCR), Number of Topics (NTOP), and Body/Identity Pressure (BIP). The data were analyzed using descriptive, correlation and multinomial logistic regression techniques to identify the interaction of these variables and predictive of the emotional and identity related outcomes. Findings showed that more than 60 percent of the subjects especially females indicated that they experienced a lot of social comparison and body pressure following the exposure to the algorithms. Tik Tok and Instagram users reported much more odds of being subjected to appearance and behavioral pressure than YouTube users, thereby affirming that appearance-focused and engagement-oriented platforms enhance conformity and self-assessment. Moreover, negative self-perception at the time of exposure to stereotypical material (high SCR) was strongly related to exposure, and increased diversity of the topic (high NTOP) was a protective factor, decreasing the identity stress and resulting in a more balanced sense of self. The results promote both Social Identity Theory and Algorithmic Bias Theory, showing the Machine Learning systems not only suggest content, but also act as the contributors to the formation of the identity of users by supporting specific social norms and values. Young people who are in algorithmic echo chambers are less exposed to different or anti-stereotypical stories, which results in more limited ideas of attractiveness, popularity, and success. This paper thus lays emphasis on the importance of algorithmic responsibility, ethical design and media literacy interventions. The research will offer a solution to these issues by proposing the Responsible Curation Framework, which is a complex intervention encompassing algorithmic diversity prompts, user-controlled content filters and digital literacy education. Collectively, these measures will help to regain the balance of exposure, self-awareness and encourage psychological well-being in young users. On the whole, this analysis can be discussed as part of the expanding discourse of ethical AI and digital well-being and can serve as a way of starting to change how algorithmic recommendation systems are managed so as to become instruments of conformity instead of instruments of diversity, empowerment, and positive identity formation.Publication Open Access Publisher-Centric Machine Learning-Based Solution for Click Fraud(SLIIT, 2024-12) Pathirage, G.SInvalid traffic and click fraud present significant challenges in online advertising, impacting advertising metrics and causing substantial financial losses across the digital advertising ecosystem. While advertisers have access to various protective solutions and receive protection from advertising networks, publishers face limited options for detecting and preventing fraudulent activities on their websites. This gap in publisher-side protection creates a critical area for investigation and development of practical solutions. This research presents an effective publisher-side solution: the Ad Click Fraud Protector (ACFP), an open-source WordPress plugin that detects and prevents click fraud and invalid traffic. The research methodology involved studying browser fingerprinting approaches by collecting browser fingerprints from legitimate users and bots, distinguished through firewall rules and honeypots. Experimental analysis identified six key browser fingerprinting attributes that effectively distinguish between legitimate and fraudulent traffic. These findings informed the development of the ACFP plugin, which incorporates additional security measures for enhanced protection. Testing of the plugin on two AdSense publisher accounts demonstrated its effectiveness in reducing invalid clicks, minimizing invalid traffic, and decreasing revenue deductions due to invalid clicks. The results show that publishers can effectively protect their ad accounts from penalties and deductions through browser fingerprint-based traffic filtering. This research provides publishers with an accessible, opensource solution for combating click fraud while contributing to the theoretical understanding of browser fingerprinting effectiveness in fraud detection. Additionally, it establishes a framework for future development in publisher-side protection systems.Publication Embargo Leveraging Word Embedding for Automated Candidate Ranking in Talent Acquisition Processes(SLIIT, 2024-12) Rasanayagam, JRanking the applicants who applied for a certain position in a company is mostly done manually. To ease this process, this system creates a ranking system by giving scores for each applicant based on the word embedding model trained using the past datasets. The job advertisements related to information technology fields or related to certain positions are collected and trained a model using the word embedding process. The system compares the resume of the applicant with this model and allocates a specific score for each applicant and orders them in the ascending order. Data crawling and scraping, text preprocessing and training the model are the main components in this research. The goal of this research is to collect the data of job openings related to the information technology industry and collect the job seekers information through the web scraping and crawling and train a model to rank the applicants. The crawled data is used to prepare the corpus. Python scrapy is used to prepare the crawler script for this crawling mechanism. The crawled data is then undergone for the preprocessing. Finally, the preprocessed corpus is undergone for the word embedding. Word2Vec, Gensim are some algorithms used here to train a model. This model is used to compare the resumes of each applicant and get value from the model and finally it will output a total score for each resume and then the system finally ranks the applicants based on the scores they got in ascending order.Publication Open Access Security Threat Detection In Telecommunication Network In Compromised IoT Devices By Using Trustworthy Machine Learning(SLIIT, 2022-10) Aperame, V.Currently, Information Communication Technology (ICT) holds a significant part in the sphere. In IT, Cyber Security carry a massive position. Internet of Things (IoT) indicates to the vast number of tangible bodies which are affixed to the internet, by gathering and switching information with other apparatus and systems with the help of the internet. By using Machine Learning technique, the security threat detection is identified over the telecommunication network in compromised IoT devices. The Driver Anomaly Detection (DAD) Dataset is used for anomaly detection in IoT networks. Message Queue Telemetry Transport protocol (MQTT) is a messaging protocol which is based on Transmission Control Protocol (TCP) and utilized for to create communication between multiple devices. It is required to identify and distinguish the available threats presented in telecommunication network. This thesis gives an understanding about different security threats detection in telecommunication network using Machine Learning technique and explain about security constraints, issues presented. By implementing Security Threat Detection System in an institute, it helps to assists analytical output concerning the imminent threats. Similarly, it aids to guarantee the fame of an association by launching faith among the workers. The above are the benefits obtained by a specific institution by consisting a Threat Detection System. Although there are existing Threat Detection Systems presented in the trade, but they are lacked in some instances like real time. So, in order to resolve all these problems, in this research as a result, ended up with a cost effective and ease of use comprehensive Threat Detection System in a telecommunication network in compromised IoT devices by using trustworthy machine learningPublication Open Access Subject Stream Prediction: A Machine learning Approach to Select the Suitable Subject Stream for Senior Secondary Students in Sri Lanka(2022-09) Abeywardhane, K.G.KaushalyaEducation is an important factor that measures the nation's wealth and directly affects the country's future development. According to the Sri Lankan government, free education provides to students at all levels up to the university level. General Certificate of Examination (Ordinary Level) – G.C.E.(O/L) and General Certificate of Examination (Advanced Level) – G.C.E.(A/L) are essential exams that complete senior secondary education. G.C.E.(A/L) is the examination that causes one to enter a university for higher education. According to the Sri Lankan education schemas, students happen to select one subject stream and related subjects relevant to that subject stream to continue their senior secondary education key stage 2. That selection is caused to the students’ whole lives because students happen to face G.C.E.(A/L) from that subject stream. Most of the students have taken this decision according to the force of someone or comparing it with their own. I think it may be caused to break the senior secondary education key stage (2) in the middle or change the subject stream in the middle. These kinds of reasons affect to keep away the students from their target careers. From my point of view, students should pay attention to O/L results and their inborn talents, skills, and relevant working field that they hope for their job when selecting the subject stream for continuing their senior secondary education. I have developed a machine learning model to suggest the best subject stream based on the above features. The implemented model which is called the SubjectStreamPredict system predicts the best subject stream for students. As well as the implemented model suggest another suitable ten solutions including an appropriate career path according to the user’s input values. To implement the model, I have trained and tested four machine learning algorithms: K-Nearest Neighbors, Decision Tree, Random Forest, and Support Vector Machine Algorithm for the same data set. The Random Forest algorithm outperformed other algorithms and gave high accuracy (0.70). According to the analysis results I implemented my model using Random Forest Classifier algorithm and I improve the output generated from Random Forest by predicting more than one featurePublication Open Access Information extraction for business process enhancement using Natural Language Processing and Machine Learning(2022-10) Wicrama Arachchi, W.A.D.AAs part of the digital era, data became more important than ever. Especially the activities at work align with more text-based information. Over the past years, research on data has become a rapidly growing area with continuously innovative techniques. As a result, nowadays Business Intelligence, Big Data Analysis, No SQL Analysis, and other data science tools are processing huge amounts of data to provide business patterns and trends related to business fields. Studies on unstructured data such as text-based data, PDF documents, videos, and images were not captured properly to provide more insightful information. Text-based data within a company can be an extremely rich source of information. Therefore, it is very important to extract insights from this unstructured data. Extracting information from unstructured documents like product catalogs can be a difficult task due to their unorganized nature. Currently, in the real world, there is no such system to gain insight into manuals or product catalogs easily. As part of the job activities of electrical engineers, referring to product manuals and catalogs is a recurrent task. They have to spend considerable time on this task. This directly impacts the efficiency and productivity of an employee. Especially in the electrical industry, engineers have to go through a lot of product catalogs to find more information on a single item. Over the past ten years, Natural Language Processing and Machine learning have had a major impact on business processes. It is a known fact that NLP and ML are becoming the top enterprise level technologies that enable to perform business tasks that were impossible to reach. There were many technologies introduced to fill the gaps and meet the requirements. However, NLP and ML are becoming more popular than the other technologies in the industry. In this research, I’m providing a concept that gains more insightful information from unstructured data such as product catalogs. The research reading is to develop a digital assistant with the use of NLP and ML where electrical engineers can submit their queries and get the information about their products easily. This will increase the efficiency and productivity of the electrical engineer as it will provide a method to avoid time consuming activities such as reading product catalogs.Publication Open Access IoT Based Smart Waste Segregation Using Machine Learning for Home Environment(2022-11) Rijah, U.L.M
