MSc in Information Technology
Permanent URI for this collectionhttps://rda.sliit.lk/handle/123456789/2484
Students enrolled in the MSc in Information Technology programme are required to submit a thesis as a compulsory component of their degree requirements. This collection features merit-based theses submitted by postgraduate students specialising in Information Technology. Abstracts are available for public viewing, while the full texts can be accessed on-site within the library.
Theses and Dissertations of the Sri Lanka Institute of Information Technology (SLIIT) are licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Browse
8 results
Search Results
Publication Open Access A Machine Learning Approach to Identify the Key Factors Affecting Correct Stream Selection and To Predict Suitable Subject Streams for Advanced Level Students in Sri Lanka(Sri Lanka Institute of Information Technology, 2025-12) Abeywardhana,K.G.H.Education plays a vital role in shaping the economic growth and sustainable development of a nation. It is not only a measure of a country’s intellectual wealth but also a determining factor in its future progress. In Sri Lanka, education is provided free of charge by the government from primary school through university, ensuring equal access for all students. Within this framework, the General Certificate of Education (Ordinary Level) – G.C.E. (O/L) and the General Certificate of Education (Advanced Level) – G.C.E. (A/L) examinations represent two critical milestones in the academic journey. The G.C.E. (A/L) examination, in particular, serves as the gateway to higher education and university admission, marking a pivotal stage in shaping students’ academic and professional futures. At the end of the O/L stage, students are required to select a subject stream such as Science, Arts, Commerce, or Technology to pursue during their A/L studies. This choice has a lasting impact, as it directly determines the student’s educational direction and career opportunities. However, many students make this crucial decision based on external influences, such as parental pressure, peer comparison, or limited guidance, rather than through a clear understanding of their academic strengths, personal interests, or long-term career aspirations. Consequently, this often leads to dissatisfaction, stream switching, or even discontinuation of studies. To address this issue, it is essential to adopt a data-driven approach that considers multiple factors, including students’ O/L examination performance, inborn talents, extracurricular activities, and preferred professional fields. This research introduces a machine learning-based model the Subject Stream Prediction System—designed to recommend the most suitable A/L subject stream for students. The proposed system not only predicts the optimal subject stream but also provides additional guidance by suggesting potential career paths, relevant educational qualifications, and technical skills aligned with the student’s profile. Four supervised machine learning algorithms K-Nearest Neighbors (KNN), Decision Tree, Random Forest, and Support Vector Machine (SVM)were trained and evaluated to develop the predictive model, ensuring the highest possible accuracy and reliability.Publication Open Access Publisher-Centric Machine Learning-Based Solution for Click Fraud(SLIIT, 2024-12) Pathirage, G.SInvalid traffic and click fraud present significant challenges in online advertising, impacting advertising metrics and causing substantial financial losses across the digital advertising ecosystem. While advertisers have access to various protective solutions and receive protection from advertising networks, publishers face limited options for detecting and preventing fraudulent activities on their websites. This gap in publisher-side protection creates a critical area for investigation and development of practical solutions. This research presents an effective publisher-side solution: the Ad Click Fraud Protector (ACFP), an open-source WordPress plugin that detects and prevents click fraud and invalid traffic. The research methodology involved studying browser fingerprinting approaches by collecting browser fingerprints from legitimate users and bots, distinguished through firewall rules and honeypots. Experimental analysis identified six key browser fingerprinting attributes that effectively distinguish between legitimate and fraudulent traffic. These findings informed the development of the ACFP plugin, which incorporates additional security measures for enhanced protection. Testing of the plugin on two AdSense publisher accounts demonstrated its effectiveness in reducing invalid clicks, minimizing invalid traffic, and decreasing revenue deductions due to invalid clicks. The results show that publishers can effectively protect their ad accounts from penalties and deductions through browser fingerprint-based traffic filtering. This research provides publishers with an accessible, opensource solution for combating click fraud while contributing to the theoretical understanding of browser fingerprinting effectiveness in fraud detection. Additionally, it establishes a framework for future development in publisher-side protection systems.Publication Embargo Leveraging Word Embedding for Automated Candidate Ranking in Talent Acquisition Processes(SLIIT, 2024-12) Rasanayagam, JRanking the applicants who applied for a certain position in a company is mostly done manually. To ease this process, this system creates a ranking system by giving scores for each applicant based on the word embedding model trained using the past datasets. The job advertisements related to information technology fields or related to certain positions are collected and trained a model using the word embedding process. The system compares the resume of the applicant with this model and allocates a specific score for each applicant and orders them in the ascending order. Data crawling and scraping, text preprocessing and training the model are the main components in this research. The goal of this research is to collect the data of job openings related to the information technology industry and collect the job seekers information through the web scraping and crawling and train a model to rank the applicants. The crawled data is used to prepare the corpus. Python scrapy is used to prepare the crawler script for this crawling mechanism. The crawled data is then undergone for the preprocessing. Finally, the preprocessed corpus is undergone for the word embedding. Word2Vec, Gensim are some algorithms used here to train a model. This model is used to compare the resumes of each applicant and get value from the model and finally it will output a total score for each resume and then the system finally ranks the applicants based on the scores they got in ascending order.Publication Open Access Subject Stream Prediction: A Machine learning Approach to Select the Suitable Subject Stream for Senior Secondary Students in Sri Lanka(2022-09) Abeywardhane, K.G.KaushalyaEducation is an important factor that measures the nation's wealth and directly affects the country's future development. According to the Sri Lankan government, free education provides to students at all levels up to the university level. General Certificate of Examination (Ordinary Level) – G.C.E.(O/L) and General Certificate of Examination (Advanced Level) – G.C.E.(A/L) are essential exams that complete senior secondary education. G.C.E.(A/L) is the examination that causes one to enter a university for higher education. According to the Sri Lankan education schemas, students happen to select one subject stream and related subjects relevant to that subject stream to continue their senior secondary education key stage 2. That selection is caused to the students’ whole lives because students happen to face G.C.E.(A/L) from that subject stream. Most of the students have taken this decision according to the force of someone or comparing it with their own. I think it may be caused to break the senior secondary education key stage (2) in the middle or change the subject stream in the middle. These kinds of reasons affect to keep away the students from their target careers. From my point of view, students should pay attention to O/L results and their inborn talents, skills, and relevant working field that they hope for their job when selecting the subject stream for continuing their senior secondary education. I have developed a machine learning model to suggest the best subject stream based on the above features. The implemented model which is called the SubjectStreamPredict system predicts the best subject stream for students. As well as the implemented model suggest another suitable ten solutions including an appropriate career path according to the user’s input values. To implement the model, I have trained and tested four machine learning algorithms: K-Nearest Neighbors, Decision Tree, Random Forest, and Support Vector Machine Algorithm for the same data set. The Random Forest algorithm outperformed other algorithms and gave high accuracy (0.70). According to the analysis results I implemented my model using Random Forest Classifier algorithm and I improve the output generated from Random Forest by predicting more than one featurePublication Open Access Information extraction for business process enhancement using Natural Language Processing and Machine Learning(2022-10) Wicrama Arachchi, W.A.D.AAs part of the digital era, data became more important than ever. Especially the activities at work align with more text-based information. Over the past years, research on data has become a rapidly growing area with continuously innovative techniques. As a result, nowadays Business Intelligence, Big Data Analysis, No SQL Analysis, and other data science tools are processing huge amounts of data to provide business patterns and trends related to business fields. Studies on unstructured data such as text-based data, PDF documents, videos, and images were not captured properly to provide more insightful information. Text-based data within a company can be an extremely rich source of information. Therefore, it is very important to extract insights from this unstructured data. Extracting information from unstructured documents like product catalogs can be a difficult task due to their unorganized nature. Currently, in the real world, there is no such system to gain insight into manuals or product catalogs easily. As part of the job activities of electrical engineers, referring to product manuals and catalogs is a recurrent task. They have to spend considerable time on this task. This directly impacts the efficiency and productivity of an employee. Especially in the electrical industry, engineers have to go through a lot of product catalogs to find more information on a single item. Over the past ten years, Natural Language Processing and Machine learning have had a major impact on business processes. It is a known fact that NLP and ML are becoming the top enterprise level technologies that enable to perform business tasks that were impossible to reach. There were many technologies introduced to fill the gaps and meet the requirements. However, NLP and ML are becoming more popular than the other technologies in the industry. In this research, I’m providing a concept that gains more insightful information from unstructured data such as product catalogs. The research reading is to develop a digital assistant with the use of NLP and ML where electrical engineers can submit their queries and get the information about their products easily. This will increase the efficiency and productivity of the electrical engineer as it will provide a method to avoid time consuming activities such as reading product catalogs.Publication Open Access IoT Based Smart Waste Segregation Using Machine Learning for Home Environment(2022-11) Rijah, U.L.MPublication Embargo Decision Support System for Overcoming the Challenges in Vocational Education in Sri Lanka(2021) Lakshani, J. K. A. M.The vocational education is undergoing continuous changes. In the past, high youth unemployment has taken place due to unfamiliarity with vocational education. Researchers and policy makers are paying attention to the vocational education because of the hidden importance of the vocational education. In Sri Lanka, there is a vocational education system as the 13 years mandatory education system. The project is going to discover the challenges of the vocational education and give some solution to enhance the effectiveness of vocational education using the sample scenario of the professional entry. There are several issues in vocational education system. Among them, the major challenge is the lower rate of successfully completed students than commencing students. The main objective of this research is to develop a Data-driven decision support system to mitigate the students’ dropouts from vocational education using deep learning model with higher level of accuracy rate than previous systems. Accurate data collection helps to maintain the integrity of the research in any field. The project has collected real data set from the students and teachers in selected government schools in Sri Lanka. Data has collected mainly in three categories as demographic factors, academic performance and candidate interest. Collected data has analyzed according to the data analysis techniques. Decision support system has used machine learning model to predict the suitable vocational education pathways to the students. The model has used deep neural network (DNN) with PyTorch library. After training the model, the model has predicted the accuracy level as 96.06%.Publication Embargo Unsupervised Sinhala Cyberbullying Categorization(2021) Chandrasena, B.G.MThe objective of unsupervised machine learning is to categorize the social media comments into a given number of pre-learned categories. The earlier studies of this domain have used many the dataset for supervised learning & introduced a large number of techniques, methodologies. A major challenge there was training labels. Although words with training comments are easy to find, separating them manually is not an easy task. Through this research, we hope to find a solution to this using unsupervised machine learning techniques. the proposed technique divides the comments into words and removed special characters, emojis, and links from the comments & categorized each comment using a keyword list of each category and similarity findings. And then this was used to categorize comments for training. The implemented method shows the same performance, by Comparison with other supervised machine learning techniques for cyberbullying. Therefore, this mechanism can be used in any other places where low-cost cyberbullying identification is needed. This also can be used to create train comments.
