Browsing by Author "Amaratunga, D"

Now showing 1 - 6 of 6

Open Access
Comparing Methods for Detecting Anomalous Values in Automated Laboratory Processes
(Faculty of Humanities and Sciences, SLIIT, 2024-05-30) Madhushanka, H. M. S; Amaratunga, D
Outlier detection is used in many domains. In automated laboratory processes, detecting anomalous values is critical for ensuring the reliability of experimental results. This study compares various outlier detection methods, including traditional statistical approaches like Mahalanobis distance, Median and mean absolute deviation (MAD), as well as modern machine learning techniques such as Isolation Forest, Angle Based Outlier Detection (ABOD), and Local Outlier Factor (LOF). The performances of these methods were evaluated using simulated multivariate data, with different types of outliers and levels of contamination. Comparisons are made using sensitivity, precision, and mainly the F2 score, a weighted metric of sensitivity and precision that gives more weight to precision. The results show that in univariate settings, the Median MAD method works consistently well. For multivariate scenarios, Mahalanobis methods with Minimum Covariance Determinant estimates and Minimum Volume Ellipsoid estimates work well even for high contamination percentages. This study highlights the importance of selecting an appropriate outlier detection method for the situation.
Embargo
Comparing Trends in Data (with Applications to COVID and Image Data)
(Faculty of Humanities and Sciences,SLIIT, 2021-09-25) Amaratunga, D; Cabrera, J
Many applications involve looking at and comparing trends in data. We will discuss some statistics that can be used to assess the similarity or dissimilarity between pairs of cumulative trends. These statistics can then be used to study sets of trends – for example, to cluster them or to compare them across different groups We will describe one possible approach and illustrate its use in two case studies. In the first case study, we studied the trend over time of COVID-19 in New Jersey in the USA– it was found that areas close to New York City had significantly different (more rapidly increasing) cumulative trends compared to areas further from New York City during the early days of the pandemic, but this difference dissipated as the pandemic progressed and spread within New Jersey itself. In the second case study, we compared two sets of CT scan images of lungs – a significant difference could be detected between COPD-diseased lungs and normal lungs. Overall, the method performed well and detected insightful differences.
Open Access
Data Smoothing and Other Methods for Generating Forecasts for COVID-19 Cases in Sri Lanka
(Faculty of Humanities and Sciences, SLIIT, 2023-11-01) Siriwardena, G; Dharmaratne, G; Amaratunga, D
The COVID-19 pandemic has significantly impacted global society, including Sri Lanka, necessitating the need for reliable forecasting methods. This study compares ten distinct models to predict the number of confirmed COVID-19 cases in Sri Lanka, aiming to assess the performance of statistical models using limited and volatile realworld data characterized by trends, random peaks, and autocorrelations. In addition to the classical ARIMA model, various smoothing and filtering techniques were explored to capture the unique characteristics of the data. The model consistencies in multiple-day predictions were demonstrated, and robust evaluation criteria, along with non-robust measures, were utilized to enhance the effectiveness of the evaluation process. The results highlight the effectiveness of traditional smoothing strategies such as Simple Exponential Smoothing, Holt’s Exponential Smoothing, and the Smoothing Splines technique coupled with the ARIMA model. Notably, applying the ARIMA model directly to the original data without smoothing or filtering approaches yielded inadequate forecasts, underscoring its limitations in volatile data settings.
Embargo
Identifying Proteins Associated with Disease Severity
(Faculty of Humanities and Sciences, SLIIT, 2022-09-15) Samarawickrama, O; Jayatillake, R; Amaratunga, D
Proteomic studies or studies of protein expression levels are growing swiftly with the steady improvement in technology and knowledge on understanding various anomalies affecting humans. Since differentially expressed proteins have an influence on overall cell functionality, this improves discrimination between healthy and diseased states. Identifying prime proteins offers prospective insights for developing optimized and targeted treatment methods. This research involves analyzing data from an early-stage study whose main purpose was to identify differentially expressed proteins. The presence of 3 progressively serious states of disease (healthy to mild to severe) escalates the importance of this study because there is not much research literature that considers ordinal outcomes in studies of this nature. The analysis can be segregated into 2 stages, univariate and multiprotein analysis. Approach of the univariate analysis was to implement continuation ratio model considering one protein at a time to pick those that exhibits potential ordinality. Penalized continuation ratio model using lasso regularization incorporated with bootstrapping proteins was performed as the next stage to identify protein combinations that perform well together. Compound results of the univariate and multi-protein analysis identified 20 most dominant proteins that have the capability to discriminate between the disease states in an ordinal manner satisfactorily.
Embargo
Reference Ranges and Control Limits that are Resistant to Baseline Outliers
(Faculty of Humanities and Sciences, SLIIT, 2022-09-15) Amaratunga, D
Reference ranges and control limits are used in many settings – for example, to assess a person’s health or to monitor the stability of a manufacturing process. Such ranges are established based on a baseline sample of what is considered normal data, but it is not possible to always avoid a few outliers being present even in this sample. If, as is common, the range is calculated using statistics, such as the mean and standard deviation, which could be influenced by outliers, then the use of such a range could adversely affect the decisions made. This can be avoided by constructing the reference range using statistics that are resistant to outliers. In this paper, we demonstrate the superior performance of such an approach.
Open Access
Reference Ranges and Control Limits that are Resistant to Baseline Outliers
(Faculty of Humanities and Sciences, SLIIT, 2022-12-24) Amaratunga, D
Reference ranges and control limits are used in many settings – for example, to assess a person’s health or to monitor the stability of a manufacturing process. Such ranges are established based on a baseline sample of what is considered normal data, but it is not possible to always avoid a few outliers being present even in this sample. If, as is common, the range is calculated using statistics, such as the mean and standard deviation, which could be influenced by outliers, then the use of such a range could adversely affect the decisions made. This can be avoided by constructing the reference range using statistics that are resistant to outliers. In this paper, we studied possible approaches and found two methods that had superior performance overall: one based on MM-estimation and one based on a form of Winsorization.