Search results for: missing data
24846 Data Mining Techniques for Anti-Money Laundering
Authors: M. Sai Veerendra
Abstract:
Today, money laundering (ML) poses a serious threat not only to financial institutions but also to the nation. This criminal activity is becoming more and more sophisticated and seems to have moved from the cliché of drug trafficking to financing terrorism and surely not forgetting personal gain. Most of the financial institutions internationally have been implementing anti-money laundering solutions (AML) to fight investment fraud activities. However, traditional investigative techniques consume numerous man-hours. Recently, data mining approaches have been developed and are considered as well-suited techniques for detecting ML activities. Within the scope of a collaboration project on developing a new data mining solution for AML Units in an international investment bank in Ireland, we survey recent data mining approaches for AML. In this paper, we present not only these approaches but also give an overview on the important factors in building data mining solutions for AML activities.Keywords: data mining, clustering, money laundering, anti-money laundering solutions
Procedia PDF Downloads 53924845 Development of New Technology Evaluation Model by Using Patent Information and Customers' Review Data
Authors: Kisik Song, Kyuwoong Kim, Sungjoo Lee
Abstract:
Many global firms and corporations derive new technology and opportunity by identifying vacant technology from patent analysis. However, previous studies failed to focus on technologies that promised continuous growth in industrial fields. Most studies that derive new technology opportunities do not test practical effectiveness. Since previous studies depended on expert judgment, it became costly and time-consuming to evaluate new technologies based on patent analysis. Therefore, research suggests a quantitative and systematic approach to technology evaluation indicators by using patent data to and from customer communities. The first step involves collecting two types of data. The data is used to construct evaluation indicators and apply these indicators to the evaluation of new technologies. This type of data mining allows a new method of technology evaluation and better predictor of how new technologies are adopted.Keywords: data mining, evaluating new technology, technology opportunity, patent analysis
Procedia PDF Downloads 37824844 Anomaly Detection Based on System Log Data
Authors: M. Kamel, A. Hoayek, M. Batton-Hubert
Abstract:
With the increase of network virtualization and the disparity of vendors, the continuous monitoring and detection of anomalies cannot rely on static rules. An advanced analytical methodology is needed to discriminate between ordinary events and unusual anomalies. In this paper, we focus on log data (textual data), which is a crucial source of information for network performance. Then, we introduce an algorithm used as a pipeline to help with the pretreatment of such data, group it into patterns, and dynamically label each pattern as an anomaly or not. Such tools will provide users and experts with continuous real-time logs monitoring capability to detect anomalies and failures in the underlying system that can affect performance. An application of real-world data illustrates the algorithm.Keywords: logs, anomaly detection, ML, scoring, NLP
Procedia PDF Downloads 9524843 EnumTree: An Enumerative Biclustering Algorithm for DNA Microarray Data
Authors: Haifa Ben Saber, Mourad Elloumi
Abstract:
In a number of domains, like in DNA microarray data analysis, we need to cluster simultaneously rows (genes) and columns (conditions) of a data matrix to identify groups of constant rows with a group of columns. This kind of clustering is called biclustering. Biclustering algorithms are extensively used in DNA microarray data analysis. More effective biclustering algorithms are highly desirable and needed. We introduce a new algorithm called, Enumerative tree (EnumTree) for biclustering of binary microarray data. is an algorithm adopting the approach of enumerating biclusters. This algorithm extracts all biclusters consistent good quality. The main idea of EnumLat is the construction of a new tree structure to represent adequately different biclusters discovered during the process of enumeration. This algorithm adopts the strategy of all biclusters at a time. The performance of the proposed algorithm is assessed using both synthetic and real DNA micryarray data, our algorithm outperforms other biclustering algorithms for binary microarray data. Biclusters with different numbers of rows. Moreover, we test the biological significance using a gene annotation web tool to show that our proposed method is able to produce biologically relevent biclusters.Keywords: DNA microarray, biclustering, gene expression data, tree, datamining.
Procedia PDF Downloads 37224842 The Impact of Financial Reporting on Sustainability
Authors: Lynn Ruggieri
Abstract:
The worldwide pandemic has only increased sustainability awareness. The public is demanding that businesses be held accountable for their impact on the environment. While financial data enjoys uniformity in reporting requirements, there are no uniform reporting requirements for non-financial data. Europe is leading the way with some standards being implemented for reporting non-financial sustainability data; however, there is no uniformity globally. And without uniformity, there is not a clear understanding of what information to include and how to disclose it. Sustainability reporting will provide important information to stakeholders and will enable businesses to understand their impact on the environment. Therefore, there is a crucial need for this data. This paper looks at the history of sustainability reporting in the countries of the European Union and throughout the world and makes a case for worldwide reporting requirements for sustainability.Keywords: financial reporting, non-financial data, sustainability, global financial reporting
Procedia PDF Downloads 17924841 Methods and Algorithms of Ensuring Data Privacy in AI-Based Healthcare Systems and Technologies
Authors: Omar Farshad Jeelani, Makaire Njie, Viktoriia M. Korzhuk
Abstract:
Recently, the application of AI-powered algorithms in healthcare continues to flourish. Particularly, access to healthcare information, including patient health history, diagnostic data, and PII (Personally Identifiable Information) is paramount in the delivery of efficient patient outcomes. However, as the exchange of healthcare information between patients and healthcare providers through AI-powered solutions increases, protecting a person’s information and their privacy has become even more important. Arguably, the increased adoption of healthcare AI has resulted in a significant concentration on the security risks and protection measures to the security and privacy of healthcare data, leading to escalated analyses and enforcement. Since these challenges are brought by the use of AI-based healthcare solutions to manage healthcare data, AI-based data protection measures are used to resolve the underlying problems. Consequently, this project proposes AI-powered safeguards and policies/laws to protect the privacy of healthcare data. The project presents the best-in-school techniques used to preserve the data privacy of AI-powered healthcare applications. Popular privacy-protecting methods like Federated learning, cryptographic techniques, differential privacy methods, and hybrid methods are discussed together with potential cyber threats, data security concerns, and prospects. Also, the project discusses some of the relevant data security acts/laws that govern the collection, storage, and processing of healthcare data to guarantee owners’ privacy is preserved. This inquiry discusses various gaps and uncertainties associated with healthcare AI data collection procedures and identifies potential correction/mitigation measures.Keywords: data privacy, artificial intelligence (AI), healthcare AI, data sharing, healthcare organizations (HCOs)
Procedia PDF Downloads 9624840 Mapping Tunnelling Parameters for Global Optimization in Big Data via Dye Laser Simulation
Authors: Sahil Imtiyaz
Abstract:
One of the biggest challenges has emerged from the ever-expanding, dynamic, and instantaneously changing space-Big Data; and to find a data point and inherit wisdom to this space is a hard task. In this paper, we reduce the space of big data in Hamiltonian formalism that is in concordance with Ising Model. For this formulation, we simulate the system using dye laser in FORTRAN and analyse the dynamics of the data point in energy well of rhodium atom. After mapping the photon intensity and pulse width with energy and potential we concluded that as we increase the energy there is also increase in probability of tunnelling up to some point and then it starts decreasing and then shows a randomizing behaviour. It is due to decoherence with the environment and hence there is a loss of ‘quantumness’. This interprets the efficiency parameter and the extent of quantum evolution. The results are strongly encouraging in favour of the use of ‘Topological Property’ as a source of information instead of the qubit.Keywords: big data, optimization, quantum evolution, hamiltonian, dye laser, fermionic computations
Procedia PDF Downloads 19524839 Applying Different Stenography Techniques in Cloud Computing Technology to Improve Cloud Data Privacy and Security Issues
Authors: Muhammad Muhammad Suleiman
Abstract:
Cloud Computing is a versatile concept that refers to a service that allows users to outsource their data without having to worry about local storage issues. However, the most pressing issues to be addressed are maintaining a secure and reliable data repository rather than relying on untrustworthy service providers. In this study, we look at how stenography approaches and collaboration with Digital Watermarking can greatly improve the system's effectiveness and data security when used for Cloud Computing. The main requirement of such frameworks, where data is transferred or exchanged between servers and users, is safe data management in cloud environments. Steganography is the cloud is among the most effective methods for safe communication. Steganography is a method of writing coded messages in such a way that only the sender and recipient can safely interpret and display the information hidden in the communication channel. This study presents a new text steganography method for hiding a loaded hidden English text file in a cover English text file to ensure data protection in cloud computing. Data protection, data hiding capability, and time were all improved using the proposed technique.Keywords: cloud computing, steganography, information hiding, cloud storage, security
Procedia PDF Downloads 19224838 Investigation on Performance of Change Point Algorithm in Time Series Dynamical Regimes and Effect of Data Characteristics
Authors: Farhad Asadi, Mohammad Javad Mollakazemi
Abstract:
In this paper, Bayesian online inference in models of data series are constructed by change-points algorithm, which separated the observed time series into independent series and study the change and variation of the regime of the data with related statistical characteristics. variation of statistical characteristics of time series data often represent separated phenomena in the some dynamical system, like a change in state of brain dynamical reflected in EEG signal data measurement or a change in important regime of data in many dynamical system. In this paper, prediction algorithm for studying change point location in some time series data is simulated. It is verified that pattern of proposed distribution of data has important factor on simpler and smother fluctuation of hazard rate parameter and also for better identification of change point locations. Finally, the conditions of how the time series distribution effect on factors in this approach are explained and validated with different time series databases for some dynamical system.Keywords: time series, fluctuation in statistical characteristics, optimal learning, change-point algorithm
Procedia PDF Downloads 42724837 Determination of the Risks of Heart Attack at the First Stage as Well as Their Control and Resource Planning with the Method of Data Mining
Authors: İbrahi̇m Kara, Seher Arslankaya
Abstract:
Frequently preferred in the field of engineering in particular, data mining has now begun to be used in the field of health as well since the data in the health sector have reached great dimensions. With data mining, it is aimed to reveal models from the great amounts of raw data in agreement with the purpose and to search for the rules and relationships which will enable one to make predictions about the future from the large amount of data set. It helps the decision-maker to find the relationships among the data which form at the stage of decision-making. In this study, it is aimed to determine the risk of heart attack at the first stage, to control it, and to make its resource planning with the method of data mining. Through the early and correct diagnosis of heart attacks, it is aimed to reveal the factors which affect the diseases, to protect health and choose the right treatment methods, to reduce the costs in health expenditures, and to shorten the durations of patients’ stay at hospitals. In this way, the diagnosis and treatment costs of a heart attack will be scrutinized, which will be useful to determine the risk of the disease at the first stage, to control it, and to make its resource planning.Keywords: data mining, decision support systems, heart attack, health sector
Procedia PDF Downloads 35824836 Improvement in Oral Health-Related Quality of Life of Adult Patients After Rehabilitation With Partial Dentures: A Systematic Review and Meta-Analysis
Authors: Adama NS Bah
Abstract:
Background: Loss of teeth has a negative influence on essential oral functions such as phonetics, mastication, and aesthetics. Dentists treat people with prosthodontic rehabilitation to recover essential oral functions. The oral health quality of life inventory reflects the success of prosthodontic rehabilitation. In many countries, the current conventional care delivered to replace missing teeth for adult patients involves the provision of removable partial dentures. Aim: The aim of this systematic review and meta-analysis is to gather the best available evidence to determine patients’ oral health-related quality of life improvement after treatment with partial dentures. Methods: We searched electronic databases from January 2010 to September 2019, including PubMed, ProQuest, Science Direct, Scopus and Google Scholar. In this paper, studies were included only if the average age was 30 years and above and also published in English. Two reviewers independently screened and selected all the references based on inclusion criteria using the PRISMA guideline, and assessed the quality of the included references using the Joanna Briggs Institute quality assessment tools. Data extracted were analyzed in RevMan 5.0 software, the heterogeneity between the studies was assessed using Forest plot, I2 statistics and chi-square test with a statistical P value less than 0.05 to indicate statistical significance. Random effect models were used in case of moderate or high heterogeneity. Four studies were included in the systematic review and three studies were pooled for meta-analysis. Results: Four studies included in the systematic review and three studies included in the meta-analysis with a total of 285 patients comparing the improvement in oral health-related quality of life before and after rehabilitation with partial denture, the pooled results showed a better improvement of oral health-related quality of life after treatment with partial dentures (mean difference 5.25; 95% CI [3.81, 6.68], p < 0.00001) favoring the wearing of partial dentures. In order to ascertain the reliability of the included studies for meta-analysis risk of bias was assessed and found to be low in all included studies for meta-analysis using the Cochrane collaboration tool for risk of bias assessment. Conclusion: There is high evidence that rehabilitation with partial dentures can improve the patient’s oral health-related quality of life measured with Oral Health Impact Profile 14. This review has clinical evidence value for dentists treating the expanding vulnerable adult population.Keywords: meta-analysis, oral health impact profile, partial dentures, systematic review
Procedia PDF Downloads 10724835 A Preliminary End-Point Approach for Calculating Odorous Emissions in Life Cycle Assessment
Authors: G. M. Cappucci, C. Losi, P. Neri, M. Pini, A. M. Ferrari
Abstract:
Waste treatment and many production processes cause significant emissions of odors, thus typically leading to intense debate. The introduction of odorimetric units and their units of measurement, i.e., U.O. / m3, with the European regulation UE 13725 of 2003 designates the dynamic olfactometry as the official method for odorimetric analysis. Italy has filled the pre-existing legislative gap on the regulation of odorous emissions only recently, by introducing the Legislative Decree n°183 in 2017. The concentration of the odor to which a perceptive response occurs to 50% of the panel corresponds to the odorimetric unit of the sample under examination (1 U.O. / m3) and is equal to the threshold of perceptibility of the substance (O.T.). In particular, the treatment of Municipal Solid Waste (MSW) by Mechanical-Biological Treatment (MBT) plants produces odorous emissions, typically generated by aerobic procedures, potentially leading to significant environmental burdens. The quantification of odorous emissions represents a challenge within a LCA study since primary data are often missing. The aim of this study is to present the preliminary findings of an ongoing study whose aim is to identify and quantify odor emissions from the Tre Monti MBT plant, located in Imola (Bologna, Italy). Particularly, the issues faced with odor emissions in the present work are: i) the identification of the components of the gaseous mixture, whose total quantification in terms of odorimetric units is known, ii) the distribution of the total odorimetric units among the single substances identified and iii) the quantification of the mass emitted for each substance. The environmental analysis was carried out on the basis of the amount of emitted substance. The calculation method IMPact Assessment of Chemical Toxics (IMPACT) 2002+ has been modified since the original one does not take into account indoor emissions. Characterization factors were obtained by adopting a preliminary method in order to calculate indoor human effects. The impact and damage assessments were performed without the identification of new categories, thus in accordance with the categories of the selected calculation method. The results show that the damage associated to odorous emissions is the 0.24% of the total damage, and the most affected damage category is Human Health, mainly as a consequence of ammonia emission (86.06%). In conclusion, this preliminary approach allowed identifying and quantifying the substances responsible for the odour impact, in order to attribute them the relative damage on human health as well as ecosystem quality.Keywords: life cycle assessment, municipal solid waste, odorous emissions, waste treatment
Procedia PDF Downloads 17424834 Bayesian Borrowing Methods for Count Data: Analysis of Incontinence Episodes in Patients with Overactive Bladder
Authors: Akalu Banbeta, Emmanuel Lesaffre, Reynaldo Martina, Joost Van Rosmalen
Abstract:
Including data from previous studies (historical data) in the analysis of the current study may reduce the sample size requirement and/or increase the power of analysis. The most common example is incorporating historical control data in the analysis of a current clinical trial. However, this only applies when the historical control dataare similar enough to the current control data. Recently, several Bayesian approaches for incorporating historical data have been proposed, such as the meta-analytic-predictive (MAP) prior and the modified power prior (MPP) both for single control as well as for multiple historical control arms. Here, we examine the performance of the MAP and the MPP approaches for the analysis of (over-dispersed) count data. To this end, we propose a computational method for the MPP approach for the Poisson and the negative binomial models. We conducted an extensive simulation study to assess the performance of Bayesian approaches. Additionally, we illustrate our approaches on an overactive bladder data set. For similar data across the control arms, the MPP approach outperformed the MAP approach with respect to thestatistical power. When the means across the control arms are different, the MPP yielded a slightly inflated type I error (TIE) rate, whereas the MAP did not. In contrast, when the dispersion parameters are different, the MAP gave an inflated TIE rate, whereas the MPP did not.We conclude that the MPP approach is more promising than the MAP approach for incorporating historical count data.Keywords: count data, meta-analytic prior, negative binomial, poisson
Procedia PDF Downloads 11824833 Strategic Citizen Participation in Applied Planning Investigations: How Planners Use Etic and Emic Community Input Perspectives to Fill-in the Gaps in Their Analysis
Authors: John Gaber
Abstract:
Planners regularly use citizen input as empirical data to help them better understand community issues they know very little about. This type of community data is based on the lived experiences of local residents and is known as "emic" data. What is becoming more common practice for planners is their use of data from local experts and stakeholders (known as "etic" data or the outsider perspective) to help them fill in the gaps in their analysis of applied planning research projects. Utilizing international Health Impact Assessment (HIA) data, I look at who planners invite to their citizen input investigations. Research presented in this paper shows that planners access a wide range of emic and etic community perspectives in their search for the “community’s view.” The paper concludes with how planners can chart out a new empirical path in their execution of emic/etic citizen participation strategies in their applied planning research projects.Keywords: citizen participation, emic data, etic data, Health Impact Assessment (HIA)
Procedia PDF Downloads 48424832 A Conceptual Model of Social Entrepreneurial Intention Based on the Social Cognitive Career Theory
Authors: Anh T. P. Tran, Harald Von Korflesch
Abstract:
Entrepreneurial intention play a major role in entrepreneurship academia and practice. The spectrum ranges from the first model of the so-called Entrepreneurial Event, then the Theory of Planned Behavior, the Theory of Planned Behavior Entrepreneurial Model, and the Social Cognitive Career Theory to some typical empirical studies with more or less diverse results. However, little is known so far about the intentions of entrepreneurs in the social areas of venture creation. It is surprising that, since social entrepreneurship is an emerging field with growing importance. Currently, all around the world, there is a big challenge with a lot of urgent soaring social and environmental problems such as poor households, people with disabilities, HIV/AIDS infected people, the lonely elderly, or neglected children, some of them even actual in the Western countries. In addition, the already existing literature on entrepreneurial intentions demonstrates a high level of theoretical diversity in general, especially the missing link to the social dimension of entrepreneurship. Seeking to fill the mentioned gaps in the social entrepreneurial intentions literature, this paper proposes a conceptual model of social entrepreneurial intentions based on the Social Cognitive Career Theory with two main factors influencing entrepreneurial intentions namely self-efficacy and outcome expectation. Moreover, motives, goals and plans do not arise from empty nothingness, but are shaped by interacting with the environment. Hence, personalities (i.e., agreeableness, conscientiousness, extraversion, neuroticism, openness) as well as contextual factors (e.g., role models, education, and perceived support) are also considered as the antecedents of social entrepreneurship intentions.Keywords: entrepreneurial intention, social cognitive career theory, social entrepreneurial intention, social entrepreneurship
Procedia PDF Downloads 47824831 Multilevel of Factors Affected Optimal Adherence to Antiretroviral Therapy and Viral Suppression amongst HIV-Infected Prisoners in South Ethiopia: A Prospective Cohort Study
Authors: Terefe Fuge, George Tsourtos , Emma Miller
Abstract:
Objectives: Maintaining optimal adherence and viral suppression in people living with HIV (PLWHA) is essential to ensure both preventative and therapeutic benefits of antiretroviral therapy (ART). Prisoners bear a particularly high burden of HIV infection and are highly likely to transmit to others during and after incarceration. However, the level of adherence and viral suppression, as well as its associated factors in incarcerated populations in low-income countries is unknown. This study aimed to determine the prevalence of non-adherence and viral failure, and contributing factors to this amongst prisoners in South Ethiopia. Methods: A prospective cohort study was conducted between June 1, 2019 and July 31, 2020 to compare the level of adherence and viral suppression between incarcerated and non-incarcerated PLWHA. The study involved 74 inmates living with HIV (ILWHA) and 296 non-incarcerated PLWHA. Background information including sociodemographic, socioeconomic, psychosocial, behavioural, and incarceration-related characteristics was collected using a structured questionnaire. Adherence was determined based on participants’ self-report and pharmacy refill records, and plasma viral load measurements which were undertaken within the study period were prospectively extracted to determine viral suppression. Various univariate and multivariate regression models were used to analyse data. Results: Self-reported dose adherence was approximately similar between ILWHA and non-incarcerated PLWHA (81% and 83% respectively), but ILWHA had a significantly higher medication possession ratio (MPR) (89% vs 75%). The prevalence of viral failure (VF) was slightly higher (6%) in ILWHA compared to non-incarcerated PLWHA (4.4%). The overall dose non-adherence (NA) was significantly associated with missing ART appointments, level of satisfaction with ART services, patient’s ability to comply with a specified medication schedule and types of methods used to monitor the schedule. In ILWHA specifically, accessing ART services from a hospital compared to a health centre, an inability to always attend clinic appointments, experience of depression and a lack of social support predicted NA. VF was significantly higher in males, people of age 31-35 years and in those who experienced social stigma, regardless of their incarceration status. Conclusions: This study revealed that HIV-infected prisoners in South Ethiopia were more likely to be non-adherent to doses and so to develop viral failure compared to their non-incarcerated counterparts. A multitude of factors was found to be responsible for this requiring multilevel intervention strategies focusing on the specific needs of prisoners.Keywords: Adherence , Antiretroviral therapy, Incarceration, South Ethiopia, Viral suppression
Procedia PDF Downloads 13524830 Data Augmentation for Automatic Graphical User Interface Generation Based on Generative Adversarial Network
Authors: Xulu Yao, Moi Hoon Yap, Yanlong Zhang
Abstract:
As a branch of artificial neural network, deep learning is widely used in the field of image recognition, but the lack of its dataset leads to imperfect model learning. By analysing the data scale requirements of deep learning and aiming at the application in GUI generation, it is found that the collection of GUI dataset is a time-consuming and labor-consuming project, which is difficult to meet the needs of current deep learning network. To solve this problem, this paper proposes a semi-supervised deep learning model that relies on the original small-scale datasets to produce a large number of reliable data sets. By combining the cyclic neural network with the generated countermeasure network, the cyclic neural network can learn the sequence relationship and characteristics of data, make the generated countermeasure network generate reasonable data, and then expand the Rico dataset. Relying on the network structure, the characteristics of collected data can be well analysed, and a large number of reasonable data can be generated according to these characteristics. After data processing, a reliable dataset for model training can be formed, which alleviates the problem of dataset shortage in deep learning.Keywords: GUI, deep learning, GAN, data augmentation
Procedia PDF Downloads 18524829 Modelling Rainfall-Induced Shallow Landslides in the Northern New South Wales
Authors: S. Ravindran, Y.Liu, I. Gratchev, D.Jeng
Abstract:
Rainfall-induced shallow landslides are more common in the northern New South Wales (NSW), Australia. From 2009 to 2017, around 105 rainfall-induced landslides occurred along the road corridors and caused temporary road closures in the northern NSW. Rainfall causing shallow landslides has different distributions of rainfall varying from uniform, normal, decreasing to increasing rainfall intensity. The duration of rainfall varied from one day to 18 days according to historical data. The objective of this research is to analyse slope instability of some of the sites in the northern NSW by varying cumulative rainfall using SLOPE/W and SEEP/W and compare with field data of rainfall causing shallow landslides. The rainfall data and topographical data from public authorities and soil data obtained from laboratory tests will be used for this modelling. There is a likelihood of shallow landslides if the cumulative rainfall is between 100 mm to 400 mm in accordance with field data.Keywords: landslides, modelling, rainfall, suction
Procedia PDF Downloads 18424828 Machine Learning-Enabled Classification of Climbing Using Small Data
Authors: Nicholas Milburn, Yu Liang, Dalei Wu
Abstract:
Athlete performance scoring within the climbing do-main presents interesting challenges as the sport does not have an objective way to assign skill. Assessing skill levels within any sport is valuable as it can be used to mark progress while training, and it can help an athlete choose appropriate climbs to attempt. Machine learning-based methods are popular for complex problems like this. The dataset available was composed of dynamic force data recorded during climbing; however, this dataset came with challenges such as data scarcity, imbalance, and it was temporally heterogeneous. Investigated solutions to these challenges include data augmentation, temporal normalization, conversion of time series to the spectral domain, and cross validation strategies. The investigated solutions to the classification problem included light weight machine classifiers KNN and SVM as well as the deep learning with CNN. The best performing model had an 80% accuracy. In conclusion, there seems to be enough information within climbing force data to accurately categorize climbers by skill.Keywords: classification, climbing, data imbalance, data scarcity, machine learning, time sequence
Procedia PDF Downloads 14324827 Analysis of Expression Data Using Unsupervised Techniques
Authors: M. A. I Perera, C. R. Wijesinghe, A. R. Weerasinghe
Abstract:
his study was conducted to review and identify the unsupervised techniques that can be employed to analyze gene expression data in order to identify better subtypes of tumors. Identifying subtypes of cancer help in improving the efficacy and reducing the toxicity of the treatments by identifying clues to find target therapeutics. Process of gene expression data analysis described under three steps as preprocessing, clustering, and cluster validation. Feature selection is important since the genomic data are high dimensional with a large number of features compared to samples. Hierarchical clustering and K Means are often used in the analysis of gene expression data. There are several cluster validation techniques used in validating the clusters. Heatmaps are an effective external validation method that allows comparing the identified classes with clinical variables and visual analysis of the classes.Keywords: cancer subtypes, gene expression data analysis, clustering, cluster validation
Procedia PDF Downloads 14924826 The Political Haunting of “Martyrdom” in the Palestinian Context
Authors: Mai Awad
Abstract:
This paper aims to focus on the phenomenon of martyrdom—particularly its performative aspect—and how social and popular cultural representations address the multiple meanings of the loaded image of a Palestinian martyr. This focus will help us to explore the possible reasons that might push Palestinians to consider pursuing “martyrdom” or suicide operations. Tracing what happened in the past and what is currently happening (that is, haunting) will aid in theorizing how the act/practice of “martyrdom” is produced. It is believed that there are social and political forces, particularly in a colonial society like Palestine, that influence the subject and its experience. But what is unique about this paper is its attempt to disclose the invisible, hidden narratives and complexities of Palestinian life that we do not see. By giving “martyrs” a chance to speak and express their own narratives—since “martyrs” usually leave written letters for their families, which are published after their death—this study must broaden the whole picture and discuss what is missing. The analytic method to be used: For the methodology, the paper recruits discourse analysis as a method for tracing the emergence, circulation, and productivity of the martyrdom discourse across a range of social practices in Palestinians’ everyday life after the Nakba. The paper analyzes the letters that “martyrs” left to their families, relatives, and the Palestinian community after their death. By letting “martyrs” speak for themselves and hearing their unique discourses, the research would suggest that more explanation is needed to describe the “martyr” identity. Hence, it is not possible to study the “martyr” identity in Palestine without understanding the colonial context that governs it and shapes their subjective experience.Keywords: martyrdom, palestine, haunting, nakba 1948
Procedia PDF Downloads 6924825 Learning Analytics in a HiFlex Learning Environment
Authors: Matthew Montebello
Abstract:
Student engagement within a virtual learning environment generates masses of data points that can significantly contribute to the learning analytics that lead to decision support. Ideally, similar data is collected during student interaction with a physical learning space, and as a consequence, data is present at a large scale, even in relatively small classes. In this paper, we report of such an occurrence during classes held in a HiFlex modality as we investigate the advantages of adopting such a methodology. We plan to take full advantage of the learner-generated data in an attempt to further enhance the effectiveness of the adopted learning environment. This could shed crucial light on operating modalities that higher education institutions around the world will switch to in a post-COVID era.Keywords: HiFlex, big data in higher education, learning analytics, virtual learning environment
Procedia PDF Downloads 20124824 Li-Fi Technology: Data Transmission through Visible Light
Authors: Shahzad Hassan, Kamran Saeed
Abstract:
People are always in search of Wi-Fi hotspots because Internet is a major demand nowadays. But like all other technologies, there is still room for improvement in the Wi-Fi technology with regards to the speed and quality of connectivity. In order to address these aspects, Harald Haas, a professor at the University of Edinburgh, proposed what we know as the Li-Fi (Light Fidelity). Li-Fi is a new technology in the field of wireless communication to provide connectivity within a network environment. It is a two-way mode of wireless communication using light. Basically, the data is transmitted through Light Emitting Diodes which can vary the intensity of light very fast, even faster than the blink of an eye. From the research and experiments conducted so far, it can be said that Li-Fi can increase the speed and reliability of the transfer of data. This paper pays particular attention on the assessment of the performance of this technology. In other words, it is a 5G technology which uses LED as the medium of data transfer. For coverage within the buildings, Wi-Fi is good but Li-Fi can be considered favorable in situations where large amounts of data are to be transferred in areas with electromagnetic interferences. It brings a lot of data related qualities such as efficiency, security as well as large throughputs to the table of wireless communication. All in all, it can be said that Li-Fi is going to be a future phenomenon where the presence of light will mean access to the Internet as well as speedy data transfer.Keywords: communication, LED, Li-Fi, Wi-Fi
Procedia PDF Downloads 34724823 An Analysis of Humanitarian Data Management of Polish Non-Governmental Organizations in Ukraine Since February 2022 and Its Relevance for Ukrainian Humanitarian Data Ecosystem
Authors: Renata Kurpiewska-Korbut
Abstract:
Making an assumption that the use and sharing of data generated in humanitarian action constitute a core function of humanitarian organizations, the paper analyzes the position of the largest Polish humanitarian non-governmental organizations in the humanitarian data ecosystem in Ukraine and their approach to non-personal and personal data management since February of 2022. Both expert interviews and document analysis of non-profit organizations providing a direct response in the Ukrainian crisis context, i.e., the Polish Humanitarian Action, Caritas, Polish Medical Mission, Polish Red Cross, and the Polish Center for International Aid and the applicability of theoretical perspective of contingency theory – with its central point that the context or specific set of conditions determining the way of behavior and the choice of methods of action – help to examine the significance of data complexity and adaptive approach to data management by relief organizations in the humanitarian supply chain network. The purpose of this study is to determine how the existence of well-established and accurate internal procedures and good practices of using and sharing data (including safeguards for sensitive data) by the surveyed organizations with comparable human and technological capabilities are implemented and adjusted to Ukrainian humanitarian settings and data infrastructure. The study also poses a fundamental question of whether this crisis experience will have a determining effect on their future performance. The obtained finding indicate that Polish humanitarian organizations in Ukraine, which have their own unique code of conduct and effective managerial data practices determined by contingencies, have limited influence on improving the situational awareness of other assistance providers in the data ecosystem despite their attempts to undertake interagency work in the area of data sharing.Keywords: humanitarian data ecosystem, humanitarian data management, polish NGOs, Ukraine
Procedia PDF Downloads 9324822 An Approach for Estimation in Hierarchical Clustered Data Applicable to Rare Diseases
Authors: Daniel C. Bonzo
Abstract:
Practical considerations lead to the use of unit of analysis within subjects, e.g., bleeding episodes or treatment-related adverse events, in rare disease settings. This is coupled with data augmentation techniques such as extrapolation to enlarge the subject base. In general, one can think about extrapolation of data as extending information and conclusions from one estimand to another estimand. This approach induces hierarchichal clustered data with varying cluster sizes. Extrapolation of clinical trial data is being accepted increasingly by regulatory agencies as a means of generating data in diverse situations during drug development process. Under certain circumstances, data can be extrapolated to a different population, a different but related indication, and different but similar product. We consider here the problem of estimation (point and interval) using a mixed-models approach under an extrapolation. It is proposed that estimators (point and interval) be constructed using weighting schemes for the clusters, e.g., equally weighted and with weights proportional to cluster size. Simulated data generated under varying scenarios are then used to evaluate the performance of this approach. In conclusion, the evaluation result showed that the approach is a useful means for improving statistical inference in rare disease settings and thus aids not only signal detection but risk-benefit evaluation as well.Keywords: clustered data, estimand, extrapolation, mixed model
Procedia PDF Downloads 13724821 Authorization of Commercial Communication Satellite Grounds for Promoting Turkish Data Relay System
Authors: Celal Dudak, Aslı Utku, Burak Yağlioğlu
Abstract:
Uninterrupted and continuous satellite communication through the whole orbit time is becoming more indispensable every day. Data relay systems are developed and built for various high/low data rate information exchanges like TDRSS of USA and EDRSS of Europe. In these missions, a couple of task-dedicated communication satellites exist. In this regard, for Turkey a data relay system is attempted to be defined exchanging low data rate information (i.e. TTC) for Earth-observing LEO satellites appointing commercial GEO communication satellites all over the world. First, justification of this attempt is given, demonstrating duration enhancements in the link. Discussion of preference of RF communication is, also, given instead of laser communication. Then, preferred communication GEOs – including TURKSAT4A already belonging to Turkey- are given, together with the coverage enhancements through STK simulations and the corresponding link budget. Also, a block diagram of the communication system is given on the LEO satellite.Keywords: communication, GEO satellite, data relay system, coverage
Procedia PDF Downloads 44224820 The Development of Encrypted Near Field Communication Data Exchange Format Transmission in an NFC Passive Tag for Checking the Genuine Product
Authors: Tanawat Hongthai, Dusit Thanapatay
Abstract:
This paper presents the development of encrypted near field communication (NFC) data exchange format transmission in an NFC passive tag for the feasibility of implementing a genuine product authentication. We propose a research encryption and checking the genuine product into four major categories; concept, infrastructure, development and applications. This result shows the passive NFC-forum Type 2 tag can be configured to be compatible with the NFC data exchange format (NDEF), which can be automatically partially data updated when there is NFC field.Keywords: near field communication, NFC data exchange format, checking the genuine product, encrypted NFC
Procedia PDF Downloads 28124819 Data Hiding by Vector Quantization in Color Image
Authors: Yung Gi Wu
Abstract:
With the growing of computer and network, digital data can be spread to anywhere in the world quickly. In addition, digital data can also be copied or tampered easily so that the security issue becomes an important topic in the protection of digital data. Digital watermark is a method to protect the ownership of digital data. Embedding the watermark will influence the quality certainly. In this paper, Vector Quantization (VQ) is used to embed the watermark into the image to fulfill the goal of data hiding. This kind of watermarking is invisible which means that the users will not conscious the existing of embedded watermark even though the embedded image has tiny difference compared to the original image. Meanwhile, VQ needs a lot of computation burden so that we adopt a fast VQ encoding scheme by partial distortion searching (PDS) and mean approximation scheme to speed up the data hiding process. The watermarks we hide to the image could be gray, bi-level and color images. Texts are also can be regarded as watermark to embed. In order to test the robustness of the system, we adopt Photoshop to fulfill sharpen, cropping and altering to check if the extracted watermark is still recognizable. Experimental results demonstrate that the proposed system can resist the above three kinds of tampering in general cases.Keywords: data hiding, vector quantization, watermark, color image
Procedia PDF Downloads 36424818 Knowledge Based Behaviour Modelling and Execution in Service Robotics
Authors: Suraj Nair, Aravindkumar Vijayalingam, Alexander Perzylo, Alois Knoll
Abstract:
In the last decade robotics research and development activities have grown rapidly, especially in the domain of service robotics. Integrating service robots into human occupied spaces such as homes, offices, hospitals, etc. has become increasingly worked upon. The primary motive is to ease daily lives of humans by taking over some of the household/office chores. However, several challenges remain in systematically integrating such systems in human shared work-spaces. In addition to sensing and indoor-navigation challenges, programmability of such systems is a major hurdle due to the fact that the potential user cannot be expected to have knowledge in robotics or similar mechatronic systems. In this paper, we propose a cognitive system for service robotics which allows non-expert users to easily model system behaviour in an underspecified manner through abstract tasks and objects associated with them. The system uses domain knowledge expressed in the form of an ontology along with logical reasoning mechanisms to infer all the missing pieces of information required for executing the tasks. Furthermore, the system is also capable of recovering from failed tasks arising due to on-line disturbances by using the knowledge base and inferring alternate methods to execute the same tasks. The system is demonstrated through a coffee fetching scenario in an office environment using a mobile robot equipped with sensors and software capabilities for autonomous navigation and human-interaction through natural language.Keywords: cognitive robotics, reasoning, service robotics, task based systems
Procedia PDF Downloads 24424817 Anomaly Detection in a Data Center with a Reconstruction Method Using a Multi-Autoencoders Model
Authors: Victor Breux, Jérôme Boutet, Alain Goret, Viviane Cattin
Abstract:
Early detection of anomalies in data centers is important to reduce downtimes and the costs of periodic maintenance. However, there is little research on this topic and even fewer on the fusion of sensor data for the detection of abnormal events. The goal of this paper is to propose a method for anomaly detection in data centers by combining sensor data (temperature, humidity, power) and deep learning models. The model described in the paper uses one autoencoder per sensor to reconstruct the inputs. The auto-encoders contain Long-Short Term Memory (LSTM) layers and are trained using the normal samples of the relevant sensors selected by correlation analysis. The difference signal between the input and its reconstruction is then used to classify the samples using feature extraction and a random forest classifier. The data measured by the sensors of a data center between January 2019 and May 2020 are used to train the model, while the data between June 2020 and May 2021 are used to assess it. Performances of the model are assessed a posteriori through F1-score by comparing detected anomalies with the data center’s history. The proposed model outperforms the state-of-the-art reconstruction method, which uses only one autoencoder taking multivariate sequences and detects an anomaly with a threshold on the reconstruction error, with an F1-score of 83.60% compared to 24.16%.Keywords: anomaly detection, autoencoder, data centers, deep learning
Procedia PDF Downloads 194