Search results for: enterprise data warehouse
23302 Re-identification Risk and Mitigation in Federated Learning: Human Activity Recognition Use Case
Authors: Besma Khalfoun
Abstract:
In many current Human Activity Recognition (HAR) applications, users' data is frequently shared and centrally stored by third parties, posing a significant privacy risk. This practice makes these entities attractive targets for extracting sensitive information about users, including their identity, health status, and location, thereby directly violating users' privacy. To tackle the issue of centralized data storage, a relatively recent paradigm known as federated learning has emerged. In this approach, users' raw data remains on their smartphones, where they train the HAR model locally. However, users still share updates of their local models originating from raw data. These updates are vulnerable to several attacks designed to extract sensitive information, such as determining whether a data sample is used in the training process, recovering the training data with inversion attacks, or inferring a specific attribute or property from the training data. In this paper, we first introduce PUR-Attack, a parameter-based user re-identification attack developed for HAR applications within a federated learning setting. It involves associating anonymous model updates (i.e., local models' weights or parameters) with the originating user's identity using background knowledge. PUR-Attack relies on a simple yet effective machine learning classifier and produces promising results. Specifically, we have found that by considering the weights of a given layer in a HAR model, we can uniquely re-identify users with an attack success rate of almost 100%. This result holds when considering a small attack training set and various data splitting strategies in the HAR model training. Thus, it is crucial to investigate protection methods to mitigate this privacy threat. Along this path, we propose SAFER, a privacy-preserving mechanism based on adaptive local differential privacy. Before sharing the model updates with the FL server, SAFER adds the optimal noise based on the re-identification risk assessment. Our approach can achieve a promising tradeoff between privacy, in terms of reducing re-identification risk, and utility, in terms of maintaining acceptable accuracy for the HAR model.Keywords: federated learning, privacy risk assessment, re-identification risk, privacy preserving mechanisms, local differential privacy, human activity recognition
Procedia PDF Downloads 1123301 Blockchain for Transport: Performance Simulations of Blockchain Network for Emission Monitoring Scenario
Authors: Dermot O'Brien, Vasileios Christaras, Georgios Fontaras, Igor Nai Fovino, Ioannis Kounelis
Abstract:
With the rise of the Internet of Things (IoT), 5G, and blockchain (BC) technologies, vehicles are becoming ever increasingly connected and are already transmitting substantial amounts of data to the original equipment manufacturers (OEMs) servers. This data could be used to help detect mileage fraud and enable more accurate vehicle emissions monitoring. This would not only help regulators but could enable applications such as permitting efficient drivers to pay less tax, geofencing for air quality improvement, as well as pollution tolling and trading platforms for transport-related businesses and EU citizens. Other applications could include traffic management and shared mobility systems. BC enables the transmission of data with additional security and removes single points of failure while maintaining data provenance, identity ownership, and the possibility to retain varying levels of privacy depending on the requirements of the applied use case. This research performs simulations of vehicles interacting with European member state authorities and European Commission BC nodes that are running hyperleger fabric and explores whether the technology is currently feasible for transport applications such as the emission monitoring use-case.Keywords: future transportation systems, technological innovations, policy approaches for transportation future, economic and regulatory trends, blockchain
Procedia PDF Downloads 17623300 DURAFILE: A Collaborative Tool for Preserving Digital Media Files
Authors: Santiago Macho, Miquel Montaner, Raivo Ruusalepp, Ferran Candela, Xavier Tarres, Rando Rostok
Abstract:
During our lives, we generate a lot of personal information such as photos, music, text documents and videos that link us with our past. This data that used to be tangible is now digital information stored in our computers, which implies a software dependence to make them accessible in the future. Technology, however, constantly evolves and goes through regular shifts, quickly rendering various file formats obsolete. The need for accessing data in the future affects not only personal users but also organizations. In a digital environment, a reliable preservation plan and the ability to adapt to fast changing technology are essential for maintaining data collections in the long term. We present in this paper the European FP7 project called DURAFILE that provides the technology to preserve media files for personal users and organizations while maintaining their quality.Keywords: artificial intelligence, digital preservation, social search, digital preservation plans
Procedia PDF Downloads 44523299 Constructing a Semi-Supervised Model for Network Intrusion Detection
Authors: Tigabu Dagne Akal
Abstract:
While advances in computer and communications technology have made the network ubiquitous, they have also rendered networked systems vulnerable to malicious attacks devised from a distance. These attacks or intrusions start with attackers infiltrating a network through a vulnerable host and then launching further attacks on the local network or Intranet. Nowadays, system administrators and network professionals can attempt to prevent such attacks by developing intrusion detection tools and systems using data mining technology. In this study, the experiments were conducted following the Knowledge Discovery in Database Process Model. The Knowledge Discovery in Database Process Model starts from selection of the datasets. The dataset used in this study has been taken from Massachusetts Institute of Technology Lincoln Laboratory. After taking the data, it has been pre-processed. The major pre-processing activities include fill in missed values, remove outliers; resolve inconsistencies, integration of data that contains both labelled and unlabelled datasets, dimensionality reduction, size reduction and data transformation activity like discretization tasks were done for this study. A total of 21,533 intrusion records are used for training the models. For validating the performance of the selected model a separate 3,397 records are used as a testing set. For building a predictive model for intrusion detection J48 decision tree and the Naïve Bayes algorithms have been tested as a classification approach for both with and without feature selection approaches. The model that was created using 10-fold cross validation using the J48 decision tree algorithm with the default parameter values showed the best classification accuracy. The model has a prediction accuracy of 96.11% on the training datasets and 93.2% on the test dataset to classify the new instances as normal, DOS, U2R, R2L and probe classes. The findings of this study have shown that the data mining methods generates interesting rules that are crucial for intrusion detection and prevention in the networking industry. Future research directions are forwarded to come up an applicable system in the area of the study.Keywords: intrusion detection, data mining, computer science, data mining
Procedia PDF Downloads 29623298 Academic Leadership Succession Planning Practice in Nigeria Higher Education Institutions: A Case Study of Colleges of Education
Authors: Adie, Julius Undiukeye
Abstract:
This research investigated the practice of academic leadership succession planning in Nigerian higher education institutions, drawing on the lived experiences of the academic staff of the case study institutions. It is multi-case study research that adopts a qualitative research method. Ten participants (mainly academic staff) were used as the study sample. The study was guided by four research questions. Semi-structured interviews and archival information from official documents formed the sources of data. The data collected was analyzed using the Constant Comparative Technique (CCT) to generate empirical insights and facts on the subject of this paper. The following findings emerged from the data analysis: firstly, there was no formalized leadership succession plan in place in the institutions that were sampled for this study; secondly, despite the absence of a formal succession plan, the data indicates that academics believe that succession planning is very significant for institutional survival; thirdly, existing practices of succession planning in the sampled institutions, takes the forms of job seniority ranking, political process and executive fiat, ad-hoc arrangement, and external hiring; and finally, data revealed that there are some barriers to the practice of succession planning, such as traditional higher education institutions’ characteristics (e.g. external talent search, shared governance, diversity, and equality in leadership appointment) and the lack of interest in leadership positions. Based on the research findings, some far-reaching recommendations were made, including the urgent need for the ‘formalization’ of leadership succession planning by the higher education institutions concerned, through the design of an official policy framework.Keywords: academic leadership, succession, planning, higher education
Procedia PDF Downloads 14323297 Native Language Identification with Cross-Corpus Evaluation Using Social Media Data: ’Reddit’
Authors: Yasmeen Bassas, Sandra Kuebler, Allen Riddell
Abstract:
Native language identification is one of the growing subfields in natural language processing (NLP). The task of native language identification (NLI) is mainly concerned with predicting the native language of an author’s writing in a second language. In this paper, we investigate the performance of two types of features; content-based features vs. content independent features, when they are evaluated on a different corpus (using social media data “Reddit”). In this NLI task, the predefined models are trained on one corpus (TOEFL), and then the trained models are evaluated on different data using an external corpus (Reddit). Three classifiers are used in this task; the baseline, linear SVM, and logistic regression. Results show that content-based features are more accurate and robust than content independent ones when tested within the corpus and across corpus.Keywords: NLI, NLP, content-based features, content independent features, social media corpus, ML
Procedia PDF Downloads 13723296 Integration of Internet-Accessible Resources in the Field of Mobile Robots
Authors: B. Madhevan, R. Sakkaravarthi, R. Diya
Abstract:
The number and variety of mobile robot applications are increasing day by day, both in an industry and in our daily lives. First developed as a tool, nowadays mobile robots can be integrated as an entity in Internet-accessible resources. The present work is organized around four potential resources such as cloud computing, Internet of things, Big data analysis and Co-simulation. Further, the focus relies on integrating, analyzing and discussing the need for integrating Internet-accessible resources and the challenges deriving from such integration, and how these issues have been tackled. Hence, the research work investigates the concepts of the Internet-accessible resources from the aspect of the autonomous mobile robots with an overview of the performances of the currently available database systems. IaR is a world-wide network of interconnected objects, can be considered an evolutionary process in mobile robots. IaR constitutes an integral part of future Internet with data analysis, consisting of both physical and virtual things.Keywords: internet-accessible resources, cloud computing, big data analysis, internet of things, mobile robot
Procedia PDF Downloads 38923295 The Application of Lean-Kaizen in Course Plan and Delivery in Malaysian Higher Education Sector
Authors: Nur Aishah Binti Awi, Zulfiqar Khan
Abstract:
Lean-kaizen has always been applied in manufacturing sector since many years ago. What about education sector? This paper discuss on how lean-kaizen can also be applied in education sector, specifically in academic area of Malaysian’s higher education sector. The purpose of this paper is to describe the application of lean kaizen in course plan and delivery. Lean-kaizen techniques have been used to identify waste in the course plan and delivery. A field study has been conducted to obtain the data. This study used both quantitative and qualitative data. The researcher had interviewed the chosen lecturers regarding to the problems of course plan and delivery that they encountered. Secondary data of students’ feedback at the end of semester also has been used to improve course plan and delivery. The result empirically shows that lean-kaizen helps to improve the course plan and delivery by reducing the wastes. Thus, this study demonstrates that lean-kaizen can also help education sector to improve their services as achieved by manufacturing sector.Keywords: course delivery, education, Kaizen, lean
Procedia PDF Downloads 36823294 An ANN Approach for Detection and Localization of Fatigue Damage in Aircraft Structures
Authors: Reza Rezaeipour Honarmandzad
Abstract:
In this paper we propose an ANN for detection and localization of fatigue damage in aircraft structures. We used network of piezoelectric transducers for Lamb-wave measurements in order to calculate damage indices. Data gathered by the sensors was given to neural network classifier. A set of neural network electors of different architecture cooperates to achieve consensus concerning the state of each monitored path. Sensed signal variations in the ROI, detected by the networks at each path, were used to assess the state of the structure as well as to localize detected damage and to filter out ambient changes. The classifier has been extensively tested on large data sets acquired in the tests of specimens with artificially introduced notches as well as the results of numerous fatigue experiments. Effect of the classifier structure and test data used for training on the results was evaluated.Keywords: ANN, fatigue damage, aircraft structures, piezoelectric transducers, lamb-wave measurements
Procedia PDF Downloads 41723293 Public Libraries as Social Spaces for Vulnerable Populations
Authors: Natalie Malone
Abstract:
This study explores the role of a public library in the creation of social spaces for vulnerable populations. The data stems from a longitudinal ethnographic study of the Anderson Library community, which included field notes, artifacts, and interview data. Thematic analysis revealed multiple meanings and thematic relationships within and among the data sources -interviews, field notes, and artifacts. Initial analysis suggests the Anderson Library serves as a space for vulnerable populations, with the sub-themes of fostering interpersonal communication to create a social space for children and fostering interpersonal communication to create a social space for parents and adults. These findings are important as they illustrate the potential of public libraries to serve as community empowering institutions.Keywords: capital, immigrant families, public libraries, space, vulnerable
Procedia PDF Downloads 15123292 A Decadal Flood Assessment Using Time-Series Satellite Data in Cambodia
Authors: Nguyen-Thanh Son
Abstract:
Flood is among the most frequent and costliest natural hazards. The flood disasters especially affect the poor people in rural areas, who are heavily dependent on agriculture and have lower incomes. Cambodia is identified as one of the most climate-vulnerable countries in the world, ranked 13th out of 181 countries most affected by the impacts of climate change. Flood monitoring is thus a strategic priority at national and regional levels because policymakers need reliable spatial and temporal information on flood-prone areas to form successful monitoring programs to reduce possible impacts on the country’s economy and people’s likelihood. This study aims to develop methods for flood mapping and assessment from MODIS data in Cambodia. We processed the data for the period from 2000 to 2017, following three main steps: (1) data pre-processing to construct smooth time-series vegetation and water surface indices, (2) delineation of flood-prone areas, and (3) accuracy assessment. The results of flood mapping were verified with the ground reference data, indicating the overall accuracy of 88.7% and a Kappa coefficient of 0.77, respectively. These results were reaffirmed by close agreement between the flood-mapping area and ground reference data, with the correlation coefficient of determination (R²) of 0.94. The seasonally flooded areas observed for 2010, 2015, and 2016 were remarkably smaller than other years, mainly attributed to the El Niño weather phenomenon exacerbated by impacts of climate change. Eventually, although several sources potentially lowered the mapping accuracy of flood-prone areas, including image cloud contamination, mixed-pixel issues, and low-resolution bias between the mapping results and ground reference data, our methods indicated the satisfactory results for delineating spatiotemporal evolutions of floods. The results in the form of quantitative information on spatiotemporal flood distributions could be beneficial to policymakers in evaluating their management strategies for mitigating the negative effects of floods on agriculture and people’s likelihood in the country.Keywords: MODIS, flood, mapping, Cambodia
Procedia PDF Downloads 12623291 Data Mining of Students' Performance Using Artificial Neural Network: Turkish Students as a Case Study
Authors: Samuel Nii Tackie, Oyebade K. Oyedotun, Ebenezer O. Olaniyi, Adnan Khashman
Abstract:
Artificial neural networks have been used in different fields of artificial intelligence, and more specifically in machine learning. Although, other machine learning options are feasible in most situations, but the ease with which neural networks lend themselves to different problems which include pattern recognition, image compression, classification, computer vision, regression etc. has earned it a remarkable place in the machine learning field. This research exploits neural networks as a data mining tool in predicting the number of times a student repeats a course, considering some attributes relating to the course itself, the teacher, and the particular student. Neural networks were used in this work to map the relationship between some attributes related to students’ course assessment and the number of times a student will possibly repeat a course before he passes. It is the hope that the possibility to predict students’ performance from such complex relationships can help facilitate the fine-tuning of academic systems and policies implemented in learning environments. To validate the power of neural networks in data mining, Turkish students’ performance database has been used; feedforward and radial basis function networks were trained for this task; and the performances obtained from these networks evaluated in consideration of achieved recognition rates and training time.Keywords: artificial neural network, data mining, classification, students’ evaluation
Procedia PDF Downloads 61323290 Introducing Information and Communication Technologies in Prison: A Proposal in Favor of Social Reintegration
Authors: Carmen Rocio Fernandez Diaz
Abstract:
This paper focuses on the relevance of information and communication technologies (hereinafter referred as ‘ICTs’) as an essential part of the day-to-day life of all societies nowadays, as they offer the scenario where an immense number of behaviors are performed that previously took place in the physical world. In this context, areas of reality that have remained outside the so-called ‘information society’ are hardly imaginable. Nevertheless, it is possible to identify a means that continue to be behind this reality, and it is the penitentiary area regarding inmates rights, as security aspects in prison have already be improved by new technologies. Introducing ICTs in prisons is still a matter subject to great rejections. The study of comparative penitentiary systems worldwide shows that most of them use ICTs only regarding educational aspects of life in prison and that communications with the outside world are generally based on traditional ways. These are only two examples of the huge range of activities where ICTs can carry positive results within the prison. Those positive results have to do with the social reintegration of persons serving a prison sentence. Deprivation of liberty entails contact with the prison subculture and the harmful effects of it, causing in cases of long-term sentences the so-called phenomenon of ‘prisonization’. This negative effect of imprisonment could be reduced if ICTs were used inside prisons in the different areas where they can have an impact, and which are treated in this research, as (1) access to information and culture, (2) basic and advanced training, (3) employment, (4) communication with the outside world, (5) treatment or (6) leisure and entertainment. The content of all of these areas could be improved if ICTs were introduced in prison, as it is shown by the experience of some prisons of Belgium, United Kingdom or The United States. However, rejections to introducing ICTs in prisons obey to the fact that it could carry also risks concerning security and the commission of new offences. Considering these risks, the scope of this paper is to offer a real proposal to introduce ICTs in prison, trying to avoid those risks. This enterprise would be done to take advantage of the possibilities that ICTs offer to all inmates in order to start to build a life outside which is far from delinquency, but mainly to those inmates who are close to release. Reforming prisons in this sense is considered by the author of this paper an opportunity to offer inmates a progressive resettlement to live in freedom with a higher possibility to obey the law and to escape from recidivism. The value that new technologies would add to education, employment, communications or treatment to a person deprived of liberty constitutes a way of humanization of prisons in the 21st century.Keywords: deprivation of freedom, information and communication technologies, imprisonment, social reintegration
Procedia PDF Downloads 16523289 Evaluation of Routing Protocols in Mobile Adhoc Networks
Authors: Anu Malhotra
Abstract:
An Ad-hoc network is one that is an autonomous, self configuring network made up of mobile nodes connected via wireless links. Ad-hoc networks often consist of nodes, mobile hosts (MH) or mobile stations (MS, also serving as routers) connected by wireless links. Different routing protocols are used for data transmission in between the nodes in an adhoc network. In this paper two protocols (OLSR and AODV) are analyzed on the basis of two parameters i.e. time delay and throughput with different data rates. On the basis of these analysis, we observed that with same data rate, AODV protocol is having more time delay than the OLSR protocol whereas throughput for the OLSR protocol is less compared to the AODV protocol.Keywords: routing adhoc, mobile hosts, mobile stations, OLSR protocol, AODV protocol
Procedia PDF Downloads 50623288 Experimental Investigation of Natural Frequency and Forced Vibration of Euler-Bernoulli Beam under Displacement of Concentrated Mass and Load
Authors: Aref Aasi, Sadegh Mehdi Aghaei, Balaji Panchapakesan
Abstract:
This work aims to evaluate the free and forced vibration of a beam with two end joints subjected to a concentrated moving mass and a load using the Euler-Bernoulli method. The natural frequency is calculated for different locations of the concentrated mass and load on the beam. The analytical results are verified by the experimental data. The variations of natural frequency as a function of the location of the mass, the effect of the forced frequency on the vibrational amplitude, and the displacement amplitude versus time are investigated. It is discovered that as the concentrated mass moves toward the center of the beam, the natural frequency of the beam and the relative error between experimental and analytical data decreases. There is a close resemblance between analytical data and experimental observations.Keywords: Euler-Bernoulli beam, natural frequency, forced vibration, experimental setup
Procedia PDF Downloads 27423287 The Phonemic Inventory of Tenyidie Affricates: An Acoustic Study
Authors: NeisaKuonuo Tungoe
Abstract:
Tenyidie, also known as Angami, is spoken by the Angami tribe of Nagaland, North-East India, bordering Myanmar (Burma). It belongs to the Tibeto-Burman language group, falling under the Kuki-Chin-Naga sub-family. Tenyidie studies have seen random attempts at explaining the phonemic inventory of Tenyidie. Different scholars have variously emphasized the grammar or the history of Tenyidie. Many of these claims have been stimulating, but they were often based on a small amount of merely suggestive data or on auditory perception only. The principal objective of this paper is to analyse the affricate segments of Tenyidie as an acoustic study. There are seven categories to the inventory of Tenyidie; Plosives, Nasals, Affricates, Laterals, Rhotics, Fricatives, Semi vowels and Vowels. In all, there are sixty phonemes in the inventory. As mentioned above, the only prominent readings on Tenyidie or affricates in particular are only reflected through auditory perception. As noted above, this study aims to lay out the affricate segments based only on acoustic conclusions. There are seven affricates found in Tenyidie. They are: 1) Voiceless Labiodental Affricate - / pf /, 2) Voiceless Aspirated Labiodental Affricate- / pfh /, 3) Voiceless Alveolar Affricate - / ts /, 4) Voiceless Aspirated Alveolar Affricate - / tsh /, 5) Voiced Alveolar Affricate - / dz /, 6) Voiceless Post-Alveolar Affricate / tʃ / and 7) Voiced Post- Alveolar Affricate- / dʒ /. Since the study is based on acoustic features of affricates, five informants were asked to record their voice with Tenyidie phonemes and English phonemes. Throughout the study of the recorded data, PRAAT, a scientific software program that has made itself indispensible for the analyses of speech in phonetics, have been used as the main software. This data was then used as a comparative study between Tenyidie and English affricates. Comparisons have also been drawn between this study and the work of another author who has stated that there are only six affricates in Tenyidie. The study has been quite detailed regarding the specifics of the data. Detailed accounts of the duration and acoustic cues have been noted. The data will be presented in the form of spectrograms. Since there aren’t any other acoustic related data done on Tenyidie, this study will be the first in the long line of acoustic researches on Tenyidie.Keywords: tenyidie, affricates, praat, phonemic inventory
Procedia PDF Downloads 41623286 Exploring Students' Understanding about Bullying in Private Colleges in Rawalpindi, Pakistan
Authors: Alveena Khan
Abstract:
The objective of this research is to explore students’ understanding about bullying and different bullying types. Nowadays bullying is considered as an important social issue around the world because it has long lasting effects on students’ lives. Sometimes due to bullying students commit suicide, they lose confidence and become isolated. This research used qualitative research approach. In order to generate data, triangulation was considered for the verification and reliability of the generated data. Semi-structured interview, non-participant observation, and case studies were conducted. This research focused on five major private colleges and 20 students (both female and male) participated in Rawalpindi, Pakistan. The data generated included approximately 45 hours of total interviews. Thematic analysis was used for data analysis and followed grounded theory to generate themes. The findings of the research highlights that bullying does prevail in studied private colleges, mostly in the form of verbal and physical bullying. No specific gender difference was found in experiencing verbal and physical bullying. Furthermore, from students’ point of view, college administrators are responsible to deal with bullying. The researcher suggests that there must be a proper check and balance system and anti-bullying programs should be held in colleges to create a protective and healthy environment in which students do not face bullying.Keywords: bullying, college student, physical and verbal bullying, qualitative research
Procedia PDF Downloads 15923285 Consumer Values in the Perspective of Javanese Mataraman Society: Identification, Meaning, and Application
Authors: Anna Triwijayati, Etsa Astridya Setiyati, Titik Desi Harsoyo
Abstract:
Culture is the important determinant of human behavior and desire. Culture influences the consumer through the norms and values established by the society in which they live and reflect it. The cultural values of Javanese society certainly have united in the Javanese society behavior in consumption. This research is expected to give big enough theoretical benefits in the findings of cultural value in consumption in Javanese society. These can be an incentive in finding the local cultural value in many tribes in Indonesia, so one time, the local cultural value in Indonesia about consumption can be fundamental part in education and consumption practice in Indonesia. The approach used in this research is non positivist research or is known as qualitative approach. The method or type of research used in this research is ethnomethodology. The collection data is done in Central Java region. The research subject or informant is determined by the purposive technique by certain criteria determined by the researcher. The data is collected by deep interview and observation. Before the data analysis, the researcher does the storing method data stage and implements the data validity procedures. Then, the data is analyzed by the theme and interactive analysis technique. The Javanese Mataraman society has such consumption values such as has to be sufficient, be careful, economical, submit to the one who creates the life, the way life flow, and the present problem is thought in the present also. In the financial management for consumption, the consumer should have the simple life principles, has to be sufficient, has to be able to eat, has to be able to self-press, well-managed/diligent/accurate/careful, the open or transparent management, has the struggle effort, like to self-sacrifice and think about the future. The meaning of consumption value in family is centered to the submission and full-trust to God. These consumption values are applied in consumer behavior in self, family, investment and credit need in short term and long term perspective.Keywords: values, consumer, consumption, Javanese Mataraman, ethnomethodology
Procedia PDF Downloads 39223284 Parallel Fuzzy Rough Support Vector Machine for Data Classification in Cloud Environment
Authors: Arindam Chaudhuri
Abstract:
Classification of data has been actively used for most effective and efficient means of conveying knowledge and information to users. The prima face has always been upon techniques for extracting useful knowledge from data such that returns are maximized. With emergence of huge datasets the existing classification techniques often fail to produce desirable results. The challenge lies in analyzing and understanding characteristics of massive data sets by retrieving useful geometric and statistical patterns. We propose a supervised parallel fuzzy rough support vector machine (PFRSVM) for data classification in cloud environment. The classification is performed by PFRSVM using hyperbolic tangent kernel. The fuzzy rough set model takes care of sensitiveness of noisy samples and handles impreciseness in training samples bringing robustness to results. The membership function is function of center and radius of each class in feature space and is represented with kernel. It plays an important role towards sampling the decision surface. The success of PFRSVM is governed by choosing appropriate parameter values. The training samples are either linear or nonlinear separable. The different input points make unique contributions to decision surface. The algorithm is parallelized with a view to reduce training times. The system is built on support vector machine library using Hadoop implementation of MapReduce. The algorithm is tested on large data sets to check its feasibility and convergence. The performance of classifier is also assessed in terms of number of support vectors. The challenges encountered towards implementing big data classification in machine learning frameworks are also discussed. The experiments are done on the cloud environment available at University of Technology and Management, India. The results are illustrated for Gaussian RBF and Bayesian kernels. The effect of variability in prediction and generalization of PFRSVM is examined with respect to values of parameter C. It effectively resolves outliers’ effects, imbalance and overlapping class problems, normalizes to unseen data and relaxes dependency between features and labels. The average classification accuracy for PFRSVM is better than other classifiers for both Gaussian RBF and Bayesian kernels. The experimental results on both synthetic and real data sets clearly demonstrate the superiority of the proposed technique.Keywords: FRSVM, Hadoop, MapReduce, PFRSVM
Procedia PDF Downloads 49023283 Design and Development of a Computerized Medical Record System for Hospitals in Remote Areas
Authors: Grace Omowunmi Soyebi
Abstract:
A computerized medical record system is a collection of medical information about a person that is stored on a computer. One principal problem of most hospitals in rural areas is using the file management system for keeping records. A lot of time is wasted when a patient visits the hospital, probably in an emergency, and the nurse or attendant has to search through voluminous files before the patient's file can be retrieved, this may cause an unexpected to happen to the patient. This Data Mining application is to be designed using a Structured System Analysis and design method which will help in a well-articulated analysis of the existing file management system, feasibility study, and proper documentation of the Design and Implementation of a Computerized medical record system. This Computerized system will replace the file management system and help to quickly retrieve a patient's record with increased data security, access clinical records for decision-making, and reduce the time range at which a patient gets attended to.Keywords: programming, computing, data, innovation
Procedia PDF Downloads 11923282 Modified CUSUM Algorithm for Gradual Change Detection in a Time Series Data
Authors: Victoria Siriaki Jorry, I. S. Mbalawata, Hayong Shin
Abstract:
The main objective in a change detection problem is to develop algorithms for efficient detection of gradual and/or abrupt changes in the parameter distribution of a process or time series data. In this paper, we present a modified cumulative (MCUSUM) algorithm to detect the start and end of a time-varying linear drift in mean value of a time series data based on likelihood ratio test procedure. The design, implementation and performance of the proposed algorithm for a linear drift detection is evaluated and compared to the existing CUSUM algorithm using different performance measures. An approach to accurately approximate the threshold of the MCUSUM is also provided. Performance of the MCUSUM for gradual change-point detection is compared to that of standard cumulative sum (CUSUM) control chart designed for abrupt shift detection using Monte Carlo Simulations. In terms of the expected time for detection, the MCUSUM procedure is found to have a better performance than a standard CUSUM chart for detection of the gradual change in mean. The algorithm is then applied and tested to a randomly generated time series data with a gradual linear trend in mean to demonstrate its usefulness.Keywords: average run length, CUSUM control chart, gradual change detection, likelihood ratio test
Procedia PDF Downloads 29923281 Contextual Toxicity Detection with Data Augmentation
Authors: Julia Ive, Lucia Specia
Abstract:
Understanding and detecting toxicity is an important problem to support safer human interactions online. Our work focuses on the important problem of contextual toxicity detection, where automated classifiers are tasked with determining whether a short textual segment (usually a sentence) is toxic within its conversational context. We use “toxicity” as an umbrella term to denote a number of variants commonly named in the literature, including hate, abuse, offence, among others. Detecting toxicity in context is a non-trivial problem and has been addressed by very few previous studies. These previous studies have analysed the influence of conversational context in human perception of toxicity in controlled experiments and concluded that humans rarely change their judgements in the presence of context. They have also evaluated contextual detection models based on state-of-the-art Deep Learning and Natural Language Processing (NLP) techniques. Counterintuitively, they reached the general conclusion that computational models tend to suffer performance degradation in the presence of context. We challenge these empirical observations by devising better contextual predictive models that also rely on NLP data augmentation techniques to create larger and better data. In our study, we start by further analysing the human perception of toxicity in conversational data (i.e., tweets), in the absence versus presence of context, in this case, previous tweets in the same conversational thread. We observed that the conclusions of previous work on human perception are mainly due to data issues: The contextual data available does not provide sufficient evidence that context is indeed important (even for humans). The data problem is common in current toxicity datasets: cases labelled as toxic are either obviously toxic (i.e., overt toxicity with swear, racist, etc. words), and thus context does is not needed for a decision, or are ambiguous, vague or unclear even in the presence of context; in addition, the data contains labeling inconsistencies. To address this problem, we propose to automatically generate contextual samples where toxicity is not obvious (i.e., covert cases) without context or where different contexts can lead to different toxicity judgements for the same tweet. We generate toxic and non-toxic utterances conditioned on the context or on target tweets using a range of techniques for controlled text generation(e.g., Generative Adversarial Networks and steering techniques). On the contextual detection models, we posit that their poor performance is due to limitations on both of the data they are trained on (same problems stated above) and the architectures they use, which are not able to leverage context in effective ways. To improve on that, we propose text classification architectures that take the hierarchy of conversational utterances into account. In experiments benchmarking ours against previous models on existing and automatically generated data, we show that both data and architectural choices are very important. Our model achieves substantial performance improvements as compared to the baselines that are non-contextual or contextual but agnostic of the conversation structure.Keywords: contextual toxicity detection, data augmentation, hierarchical text classification models, natural language processing
Procedia PDF Downloads 17023280 Osteoarthritis (OA): A Total Knee Replacement Surgery
Authors: Loveneet Kaur
Abstract:
Introduction: Osteoarthritis (OA) is one of the leading causes of disability, and the knee is the most commonly affected joint in the body. The last resort for treatment of knee OA is Total Knee Replacement (TKR) surgery. Despite numerous advances in prosthetic design, patients do not reach normal function after surgery. Current surgical decisions are made on 2D radiographs and patient interviews. Aims: The aim of this study was to compare knee kinematics pre and post-TKR surgery using computer-animated images of patient-specific models under everyday conditions. Methods: 7 subjects were recruited for the study. Subjects underwent 3D gait analysis during 4 everyday activities and medical imaging of the knee joint pre- and one-month post-surgery. A 3D model was created from each of the scans, and the kinematic gait analysis data was used to animate the images. Results: Improvements were seen in a range of motion in all 4 activities 1-year post-surgery. The preoperative 3D images provide detailed information on the anatomy of the osteoarthritic knee. The postoperative images demonstrate potential future problems associated with the implant. Although not accurate enough to be of clinical use, the animated data can provide valuable insight into what conditions cause damage to both the osteoarthritic and prosthetic knee joints. As the animated data does not require specialist training to view, the images can be utilized across the fields of health professionals and manufacturing in the assessment and treatment of patients pre and post-knee replacement surgery. Future improvements in the collection and processing of data may yield clinically useful data. Conclusion: Although not yet of clinical use, the potential application of 3D animations of the knee joint pre and post-surgery is widespread.Keywords: Orthoporosis, Ortharthritis, knee replacement, TKR
Procedia PDF Downloads 4723279 Time of Week Intensity Estimation from Interval Censored Data with Application to Police Patrol Planning
Authors: Jiahao Tian, Michael D. Porter
Abstract:
Law enforcement agencies are tasked with crime prevention and crime reduction under limited resources. Having an accurate temporal estimate of the crime rate would be valuable to achieve such a goal. However, estimation is usually complicated by the interval-censored nature of crime data. We cast the problem of intensity estimation as a Poisson regression using an EM algorithm to estimate the parameters. Two special penalties are added that provide smoothness over the time of day and day of the week. This approach presented here provides accurate intensity estimates and can also uncover day-of-week clusters that share the same intensity patterns. Anticipating where and when crimes might occur is a key element to successful policing strategies. However, this task is complicated by the presence of interval-censored data. The censored data refers to the type of data that the event time is only known to lie within an interval instead of being observed exactly. This type of data is prevailing in the field of criminology because of the absence of victims for certain types of crime. Despite its importance, the research in temporal analysis of crime has lagged behind the spatial component. Inspired by the success of solving crime-related problems with a statistical approach, we propose a statistical model for the temporal intensity estimation of crime with censored data. The model is built on Poisson regression and has special penalty terms added to the likelihood. An EM algorithm was derived to obtain maximum likelihood estimates, and the resulting model shows superior performance to the competing model. Our research is in line with the smart policing initiative (SPI) proposed by the Bureau Justice of Assistance (BJA) as an effort to support law enforcement agencies in building evidence-based, data-driven law enforcement tactics. The goal is to identify strategic approaches that are effective in crime prevention and reduction. In our case, we allow agencies to deploy their resources for a relatively short period of time to achieve the maximum level of crime reduction. By analyzing a particular area within cities where data are available, our proposed approach could not only provide an accurate estimate of intensities for the time unit considered but a time-variation crime incidence pattern. Both will be helpful in the allocation of limited resources by either improving the existing patrol plan with the understanding of the discovery of the day of week cluster or supporting extra resources available.Keywords: cluster detection, EM algorithm, interval censoring, intensity estimation
Procedia PDF Downloads 6623278 Diversifying from Petroleum Products to Arable Farming as Source of Revenue Generation in Nigeria: A Case Study of Ondo West Local Government
Authors: A. S. Akinbani
Abstract:
Overdependence on petroleum is causing set back in Nigeria economy. Field survey was carried out to assess the profitability and production of selected arable crops in six selected towns and villages of Ondo southwestern. Data were collected from 240 arable crop farmers with the aid of both primary and secondary data. Data were collected with the use of oral interview and structured questionnaires. Data collected were analyzed using both descriptive and inferential statistics. Forty farmers were randomly selected to give a total number of 240 respondents. 84 farmers interviewed had no formal education, 72 had primary education, 50 farmers attained secondary education while 38 attained beyond secondary education. The majority of the farmers hold less than 10 acres of land. The data collected from the field showed that 192 farmers practiced mixed cropping which includes mixtures of yam, cowpea, cocoyam, vegetable, cassava and maize while only 48 farmers practiced monocropping. Among the sampled farmers, 93% agreed that arable production is profitable while 7% disagreed. The findings show that managerial practices that conserve the soil fertility and reduce labor cost such as planting of leguminous crops and herbicide application instead of using hand held hoe for weeding should be encouraged. All the respondents agreed that yam, cowpea, cocoyam, sweet potato, rice, maize and vegetable production will solve the problem of hunger and increase standard of living compared with petroleum product that Nigeria relied on as means of livelihood.Keywords: farmers, arable crop, cocoyam, respondents, maize
Procedia PDF Downloads 25123277 Participation of Students and Lecturers in Social Networking for Teaching and Learning in Public Universities in Rivers State, Nigeria
Authors: Nkeiruka Queendarline Nwaizugbu
Abstract:
The use of social media and mobile devices has become acceptable in virtually all areas of today’s world. Hence, this study is a survey that was carried out to find out if students and lecturers in public universities in Rivers State use social networking for educational purposes. The sample of the study comprised of 240 students and 99 lecturers from the University of Port Harcourt and the Rivers State University of science and Technology. The study had five research questions, two hypotheses and the instrument for data collection was a 4-point Likert-type rating scale questionnaire. The data was analysed using mean, standard deviation and z-test. The findings gotten from the analysed data shows that students participate in social networking using different types of web applications but they hardly use them for educational purposes. Some recommendations were also made.Keywords: internet access, mobile learning, participation, social media, social networking, technology
Procedia PDF Downloads 42323276 Handling Missing Data by Using Expectation-Maximization and Expectation-Maximization with Bootstrapping for Linear Functional Relationship Model
Authors: Adilah Abdul Ghapor, Yong Zulina Zubairi, A. H. M. R. Imon
Abstract:
Missing value problem is common in statistics and has been of interest for years. This article considers two modern techniques in handling missing data for linear functional relationship model (LFRM) namely the Expectation-Maximization (EM) algorithm and Expectation-Maximization with Bootstrapping (EMB) algorithm using three performance indicators; namely the mean absolute error (MAE), root mean square error (RMSE) and estimated biased (EB). In this study, we applied the methods of imputing missing values in two types of LFRM namely the full model of LFRM and in LFRM when the slope is estimated using a nonparametric method. Results of the simulation study suggest that EMB algorithm performs much better than EM algorithm in both models. We also illustrate the applicability of the approach in a real data set.Keywords: expectation-maximization, expectation-maximization with bootstrapping, linear functional relationship model, performance indicators
Procedia PDF Downloads 45523275 A Comparative Study of Environment Risk Assessment Guidelines of Developing and Developed Countries Including Bangladesh
Authors: Syeda Fahria Hoque Mimmi, Aparna Islam
Abstract:
Genetically engineered (GE) plants are the need of time for increased demand for food. A complete set of regulations need to be followed from the development of a GE plant to its release into the environment. The whole regulation system is categorized into separate stages for maintaining the proper biosafety. Environmental risk assessment (ERA) is one of such crucial stages in the whole process. ERA identifies potential risks and their impacts through science-based evaluation where it is done in a case-by-case study. All the countries which deal with GE plants follow specific guidelines to conduct a successful ERA. In this study, ERA guidelines of 4 developing and 4 developed countries, including Bangladesh, were compared. ERA guidelines of countries such as India, Canada, Australia, the European Union, Argentina, Brazil, and the US were considered as a model to conduct the comparison study with Bangladesh. Initially, ten parameters were detected to compare the required data and information among all the guidelines. Surprisingly, an adequate amount of data and information requirements (e.g., if the intended modification/new traits of interest has been achieved or not, the growth habit of GE plants, consequences of any potential gene flow upon the cultivation of GE plants to sexually compatible plant species, potential adverse effects on the human health, etc.) matched between all the countries. However, a few differences in data requirement (e.g., agronomic conventions of non-transformed plants, applicants should clearly describe experimental procedures followed, etc.) were also observed in the study. Moreover, it was found that only a few countries provide instructions on the quality of the data used for ERA. If these similarities are recognized in a more framed manner, then the approval pathway of GE plants can be shared.Keywords: GE plants, ERA, harmonization, ERA guidelines, Information and data requirements
Procedia PDF Downloads 18723274 In-service High School Teachers’ Experiences On Blended Teaching Approach Of Mathematics
Authors: Lukholo Raxangana
Abstract:
Fourth Industrial Revolution (4IR)-era teaching offers in-service mathematics teachers opportunities to use blended approaches to engage learners while teaching mathematics. This study explores in-service high school teachers' experiences with a blended teaching approach to mathematics. This qualitative case study involved eight pre-service teachers from four selected schools in the Sedibeng West District of the Gauteng Province. The study used the community of inquiry model as its analytical framework for data analysis. Data collection was through semi-structured interviews and focus-group discussions to explore in-service teachers' experiences with the influence of blended teaching (BT) on learning mathematics. The study results are the impact of load-shedding, benefits of BT, and perceptions of in-service and hindrances of BT. Based on these findings, the study recommends that further research should focus on developing data-free BT tools to assist during load-shedding, regardless of location.Keywords: bended teaching, teachers, in-service, and mathematics
Procedia PDF Downloads 5823273 Auditory Brainstem Response in Wave VI for the Detection of Learning Disabilities
Authors: Maria Isabel Garcia-Planas, Maria Victoria Garcia-Camba
Abstract:
The use of brain stem auditory evoked potential (BAEP) is a common way to study the auditory function of people, a way to learn the functionality of a part of the brain neuronal groups that intervene in the learning process by studying the behaviour of wave VI. The latest advances in neuroscience have revealed the existence of different brain activity in the learning process that can be highlighted through the use of innocuous, low-cost, and easy-access techniques such as, among others, the BAEP that can help us to detect early possible neurodevelopmental difficulties for their subsequent assessment and cure. To date and to the authors' best knowledge, only the latency data obtained, observing the first to V waves and mainly in the left ear, were taken into account. This work shows that it is essential to take into account both ears; with these latest data, it has been possible had diagnosed more precise some cases than with the previous data had been diagnosed as 'normal' despite showing signs of some alteration that motivated the new consultation to the specialist.Keywords: ear, neurodevelopment, auditory evoked potentials, intervals of normality, learning disabilities
Procedia PDF Downloads 165