Search results for: crowdsourced data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25263

Search results for: crowdsourced data

23643 Improving Student Programming Skills in Introductory Computer and Data Science Courses Using Generative AI

Authors: Genady Grabarnik, Serge Yaskolko

Abstract:

Generative Artificial Intelligence (AI) has significantly expanded its applicability with the incorporation of Large Language Models (LLMs) and become a technology with promise to automate some areas that were very difficult to automate before. The paper describes the introduction of generative Artificial Intelligence into Introductory Computer and Data Science courses and analysis of effect of such introduction. The generative Artificial Intelligence is incorporated in the educational process two-fold: For the instructors, we create templates of prompts for generation of tasks, and grading of the students work, including feedback on the submitted assignments. For the students, we introduce them to basic prompt engineering, which in turn will be used for generation of test cases based on description of the problems, generating code snippets for the single block complexity programming, and partitioning into such blocks of an average size complexity programming. The above-mentioned classes are run using Large Language Models, and feedback from instructors and students and courses’ outcomes are collected. The analysis shows statistically significant positive effect and preference of both stakeholders.

Keywords: introductory computer and data science education, generative AI, large language models, application of LLMS to computer and data science education

Procedia PDF Downloads 58
23642 Study of a Few Additional Posterior Projection Data to 180° Acquisition for Myocardial SPECT

Authors: Yasuyuki Takahashi, Hirotaka Shimada, Takao Kanzaki

Abstract:

A Dual-detector SPECT system is widely by use of myocardial SPECT studies. With 180-degree (180°) acquisition, reconstructed images are distorted in the posterior wall of myocardium due to the lack of sufficient data of posterior projection. We hypothesized that quality of myocardial SPECT images can be improved by the addition of data acquisition of only a few posterior projections to ordinary 180° acquisition. The proposed acquisition method (180° plus acquisition methods) uses the dual-detector SPECT system with a pair of detector arranged in 90° perpendicular. Sampling angle was 5°, and the acquisition range was 180° from 45° right anterior oblique to 45° left posterior oblique. After the acquisition of 180°, the detector moved to additional acquisition position of reverse side once for 2 projections, twice for 4 projections, or 3 times for 6 projections. Since these acquisition methods cannot be done in the present system, actual data acquisition was done by 360° with a sampling angle of 5°, and projection data corresponding to above acquisition position were extracted for reconstruction. We underwent the phantom studies and a clinical study. SPECT images were compared by profile curve analysis and also quantitatively by contrast ratio. The distortion was improved by 180° plus method. Profile curve analysis showed increased of cardiac cavity. Analysis with contrast ratio revealed that SPECT images of the phantoms and the clinical study were improved from 180° acquisition by the present methods. The difference in the contrast was not clearly recognized between 180° plus 2 projections, 180° plus 4 projections, and 180° plus 6 projections. 180° plus 2 projections method may be feasible for myocardial SPECT because distortion of the image and the contrast were improved.

Keywords: 180° plus acquisition method, a few posterior projections, dual-detector SPECT system, myocardial SPECT

Procedia PDF Downloads 296
23641 Re-identification Risk and Mitigation in Federated Learning: Human Activity Recognition Use Case

Authors: Besma Khalfoun

Abstract:

In many current Human Activity Recognition (HAR) applications, users' data is frequently shared and centrally stored by third parties, posing a significant privacy risk. This practice makes these entities attractive targets for extracting sensitive information about users, including their identity, health status, and location, thereby directly violating users' privacy. To tackle the issue of centralized data storage, a relatively recent paradigm known as federated learning has emerged. In this approach, users' raw data remains on their smartphones, where they train the HAR model locally. However, users still share updates of their local models originating from raw data. These updates are vulnerable to several attacks designed to extract sensitive information, such as determining whether a data sample is used in the training process, recovering the training data with inversion attacks, or inferring a specific attribute or property from the training data. In this paper, we first introduce PUR-Attack, a parameter-based user re-identification attack developed for HAR applications within a federated learning setting. It involves associating anonymous model updates (i.e., local models' weights or parameters) with the originating user's identity using background knowledge. PUR-Attack relies on a simple yet effective machine learning classifier and produces promising results. Specifically, we have found that by considering the weights of a given layer in a HAR model, we can uniquely re-identify users with an attack success rate of almost 100%. This result holds when considering a small attack training set and various data splitting strategies in the HAR model training. Thus, it is crucial to investigate protection methods to mitigate this privacy threat. Along this path, we propose SAFER, a privacy-preserving mechanism based on adaptive local differential privacy. Before sharing the model updates with the FL server, SAFER adds the optimal noise based on the re-identification risk assessment. Our approach can achieve a promising tradeoff between privacy, in terms of reducing re-identification risk, and utility, in terms of maintaining acceptable accuracy for the HAR model.

Keywords: federated learning, privacy risk assessment, re-identification risk, privacy preserving mechanisms, local differential privacy, human activity recognition

Procedia PDF Downloads 14
23640 Blockchain for Transport: Performance Simulations of Blockchain Network for Emission Monitoring Scenario

Authors: Dermot O'Brien, Vasileios Christaras, Georgios Fontaras, Igor Nai Fovino, Ioannis Kounelis

Abstract:

With the rise of the Internet of Things (IoT), 5G, and blockchain (BC) technologies, vehicles are becoming ever increasingly connected and are already transmitting substantial amounts of data to the original equipment manufacturers (OEMs) servers. This data could be used to help detect mileage fraud and enable more accurate vehicle emissions monitoring. This would not only help regulators but could enable applications such as permitting efficient drivers to pay less tax, geofencing for air quality improvement, as well as pollution tolling and trading platforms for transport-related businesses and EU citizens. Other applications could include traffic management and shared mobility systems. BC enables the transmission of data with additional security and removes single points of failure while maintaining data provenance, identity ownership, and the possibility to retain varying levels of privacy depending on the requirements of the applied use case. This research performs simulations of vehicles interacting with European member state authorities and European Commission BC nodes that are running hyperleger fabric and explores whether the technology is currently feasible for transport applications such as the emission monitoring use-case.

Keywords: future transportation systems, technological innovations, policy approaches for transportation future, economic and regulatory trends, blockchain

Procedia PDF Downloads 177
23639 DURAFILE: A Collaborative Tool for Preserving Digital Media Files

Authors: Santiago Macho, Miquel Montaner, Raivo Ruusalepp, Ferran Candela, Xavier Tarres, Rando Rostok

Abstract:

During our lives, we generate a lot of personal information such as photos, music, text documents and videos that link us with our past. This data that used to be tangible is now digital information stored in our computers, which implies a software dependence to make them accessible in the future. Technology, however, constantly evolves and goes through regular shifts, quickly rendering various file formats obsolete. The need for accessing data in the future affects not only personal users but also organizations. In a digital environment, a reliable preservation plan and the ability to adapt to fast changing technology are essential for maintaining data collections in the long term. We present in this paper the European FP7 project called DURAFILE that provides the technology to preserve media files for personal users and organizations while maintaining their quality.

Keywords: artificial intelligence, digital preservation, social search, digital preservation plans

Procedia PDF Downloads 445
23638 Constructing a Semi-Supervised Model for Network Intrusion Detection

Authors: Tigabu Dagne Akal

Abstract:

While advances in computer and communications technology have made the network ubiquitous, they have also rendered networked systems vulnerable to malicious attacks devised from a distance. These attacks or intrusions start with attackers infiltrating a network through a vulnerable host and then launching further attacks on the local network or Intranet. Nowadays, system administrators and network professionals can attempt to prevent such attacks by developing intrusion detection tools and systems using data mining technology. In this study, the experiments were conducted following the Knowledge Discovery in Database Process Model. The Knowledge Discovery in Database Process Model starts from selection of the datasets. The dataset used in this study has been taken from Massachusetts Institute of Technology Lincoln Laboratory. After taking the data, it has been pre-processed. The major pre-processing activities include fill in missed values, remove outliers; resolve inconsistencies, integration of data that contains both labelled and unlabelled datasets, dimensionality reduction, size reduction and data transformation activity like discretization tasks were done for this study. A total of 21,533 intrusion records are used for training the models. For validating the performance of the selected model a separate 3,397 records are used as a testing set. For building a predictive model for intrusion detection J48 decision tree and the Naïve Bayes algorithms have been tested as a classification approach for both with and without feature selection approaches. The model that was created using 10-fold cross validation using the J48 decision tree algorithm with the default parameter values showed the best classification accuracy. The model has a prediction accuracy of 96.11% on the training datasets and 93.2% on the test dataset to classify the new instances as normal, DOS, U2R, R2L and probe classes. The findings of this study have shown that the data mining methods generates interesting rules that are crucial for intrusion detection and prevention in the networking industry. Future research directions are forwarded to come up an applicable system in the area of the study.

Keywords: intrusion detection, data mining, computer science, data mining

Procedia PDF Downloads 297
23637 Academic Leadership Succession Planning Practice in Nigeria Higher Education Institutions: A Case Study of Colleges of Education

Authors: Adie, Julius Undiukeye

Abstract:

This research investigated the practice of academic leadership succession planning in Nigerian higher education institutions, drawing on the lived experiences of the academic staff of the case study institutions. It is multi-case study research that adopts a qualitative research method. Ten participants (mainly academic staff) were used as the study sample. The study was guided by four research questions. Semi-structured interviews and archival information from official documents formed the sources of data. The data collected was analyzed using the Constant Comparative Technique (CCT) to generate empirical insights and facts on the subject of this paper. The following findings emerged from the data analysis: firstly, there was no formalized leadership succession plan in place in the institutions that were sampled for this study; secondly, despite the absence of a formal succession plan, the data indicates that academics believe that succession planning is very significant for institutional survival; thirdly, existing practices of succession planning in the sampled institutions, takes the forms of job seniority ranking, political process and executive fiat, ad-hoc arrangement, and external hiring; and finally, data revealed that there are some barriers to the practice of succession planning, such as traditional higher education institutions’ characteristics (e.g. external talent search, shared governance, diversity, and equality in leadership appointment) and the lack of interest in leadership positions. Based on the research findings, some far-reaching recommendations were made, including the urgent need for the ‘formalization’ of leadership succession planning by the higher education institutions concerned, through the design of an official policy framework.

Keywords: academic leadership, succession, planning, higher education

Procedia PDF Downloads 147
23636 Native Language Identification with Cross-Corpus Evaluation Using Social Media Data: ’Reddit’

Authors: Yasmeen Bassas, Sandra Kuebler, Allen Riddell

Abstract:

Native language identification is one of the growing subfields in natural language processing (NLP). The task of native language identification (NLI) is mainly concerned with predicting the native language of an author’s writing in a second language. In this paper, we investigate the performance of two types of features; content-based features vs. content independent features, when they are evaluated on a different corpus (using social media data “Reddit”). In this NLI task, the predefined models are trained on one corpus (TOEFL), and then the trained models are evaluated on different data using an external corpus (Reddit). Three classifiers are used in this task; the baseline, linear SVM, and logistic regression. Results show that content-based features are more accurate and robust than content independent ones when tested within the corpus and across corpus.

Keywords: NLI, NLP, content-based features, content independent features, social media corpus, ML

Procedia PDF Downloads 139
23635 Integration of Internet-Accessible Resources in the Field of Mobile Robots

Authors: B. Madhevan, R. Sakkaravarthi, R. Diya

Abstract:

The number and variety of mobile robot applications are increasing day by day, both in an industry and in our daily lives. First developed as a tool, nowadays mobile robots can be integrated as an entity in Internet-accessible resources. The present work is organized around four potential resources such as cloud computing, Internet of things, Big data analysis and Co-simulation. Further, the focus relies on integrating, analyzing and discussing the need for integrating Internet-accessible resources and the challenges deriving from such integration, and how these issues have been tackled. Hence, the research work investigates the concepts of the Internet-accessible resources from the aspect of the autonomous mobile robots with an overview of the performances of the currently available database systems. IaR is a world-wide network of interconnected objects, can be considered an evolutionary process in mobile robots. IaR constitutes an integral part of future Internet with data analysis, consisting of both physical and virtual things.

Keywords: internet-accessible resources, cloud computing, big data analysis, internet of things, mobile robot

Procedia PDF Downloads 390
23634 The Application of Lean-Kaizen in Course Plan and Delivery in Malaysian Higher Education Sector

Authors: Nur Aishah Binti Awi, Zulfiqar Khan

Abstract:

Lean-kaizen has always been applied in manufacturing sector since many years ago. What about education sector? This paper discuss on how lean-kaizen can also be applied in education sector, specifically in academic area of Malaysian’s higher education sector. The purpose of this paper is to describe the application of lean kaizen in course plan and delivery. Lean-kaizen techniques have been used to identify waste in the course plan and delivery. A field study has been conducted to obtain the data. This study used both quantitative and qualitative data. The researcher had interviewed the chosen lecturers regarding to the problems of course plan and delivery that they encountered. Secondary data of students’ feedback at the end of semester also has been used to improve course plan and delivery. The result empirically shows that lean-kaizen helps to improve the course plan and delivery by reducing the wastes. Thus, this study demonstrates that lean-kaizen can also help education sector to improve their services as achieved by manufacturing sector.

Keywords: course delivery, education, Kaizen, lean

Procedia PDF Downloads 369
23633 An ANN Approach for Detection and Localization of Fatigue Damage in Aircraft Structures

Authors: Reza Rezaeipour Honarmandzad

Abstract:

In this paper we propose an ANN for detection and localization of fatigue damage in aircraft structures. We used network of piezoelectric transducers for Lamb-wave measurements in order to calculate damage indices. Data gathered by the sensors was given to neural network classifier. A set of neural network electors of different architecture cooperates to achieve consensus concerning the state of each monitored path. Sensed signal variations in the ROI, detected by the networks at each path, were used to assess the state of the structure as well as to localize detected damage and to filter out ambient changes. The classifier has been extensively tested on large data sets acquired in the tests of specimens with artificially introduced notches as well as the results of numerous fatigue experiments. Effect of the classifier structure and test data used for training on the results was evaluated.

Keywords: ANN, fatigue damage, aircraft structures, piezoelectric transducers, lamb-wave measurements

Procedia PDF Downloads 420
23632 Public Libraries as Social Spaces for Vulnerable Populations

Authors: Natalie Malone

Abstract:

This study explores the role of a public library in the creation of social spaces for vulnerable populations. The data stems from a longitudinal ethnographic study of the Anderson Library community, which included field notes, artifacts, and interview data. Thematic analysis revealed multiple meanings and thematic relationships within and among the data sources -interviews, field notes, and artifacts. Initial analysis suggests the Anderson Library serves as a space for vulnerable populations, with the sub-themes of fostering interpersonal communication to create a social space for children and fostering interpersonal communication to create a social space for parents and adults. These findings are important as they illustrate the potential of public libraries to serve as community empowering institutions.

Keywords: capital, immigrant families, public libraries, space, vulnerable

Procedia PDF Downloads 154
23631 A Decadal Flood Assessment Using Time-Series Satellite Data in Cambodia

Authors: Nguyen-Thanh Son

Abstract:

Flood is among the most frequent and costliest natural hazards. The flood disasters especially affect the poor people in rural areas, who are heavily dependent on agriculture and have lower incomes. Cambodia is identified as one of the most climate-vulnerable countries in the world, ranked 13th out of 181 countries most affected by the impacts of climate change. Flood monitoring is thus a strategic priority at national and regional levels because policymakers need reliable spatial and temporal information on flood-prone areas to form successful monitoring programs to reduce possible impacts on the country’s economy and people’s likelihood. This study aims to develop methods for flood mapping and assessment from MODIS data in Cambodia. We processed the data for the period from 2000 to 2017, following three main steps: (1) data pre-processing to construct smooth time-series vegetation and water surface indices, (2) delineation of flood-prone areas, and (3) accuracy assessment. The results of flood mapping were verified with the ground reference data, indicating the overall accuracy of 88.7% and a Kappa coefficient of 0.77, respectively. These results were reaffirmed by close agreement between the flood-mapping area and ground reference data, with the correlation coefficient of determination (R²) of 0.94. The seasonally flooded areas observed for 2010, 2015, and 2016 were remarkably smaller than other years, mainly attributed to the El Niño weather phenomenon exacerbated by impacts of climate change. Eventually, although several sources potentially lowered the mapping accuracy of flood-prone areas, including image cloud contamination, mixed-pixel issues, and low-resolution bias between the mapping results and ground reference data, our methods indicated the satisfactory results for delineating spatiotemporal evolutions of floods. The results in the form of quantitative information on spatiotemporal flood distributions could be beneficial to policymakers in evaluating their management strategies for mitigating the negative effects of floods on agriculture and people’s likelihood in the country.

Keywords: MODIS, flood, mapping, Cambodia

Procedia PDF Downloads 128
23630 Data Mining of Students' Performance Using Artificial Neural Network: Turkish Students as a Case Study

Authors: Samuel Nii Tackie, Oyebade K. Oyedotun, Ebenezer O. Olaniyi, Adnan Khashman

Abstract:

Artificial neural networks have been used in different fields of artificial intelligence, and more specifically in machine learning. Although, other machine learning options are feasible in most situations, but the ease with which neural networks lend themselves to different problems which include pattern recognition, image compression, classification, computer vision, regression etc. has earned it a remarkable place in the machine learning field. This research exploits neural networks as a data mining tool in predicting the number of times a student repeats a course, considering some attributes relating to the course itself, the teacher, and the particular student. Neural networks were used in this work to map the relationship between some attributes related to students’ course assessment and the number of times a student will possibly repeat a course before he passes. It is the hope that the possibility to predict students’ performance from such complex relationships can help facilitate the fine-tuning of academic systems and policies implemented in learning environments. To validate the power of neural networks in data mining, Turkish students’ performance database has been used; feedforward and radial basis function networks were trained for this task; and the performances obtained from these networks evaluated in consideration of achieved recognition rates and training time.

Keywords: artificial neural network, data mining, classification, students’ evaluation

Procedia PDF Downloads 615
23629 Evaluation of Routing Protocols in Mobile Adhoc Networks

Authors: Anu Malhotra

Abstract:

An Ad-hoc network is one that is an autonomous, self configuring network made up of mobile nodes connected via wireless links. Ad-hoc networks often consist of nodes, mobile hosts (MH) or mobile stations (MS, also serving as routers) connected by wireless links. Different routing protocols are used for data transmission in between the nodes in an adhoc network. In this paper two protocols (OLSR and AODV) are analyzed on the basis of two parameters i.e. time delay and throughput with different data rates. On the basis of these analysis, we observed that with same data rate, AODV protocol is having more time delay than the OLSR protocol whereas throughput for the OLSR protocol is less compared to the AODV protocol.

Keywords: routing adhoc, mobile hosts, mobile stations, OLSR protocol, AODV protocol

Procedia PDF Downloads 508
23628 Experimental Investigation of Natural Frequency and Forced Vibration of Euler-Bernoulli Beam under Displacement of Concentrated Mass and Load

Authors: Aref Aasi, Sadegh Mehdi Aghaei, Balaji Panchapakesan

Abstract:

This work aims to evaluate the free and forced vibration of a beam with two end joints subjected to a concentrated moving mass and a load using the Euler-Bernoulli method. The natural frequency is calculated for different locations of the concentrated mass and load on the beam. The analytical results are verified by the experimental data. The variations of natural frequency as a function of the location of the mass, the effect of the forced frequency on the vibrational amplitude, and the displacement amplitude versus time are investigated. It is discovered that as the concentrated mass moves toward the center of the beam, the natural frequency of the beam and the relative error between experimental and analytical data decreases. There is a close resemblance between analytical data and experimental observations.

Keywords: Euler-Bernoulli beam, natural frequency, forced vibration, experimental setup

Procedia PDF Downloads 275
23627 The Phonemic Inventory of Tenyidie Affricates: An Acoustic Study

Authors: NeisaKuonuo Tungoe

Abstract:

Tenyidie, also known as Angami, is spoken by the Angami tribe of Nagaland, North-East India, bordering Myanmar (Burma). It belongs to the Tibeto-Burman language group, falling under the Kuki-Chin-Naga sub-family. Tenyidie studies have seen random attempts at explaining the phonemic inventory of Tenyidie. Different scholars have variously emphasized the grammar or the history of Tenyidie. Many of these claims have been stimulating, but they were often based on a small amount of merely suggestive data or on auditory perception only. The principal objective of this paper is to analyse the affricate segments of Tenyidie as an acoustic study. There are seven categories to the inventory of Tenyidie; Plosives, Nasals, Affricates, Laterals, Rhotics, Fricatives, Semi vowels and Vowels. In all, there are sixty phonemes in the inventory. As mentioned above, the only prominent readings on Tenyidie or affricates in particular are only reflected through auditory perception. As noted above, this study aims to lay out the affricate segments based only on acoustic conclusions. There are seven affricates found in Tenyidie. They are: 1) Voiceless Labiodental Affricate - / pf /, 2) Voiceless Aspirated Labiodental Affricate- / pfh /, 3) Voiceless Alveolar Affricate - / ts /, 4) Voiceless Aspirated Alveolar Affricate - / tsh /, 5) Voiced Alveolar Affricate - / dz /, 6) Voiceless Post-Alveolar Affricate / tʃ / and 7) Voiced Post- Alveolar Affricate- / dʒ /. Since the study is based on acoustic features of affricates, five informants were asked to record their voice with Tenyidie phonemes and English phonemes. Throughout the study of the recorded data, PRAAT, a scientific software program that has made itself indispensible for the analyses of speech in phonetics, have been used as the main software. This data was then used as a comparative study between Tenyidie and English affricates. Comparisons have also been drawn between this study and the work of another author who has stated that there are only six affricates in Tenyidie. The study has been quite detailed regarding the specifics of the data. Detailed accounts of the duration and acoustic cues have been noted. The data will be presented in the form of spectrograms. Since there aren’t any other acoustic related data done on Tenyidie, this study will be the first in the long line of acoustic researches on Tenyidie.

Keywords: tenyidie, affricates, praat, phonemic inventory

Procedia PDF Downloads 419
23626 Exploring Students' Understanding about Bullying in Private Colleges in Rawalpindi, Pakistan

Authors: Alveena Khan

Abstract:

The objective of this research is to explore students’ understanding about bullying and different bullying types. Nowadays bullying is considered as an important social issue around the world because it has long lasting effects on students’ lives. Sometimes due to bullying students commit suicide, they lose confidence and become isolated. This research used qualitative research approach. In order to generate data, triangulation was considered for the verification and reliability of the generated data. Semi-structured interview, non-participant observation, and case studies were conducted. This research focused on five major private colleges and 20 students (both female and male) participated in Rawalpindi, Pakistan. The data generated included approximately 45 hours of total interviews. Thematic analysis was used for data analysis and followed grounded theory to generate themes. The findings of the research highlights that bullying does prevail in studied private colleges, mostly in the form of verbal and physical bullying. No specific gender difference was found in experiencing verbal and physical bullying. Furthermore, from students’ point of view, college administrators are responsible to deal with bullying. The researcher suggests that there must be a proper check and balance system and anti-bullying programs should be held in colleges to create a protective and healthy environment in which students do not face bullying.

Keywords: bullying, college student, physical and verbal bullying, qualitative research

Procedia PDF Downloads 160
23625 Consumer Values in the Perspective of Javanese Mataraman Society: Identification, Meaning, and Application

Authors: Anna Triwijayati, Etsa Astridya Setiyati, Titik Desi Harsoyo

Abstract:

Culture is the important determinant of human behavior and desire. Culture influences the consumer through the norms and values established by the society in which they live and reflect it. The cultural values of Javanese society certainly have united in the Javanese society behavior in consumption. This research is expected to give big enough theoretical benefits in the findings of cultural value in consumption in Javanese society. These can be an incentive in finding the local cultural value in many tribes in Indonesia, so one time, the local cultural value in Indonesia about consumption can be fundamental part in education and consumption practice in Indonesia. The approach used in this research is non positivist research or is known as qualitative approach. The method or type of research used in this research is ethnomethodology. The collection data is done in Central Java region. The research subject or informant is determined by the purposive technique by certain criteria determined by the researcher. The data is collected by deep interview and observation. Before the data analysis, the researcher does the storing method data stage and implements the data validity procedures. Then, the data is analyzed by the theme and interactive analysis technique. The Javanese Mataraman society has such consumption values such as has to be sufficient, be careful, economical, submit to the one who creates the life, the way life flow, and the present problem is thought in the present also. In the financial management for consumption, the consumer should have the simple life principles, has to be sufficient, has to be able to eat, has to be able to self-press, well-managed/diligent/accurate/careful, the open or transparent management, has the struggle effort, like to self-sacrifice and think about the future. The meaning of consumption value in family is centered to the submission and full-trust to God. These consumption values are applied in consumer behavior in self, family, investment and credit need in short term and long term perspective.

Keywords: values, consumer, consumption, Javanese Mataraman, ethnomethodology

Procedia PDF Downloads 393
23624 Parallel Fuzzy Rough Support Vector Machine for Data Classification in Cloud Environment

Authors: Arindam Chaudhuri

Abstract:

Classification of data has been actively used for most effective and efficient means of conveying knowledge and information to users. The prima face has always been upon techniques for extracting useful knowledge from data such that returns are maximized. With emergence of huge datasets the existing classification techniques often fail to produce desirable results. The challenge lies in analyzing and understanding characteristics of massive data sets by retrieving useful geometric and statistical patterns. We propose a supervised parallel fuzzy rough support vector machine (PFRSVM) for data classification in cloud environment. The classification is performed by PFRSVM using hyperbolic tangent kernel. The fuzzy rough set model takes care of sensitiveness of noisy samples and handles impreciseness in training samples bringing robustness to results. The membership function is function of center and radius of each class in feature space and is represented with kernel. It plays an important role towards sampling the decision surface. The success of PFRSVM is governed by choosing appropriate parameter values. The training samples are either linear or nonlinear separable. The different input points make unique contributions to decision surface. The algorithm is parallelized with a view to reduce training times. The system is built on support vector machine library using Hadoop implementation of MapReduce. The algorithm is tested on large data sets to check its feasibility and convergence. The performance of classifier is also assessed in terms of number of support vectors. The challenges encountered towards implementing big data classification in machine learning frameworks are also discussed. The experiments are done on the cloud environment available at University of Technology and Management, India. The results are illustrated for Gaussian RBF and Bayesian kernels. The effect of variability in prediction and generalization of PFRSVM is examined with respect to values of parameter C. It effectively resolves outliers’ effects, imbalance and overlapping class problems, normalizes to unseen data and relaxes dependency between features and labels. The average classification accuracy for PFRSVM is better than other classifiers for both Gaussian RBF and Bayesian kernels. The experimental results on both synthetic and real data sets clearly demonstrate the superiority of the proposed technique.

Keywords: FRSVM, Hadoop, MapReduce, PFRSVM

Procedia PDF Downloads 491
23623 Design and Development of a Computerized Medical Record System for Hospitals in Remote Areas

Authors: Grace Omowunmi Soyebi

Abstract:

A computerized medical record system is a collection of medical information about a person that is stored on a computer. One principal problem of most hospitals in rural areas is using the file management system for keeping records. A lot of time is wasted when a patient visits the hospital, probably in an emergency, and the nurse or attendant has to search through voluminous files before the patient's file can be retrieved, this may cause an unexpected to happen to the patient. This Data Mining application is to be designed using a Structured System Analysis and design method which will help in a well-articulated analysis of the existing file management system, feasibility study, and proper documentation of the Design and Implementation of a Computerized medical record system. This Computerized system will replace the file management system and help to quickly retrieve a patient's record with increased data security, access clinical records for decision-making, and reduce the time range at which a patient gets attended to.

Keywords: programming, computing, data, innovation

Procedia PDF Downloads 120
23622 Modified CUSUM Algorithm for Gradual Change Detection in a Time Series Data

Authors: Victoria Siriaki Jorry, I. S. Mbalawata, Hayong Shin

Abstract:

The main objective in a change detection problem is to develop algorithms for efficient detection of gradual and/or abrupt changes in the parameter distribution of a process or time series data. In this paper, we present a modified cumulative (MCUSUM) algorithm to detect the start and end of a time-varying linear drift in mean value of a time series data based on likelihood ratio test procedure. The design, implementation and performance of the proposed algorithm for a linear drift detection is evaluated and compared to the existing CUSUM algorithm using different performance measures. An approach to accurately approximate the threshold of the MCUSUM is also provided. Performance of the MCUSUM for gradual change-point detection is compared to that of standard cumulative sum (CUSUM) control chart designed for abrupt shift detection using Monte Carlo Simulations. In terms of the expected time for detection, the MCUSUM procedure is found to have a better performance than a standard CUSUM chart for detection of the gradual change in mean. The algorithm is then applied and tested to a randomly generated time series data with a gradual linear trend in mean to demonstrate its usefulness.

Keywords: average run length, CUSUM control chart, gradual change detection, likelihood ratio test

Procedia PDF Downloads 299
23621 Contextual Toxicity Detection with Data Augmentation

Authors: Julia Ive, Lucia Specia

Abstract:

Understanding and detecting toxicity is an important problem to support safer human interactions online. Our work focuses on the important problem of contextual toxicity detection, where automated classifiers are tasked with determining whether a short textual segment (usually a sentence) is toxic within its conversational context. We use “toxicity” as an umbrella term to denote a number of variants commonly named in the literature, including hate, abuse, offence, among others. Detecting toxicity in context is a non-trivial problem and has been addressed by very few previous studies. These previous studies have analysed the influence of conversational context in human perception of toxicity in controlled experiments and concluded that humans rarely change their judgements in the presence of context. They have also evaluated contextual detection models based on state-of-the-art Deep Learning and Natural Language Processing (NLP) techniques. Counterintuitively, they reached the general conclusion that computational models tend to suffer performance degradation in the presence of context. We challenge these empirical observations by devising better contextual predictive models that also rely on NLP data augmentation techniques to create larger and better data. In our study, we start by further analysing the human perception of toxicity in conversational data (i.e., tweets), in the absence versus presence of context, in this case, previous tweets in the same conversational thread. We observed that the conclusions of previous work on human perception are mainly due to data issues: The contextual data available does not provide sufficient evidence that context is indeed important (even for humans). The data problem is common in current toxicity datasets: cases labelled as toxic are either obviously toxic (i.e., overt toxicity with swear, racist, etc. words), and thus context does is not needed for a decision, or are ambiguous, vague or unclear even in the presence of context; in addition, the data contains labeling inconsistencies. To address this problem, we propose to automatically generate contextual samples where toxicity is not obvious (i.e., covert cases) without context or where different contexts can lead to different toxicity judgements for the same tweet. We generate toxic and non-toxic utterances conditioned on the context or on target tweets using a range of techniques for controlled text generation(e.g., Generative Adversarial Networks and steering techniques). On the contextual detection models, we posit that their poor performance is due to limitations on both of the data they are trained on (same problems stated above) and the architectures they use, which are not able to leverage context in effective ways. To improve on that, we propose text classification architectures that take the hierarchy of conversational utterances into account. In experiments benchmarking ours against previous models on existing and automatically generated data, we show that both data and architectural choices are very important. Our model achieves substantial performance improvements as compared to the baselines that are non-contextual or contextual but agnostic of the conversation structure.

Keywords: contextual toxicity detection, data augmentation, hierarchical text classification models, natural language processing

Procedia PDF Downloads 171
23620 Osteoarthritis (OA): A Total Knee Replacement Surgery

Authors: Loveneet Kaur

Abstract:

Introduction: Osteoarthritis (OA) is one of the leading causes of disability, and the knee is the most commonly affected joint in the body. The last resort for treatment of knee OA is Total Knee Replacement (TKR) surgery. Despite numerous advances in prosthetic design, patients do not reach normal function after surgery. Current surgical decisions are made on 2D radiographs and patient interviews. Aims: The aim of this study was to compare knee kinematics pre and post-TKR surgery using computer-animated images of patient-specific models under everyday conditions. Methods: 7 subjects were recruited for the study. Subjects underwent 3D gait analysis during 4 everyday activities and medical imaging of the knee joint pre- and one-month post-surgery. A 3D model was created from each of the scans, and the kinematic gait analysis data was used to animate the images. Results: Improvements were seen in a range of motion in all 4 activities 1-year post-surgery. The preoperative 3D images provide detailed information on the anatomy of the osteoarthritic knee. The postoperative images demonstrate potential future problems associated with the implant. Although not accurate enough to be of clinical use, the animated data can provide valuable insight into what conditions cause damage to both the osteoarthritic and prosthetic knee joints. As the animated data does not require specialist training to view, the images can be utilized across the fields of health professionals and manufacturing in the assessment and treatment of patients pre and post-knee replacement surgery. Future improvements in the collection and processing of data may yield clinically useful data. Conclusion: Although not yet of clinical use, the potential application of 3D animations of the knee joint pre and post-surgery is widespread.

Keywords: Orthoporosis, Ortharthritis, knee replacement, TKR

Procedia PDF Downloads 52
23619 Time of Week Intensity Estimation from Interval Censored Data with Application to Police Patrol Planning

Authors: Jiahao Tian, Michael D. Porter

Abstract:

Law enforcement agencies are tasked with crime prevention and crime reduction under limited resources. Having an accurate temporal estimate of the crime rate would be valuable to achieve such a goal. However, estimation is usually complicated by the interval-censored nature of crime data. We cast the problem of intensity estimation as a Poisson regression using an EM algorithm to estimate the parameters. Two special penalties are added that provide smoothness over the time of day and day of the week. This approach presented here provides accurate intensity estimates and can also uncover day-of-week clusters that share the same intensity patterns. Anticipating where and when crimes might occur is a key element to successful policing strategies. However, this task is complicated by the presence of interval-censored data. The censored data refers to the type of data that the event time is only known to lie within an interval instead of being observed exactly. This type of data is prevailing in the field of criminology because of the absence of victims for certain types of crime. Despite its importance, the research in temporal analysis of crime has lagged behind the spatial component. Inspired by the success of solving crime-related problems with a statistical approach, we propose a statistical model for the temporal intensity estimation of crime with censored data. The model is built on Poisson regression and has special penalty terms added to the likelihood. An EM algorithm was derived to obtain maximum likelihood estimates, and the resulting model shows superior performance to the competing model. Our research is in line with the smart policing initiative (SPI) proposed by the Bureau Justice of Assistance (BJA) as an effort to support law enforcement agencies in building evidence-based, data-driven law enforcement tactics. The goal is to identify strategic approaches that are effective in crime prevention and reduction. In our case, we allow agencies to deploy their resources for a relatively short period of time to achieve the maximum level of crime reduction. By analyzing a particular area within cities where data are available, our proposed approach could not only provide an accurate estimate of intensities for the time unit considered but a time-variation crime incidence pattern. Both will be helpful in the allocation of limited resources by either improving the existing patrol plan with the understanding of the discovery of the day of week cluster or supporting extra resources available.

Keywords: cluster detection, EM algorithm, interval censoring, intensity estimation

Procedia PDF Downloads 66
23618 Diversifying from Petroleum Products to Arable Farming as Source of Revenue Generation in Nigeria: A Case Study of Ondo West Local Government

Authors: A. S. Akinbani

Abstract:

Overdependence on petroleum is causing set back in Nigeria economy. Field survey was carried out to assess the profitability and production of selected arable crops in six selected towns and villages of Ondo southwestern. Data were collected from 240 arable crop farmers with the aid of both primary and secondary data. Data were collected with the use of oral interview and structured questionnaires. Data collected were analyzed using both descriptive and inferential statistics. Forty farmers were randomly selected to give a total number of 240 respondents. 84 farmers interviewed had no formal education, 72 had primary education, 50 farmers attained secondary education while 38 attained beyond secondary education. The majority of the farmers hold less than 10 acres of land. The data collected from the field showed that 192 farmers practiced mixed cropping which includes mixtures of yam, cowpea, cocoyam, vegetable, cassava and maize while only 48 farmers practiced monocropping. Among the sampled farmers, 93% agreed that arable production is profitable while 7% disagreed. The findings show that managerial practices that conserve the soil fertility and reduce labor cost such as planting of leguminous crops and herbicide application instead of using hand held hoe for weeding should be encouraged. All the respondents agreed that yam, cowpea, cocoyam, sweet potato, rice, maize and vegetable production will solve the problem of hunger and increase standard of living compared with petroleum product that Nigeria relied on as means of livelihood.

Keywords: farmers, arable crop, cocoyam, respondents, maize

Procedia PDF Downloads 252
23617 Participation of Students and Lecturers in Social Networking for Teaching and Learning in Public Universities in Rivers State, Nigeria

Authors: Nkeiruka Queendarline Nwaizugbu

Abstract:

The use of social media and mobile devices has become acceptable in virtually all areas of today’s world. Hence, this study is a survey that was carried out to find out if students and lecturers in public universities in Rivers State use social networking for educational purposes. The sample of the study comprised of 240 students and 99 lecturers from the University of Port Harcourt and the Rivers State University of science and Technology. The study had five research questions, two hypotheses and the instrument for data collection was a 4-point Likert-type rating scale questionnaire. The data was analysed using mean, standard deviation and z-test. The findings gotten from the analysed data shows that students participate in social networking using different types of web applications but they hardly use them for educational purposes. Some recommendations were also made.

Keywords: internet access, mobile learning, participation, social media, social networking, technology

Procedia PDF Downloads 424
23616 Handling Missing Data by Using Expectation-Maximization and Expectation-Maximization with Bootstrapping for Linear Functional Relationship Model

Authors: Adilah Abdul Ghapor, Yong Zulina Zubairi, A. H. M. R. Imon

Abstract:

Missing value problem is common in statistics and has been of interest for years. This article considers two modern techniques in handling missing data for linear functional relationship model (LFRM) namely the Expectation-Maximization (EM) algorithm and Expectation-Maximization with Bootstrapping (EMB) algorithm using three performance indicators; namely the mean absolute error (MAE), root mean square error (RMSE) and estimated biased (EB). In this study, we applied the methods of imputing missing values in two types of LFRM namely the full model of LFRM and in LFRM when the slope is estimated using a nonparametric method. Results of the simulation study suggest that EMB algorithm performs much better than EM algorithm in both models. We also illustrate the applicability of the approach in a real data set.

Keywords: expectation-maximization, expectation-maximization with bootstrapping, linear functional relationship model, performance indicators

Procedia PDF Downloads 455
23615 A Comparative Study of Environment Risk Assessment Guidelines of Developing and Developed Countries Including Bangladesh

Authors: Syeda Fahria Hoque Mimmi, Aparna Islam

Abstract:

Genetically engineered (GE) plants are the need of time for increased demand for food. A complete set of regulations need to be followed from the development of a GE plant to its release into the environment. The whole regulation system is categorized into separate stages for maintaining the proper biosafety. Environmental risk assessment (ERA) is one of such crucial stages in the whole process. ERA identifies potential risks and their impacts through science-based evaluation where it is done in a case-by-case study. All the countries which deal with GE plants follow specific guidelines to conduct a successful ERA. In this study, ERA guidelines of 4 developing and 4 developed countries, including Bangladesh, were compared. ERA guidelines of countries such as India, Canada, Australia, the European Union, Argentina, Brazil, and the US were considered as a model to conduct the comparison study with Bangladesh. Initially, ten parameters were detected to compare the required data and information among all the guidelines. Surprisingly, an adequate amount of data and information requirements (e.g., if the intended modification/new traits of interest has been achieved or not, the growth habit of GE plants, consequences of any potential gene flow upon the cultivation of GE plants to sexually compatible plant species, potential adverse effects on the human health, etc.) matched between all the countries. However, a few differences in data requirement (e.g., agronomic conventions of non-transformed plants, applicants should clearly describe experimental procedures followed, etc.) were also observed in the study. Moreover, it was found that only a few countries provide instructions on the quality of the data used for ERA. If these similarities are recognized in a more framed manner, then the approval pathway of GE plants can be shared.

Keywords: GE plants, ERA, harmonization, ERA guidelines, Information and data requirements

Procedia PDF Downloads 187
23614 Creation of a Realistic Railway Simulator Developed on a 3D Graphic Game Engine Using a Numerical Computing Programming Environment

Authors: Kshitij Ansingkar, Yohei Hoshino, Liangliang Yang

Abstract:

Advances in algorithms related to autonomous systems have made it possible to research on improving the accuracy of a train’s location. This has the capability of increasing the throughput of a railway network without the need for the creation of additional infrastructure. To develop such a system, the railway industry requires data to test sensor fusion theories or implement simultaneous localization and mapping (SLAM) algorithms. Though such simulation data and ground truth datasets are available for testing automation algorithms of vehicles, however, due to regulations and economic considerations, there is a dearth of such datasets in the railway industry. Thus, there is a need for the creation of a simulation environment that can generate realistic synthetic datasets. This paper proposes (1) to leverage the capabilities of open-source 3D graphic rendering software to create a visualization of the environment. (2) to utilize open-source 3D geospatial data for accurate visualization and (3) to integrate the graphic rendering software with a programming language and numerical computing platform. To develop such an integrated platform, this paper utilizes the computing platform’s advanced sensor models like LIDAR, camera, IMU or GPS and merges it with the 3D rendering of the game engine to generate high-quality synthetic data. Further, these datasets can be used to train Railway models and improve the accuracy of a train’s location.

Keywords: 3D game engine, 3D geospatial data, dataset generation, railway simulator, sensor fusion, SLAM

Procedia PDF Downloads 12