Search results for: manual data inquiry
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25279

Search results for: manual data inquiry

23419 An Investigation of E-Government by Using GIS and Establishing E-Government in Developing Countries Case Study: Iraq

Authors: Ahmed M. Jamel

Abstract:

Electronic government initiatives and public participation to them are among the indicators of today's development criteria of the countries. After consequent two wars, Iraq's current position in, for example, UN's e-government ranking is quite concerning and did not improve in recent years, either. In the preparation of this work, we are motivated with the fact that handling geographic data of the public facilities and resources are needed in most of the e-government projects. Geographical information systems (GIS) provide most common tools not only to manage spatial data but also to integrate such type of data with nonspatial attributes of the features. With this background, this paper proposes that establishing a working GIS in the health sector of Iraq would improve e-government applications. As the case study, investigating hospital locations in Erbil is chosen.

Keywords: e-government, GIS, Iraq, Erbil

Procedia PDF Downloads 381
23418 Evaluation of Classification Algorithms for Diagnosis of Asthma in Iranian Patients

Authors: Taha SamadSoltani, Peyman Rezaei Hachesu, Marjan GhaziSaeedi, Maryam Zolnoori

Abstract:

Introduction: Data mining defined as a process to find patterns and relationships along data in the database to build predictive models. Application of data mining extended in vast sectors such as the healthcare services. Medical data mining aims to solve real-world problems in the diagnosis and treatment of diseases. This method applies various techniques and algorithms which have different accuracy and precision. The purpose of this study was to apply knowledge discovery and data mining techniques for the diagnosis of asthma based on patient symptoms and history. Method: Data mining includes several steps and decisions should be made by the user which starts by creation of an understanding of the scope and application of previous knowledge in this area and identifying KD process from the point of view of the stakeholders and finished by acting on discovered knowledge using knowledge conducting, integrating knowledge with other systems and knowledge documenting and reporting.in this study a stepwise methodology followed to achieve a logical outcome. Results: Sensitivity, Specifity and Accuracy of KNN, SVM, Naïve bayes, NN, Classification tree and CN2 algorithms and related similar studies was evaluated and ROC curves were plotted to show the performance of the system. Conclusion: The results show that we can accurately diagnose asthma, approximately ninety percent, based on the demographical and clinical data. The study also showed that the methods based on pattern discovery and data mining have a higher sensitivity compared to expert and knowledge-based systems. On the other hand, medical guidelines and evidence-based medicine should be base of diagnostics methods, therefore recommended to machine learning algorithms used in combination with knowledge-based algorithms.

Keywords: asthma, datamining, classification, machine learning

Procedia PDF Downloads 440
23417 Application of GPRS in Water Quality Monitoring System

Authors: V. Ayishwarya Bharathi, S. M. Hasker, J. Indhu, M. Mohamed Azarudeen, G. Gowthami, R. Vinoth Rajan, N. Vijayarangan

Abstract:

Identification of water quality conditions in a river system based on limited observations is an essential task for meeting the goals of environmental management. The traditional method of water quality testing is to collect samples manually and then send to laboratory for analysis. However, it has been unable to meet the demands of water quality monitoring today. So a set of automatic measurement and reporting system of water quality has been developed. In this project specifies Water quality parameters collected by multi-parameter water quality probe are transmitted to data processing and monitoring center through GPRS wireless communication network of mobile. The multi parameter sensor is directly placed above the water level. The monitoring center consists of GPRS and micro-controller which monitor the data. The collected data can be monitor at any instant of time. In the pollution control board they will monitor the water quality sensor data in computer using Visual Basic Software. The system collects, transmits and processes water quality parameters automatically, so production efficiency and economy benefit are improved greatly. GPRS technology can achieve well within the complex environment of poor water quality non-monitored, and more specifically applicable to the collection point, data transmission automatically generate the field of water analysis equipment data transmission and monitoring.

Keywords: multiparameter sensor, GPRS, visual basic software, RS232

Procedia PDF Downloads 399
23416 Decision Support System in Air Pollution Using Data Mining

Authors: E. Fathallahi Aghdam, V. Hosseini

Abstract:

Environmental pollution is not limited to a specific region or country; that is why sustainable development, as a necessary process for improvement, pays attention to issues such as destruction of natural resources, degradation of biological system, global pollution, and climate change in the world, especially in the developing countries. According to the World Health Organization, as a developing city, Tehran (capital of Iran) is one of the most polluted cities in the world in terms of air pollution. In this study, three pollutants including particulate matter less than 10 microns, nitrogen oxides, and sulfur dioxide were evaluated in Tehran using data mining techniques and through Crisp approach. The data from 21 air pollution measuring stations in different areas of Tehran were collected from 1999 to 2013. Commercial softwares Clementine was selected for this study. Tehran was divided into distinct clusters in terms of the mentioned pollutants using the software. As a data mining technique, clustering is usually used as a prologue for other analyses, therefore, the similarity of clusters was evaluated in this study through analyzing local conditions, traffic behavior, and industrial activities. In fact, the results of this research can support decision-making system, help managers improve the performance and decision making, and assist in urban studies.

Keywords: data mining, clustering, air pollution, crisp approach

Procedia PDF Downloads 423
23415 Test Suite Optimization Using an Effective Meta-Heuristic BAT Algorithm

Authors: Anuradha Chug, Sunali Gandhi

Abstract:

Regression Testing is a very expensive and time-consuming process carried out to ensure the validity of modified software. Due to the availability of insufficient resources to re-execute all the test cases in time constrained environment, efforts are going on to generate test data automatically without human efforts. Many search based techniques have been proposed to generate efficient, effective as well as optimized test data, so that the overall cost of the software testing can be minimized. The generated test data should be able to uncover all potential lapses that exist in the software or product. Inspired from the natural behavior of bat for searching her food sources, current study employed a meta-heuristic, search-based bat algorithm for optimizing the test data on the basis certain parameters without compromising their effectiveness. Mathematical functions are also applied that can effectively filter out the redundant test data. As many as 50 Java programs are used to check the effectiveness of proposed test data generation and it has been found that 86% saving in testing efforts can be achieved using bat algorithm while covering 100% of the software code for testing. Bat algorithm was found to be more efficient in terms of simplicity and flexibility when the results were compared with another nature inspired algorithms such as Firefly Algorithm (FA), Hill Climbing Algorithm (HC) and Ant Colony Optimization (ACO). The output of this study would be useful to testers as they can achieve 100% path coverage for testing with minimum number of test cases.

Keywords: regression testing, test case selection, test case prioritization, genetic algorithm, bat algorithm

Procedia PDF Downloads 366
23414 Low Cost Webcam Camera and GNSS Integration for Updating Home Data Using AI Principles

Authors: Mohkammad Nur Cahyadi, Hepi Hapsari Handayani, Agus Budi Raharjo, Ronny Mardianto, Daud Wahyu Imani, Arizal Bawazir, Luki Adi Triawan

Abstract:

PDAM (local water company) determines customer charges by considering the customer's building or house. Charges determination significantly affects PDAM income and customer costs because the PDAM applies a subsidy policy for customers classified as small households. Periodic updates are needed so that pricing is in line with the target. A thorough customer survey in Surabaya is needed to update customer building data. However, the survey that has been carried out so far has been by deploying officers to conduct one-by-one surveys for each PDAM customer. Surveys with this method require a lot of effort and cost. For this reason, this research offers a technology called moblie mapping, a mapping method that is more efficient in terms of time and cost. The use of this tool is also quite simple, where the device will be installed in the car so that it can record the surrounding buildings while the car is running. Mobile mapping technology generally uses lidar sensors equipped with GNSS, but this technology requires high costs. In overcoming this problem, this research develops low-cost mobile mapping technology using a webcam camera sensor added to the GNSS and IMU sensors. The camera used has specifications of 3MP with a resolution of 720 and a diagonal field of view of 78⁰. The principle of this invention is to integrate four camera sensors, a GNSS webcam, and GPS to acquire photo data, which is equipped with location data (latitude, longitude) and IMU (roll, pitch, yaw). This device is also equipped with a tripod and a vacuum cleaner to attach to the car's roof so it doesn't fall off while running. The output data from this technology will be analyzed with artificial intelligence to reduce similar data (Cosine Similarity) and then classify building types. Data reduction is used to eliminate similar data and maintain the image that displays the complete house so that it can be processed for later classification of buildings. The AI method used is transfer learning by utilizing a trained model named VGG-16. From the analysis of similarity data, it was found that the data reduction reached 50%. Then georeferencing is done using the Google Maps API to get address information according to the coordinates in the data. After that, geographic join is done to link survey data with customer data already owned by PDAM Surya Sembada Surabaya.

Keywords: mobile mapping, GNSS, IMU, similarity, classification

Procedia PDF Downloads 74
23413 An Investigation into the Views of Distant Science Education Students Regarding Teaching Laboratory Work Online

Authors: Abraham Motlhabane

Abstract:

This research analysed the written views of science education students regarding the teaching of laboratory work using the online mode. The research adopted the qualitative methodology. The qualitative research was aimed at investigating small and distinct groups normally regarded as a single-site study. Qualitative research was used to describe and analyze the phenomena from the student’s perspective. This means the research began with assumptions of the world view that use theoretical lenses of research problems inquiring into the meaning of individual students. The research was conducted with three groups of students studying for Postgraduate Certificate in Education, Bachelor of Education and honors Bachelor of Education respectively. In each of the study programmes, the science education module is compulsory. Five science education students from each study programme were purposively selected to participate in this research. Therefore, 15 students participated in the research. In order to analysis the data, the data were first printed and hard copies were used in the analysis. The data was read several times and key concepts and ideas were highlighted. Themes and patterns were identified to describe the data. Coding as a process of organising and sorting data was used. The findings of the study are very diverse; some students are in favour of online laboratory whereas other students argue that science can only be learnt through hands-on experimentation.

Keywords: online learning, laboratory work, views, perceptions

Procedia PDF Downloads 135
23412 A Profile of an Exercise Addict: The Relationship between Exercise Addiction and Personality

Authors: Klary Geisler, Dalit Lev-Arey, Yael Hacohen

Abstract:

It is a well-known fact that exercise has favorable effects on people's physical health, as well as mental well-being. However, as for as excessive exercise, it may likely elevate negative consequences (e.g., physical injuries, negligence of everyday responsibilities such as work, family life). Lately, there is a growing interest in exercise addiction, sometimes referred to as exercise dependence, which is defined as a craving for physical activity that results in extreme work-out sessions and generates negative physiological and psychological symptoms (e.g., withdrawal symptoms, tolerance, social conflict). Exercise addiction is considered a behavioral addiction, yet it was not included in the latest editions of the diagnostic and statistical manual of mental disorders (DSM-IV), due to lack of significant research. Specifically, there is scarce research on the relationship between exercise addiction and personality dimensions. The purpose of the current research was to examine the relationship between primary exercise addiction symptoms and the big five dimensions, perfectionism (high performance expectations and self-critical performance evaluations) and subjective affect. participants were 152 trainees on a variety of aerobic sports activities (running, cycling, swimming) that were recruited through sports groups and trainers. 88% of participants trained for at least 5 hours per week, 24% of the participants trained above 10 hours per week. To test the predictive ability of the IVs a hierarchical linear regression with forced block entry was performed. It was found that Neuroticism significantly predicted exercise addiction symptoms (20% of the variance, p<0.001), while consciousness was negatively correlated with exercise addiction symptoms (14% of variance p<0.05); both had a unique contribution. Other dimensions of the big five (agreeableness, openness and extraversion) did not have any contribution to the dependent. Moreover, maladaptive perfectionism (self-critical performance evaluations) significantly predicted exercise addiction symptoms as well (10% of the variance P < 0.05). The overall regression model explained 54% of variance.

Keywords: big five, consciousness, excessive exercise, exercise addiction, neuroticism, perfectionism, personality

Procedia PDF Downloads 215
23411 The Communication Library DIALOG for iFDAQ of the COMPASS Experiment

Authors: Y. Bai, M. Bodlak, V. Frolov, S. Huber, V. Jary, I. Konorov, D. Levit, J. Novy, D. Steffen, O. Subrt, M. Virius

Abstract:

Modern experiments in high energy physics impose great demands on the reliability, the efficiency, and the data rate of Data Acquisition Systems (DAQ). This contribution focuses on the development and deployment of the new communication library DIALOG for the intelligent, FPGA-based Data Acquisition System (iFDAQ) of the COMPASS experiment at CERN. The iFDAQ utilizing a hardware event builder is designed to be able to readout data at the maximum rate of the experiment. The DIALOG library is a communication system both for distributed and mixed environments, it provides a network transparent inter-process communication layer. Using the high-performance and modern C++ framework Qt and its Qt Network API, the DIALOG library presents an alternative to the previously used DIM library. The DIALOG library was fully incorporated to all processes in the iFDAQ during the run 2016. From the software point of view, it might be considered as a significant improvement of iFDAQ in comparison with the previous run. To extend the possibilities of debugging, the online monitoring of communication among processes via DIALOG GUI is a desirable feature. In the paper, we present the DIALOG library from several insights and discuss it in a detailed way. Moreover, the efficiency measurement and comparison with the DIM library with respect to the iFDAQ requirements is provided.

Keywords: data acquisition system, DIALOG library, DIM library, FPGA, Qt framework, TCP/IP

Procedia PDF Downloads 313
23410 Mining Scientific Literature to Discover Potential Research Data Sources: An Exploratory Study in the Field of Haemato-Oncology

Authors: A. Anastasiou, K. S. Tingay

Abstract:

Background: Discovering suitable datasets is an important part of health research, particularly for projects working with clinical data from patients organized in cohorts (cohort data), but with the proliferation of so many national and international initiatives, it is becoming increasingly difficult for research teams to locate real world datasets that are most relevant to their project objectives. We present a method for identifying healthcare institutes in the European Union (EU) which may hold haemato-oncology (HO) data. A key enabler of this research was the bibInsight platform, a scientometric data management and analysis system developed by the authors at Swansea University. Method: A PubMed search was conducted using HO clinical terms taken from previous work. The resulting XML file was processed using the bibInsight platform, linking affiliations to the Global Research Identifier Database (GRID). GRID is an international, standardized list of institutions, including the city and country in which the institution exists, as well as a category of the main business type, e.g., Academic, Healthcare, Government, Company. Countries were limited to the 28 current EU members, and institute type to 'Healthcare'. An article was considered valid if at least one author was affiliated with an EU-based healthcare institute. Results: The PubMed search produced 21,310 articles, consisting of 9,885 distinct affiliations with correspondence in GRID. Of these articles, 760 were from EU countries, and 390 of these were healthcare institutes. One affiliation was excluded as being a veterinary hospital. Two EU countries did not have any publications in our analysis dataset. The results were analysed by country and by individual healthcare institute. Networks both within the EU and internationally show institutional collaborations, which may suggest a willingness to share data for research purposes. Geographical mapping can ensure that data has broad population coverage. Collaborations with industry or government may exclude healthcare institutes that may have embargos or additional costs associated with data access. Conclusions: Data reuse is becoming increasingly important both for ensuring the validity of results, and economy of available resources. The ability to identify potential, specific data sources from over twenty thousand articles in less than an hour could assist in improving knowledge of, and access to, data sources. As our method has not yet specified if these healthcare institutes are holding data, or merely publishing on that topic, future work will involve text mining of data-specific concordant terms to identify numbers of participants, demographics, study methodologies, and sub-topics of interest.

Keywords: data reuse, data discovery, data linkage, journal articles, text mining

Procedia PDF Downloads 110
23409 Using Data Mining Technique for Scholarship Disbursement

Authors: J. K. Alhassan, S. A. Lawal

Abstract:

This work is on decision tree-based classification for the disbursement of scholarship. Tree-based data mining classification technique is used in other to determine the generic rule to be used to disburse the scholarship. The system based on the defined rules from the tree is able to determine the class (status) to which an applicant shall belong whether Granted or Not Granted. The applicants that fall to the class of granted denote a successful acquirement of scholarship while those in not granted class are unsuccessful in the scheme. An algorithm that can be used to classify the applicants based on the rules from tree-based classification was also developed. The tree-based classification is adopted because of its efficiency, effectiveness, and easy to comprehend features. The system was tested with the data of National Information Technology Development Agency (NITDA) Abuja, a Parastatal of Federal Ministry of Communication Technology that is mandated to develop and regulate information technology in Nigeria. The system was found working according to the specification. It is therefore recommended for all scholarship disbursement organizations.

Keywords: classification, data mining, decision tree, scholarship

Procedia PDF Downloads 361
23408 [Keynote Speech]: Feature Selection and Predictive Modeling of Housing Data Using Random Forest

Authors: Bharatendra Rai

Abstract:

Predictive data analysis and modeling involving machine learning techniques become challenging in presence of too many explanatory variables or features. Presence of too many features in machine learning is known to not only cause algorithms to slow down, but they can also lead to decrease in model prediction accuracy. This study involves housing dataset with 79 quantitative and qualitative features that describe various aspects people consider while buying a new house. Boruta algorithm that supports feature selection using a wrapper approach build around random forest is used in this study. This feature selection process leads to 49 confirmed features which are then used for developing predictive random forest models. The study also explores five different data partitioning ratios and their impact on model accuracy are captured using coefficient of determination (r-square) and root mean square error (rsme).

Keywords: housing data, feature selection, random forest, Boruta algorithm, root mean square error

Procedia PDF Downloads 313
23407 Image-Based (RBG) Technique for Estimating Phosphorus Levels of Different Crops

Authors: M. M. Ali, Ahmed Al- Ani, Derek Eamus, Daniel K. Y. Tan

Abstract:

In this glasshouse study, we developed the new image-based non-destructive technique for detecting leaf P status of different crops such as cotton, tomato and lettuce. Plants were allowed to grow on nutrient media containing different P concentrations, i.e. 0%, 50% and 100% of recommended P concentration (P0 = no P, L; P1 = 2.5 mL 10 L-1 of P and P2 = 5 mL 10 L-1 of P as NaH2PO4). After 10 weeks of growth, plants were harvested and data on leaf P contents were collected using the standard destructive laboratory method and at the same time leaf images were collected by a handheld crop image sensor. We calculated leaf area, leaf perimeter and RGB (red, green and blue) values of these images. This data was further used in the linear discriminant analysis (LDA) to estimate leaf P contents, which successfully classified these plants on the basis of leaf P contents. The data indicated that P deficiency in crop plants can be predicted using the image and morphological data. Our proposed non-destructive imaging method is precise in estimating P requirements of different crop species.

Keywords: image-based techniques, leaf area, leaf P contents, linear discriminant analysis

Procedia PDF Downloads 371
23406 Early Modern Controversies of Mobility within the Spanish Empire: Francisco De Vitoria and the Peaceful Right to Travel

Authors: Beatriz Salamanca

Abstract:

In his public lecture ‘On the American Indians’ given at the University of Salamanca in 1538-39, Francisco de Vitoria presented an unsettling defense of freedom of movement, arguing that the Spanish had the right to travel and dwell in the New World, since it was considered part of the law of nations [ius gentium] that men enjoyed free mutual intercourse anywhere they went. The principle of freedom of movement brought hopeful expectations, promising to bring mankind together and strengthen the ties of fraternity. However, it led to polemical situations when those whose mobility was in question represented a harmful threat or was for some reason undesired. In this context, Vitoria’s argument has been seen on multiple occasions as a justification of the expansion of the Spanish empire. In order to examine the meaning of Vitoria’s defense of free mobility, a more detailed look at Vitoria’s text is required, together with the study of some of his earliest works, among them, his commentaries on Thomas Aquinas’s Summa Theologiae, where he presented relevant insights on the idea of the law of nations. In addition, it is necessary to place Vitoria’s work in the context of the intellectual tradition he belonged to and the responses he obtained from some of his contemporaries who were concerned with similar issues. The claim of this research is that the Spanish right to travel advocated by Vitoria was not intended to be interpreted in absolute terms, for it had to serve the purpose of bringing peace and unity among men, and could not contradict natural law. In addition, Vitoria explicitly observed that the right to travel was only valid if the Spaniards caused no harm, a condition that has been underestimated by his critics. Therefore, Vitoria’s legacy is of enormous value as it initiated a long lasting discussion regarding the question of the grounds under which human mobility could be restricted. Again, under Vitoria’s argument it was clear that this freedom was not absolute, but the controversial nature of his defense of Spanish mobility demonstrates how difficult it was and still is to address the issue of the circulation of peoples across frontiers, and shows the significance of this discussion in today’s globalized world, where the rights and wrongs of notions like immigration, international trade or foreign intervention still lack sufficient consensus. This inquiry about Vitoria’s defense of the principle of freedom of movement is being placed here against the background of the history of political thought, political theory, international law, and international relations, following the methodological framework of contextual history of the ‘Cambridge School’.

Keywords: Francisco de Vitoria, freedom of movement, law of nations, ius gentium, Spanish empire

Procedia PDF Downloads 359
23405 Semi-Automatic Segmentation of Mitochondria on Transmission Electron Microscopy Images Using Live-Wire and Surface Dragging Methods

Authors: Mahdieh Farzin Asanjan, Erkan Unal Mumcuoglu

Abstract:

Mitochondria are cytoplasmic organelles of the cell, which have a significant role in the variety of cellular metabolic functions. Mitochondria act as the power plants of the cell and are surrounded by two membranes. Significant morphological alterations are often due to changes in mitochondrial functions. A powerful technique in order to study the three-dimensional (3D) structure of mitochondria and its alterations in disease states is Electron microscope tomography. Detection of mitochondria in electron microscopy images due to the presence of various subcellular structures and imaging artifacts is a challenging problem. Another challenge is that each image typically contains more than one mitochondrion. Hand segmentation of mitochondria is tedious and time-consuming and also special knowledge about the mitochondria is needed. Fully automatic segmentation methods lead to over-segmentation and mitochondria are not segmented properly. Therefore, semi-automatic segmentation methods with minimum manual effort are required to edit the results of fully automatic segmentation methods. Here two editing tools were implemented by applying spline surface dragging and interactive live-wire segmentation tools. These editing tools were applied separately to the results of fully automatic segmentation. 3D extension of these tools was also studied and tested. Dice coefficients of 2D and 3D for surface dragging using splines were 0.93 and 0.92. This metric for 2D and 3D for live-wire method were 0.94 and 0.91 respectively. The root mean square symmetric surface distance values of 2D and 3D for surface dragging was measured as 0.69, 0.93. The same metrics for live-wire tool were 0.60 and 2.11. Comparing the results of these editing tools with the results of automatic segmentation method, it shows that these editing tools, led to better results and these results were more similar to ground truth image but the required time was higher than hand-segmentation time

Keywords: medical image segmentation, semi-automatic methods, transmission electron microscopy, surface dragging using splines, live-wire

Procedia PDF Downloads 163
23404 Design of Visual Repository, Constraint and Process Modeling Tool Based on Eclipse Plug-Ins

Authors: Rushiraj Heshi, Smriti Bhandari

Abstract:

Master Data Management requires creation of Central repository, applying constraints on Repository and designing processes to manage data. Designing of Repository, constraints on repository and business processes is very tedious and time consuming task for large Enterprise. Hence Visual Repository, constraints and Process (Workflow) modeling is the most critical step in Master Data Management.In this paper, we realize a Visual Modeling tool for implementing Repositories, Constraints and Processes based on Eclipse Plugin using GMF/EMF which follows principles of Model Driven Engineering (MDE).

Keywords: EMF, GMF, GEF, repository, constraint, process

Procedia PDF Downloads 487
23403 The Classification Performance in Parametric and Nonparametric Discriminant Analysis for a Class- Unbalanced Data of Diabetes Risk Groups

Authors: Lily Ingsrisawang, Tasanee Nacharoen

Abstract:

Introduction: The problems of unbalanced data sets generally appear in real world applications. Due to unequal class distribution, many research papers found that the performance of existing classifier tends to be biased towards the majority class. The k -nearest neighbors’ nonparametric discriminant analysis is one method that was proposed for classifying unbalanced classes with good performance. Hence, the methods of discriminant analysis are of interest to us in investigating misclassification error rates for class-imbalanced data of three diabetes risk groups. Objective: The purpose of this study was to compare the classification performance between parametric discriminant analysis and nonparametric discriminant analysis in a three-class classification application of class-imbalanced data of diabetes risk groups. Methods: Data from a healthy project for 599 staffs in a government hospital in Bangkok were obtained for the classification problem. The staffs were diagnosed into one of three diabetes risk groups: non-risk (90%), risk (5%), and diabetic (5%). The original data along with the variables; diabetes risk group, age, gender, cholesterol, and BMI was analyzed and bootstrapped up to 50 and 100 samples, 599 observations per sample, for additional estimation of misclassification error rate. Each data set was explored for the departure of multivariate normality and the equality of covariance matrices of the three risk groups. Both the original data and the bootstrap samples show non-normality and unequal covariance matrices. The parametric linear discriminant function, quadratic discriminant function, and the nonparametric k-nearest neighbors’ discriminant function were performed over 50 and 100 bootstrap samples and applied to the original data. In finding the optimal classification rule, the choices of prior probabilities were set up for both equal proportions (0.33: 0.33: 0.33) and unequal proportions with three choices of (0.90:0.05:0.05), (0.80: 0.10: 0.10) or (0.70, 0.15, 0.15). Results: The results from 50 and 100 bootstrap samples indicated that the k-nearest neighbors approach when k = 3 or k = 4 and the prior probabilities of {non-risk:risk:diabetic} as {0.90:0.05:0.05} or {0.80:0.10:0.10} gave the smallest error rate of misclassification. Conclusion: The k-nearest neighbors approach would be suggested for classifying a three-class-imbalanced data of diabetes risk groups.

Keywords: error rate, bootstrap, diabetes risk groups, k-nearest neighbors

Procedia PDF Downloads 429
23402 BFDD-S: Big Data Framework to Detect and Mitigate DDoS Attack in SDN Network

Authors: Amirreza Fazely Hamedani, Muzzamil Aziz, Philipp Wieder, Ramin Yahyapour

Abstract:

Software-defined networking in recent years came into the sight of so many network designers as a successor to the traditional networking. Unlike traditional networks where control and data planes engage together within a single device in the network infrastructure such as switches and routers, the two planes are kept separated in software-defined networks (SDNs). All critical decisions about packet routing are made on the network controller, and the data level devices forward the packets based on these decisions. This type of network is vulnerable to DDoS attacks, degrading the overall functioning and performance of the network by continuously injecting the fake flows into it. This increases substantial burden on the controller side, and the result ultimately leads to the inaccessibility of the controller and the lack of network service to the legitimate users. Thus, the protection of this novel network architecture against denial of service attacks is essential. In the world of cybersecurity, attacks and new threats emerge every day. It is essential to have tools capable of managing and analyzing all this new information to detect possible attacks in real-time. These tools should provide a comprehensive solution to automatically detect, predict and prevent abnormalities in the network. Big data encompasses a wide range of studies, but it mainly refers to the massive amounts of structured and unstructured data that organizations deal with on a regular basis. On the other hand, it regards not only the volume of the data; but also that how data-driven information can be used to enhance decision-making processes, security, and the overall efficiency of a business. This paper presents an intelligent big data framework as a solution to handle illegitimate traffic burden on the SDN network created by the numerous DDoS attacks. The framework entails an efficient defence and monitoring mechanism against DDoS attacks by employing the state of the art machine learning techniques.

Keywords: apache spark, apache kafka, big data, DDoS attack, machine learning, SDN network

Procedia PDF Downloads 164
23401 Welding Process Selection for Storage Tank by Integrated Data Envelopment Analysis and Fuzzy Credibility Constrained Programming Approach

Authors: Rahmad Wisnu Wardana, Eakachai Warinsiriruk, Sutep Joy-A-Ka

Abstract:

Selecting the most suitable welding process usually depends on experiences or common application in similar companies. However, this approach generally ignores many criteria that can be affecting the suitable welding process selection. Therefore, knowledge automation through knowledge-based systems will significantly improve the decision-making process. The aims of this research propose integrated data envelopment analysis (DEA) and fuzzy credibility constrained programming approach for identifying the best welding process for stainless steel storage tank in the food and beverage industry. The proposed approach uses fuzzy concept and credibility measure to deal with uncertain data from experts' judgment. Furthermore, 12 parameters are used to determine the most appropriate welding processes among six competitive welding processes.

Keywords: welding process selection, data envelopment analysis, fuzzy credibility constrained programming, storage tank

Procedia PDF Downloads 160
23400 Myosin-Driven Movement of Nanoparticles – An Approach to High-Speed Tracking

Authors: Sneha Kumari, Ravi Krishnan Elangovan

Abstract:

This abstract describes the development of a high-speed tracking method by modification in motor components for nanoparticle attachment. Myosin motors are nano-sized protein machines powering movement that defines life. These miniature molecular devices serve as engines utilizing chemical energy stored in ATP to produce useful mechanical energy in the form of a few nanometre displacement events leading to force generation that is required for cargo transport, cell division, cell locomotion, translated to macroscopic movements like running etc. With the advent of in vitro motility assay (IVMA), detailed functional studies of the actomyosin system could be performed. The major challenge with the currently available IVMA for tracking actin filaments is a resolution limitation of ± 50nm. To overcome this, we are trying to develop Single Molecule IVMA in which nanoparticle (GNP/QD) will be attached along or on the barbed end of actin filaments using CapZ protein and visualization by a compact TIRF module called ‘cTIRF’. The waveguide-based illumination by cTIRF offers a unique separation of excitation and collection optics, enabling imaging by scattering without emission filters. So, this technology is well equipped to perform tracking with high precision in temporal resolution of 2ms with significantly improved SNR by 100-fold as compared to conventional TIRF. Also, the nanoparticles (QD/GNP) attached to actin filament act as a point source of light coffering ease in filament tracking compared to conventional manual tracking. Moreover, the attachment of cargo (QD/GNP) to the thin filament paves the way for various nano-technological applications through their transportation to different predetermined locations on the chip

Keywords: actin, cargo, IVMA, myosin motors and single-molecule system

Procedia PDF Downloads 79
23399 On the Estimation of Crime Rate in the Southwest of Nigeria: Principal Component Analysis Approach

Authors: Kayode Balogun, Femi Ayoola

Abstract:

Crime is at alarming rate in this part of world and there are many factors that are contributing to this antisocietal behaviour both among the youths and old. In this work, principal component analysis (PCA) was used as a tool to reduce the dimensionality and to really know those variables that were crime prone in the study region. Data were collected on twenty-eight crime variables from National Bureau of Statistics (NBS) databank for a period of fifteen years, while retaining as much of the information as possible. We use PCA in this study to know the number of major variables and contributors to the crime in the Southwest Nigeria. The results of our analysis revealed that there were eight principal variables have been retained using the Scree plot and Loading plot which implies an eight-equation solution will be appropriate for the data. The eight components explained 93.81% of the total variation in the data set. We also found that the highest and commonly committed crimes in the Southwestern Nigeria were: Assault, Grievous Harm and Wounding, theft/stealing, burglary, house breaking, false pretence, unlawful arms possession and breach of public peace.

Keywords: crime rates, data, Southwest Nigeria, principal component analysis, variables

Procedia PDF Downloads 435
23398 On-Line Data-Driven Multivariate Statistical Prediction Approach to Production Monitoring

Authors: Hyun-Woo Cho

Abstract:

Detection of incipient abnormal events in production processes is important to improve safety and reliability of manufacturing operations and reduce losses caused by failures. The construction of calibration models for predicting faulty conditions is quite essential in making decisions on when to perform preventive maintenance. This paper presents a multivariate calibration monitoring approach based on the statistical analysis of process measurement data. The calibration model is used to predict faulty conditions from historical reference data. This approach utilizes variable selection techniques, and the predictive performance of several prediction methods are evaluated using real data. The results shows that the calibration model based on supervised probabilistic model yielded best performance in this work. By adopting a proper variable selection scheme in calibration models, the prediction performance can be improved by excluding non-informative variables from their model building steps.

Keywords: calibration model, monitoring, quality improvement, feature selection

Procedia PDF Downloads 347
23397 Multilevel Gray Scale Image Encryption through 2D Cellular Automata

Authors: Rupali Bhardwaj

Abstract:

Cryptography is the science of using mathematics to encrypt and decrypt data; the data are converted into some other gibberish form, and then the encrypted data are transmitted. The primary purpose of this paper is to provide two levels of security through a two-step process, rather than transmitted the message bits directly, first encrypted it using 2D cellular automata and then scrambled with Arnold Cat Map transformation; it provides an additional layer of protection and reduces the chance of the transmitted message being detected. A comparative analysis on effectiveness of scrambling technique is provided by scrambling degree measurement parameters i.e. Gray Difference Degree (GDD) and Correlation Coefficient.

Keywords: scrambling, cellular automata, Arnold cat map, game of life, gray difference degree, correlation coefficient

Procedia PDF Downloads 369
23396 Lean Comic GAN (LC-GAN): a Light-Weight GAN Architecture Leveraging Factorized Convolution and Teacher Forcing Distillation Style Loss Aimed to Capture Two Dimensional Animated Filtered Still Shots Using Mobile Phone Camera and Edge Devices

Authors: Kaustav Mukherjee

Abstract:

In this paper we propose a Neural Style Transfer solution whereby we have created a Lightweight Separable Convolution Kernel Based GAN Architecture (SC-GAN) which will very useful for designing filter for Mobile Phone Cameras and also Edge Devices which will convert any image to its 2D ANIMATED COMIC STYLE Movies like HEMAN, SUPERMAN, JUNGLE-BOOK. This will help the 2D animation artist by relieving to create new characters from real life person's images without having to go for endless hours of manual labour drawing each and every pose of a cartoon. It can even be used to create scenes from real life images.This will reduce a huge amount of turn around time to make 2D animated movies and decrease cost in terms of manpower and time. In addition to that being extreme light-weight it can be used as camera filters capable of taking Comic Style Shots using mobile phone camera or edge device cameras like Raspberry Pi 4,NVIDIA Jetson NANO etc. Existing Methods like CartoonGAN with the model size close to 170 MB is too heavy weight for mobile phones and edge devices due to their scarcity in resources. Compared to the current state of the art our proposed method which has a total model size of 31 MB which clearly makes it ideal and ultra-efficient for designing of camera filters on low resource devices like mobile phones, tablets and edge devices running OS or RTOS. .Owing to use of high resolution input and usage of bigger convolution kernel size it produces richer resolution Comic-Style Pictures implementation with 6 times lesser number of parameters and with just 25 extra epoch trained on a dataset of less than 1000 which breaks the myth that all GAN need mammoth amount of data. Our network reduces the density of the Gan architecture by using Depthwise Separable Convolution which does the convolution operation on each of the RGB channels separately then we use a Point-Wise Convolution to bring back the network into required channel number using 1 by 1 kernel.This reduces the number of parameters substantially and makes it extreme light-weight and suitable for mobile phones and edge devices. The architecture mentioned in the present paper make use of Parameterised Batch Normalization Goodfellow etc al. (Deep Learning OPTIMIZATION FOR TRAINING DEEP MODELS page 320) which makes the network to use the advantage of Batch Norm for easier training while maintaining the non-linear feature capture by inducing the learnable parameters

Keywords: comic stylisation from camera image using GAN, creating 2D animated movie style custom stickers from images, depth-wise separable convolutional neural network for light-weight GAN architecture for EDGE devices, GAN architecture for 2D animated cartoonizing neural style, neural style transfer for edge, model distilation, perceptual loss

Procedia PDF Downloads 126
23395 Survey Based Data Security Evaluation in Pakistan Financial Institutions against Malicious Attacks

Authors: Naveed Ghani, Samreen Javed

Abstract:

In today’s heterogeneous network environment, there is a growing demand for distrust clients to jointly execute secure network to prevent from malicious attacks as the defining task of propagating malicious code is to locate new targets to attack. Residual risk is always there no matter what solutions are implemented or whet so ever security methodology or standards being adapted. Security is the first and crucial phase in the field of Computer Science. The main aim of the Computer Security is gathering of information with secure network. No one need wonder what all that malware is trying to do: It's trying to steal money through data theft, bank transfers, stolen passwords, or swiped identities. From there, with the help of our survey we learn about the importance of white listing, antimalware programs, security patches, log files, honey pots, and more used in banks for financial data protection but there’s also a need of implementing the IPV6 tunneling with Crypto data transformation according to the requirements of new technology to prevent the organization from new Malware attacks and crafting of its own messages and sending them to the target. In this paper the writer has given the idea of implementing IPV6 Tunneling Secessions on private data transmission from financial organizations whose secrecy needed to be safeguarded.

Keywords: network worms, malware infection propagating malicious code, virus, security, VPN

Procedia PDF Downloads 351
23394 Interactive IoT-Blockchain System for Big Data Processing

Authors: Abdallah Al-ZoubI, Mamoun Dmour

Abstract:

The spectrum of IoT devices is becoming widely diversified, entering almost all possible fields and finding applications in industry, health, finance, logistics, education, to name a few. The IoT active endpoint sensors and devices exceeded the 12 billion mark in 2021 and are expected to reach 27 billion in 2025, with over $34 billion in total market value. This sheer rise in numbers and use of IoT devices bring with it considerable concerns regarding data storage, analysis, manipulation and protection. IoT Blockchain-based systems have recently been proposed as a decentralized solution for large-scale data storage and protection. COVID-19 has actually accelerated the desire to utilize IoT devices as it impacted both demand and supply and significantly affected several regions due to logistic reasons such as supply chain interruptions, shortage of shipping containers and port congestion. An IoT-blockchain system is proposed to handle big data generated by a distributed network of sensors and controllers in an interactive manner. The system is designed using the Ethereum platform, which utilizes smart contracts, programmed in solidity to execute and manage data generated by IoT sensors and devices. such as Raspberry Pi 4, Rasbpian, and add-on hardware security modules. The proposed system will run a number of applications hosted by a local machine used to validate transactions. It then sends data to the rest of the network through InterPlanetary File System (IPFS) and Ethereum Swarm, forming a closed IoT ecosystem run by blockchain where a number of distributed IoT devices can communicate and interact, thus forming a closed, controlled environment. A prototype has been deployed with three IoT handling units distributed over a wide geographical space in order to examine its feasibility, performance and costs. Initial results indicated that big IoT data retrieval and storage is feasible and interactivity is possible, provided that certain conditions of cost, speed and thorough put are met.

Keywords: IoT devices, blockchain, Ethereum, big data

Procedia PDF Downloads 143
23393 Keynote Talk: The Role of Internet of Things in the Smart Cities Power System

Authors: Abdul-Rahman Al-Ali

Abstract:

As the number of mobile devices is growing exponentially, it is estimated to connect about 50 million devices to the Internet by the year 2020. At the end of this decade, it is expected that an average of eight connected devices per person worldwide. The 50 billion devices are not mobile phones and data browsing gadgets only, but machine-to-machine and man-to-machine devices. With such growing numbers of devices the Internet of Things (I.o.T) concept is one of the emerging technologies as of recently. Within the smart grid technologies, smart home appliances, Intelligent Electronic Devices (IED) and Distributed Energy Resources (DER) are major I.o.T objects that can be addressable using the IPV6. These objects are called the smart grid internet of things (SG-I.o.T). The SG-I.o.T generates big data that requires high-speed computing infrastructure, widespread computer networks, big data storage, software, and platforms services. A company’s utility control and data centers cannot handle such a large number of devices, high-speed processing, and massive data storage. Building large data center’s infrastructure takes a long time, it also requires widespread communication networks and huge capital investment. To maintain and upgrade control and data centers’ infrastructure and communication networks as well as updating and renewing software licenses which collectively, requires additional cost. This can be overcome by utilizing the emerging computing paradigms such as cloud computing. This can be used as a smart grid enabler to replace the legacy of utilities data centers. The talk will highlight the role of I.o.T, cloud computing services and their development models within the smart grid technologies.

Keywords: intelligent electronic devices (IED), distributed energy resources (DER), internet, smart home appliances

Procedia PDF Downloads 315
23392 Statistical Analysis of Interferon-γ for the Effectiveness of an Anti-Tuberculous Treatment

Authors: Shishen Xie, Yingda L. Xie

Abstract:

Tuberculosis (TB) is a potentially serious infectious disease that remains a health concern. The Interferon Gamma Release Assay (IGRA) is a blood test to find out if an individual is tuberculous positive or negative. This study applies statistical analysis to the clinical data of interferon-gamma levels of seventy-three subjects who diagnosed pulmonary TB in an anti-tuberculous treatment. Data analysis is performed to determine if there is a significant decline in interferon-gamma levels for the subjects during a period of six months, and to infer if the anti-tuberculous treatment is effective.

Keywords: data analysis, interferon gamma release assay, statistical methods, tuberculosis infection

Procedia PDF Downloads 298
23391 Short Text Classification Using Part of Speech Feature to Analyze Students' Feedback of Assessment Components

Authors: Zainab Mutlaq Ibrahim, Mohamed Bader-El-Den, Mihaela Cocea

Abstract:

Students' textual feedback can hold unique patterns and useful information about learning process, it can hold information about advantages and disadvantages of teaching methods, assessment components, facilities, and other aspects of teaching. The results of analysing such a feedback can form a key point for institutions’ decision makers to advance and update their systems accordingly. This paper proposes a data mining framework for analysing end of unit general textual feedback using part of speech feature (PoS) with four machine learning algorithms: support vector machines, decision tree, random forest, and naive bays. The proposed framework has two tasks: first, to use the above algorithms to build an optimal model that automatically classifies the whole data set into two subsets, one subset is tailored to assessment practices (assessment related), and the other one is the non-assessment related data. Second task to use the same algorithms to build an optimal model for whole data set, and the new data subsets to automatically detect their sentiment. The significance of this paper is to compare the performance of the above four algorithms using part of speech feature to the performance of the same algorithms using n-grams feature. The paper follows Knowledge Discovery and Data Mining (KDDM) framework to construct the classification and sentiment analysis models, which is understanding the assessment domain, cleaning and pre-processing the data set, selecting and running the data mining algorithm, interpreting mined patterns, and consolidating the discovered knowledge. The results of this paper experiments show that both models which used both features performed very well regarding first task. But regarding the second task, models that used part of speech feature has underperformed in comparison with models that used unigrams and bigrams.

Keywords: assessment, part of speech, sentiment analysis, student feedback

Procedia PDF Downloads 135
23390 Fast Fourier Transform-Based Steganalysis of Covert Communications over Streaming Media

Authors: Jinghui Peng, Shanyu Tang, Jia Li

Abstract:

Steganalysis seeks to detect the presence of secret data embedded in cover objects, and there is an imminent demand to detect hidden messages in streaming media. This paper shows how a steganalysis algorithm based on Fast Fourier Transform (FFT) can be used to detect the existence of secret data embedded in streaming media. The proposed algorithm uses machine parameter characteristics and a network sniffer to determine whether the Internet traffic contains streaming channels. The detected streaming data is then transferred from the time domain to the frequency domain through FFT. The distributions of power spectra in the frequency domain between original VoIP streams and stego VoIP streams are compared in turn using t-test, achieving the p-value of 7.5686E-176 which is below the threshold. The results indicate that the proposed FFT-based steganalysis algorithm is effective in detecting the secret data embedded in VoIP streaming media.

Keywords: steganalysis, security, Fast Fourier Transform, streaming media

Procedia PDF Downloads 138