Search results for: data mapping
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25757

Search results for: data mapping

23867 Automatic Tagging and Accuracy in Assamese Text Data

Authors: Chayanika Hazarika Bordoloi

Abstract:

This paper is an attempt to work on a highly inflectional language called Assamese. This is also one of the national languages of India and very little has been achieved in terms of computational research. Building a language processing tool for a natural language is not very smooth as the standard and language representation change at various levels. This paper presents inflectional suffixes of Assamese verbs and how the statistical tools, along with linguistic features, can improve the tagging accuracy. Conditional random fields (CRF tool) was used to automatically tag and train the text data; however, accuracy was improved after linguistic featured were fed into the training data. Assamese is a highly inflectional language; hence, it is challenging to standardizing its morphology. Inflectional suffixes are used as a feature of the text data. In order to analyze the inflections of Assamese word forms, a list of suffixes is prepared. This list comprises suffixes, comprising of all possible suffixes that various categories can take is prepared. Assamese words can be classified into inflected classes (noun, pronoun, adjective and verb) and un-inflected classes (adverb and particle). The corpus used for this morphological analysis has huge tokens. The corpus is a mixed corpus and it has given satisfactory accuracy. The accuracy rate of the tagger has gradually improved with the modified training data.

Keywords: CRF, morphology, tagging, tagset

Procedia PDF Downloads 194
23866 A Human Activity Recognition System Based on Sensory Data Related to Object Usage

Authors: M. Abdullah, Al-Wadud

Abstract:

Sensor-based activity recognition systems usually accounts which sensors have been activated to perform an activity. The system then combines the conditional probabilities of those sensors to represent different activities and takes the decision based on that. However, the information about the sensors which are not activated may also be of great help in deciding which activity has been performed. This paper proposes an approach where the sensory data related to both usage and non-usage of objects are utilized to make the classification of activities. Experimental results also show the promising performance of the proposed method.

Keywords: Naïve Bayesian, based classification, activity recognition, sensor data, object-usage model

Procedia PDF Downloads 322
23865 Application of Post-Stack and Pre-Stack Seismic Inversion for Prediction of Hydrocarbon Reservoirs in a Persian Gulf Gas Field

Authors: Nastaran Moosavi, Mohammad Mokhtari

Abstract:

Seismic inversion is a technique which has been in use for years and its main goal is to estimate and to model physical characteristics of rocks and fluids. Generally, it is a combination of seismic and well-log data. Seismic inversion can be carried out through different methods; we have conducted and compared post-stack and pre- stack seismic inversion methods on real data in one of the fields in the Persian Gulf. Pre-stack seismic inversion can transform seismic data to rock physics such as P-impedance, S-impedance and density. While post- stack seismic inversion can just estimate P-impedance. Then these parameters can be used in reservoir identification. Based on the results of inverting seismic data, a gas reservoir was detected in one of Hydrocarbon oil fields in south of Iran (Persian Gulf). By comparing post stack and pre-stack seismic inversion it can be concluded that the pre-stack seismic inversion provides a more reliable and detailed information for identification and prediction of hydrocarbon reservoirs.

Keywords: density, p-impedance, s-impedance, post-stack seismic inversion, pre-stack seismic inversion

Procedia PDF Downloads 324
23864 A Data-Driven Monitoring Technique Using Combined Anomaly Detectors

Authors: Fouzi Harrou, Ying Sun, Sofiane Khadraoui

Abstract:

Anomaly detection based on Principal Component Analysis (PCA) was studied intensively and largely applied to multivariate processes with highly cross-correlated process variables. Monitoring metrics such as the Hotelling's T2 and the Q statistics are usually used in PCA-based monitoring to elucidate the pattern variations in the principal and residual subspaces, respectively. However, these metrics are ill suited to detect small faults. In this paper, the Exponentially Weighted Moving Average (EWMA) based on the Q and T statistics, T2-EWMA and Q-EWMA, were developed for detecting faults in the process mean. The performance of the proposed methods was compared with that of the conventional PCA-based fault detection method using synthetic data. The results clearly show the benefit and the effectiveness of the proposed methods over the conventional PCA method, especially for detecting small faults in highly correlated multivariate data.

Keywords: data-driven method, process control, anomaly detection, dimensionality reduction

Procedia PDF Downloads 299
23863 Leveraging Power BI for Advanced Geotechnical Data Analysis and Visualization in Mining Projects

Authors: Elaheh Talebi, Fariba Yavari, Lucy Philip, Lesley Town

Abstract:

The mining industry generates vast amounts of data, necessitating robust data management systems and advanced analytics tools to achieve better decision-making processes in the development of mining production and maintaining safety. This paper highlights the advantages of Power BI, a powerful intelligence tool, over traditional Excel-based approaches for effectively managing and harnessing mining data. Power BI enables professionals to connect and integrate multiple data sources, ensuring real-time access to up-to-date information. Its interactive visualizations and dashboards offer an intuitive interface for exploring and analyzing geotechnical data. Advanced analytics is a collection of data analysis techniques to improve decision-making. Leveraging some of the most complex techniques in data science, advanced analytics is used to do everything from detecting data errors and ensuring data accuracy to directing the development of future project phases. However, while Power BI is a robust tool, specific visualizations required by geotechnical engineers may have limitations. This paper studies the capability to use Python or R programming within the Power BI dashboard to enable advanced analytics, additional functionalities, and customized visualizations. This dashboard provides comprehensive tools for analyzing and visualizing key geotechnical data metrics, including spatial representation on maps, field and lab test results, and subsurface rock and soil characteristics. Advanced visualizations like borehole logs and Stereonet were implemented using Python programming within the Power BI dashboard, enhancing the understanding and communication of geotechnical information. Moreover, the dashboard's flexibility allows for the incorporation of additional data and visualizations based on the project scope and available data, such as pit design, rock fall analyses, rock mass characterization, and drone data. This further enhances the dashboard's usefulness in future projects, including operation, development, closure, and rehabilitation phases. Additionally, this helps in minimizing the necessity of utilizing multiple software programs in projects. This geotechnical dashboard in Power BI serves as a user-friendly solution for analyzing, visualizing, and communicating both new and historical geotechnical data, aiding in informed decision-making and efficient project management throughout various project stages. Its ability to generate dynamic reports and share them with clients in a collaborative manner further enhances decision-making processes and facilitates effective communication within geotechnical projects in the mining industry.

Keywords: geotechnical data analysis, power BI, visualization, decision-making, mining industry

Procedia PDF Downloads 92
23862 An Investigation of E-Government by Using GIS and Establishing E-Government in Developing Countries Case Study: Iraq

Authors: Ahmed M. Jamel

Abstract:

Electronic government initiatives and public participation to them are among the indicators of today's development criteria of the countries. After consequent two wars, Iraq's current position in, for example, UN's e-government ranking is quite concerning and did not improve in recent years, either. In the preparation of this work, we are motivated with the fact that handling geographic data of the public facilities and resources are needed in most of the e-government projects. Geographical information systems (GIS) provide most common tools not only to manage spatial data but also to integrate such type of data with nonspatial attributes of the features. With this background, this paper proposes that establishing a working GIS in the health sector of Iraq would improve e-government applications. As the case study, investigating hospital locations in Erbil is chosen.

Keywords: e-government, GIS, Iraq, Erbil

Procedia PDF Downloads 389
23861 Evaluation of Classification Algorithms for Diagnosis of Asthma in Iranian Patients

Authors: Taha SamadSoltani, Peyman Rezaei Hachesu, Marjan GhaziSaeedi, Maryam Zolnoori

Abstract:

Introduction: Data mining defined as a process to find patterns and relationships along data in the database to build predictive models. Application of data mining extended in vast sectors such as the healthcare services. Medical data mining aims to solve real-world problems in the diagnosis and treatment of diseases. This method applies various techniques and algorithms which have different accuracy and precision. The purpose of this study was to apply knowledge discovery and data mining techniques for the diagnosis of asthma based on patient symptoms and history. Method: Data mining includes several steps and decisions should be made by the user which starts by creation of an understanding of the scope and application of previous knowledge in this area and identifying KD process from the point of view of the stakeholders and finished by acting on discovered knowledge using knowledge conducting, integrating knowledge with other systems and knowledge documenting and reporting.in this study a stepwise methodology followed to achieve a logical outcome. Results: Sensitivity, Specifity and Accuracy of KNN, SVM, Naïve bayes, NN, Classification tree and CN2 algorithms and related similar studies was evaluated and ROC curves were plotted to show the performance of the system. Conclusion: The results show that we can accurately diagnose asthma, approximately ninety percent, based on the demographical and clinical data. The study also showed that the methods based on pattern discovery and data mining have a higher sensitivity compared to expert and knowledge-based systems. On the other hand, medical guidelines and evidence-based medicine should be base of diagnostics methods, therefore recommended to machine learning algorithms used in combination with knowledge-based algorithms.

Keywords: asthma, datamining, classification, machine learning

Procedia PDF Downloads 447
23860 Application of GPRS in Water Quality Monitoring System

Authors: V. Ayishwarya Bharathi, S. M. Hasker, J. Indhu, M. Mohamed Azarudeen, G. Gowthami, R. Vinoth Rajan, N. Vijayarangan

Abstract:

Identification of water quality conditions in a river system based on limited observations is an essential task for meeting the goals of environmental management. The traditional method of water quality testing is to collect samples manually and then send to laboratory for analysis. However, it has been unable to meet the demands of water quality monitoring today. So a set of automatic measurement and reporting system of water quality has been developed. In this project specifies Water quality parameters collected by multi-parameter water quality probe are transmitted to data processing and monitoring center through GPRS wireless communication network of mobile. The multi parameter sensor is directly placed above the water level. The monitoring center consists of GPRS and micro-controller which monitor the data. The collected data can be monitor at any instant of time. In the pollution control board they will monitor the water quality sensor data in computer using Visual Basic Software. The system collects, transmits and processes water quality parameters automatically, so production efficiency and economy benefit are improved greatly. GPRS technology can achieve well within the complex environment of poor water quality non-monitored, and more specifically applicable to the collection point, data transmission automatically generate the field of water analysis equipment data transmission and monitoring.

Keywords: multiparameter sensor, GPRS, visual basic software, RS232

Procedia PDF Downloads 412
23859 Decision Support System in Air Pollution Using Data Mining

Authors: E. Fathallahi Aghdam, V. Hosseini

Abstract:

Environmental pollution is not limited to a specific region or country; that is why sustainable development, as a necessary process for improvement, pays attention to issues such as destruction of natural resources, degradation of biological system, global pollution, and climate change in the world, especially in the developing countries. According to the World Health Organization, as a developing city, Tehran (capital of Iran) is one of the most polluted cities in the world in terms of air pollution. In this study, three pollutants including particulate matter less than 10 microns, nitrogen oxides, and sulfur dioxide were evaluated in Tehran using data mining techniques and through Crisp approach. The data from 21 air pollution measuring stations in different areas of Tehran were collected from 1999 to 2013. Commercial softwares Clementine was selected for this study. Tehran was divided into distinct clusters in terms of the mentioned pollutants using the software. As a data mining technique, clustering is usually used as a prologue for other analyses, therefore, the similarity of clusters was evaluated in this study through analyzing local conditions, traffic behavior, and industrial activities. In fact, the results of this research can support decision-making system, help managers improve the performance and decision making, and assist in urban studies.

Keywords: data mining, clustering, air pollution, crisp approach

Procedia PDF Downloads 428
23858 Test Suite Optimization Using an Effective Meta-Heuristic BAT Algorithm

Authors: Anuradha Chug, Sunali Gandhi

Abstract:

Regression Testing is a very expensive and time-consuming process carried out to ensure the validity of modified software. Due to the availability of insufficient resources to re-execute all the test cases in time constrained environment, efforts are going on to generate test data automatically without human efforts. Many search based techniques have been proposed to generate efficient, effective as well as optimized test data, so that the overall cost of the software testing can be minimized. The generated test data should be able to uncover all potential lapses that exist in the software or product. Inspired from the natural behavior of bat for searching her food sources, current study employed a meta-heuristic, search-based bat algorithm for optimizing the test data on the basis certain parameters without compromising their effectiveness. Mathematical functions are also applied that can effectively filter out the redundant test data. As many as 50 Java programs are used to check the effectiveness of proposed test data generation and it has been found that 86% saving in testing efforts can be achieved using bat algorithm while covering 100% of the software code for testing. Bat algorithm was found to be more efficient in terms of simplicity and flexibility when the results were compared with another nature inspired algorithms such as Firefly Algorithm (FA), Hill Climbing Algorithm (HC) and Ant Colony Optimization (ACO). The output of this study would be useful to testers as they can achieve 100% path coverage for testing with minimum number of test cases.

Keywords: regression testing, test case selection, test case prioritization, genetic algorithm, bat algorithm

Procedia PDF Downloads 381
23857 Memory Consolidation: Application of Retrieval Strategies in the Classroom

Authors: Eric Tardif, Nicolas Meylan

Abstract:

Recent studies suggest that the consolidation of episodic memory is better achieved through repeated retrieval than with the use of concept mapping or repeated study. Although such laboratory results highly appeal to educationalists, it remains to be shown whether they can be directly used in a classroom setting. Forty-five college students (42 girls; mean age 16.1 y/o) were asked to remember pairs of biology-related words (e.g. mitochondria-energy) in two configurations. The first configuration consisted of a three-minute study of pairs of words followed by a final one-minute test in which the first word of a pair was shown and the subject asked to write down the second associated word. This procedure was repeated three times. The second configuration consisted of a one-minute study of a list of pairs of words, which was immediately followed by a one-minute test. This procedure was repeated 6 times. Subjects filled out a small questionnaire assessing their general mood, level of fatigue, stress and motivation to do the exercise. One week later, subjects were given a final test using the same words. A total of 8 lists of words were studied and tested during the semester. Results showed that subjects recalled more correct words when using the second configuration, both within the study period and one week later, confirming laboratory findings. However, the general performance (mean items recalled) as well as the motivation to do the exercise gradually decreased during the semester. Motivation was positively correlated with performance (r=0.77, p<0.05). The results suggest that laboratory findings may provide some applications in education but other variables inherent to the classroom setting must also be considered.

Keywords: long-term, episodic memory, consolidation, retrieval, school setting

Procedia PDF Downloads 339
23856 Modified InVEST for Whatsapp Messages Forensic Triage and Search through Visualization

Authors: Agria Rhamdhan

Abstract:

WhatsApp as the most popular mobile messaging app has been used as evidence in many criminal cases. As the use of mobile messages generates large amounts of data, forensic investigation faces the challenge of large data problems. The hardest part of finding this important evidence is because current practice utilizes tools and technique that require manual analysis to check all messages. That way, analyze large sets of mobile messaging data will take a lot of time and effort. Our work offers methodologies based on forensic triage to reduce large data to manageable sets resulting easier to do detailed reviews, then show the results through interactive visualization to show important term, entities and relationship through intelligent ranking using Term Frequency-Inverse Document Frequency (TF-IDF) and Latent Dirichlet Allocation (LDA) Model. By implementing this methodology, investigators can improve investigation processing time and result's accuracy.

Keywords: forensics, triage, visualization, WhatsApp

Procedia PDF Downloads 168
23855 An Investigation into the Views of Distant Science Education Students Regarding Teaching Laboratory Work Online

Authors: Abraham Motlhabane

Abstract:

This research analysed the written views of science education students regarding the teaching of laboratory work using the online mode. The research adopted the qualitative methodology. The qualitative research was aimed at investigating small and distinct groups normally regarded as a single-site study. Qualitative research was used to describe and analyze the phenomena from the student’s perspective. This means the research began with assumptions of the world view that use theoretical lenses of research problems inquiring into the meaning of individual students. The research was conducted with three groups of students studying for Postgraduate Certificate in Education, Bachelor of Education and honors Bachelor of Education respectively. In each of the study programmes, the science education module is compulsory. Five science education students from each study programme were purposively selected to participate in this research. Therefore, 15 students participated in the research. In order to analysis the data, the data were first printed and hard copies were used in the analysis. The data was read several times and key concepts and ideas were highlighted. Themes and patterns were identified to describe the data. Coding as a process of organising and sorting data was used. The findings of the study are very diverse; some students are in favour of online laboratory whereas other students argue that science can only be learnt through hands-on experimentation.

Keywords: online learning, laboratory work, views, perceptions

Procedia PDF Downloads 145
23854 The Communication Library DIALOG for iFDAQ of the COMPASS Experiment

Authors: Y. Bai, M. Bodlak, V. Frolov, S. Huber, V. Jary, I. Konorov, D. Levit, J. Novy, D. Steffen, O. Subrt, M. Virius

Abstract:

Modern experiments in high energy physics impose great demands on the reliability, the efficiency, and the data rate of Data Acquisition Systems (DAQ). This contribution focuses on the development and deployment of the new communication library DIALOG for the intelligent, FPGA-based Data Acquisition System (iFDAQ) of the COMPASS experiment at CERN. The iFDAQ utilizing a hardware event builder is designed to be able to readout data at the maximum rate of the experiment. The DIALOG library is a communication system both for distributed and mixed environments, it provides a network transparent inter-process communication layer. Using the high-performance and modern C++ framework Qt and its Qt Network API, the DIALOG library presents an alternative to the previously used DIM library. The DIALOG library was fully incorporated to all processes in the iFDAQ during the run 2016. From the software point of view, it might be considered as a significant improvement of iFDAQ in comparison with the previous run. To extend the possibilities of debugging, the online monitoring of communication among processes via DIALOG GUI is a desirable feature. In the paper, we present the DIALOG library from several insights and discuss it in a detailed way. Moreover, the efficiency measurement and comparison with the DIM library with respect to the iFDAQ requirements is provided.

Keywords: data acquisition system, DIALOG library, DIM library, FPGA, Qt framework, TCP/IP

Procedia PDF Downloads 316
23853 Using Data Mining Technique for Scholarship Disbursement

Authors: J. K. Alhassan, S. A. Lawal

Abstract:

This work is on decision tree-based classification for the disbursement of scholarship. Tree-based data mining classification technique is used in other to determine the generic rule to be used to disburse the scholarship. The system based on the defined rules from the tree is able to determine the class (status) to which an applicant shall belong whether Granted or Not Granted. The applicants that fall to the class of granted denote a successful acquirement of scholarship while those in not granted class are unsuccessful in the scheme. An algorithm that can be used to classify the applicants based on the rules from tree-based classification was also developed. The tree-based classification is adopted because of its efficiency, effectiveness, and easy to comprehend features. The system was tested with the data of National Information Technology Development Agency (NITDA) Abuja, a Parastatal of Federal Ministry of Communication Technology that is mandated to develop and regulate information technology in Nigeria. The system was found working according to the specification. It is therefore recommended for all scholarship disbursement organizations.

Keywords: classification, data mining, decision tree, scholarship

Procedia PDF Downloads 376
23852 Implementation of a Quality Management Approach in the Laboratory of Quality Control and the Repression of Fraud (CACQE) of the Wilaya of Bechar

Authors: Khadidja Mebarki, Naceur Boussouar, Nabila Ihaddadene, M. Akermi

Abstract:

Food products are particularly sensitive, since they concern the health of the consumer, whether it’s be from the health point of view or commercial, this kind of product must be subjected to rigorous controls, in order to prevent any fraud. Quality and safety are essential for food security, public health and economic development. The strengthening of food security is essential to increase food security which is considered reached when all individuals can at any time access safe and nutritious food they need to lead healthy and active lives. The objective of this project is to initiate a quality approach in the laboratories of the quality control and the repression of fraud. It will be directed towards the application of good laboratory practices, traceability, management of quality documents (quality, procedures and specification manual) and quality audits. And to prepare the ground for a possible accreditation by ISO 17025 standard of BECHAR laboratory’s. The project will take place in four main stages: 1- Preparation of an audit grid; 2- Realization of a quality audit according to the method of 5 M completed by a section on quality documentation; 3- Drafting of an audit report and proposal for recommendations; 4- Implementation of corrective actions on the ground. This last step consisted in the formalization of the cleaning disinfection plan; work on good hygiene practices, establishment of a mapping of processes and flow charts of the different processes of the laboratory, classifying quality documents and formalizing the process of document management. During the period of the study within the laboratory, all facets of the work were almost appreciated, as we participated in the expertise performed in within it.

Keywords: quality, management, ISO 17025 accreditation, GLP

Procedia PDF Downloads 518
23851 [Keynote Speech]: Feature Selection and Predictive Modeling of Housing Data Using Random Forest

Authors: Bharatendra Rai

Abstract:

Predictive data analysis and modeling involving machine learning techniques become challenging in presence of too many explanatory variables or features. Presence of too many features in machine learning is known to not only cause algorithms to slow down, but they can also lead to decrease in model prediction accuracy. This study involves housing dataset with 79 quantitative and qualitative features that describe various aspects people consider while buying a new house. Boruta algorithm that supports feature selection using a wrapper approach build around random forest is used in this study. This feature selection process leads to 49 confirmed features which are then used for developing predictive random forest models. The study also explores five different data partitioning ratios and their impact on model accuracy are captured using coefficient of determination (r-square) and root mean square error (rsme).

Keywords: housing data, feature selection, random forest, Boruta algorithm, root mean square error

Procedia PDF Downloads 323
23850 Cars in a Neighborhood: A Case of Sustainable Living in Sector 22 Chandigarh

Authors: Maninder Singh

Abstract:

The Chandigarh city is under the strain of exponential growth of car density across various neighborhood. The consumerist nature of society today is to be blamed for this menace because everyone wants to own and ride a car. Car manufacturers are busy selling two or more cars per household. The Regional Transport Offices are busy issuing as many licenses to new vehicles as they can in order to generate revenue in the form of Road Tax. The car traffic in the neighborhoods of Chandigarh has reached a tipping point. There needs to be a more empirical and sustainable model of cars per household, which should be based on specific parameters of livable neighborhoods. Sector 22 in Chandigarh is one of the first residential sectors to be established in the city. There is scope to think, reflect, and work out a method to know how many cars we need to sell our citizens before we lose the argument to traffic problems, parking problems, and road rage. This is where the true challenge of a planner or a designer of the city lies. Currently, in Chandigarh city, there are no clear visible answers to this problem. The way forward is to look at spatial mapping, planning, and design of car parking units to address the problem, rather than suggesting extreme measures of banning cars (short-term) or promoting plans for citywide transport (very long-term). This is a chance to resolve the problem with a pragmatic approach from a citizen’s perspective, instead of an orthodox development planner’s methodology. Since citizens are at the center of how the problem is to be addressed, acceptable solutions are more likely to emerge from the car and traffic problem as defined by the citizens. Thus, the idea and its implementation would be interesting in comparison to the known academic methodologies. The novel and innovative process would lead to a more acceptable and sustainable approach to the issue of number of car parks in the neighborhood of Chandigarh city.

Keywords: cars, Chandigarh, neighborhood, sustainable living, walkability

Procedia PDF Downloads 148
23849 Mapping the Technological Interventions to the National Action Plan for Marine Litter Management 2018-2025: Addressing the Marine Plastic Litter at the Marine Tourism Destinations in Indonesia

Authors: Kaisar Akhir, Azhar Slamet

Abstract:

This study aims to provide recommendations for addressing marine plastic litter at the ocean tourism destinations in Indonesia sustainably through technological interventions in the framework of the National Action Plan for Marine Litter Management 2018-2025. In Indonesia, marine tourism is a rapidly growing economic sector. However, marine tourism destinations are facing a global challenge called marine plastic litter. Marine plastic litter is a threat to those destinations since it has potential impacts on the reduction of marine environmental sustainability, the health of tourists and local communities as well as tourism business income. Since 2018, the Indonesian government has passed and promulgated the National Plan of Action on Marine Litter Management 2018-2025. This national action plan consists of three important key aspects of interventions (i.e., societal effort, technological application, and institutional coordination) and five strategies for addressing marine litter in Indonesia, in particular, to address 70% of marine plastic litter by 2025. The strategies include 1) National movement for raising awareness of stakeholders, 2) Land-based litter management, 3) Litter management at the sea and coasts, 4) Funding mechanism, institutional strengthening, monitoring, and law enforcement, and 5) Research and development. In this study, technological interventions around the world and in Indonesia are reviewed and analyzed on their relevance to the national action plan based on five criteria. As a result, there are twelve kinds of technological interventions recommended to be implemented for addressing marine plastic litter in the marine tourism destinations in Indonesia.

Keywords: marine litter management, marine plastic litter, national action plan, ocean sustainability, ocean tourism destination, technological interventions

Procedia PDF Downloads 169
23848 Image-Based (RBG) Technique for Estimating Phosphorus Levels of Different Crops

Authors: M. M. Ali, Ahmed Al- Ani, Derek Eamus, Daniel K. Y. Tan

Abstract:

In this glasshouse study, we developed the new image-based non-destructive technique for detecting leaf P status of different crops such as cotton, tomato and lettuce. Plants were allowed to grow on nutrient media containing different P concentrations, i.e. 0%, 50% and 100% of recommended P concentration (P0 = no P, L; P1 = 2.5 mL 10 L-1 of P and P2 = 5 mL 10 L-1 of P as NaH2PO4). After 10 weeks of growth, plants were harvested and data on leaf P contents were collected using the standard destructive laboratory method and at the same time leaf images were collected by a handheld crop image sensor. We calculated leaf area, leaf perimeter and RGB (red, green and blue) values of these images. This data was further used in the linear discriminant analysis (LDA) to estimate leaf P contents, which successfully classified these plants on the basis of leaf P contents. The data indicated that P deficiency in crop plants can be predicted using the image and morphological data. Our proposed non-destructive imaging method is precise in estimating P requirements of different crop species.

Keywords: image-based techniques, leaf area, leaf P contents, linear discriminant analysis

Procedia PDF Downloads 382
23847 Design of Visual Repository, Constraint and Process Modeling Tool Based on Eclipse Plug-Ins

Authors: Rushiraj Heshi, Smriti Bhandari

Abstract:

Master Data Management requires creation of Central repository, applying constraints on Repository and designing processes to manage data. Designing of Repository, constraints on repository and business processes is very tedious and time consuming task for large Enterprise. Hence Visual Repository, constraints and Process (Workflow) modeling is the most critical step in Master Data Management.In this paper, we realize a Visual Modeling tool for implementing Repositories, Constraints and Processes based on Eclipse Plugin using GMF/EMF which follows principles of Model Driven Engineering (MDE).

Keywords: EMF, GMF, GEF, repository, constraint, process

Procedia PDF Downloads 497
23846 The Classification Performance in Parametric and Nonparametric Discriminant Analysis for a Class- Unbalanced Data of Diabetes Risk Groups

Authors: Lily Ingsrisawang, Tasanee Nacharoen

Abstract:

Introduction: The problems of unbalanced data sets generally appear in real world applications. Due to unequal class distribution, many research papers found that the performance of existing classifier tends to be biased towards the majority class. The k -nearest neighbors’ nonparametric discriminant analysis is one method that was proposed for classifying unbalanced classes with good performance. Hence, the methods of discriminant analysis are of interest to us in investigating misclassification error rates for class-imbalanced data of three diabetes risk groups. Objective: The purpose of this study was to compare the classification performance between parametric discriminant analysis and nonparametric discriminant analysis in a three-class classification application of class-imbalanced data of diabetes risk groups. Methods: Data from a healthy project for 599 staffs in a government hospital in Bangkok were obtained for the classification problem. The staffs were diagnosed into one of three diabetes risk groups: non-risk (90%), risk (5%), and diabetic (5%). The original data along with the variables; diabetes risk group, age, gender, cholesterol, and BMI was analyzed and bootstrapped up to 50 and 100 samples, 599 observations per sample, for additional estimation of misclassification error rate. Each data set was explored for the departure of multivariate normality and the equality of covariance matrices of the three risk groups. Both the original data and the bootstrap samples show non-normality and unequal covariance matrices. The parametric linear discriminant function, quadratic discriminant function, and the nonparametric k-nearest neighbors’ discriminant function were performed over 50 and 100 bootstrap samples and applied to the original data. In finding the optimal classification rule, the choices of prior probabilities were set up for both equal proportions (0.33: 0.33: 0.33) and unequal proportions with three choices of (0.90:0.05:0.05), (0.80: 0.10: 0.10) or (0.70, 0.15, 0.15). Results: The results from 50 and 100 bootstrap samples indicated that the k-nearest neighbors approach when k = 3 or k = 4 and the prior probabilities of {non-risk:risk:diabetic} as {0.90:0.05:0.05} or {0.80:0.10:0.10} gave the smallest error rate of misclassification. Conclusion: The k-nearest neighbors approach would be suggested for classifying a three-class-imbalanced data of diabetes risk groups.

Keywords: error rate, bootstrap, diabetes risk groups, k-nearest neighbors

Procedia PDF Downloads 435
23845 BFDD-S: Big Data Framework to Detect and Mitigate DDoS Attack in SDN Network

Authors: Amirreza Fazely Hamedani, Muzzamil Aziz, Philipp Wieder, Ramin Yahyapour

Abstract:

Software-defined networking in recent years came into the sight of so many network designers as a successor to the traditional networking. Unlike traditional networks where control and data planes engage together within a single device in the network infrastructure such as switches and routers, the two planes are kept separated in software-defined networks (SDNs). All critical decisions about packet routing are made on the network controller, and the data level devices forward the packets based on these decisions. This type of network is vulnerable to DDoS attacks, degrading the overall functioning and performance of the network by continuously injecting the fake flows into it. This increases substantial burden on the controller side, and the result ultimately leads to the inaccessibility of the controller and the lack of network service to the legitimate users. Thus, the protection of this novel network architecture against denial of service attacks is essential. In the world of cybersecurity, attacks and new threats emerge every day. It is essential to have tools capable of managing and analyzing all this new information to detect possible attacks in real-time. These tools should provide a comprehensive solution to automatically detect, predict and prevent abnormalities in the network. Big data encompasses a wide range of studies, but it mainly refers to the massive amounts of structured and unstructured data that organizations deal with on a regular basis. On the other hand, it regards not only the volume of the data; but also that how data-driven information can be used to enhance decision-making processes, security, and the overall efficiency of a business. This paper presents an intelligent big data framework as a solution to handle illegitimate traffic burden on the SDN network created by the numerous DDoS attacks. The framework entails an efficient defence and monitoring mechanism against DDoS attacks by employing the state of the art machine learning techniques.

Keywords: apache spark, apache kafka, big data, DDoS attack, machine learning, SDN network

Procedia PDF Downloads 169
23844 Welding Process Selection for Storage Tank by Integrated Data Envelopment Analysis and Fuzzy Credibility Constrained Programming Approach

Authors: Rahmad Wisnu Wardana, Eakachai Warinsiriruk, Sutep Joy-A-Ka

Abstract:

Selecting the most suitable welding process usually depends on experiences or common application in similar companies. However, this approach generally ignores many criteria that can be affecting the suitable welding process selection. Therefore, knowledge automation through knowledge-based systems will significantly improve the decision-making process. The aims of this research propose integrated data envelopment analysis (DEA) and fuzzy credibility constrained programming approach for identifying the best welding process for stainless steel storage tank in the food and beverage industry. The proposed approach uses fuzzy concept and credibility measure to deal with uncertain data from experts' judgment. Furthermore, 12 parameters are used to determine the most appropriate welding processes among six competitive welding processes.

Keywords: welding process selection, data envelopment analysis, fuzzy credibility constrained programming, storage tank

Procedia PDF Downloads 167
23843 On the Estimation of Crime Rate in the Southwest of Nigeria: Principal Component Analysis Approach

Authors: Kayode Balogun, Femi Ayoola

Abstract:

Crime is at alarming rate in this part of world and there are many factors that are contributing to this antisocietal behaviour both among the youths and old. In this work, principal component analysis (PCA) was used as a tool to reduce the dimensionality and to really know those variables that were crime prone in the study region. Data were collected on twenty-eight crime variables from National Bureau of Statistics (NBS) databank for a period of fifteen years, while retaining as much of the information as possible. We use PCA in this study to know the number of major variables and contributors to the crime in the Southwest Nigeria. The results of our analysis revealed that there were eight principal variables have been retained using the Scree plot and Loading plot which implies an eight-equation solution will be appropriate for the data. The eight components explained 93.81% of the total variation in the data set. We also found that the highest and commonly committed crimes in the Southwestern Nigeria were: Assault, Grievous Harm and Wounding, theft/stealing, burglary, house breaking, false pretence, unlawful arms possession and breach of public peace.

Keywords: crime rates, data, Southwest Nigeria, principal component analysis, variables

Procedia PDF Downloads 444
23842 On-Line Data-Driven Multivariate Statistical Prediction Approach to Production Monitoring

Authors: Hyun-Woo Cho

Abstract:

Detection of incipient abnormal events in production processes is important to improve safety and reliability of manufacturing operations and reduce losses caused by failures. The construction of calibration models for predicting faulty conditions is quite essential in making decisions on when to perform preventive maintenance. This paper presents a multivariate calibration monitoring approach based on the statistical analysis of process measurement data. The calibration model is used to predict faulty conditions from historical reference data. This approach utilizes variable selection techniques, and the predictive performance of several prediction methods are evaluated using real data. The results shows that the calibration model based on supervised probabilistic model yielded best performance in this work. By adopting a proper variable selection scheme in calibration models, the prediction performance can be improved by excluding non-informative variables from their model building steps.

Keywords: calibration model, monitoring, quality improvement, feature selection

Procedia PDF Downloads 356
23841 Spatial and Geostatistical Analysis of Surficial Soils of the Contiguous United States

Authors: Rachel Hetherington, Chad Deering, Ann Maclean, Snehamoy Chatterjee

Abstract:

The U.S. Geological Survey conducted a soil survey and subsequent mineralogical and geochemical analyses of over 4800 samples taken across the contiguous United States between the years 2007 and 2013. At each location, samples were taken from the top 5 cm, the A-horizon, and the C-horizon. Many studies have looked at the correlation between the mineralogical and geochemical content of soils and influencing factors such as parent lithology, climate, soil type, and age, but it seems little has been done in relation to quantifying and assessing the correlation between elements in the soil on a national scale. GIS was used for the mapping and multivariate interpolation of over 40 major and trace elements for surficial soils (0-5 cm depth). Qualitative analysis of the spatial distribution across the U.S. shows distinct patterns amongst elements both within the same periodic groups and within different periodic groups, and therefore with different behavioural characteristics. Results show the emergence of 4 main patterns of high concentration areas: vertically along the west coast, a C-shape formed through the states around Utah and northern Arizona, a V-shape through the Midwest and connecting to the Appalachians, and along the Appalachians. The Band Collection Statistics tool in GIS was used to quantitatively analyse the geochemical raster datasets and calculate a correlation matrix. Patterns emerged, which were not identified in qualitative analysis, many of which are also amongst elements with very different characteristics. Preliminary results show 41 element pairings with a strong positive correlation ( ≥ 0.75). Both qualitative and quantitative analyses on this scale could increase knowledge on the relationships between element distribution and behaviour in surficial soils of the U.S.

Keywords: correlation matrix, geochemical analyses, spatial distribution of elements, surficial soils

Procedia PDF Downloads 126
23840 Continuity of Place-Identity: Identifying Regional Components of Kerala Architecture through 1805-1950

Authors: Manoj K. Kumar, Deepthi Bathala

Abstract:

Man has the need to know and feel as a part of the historical continuum and it is this continuum that reinforces his identity. Architecture and the built environment contribute to this identity as established by the various identity theories exploring the relationship between the two. Architecture which is organic has been successful in maintaining a continuum of identity until the advent of globalization when the world saw a drastic shift to architecture of ‘placelessness’. The answer to the perfect synthesis of ‘universalization’ and ‘regionalism’ is an ongoing quest. However, history has established a smooth transition from vernacular to colonial to modern unlike the architecture of today. The traditional Kerala architecture has evolved from the tropical climate, geography, local needs, materials, skills and foreign influences. It is unique in contrast to the architecture of the neighboring states as a result of the geographical barriers however influenced by the architecture of the Orient due to trade relations. Through 1805 to 1950, the European influence on the architecture of Kerala resulted in the emergence of the colonial style which managed to establish a continuum of the traditional architecture. The paper focuses on the identification of the components of architecture that established the continuity of place-identity in the architecture of Kerala and examines the transition from the traditional Kerala architecture to colonial architecture during the colonial period. Visual surveys based on the principles of urban design, cognitive mapping, typology analysis followed by the strong understanding of the morphological and built environment along with the matrix method are the research tools used. The understanding of these components of continuity can be useful in creating buildings which people can relate to in the present day. South-Asia shares the history of colonialism and the understanding of these components can pave the way for further research on how to establish a regional identity in the era of globalization.

Keywords: colonial, identity, place, regional

Procedia PDF Downloads 408
23839 Multilevel Gray Scale Image Encryption through 2D Cellular Automata

Authors: Rupali Bhardwaj

Abstract:

Cryptography is the science of using mathematics to encrypt and decrypt data; the data are converted into some other gibberish form, and then the encrypted data are transmitted. The primary purpose of this paper is to provide two levels of security through a two-step process, rather than transmitted the message bits directly, first encrypted it using 2D cellular automata and then scrambled with Arnold Cat Map transformation; it provides an additional layer of protection and reduces the chance of the transmitted message being detected. A comparative analysis on effectiveness of scrambling technique is provided by scrambling degree measurement parameters i.e. Gray Difference Degree (GDD) and Correlation Coefficient.

Keywords: scrambling, cellular automata, Arnold cat map, game of life, gray difference degree, correlation coefficient

Procedia PDF Downloads 378
23838 Survey Based Data Security Evaluation in Pakistan Financial Institutions against Malicious Attacks

Authors: Naveed Ghani, Samreen Javed

Abstract:

In today’s heterogeneous network environment, there is a growing demand for distrust clients to jointly execute secure network to prevent from malicious attacks as the defining task of propagating malicious code is to locate new targets to attack. Residual risk is always there no matter what solutions are implemented or whet so ever security methodology or standards being adapted. Security is the first and crucial phase in the field of Computer Science. The main aim of the Computer Security is gathering of information with secure network. No one need wonder what all that malware is trying to do: It's trying to steal money through data theft, bank transfers, stolen passwords, or swiped identities. From there, with the help of our survey we learn about the importance of white listing, antimalware programs, security patches, log files, honey pots, and more used in banks for financial data protection but there’s also a need of implementing the IPV6 tunneling with Crypto data transformation according to the requirements of new technology to prevent the organization from new Malware attacks and crafting of its own messages and sending them to the target. In this paper the writer has given the idea of implementing IPV6 Tunneling Secessions on private data transmission from financial organizations whose secrecy needed to be safeguarded.

Keywords: network worms, malware infection propagating malicious code, virus, security, VPN

Procedia PDF Downloads 358