Search results for: data mapping
24986 Fuzzy Optimization Multi-Objective Clustering Ensemble Model for Multi-Source Data Analysis
Authors: C. B. Le, V. N. Pham
Abstract:
In modern data analysis, multi-source data appears more and more in real applications. Multi-source data clustering has emerged as a important issue in the data mining and machine learning community. Different data sources provide information about different data. Therefore, multi-source data linking is essential to improve clustering performance. However, in practice multi-source data is often heterogeneous, uncertain, and large. This issue is considered a major challenge from multi-source data. Ensemble is a versatile machine learning model in which learning techniques can work in parallel, with big data. Clustering ensemble has been shown to outperform any standard clustering algorithm in terms of accuracy and robustness. However, most of the traditional clustering ensemble approaches are based on single-objective function and single-source data. This paper proposes a new clustering ensemble method for multi-source data analysis. The fuzzy optimized multi-objective clustering ensemble method is called FOMOCE. Firstly, a clustering ensemble mathematical model based on the structure of multi-objective clustering function, multi-source data, and dark knowledge is introduced. Then, rules for extracting dark knowledge from the input data, clustering algorithms, and base clusterings are designed and applied. Finally, a clustering ensemble algorithm is proposed for multi-source data analysis. The experiments were performed on the standard sample data set. The experimental results demonstrate the superior performance of the FOMOCE method compared to the existing clustering ensemble methods and multi-source clustering methods.Keywords: clustering ensemble, multi-source, multi-objective, fuzzy clustering
Procedia PDF Downloads 18924985 NANCY: Combining Adversarial Networks with Cycle-Consistency for Robust Multi-Modal Image Registration
Authors: Mirjana Ruppel, Rajendra Persad, Amit Bahl, Sanja Dogramadzi, Chris Melhuish, Lyndon Smith
Abstract:
Multimodal image registration is a profoundly complex task which is why deep learning has been used widely to address it in recent years. However, two main challenges remain: Firstly, the lack of ground truth data calls for an unsupervised learning approach, which leads to the second challenge of defining a feasible loss function that can compare two images of different modalities to judge their level of alignment. To avoid this issue altogether we implement a generative adversarial network consisting of two registration networks GAB, GBA and two discrimination networks DA, DB connected by spatial transformation layers. GAB learns to generate a deformation field which registers an image of the modality B to an image of the modality A. To do that, it uses the feedback of the discriminator DB which is learning to judge the quality of alignment of the registered image B. GBA and DA learn a mapping from modality A to modality B. Additionally, a cycle-consistency loss is implemented. For this, both registration networks are employed twice, therefore resulting in images ˆA, ˆB which were registered to ˜B, ˜A which were registered to the initial image pair A, B. Thus the resulting and initial images of the same modality can be easily compared. A dataset of liver CT and MRI was used to evaluate the quality of our approach and to compare it against learning and non-learning based registration algorithms. Our approach leads to dice scores of up to 0.80 ± 0.01 and is therefore comparable to and slightly more successful than algorithms like SimpleElastix and VoxelMorph.Keywords: cycle consistency, deformable multimodal image registration, deep learning, GAN
Procedia PDF Downloads 13124984 Process Modeling and Problem Solving: Connecting Two Worlds by BPMN
Authors: Gionata Carmignani, Mario G. C. A. Cimino, Franco Failli
Abstract:
Business Processes (BPs) are the key instrument to understand how companies operate at an organizational level, taking an as-is view of the workflow, and how to address their issues by identifying a to-be model. In last year’s, the BP Model and Notation (BPMN) has become a de-facto standard for modeling processes. However, this standard does not incorporate explicitly the Problem-Solving (PS) knowledge in the Process Modeling (PM) results. Thus, such knowledge cannot be shared or reused. To narrow this gap is today a challenging research area. In this paper we present a framework able to capture the PS knowledge and to improve a workflow. This framework extends the BPMN specification by incorporating new general-purpose elements. A pilot scenario is also presented and discussed.Keywords: business process management, BPMN, problem solving, process mapping
Procedia PDF Downloads 41324983 A Lexicographic Approach to Obstacles Identified in the Ontological Representation of the Tree of Life
Authors: Sandra Young
Abstract:
The biodiversity literature is vast and heterogeneous. In today’s data age, numbers of data integration and standardisation initiatives aim to facilitate simultaneous access to all the literature across biodiversity domains for research and forecasting purposes. Ontologies are being used increasingly to organise this information, but the rationalisation intrinsic to ontologies can hit obstacles when faced with the intrinsic fluidity and inconsistency found in the domains comprising biodiversity. Essentially the problem is a conceptual one: biological taxonomies are formed on the basis of specific, physical specimens yet nomenclatural rules are used to provide labels to describe these physical objects. These labels are ambiguous representations of the physical specimen. An example of this is with the genus Melpomene, the scientific nomenclatural representation of a genus of ferns, but also for a genus of spiders. The physical specimens for each of these are vastly different, but they have been assigned the same nomenclatural reference. While there is much research into the conceptual stability of the taxonomic concept versus the nomenclature used, to the best of our knowledge as yet no research has looked empirically at the literature to see the conceptual plurality or singularity of the use of these species’ names, the linguistic representation of a physical entity. Language itself uses words as symbols to represent real world concepts, whether physical entities or otherwise, and as such lexicography has a well-founded history in the conceptual mapping of words in context for dictionary making. This makes it an ideal candidate to explore this problem. The lexicographic approach uses corpus-based analysis to look at word use in context, with a specific focus on collocated word frequencies (the frequencies of words used in specific grammatical and collocational contexts). It allows for inconsistencies and contradictions in the source data and in fact includes these in the word characterisation so that 100% of the available evidence is counted. Corpus analysis is indeed suggested as one of the ways to identify concepts for ontology building, because of its ability to look empirically at data and show patterns in language usage, which can indicate conceptual ideas which go beyond words themselves. In this sense it could potentially be used to identify if the hierarchical structures present within the empirical body of literature match those which have been identified in ontologies created to represent them. The first stages of this research have revealed a hierarchical structure that becomes apparent in the biodiversity literature when annotating scientific species’ names, common names and more general names as classes, which will be the focus of this paper. The next step in the research is focusing on a larger corpus in which specific words can be analysed and then compared with existing ontological structures looking at the same material, to evaluate the methods by means of an alternative perspective. This research aims to provide evidence as to the validity of the current methods in knowledge representation for biological entities, and also shed light on the way that scientific nomenclature is used within the literature.Keywords: ontology, biodiversity, lexicography, knowledge representation, corpus linguistics
Procedia PDF Downloads 13724982 Modeling Activity Pattern Using XGBoost for Mining Smart Card Data
Authors: Eui-Jin Kim, Hasik Lee, Su-Jin Park, Dong-Kyu Kim
Abstract:
Smart-card data are expected to provide information on activity pattern as an alternative to conventional person trip surveys. The focus of this study is to propose a method for training the person trip surveys to supplement the smart-card data that does not contain the purpose of each trip. We selected only available features from smart card data such as spatiotemporal information on the trip and geographic information system (GIS) data near the stations to train the survey data. XGboost, which is state-of-the-art tree-based ensemble classifier, was used to train data from multiple sources. This classifier uses a more regularized model formalization to control the over-fitting and show very fast execution time with well-performance. The validation results showed that proposed method efficiently estimated the trip purpose. GIS data of station and duration of stay at the destination were significant features in modeling trip purpose.Keywords: activity pattern, data fusion, smart-card, XGboost
Procedia PDF Downloads 24624981 Portable and Parallel Accelerated Development Method for Field-Programmable Gate Array (FPGA)-Central Processing Unit (CPU)- Graphics Processing Unit (GPU) Heterogeneous Computing
Authors: Nan Hu, Chao Wang, Xi Li, Xuehai Zhou
Abstract:
The field-programmable gate array (FPGA) has been widely adopted in the high-performance computing domain. In recent years, the embedded system-on-a-chip (SoC) contains coarse granularity multi-core CPU (central processing unit) and mobile GPU (graphics processing unit) that can be used as general-purpose accelerators. The motivation is that algorithms of various parallel characteristics can be efficiently mapped to the heterogeneous architecture coupled with these three processors. The CPU and GPU offload partial computationally intensive tasks from the FPGA to reduce the resource consumption and lower the overall cost of the system. However, in present common scenarios, the applications always utilize only one type of accelerator because the development approach supporting the collaboration of the heterogeneous processors faces challenges. Therefore, a systematic approach takes advantage of write-once-run-anywhere portability, high execution performance of the modules mapped to various architectures and facilitates the exploration of design space. In this paper, A servant-execution-flow model is proposed for the abstraction of the cooperation of the heterogeneous processors, which supports task partition, communication and synchronization. At its first run, the intermediate language represented by the data flow diagram can generate the executable code of the target processor or can be converted into high-level programming languages. The instantiation parameters efficiently control the relationship between the modules and computational units, including two hierarchical processing units mapping and adjustment of data-level parallelism. An embedded system of a three-dimensional waveform oscilloscope is selected as a case study. The performance of algorithms such as contrast stretching, etc., are analyzed with implementations on various combinations of these processors. The experimental results show that the heterogeneous computing system with less than 35% resources achieves similar performance to the pure FPGA and approximate energy efficiency.Keywords: FPGA-CPU-GPU collaboration, design space exploration, heterogeneous computing, intermediate language, parameterized instantiation
Procedia PDF Downloads 11824980 Applying Concept Mapping to Explore Temperature Abuse Factors in the Processes of Cold Chain Logistics Centers
Authors: Marco F. Benaglia, Mei H. Chen, Kune M. Tsai, Chia H. Hung
Abstract:
As societal and family structures, consumer dietary habits, and awareness about food safety and quality continue to evolve in most developed countries, the demand for refrigerated and frozen foods has been growing, and the issues related to their preservation have gained increasing attention. A well-established cold chain logistics system is essential to avoid any temperature abuse; therefore, assessing potential disruptions in the operational processes of cold chain logistics centers becomes pivotal. This study preliminarily employs HACCP to find disruption factors in cold chain logistics centers that may cause temperature abuse. Then, concept mapping is applied: selected experts engage in brainstorming sessions to identify any further factors. The panel consists of ten experts, including four from logistics and home delivery, two from retail distribution, one from the food industry, two from low-temperature logistics centers, and one from the freight industry. Disruptions include equipment-related aspects, human factors, management aspects, and process-related considerations. The areas of observation encompass freezer rooms, refrigerated storage areas, loading docks, sorting areas, and vehicle parking zones. The experts also categorize the disruption factors based on perceived similarities and build a similarity matrix. Each factor is evaluated for its impact, frequency, and investment importance. Next, multiple scale analysis, cluster analysis, and other methods are used to analyze these factors. Simultaneously, key disruption factors are identified based on their impact and frequency, and, subsequently, the factors that companies prioritize and are willing to invest in are determined by assessing investors’ risk aversion behavior. Finally, Cumulative Prospect Theory (CPT) is applied to verify the risk patterns. 66 disruption factors are found and categorized into six clusters: (1) "Inappropriate Use and Maintenance of Hardware and Software Facilities", (2) "Inadequate Management and Operational Negligence", (3) "Product Characteristics Affecting Quality and Inappropriate Packaging", (4) "Poor Control of Operation Timing and Missing Distribution Processing", (5) "Inadequate Planning for Peak Periods and Poor Process Planning", and (6) "Insufficient Cold Chain Awareness and Inadequate Training of Personnel". This study also identifies five critical factors in the operational processes of cold chain logistics centers: "Lack of Personnel’s Awareness Regarding Cold Chain Quality", "Personnel Not Following Standard Operating Procedures", "Personnel’s Operational Negligence", "Management’s Inadequacy", and "Lack of Personnel’s Knowledge About Cold Chain". The findings show that cold chain operators prioritize prevention and improvement efforts in the "Inappropriate Use and Maintenance of Hardware and Software Facilities" cluster, particularly focusing on the factors of "Temperature Setting Errors" and "Management’s Inadequacy". However, through the application of CPT theory, this study reveals that companies are not usually willing to invest in the improvement of factors related to the "Inappropriate Use and Maintenance of Hardware and Software Facilities" cluster due to its low occurrence likelihood, but they acknowledge the severity of the consequences if it does occur. Hence, the main implication is that the key disruption factors in cold chain logistics centers’ processes are associated with personnel issues; therefore, comprehensive training, periodic audits, and the establishment of reasonable incentives and penalties for both new employees and managers may significantly reduce disruption issues.Keywords: concept mapping, cold chain, HACCP, cumulative prospect theory
Procedia PDF Downloads 6824979 A Mutually Exclusive Task Generation Method Based on Data Augmentation
Authors: Haojie Wang, Xun Li, Rui Yin
Abstract:
In order to solve the memorization overfitting in the model-agnostic meta-learning MAML algorithm, a method of generating mutually exclusive tasks based on data augmentation is proposed. This method generates a mutex task by corresponding one feature of the data to multiple labels so that the generated mutex task is inconsistent with the data distribution in the initial dataset. Because generating mutex tasks for all data will produce a large number of invalid data and, in the worst case, lead to an exponential growth of computation, this paper also proposes a key data extraction method that only extract part of the data to generate the mutex task. The experiments show that the method of generating mutually exclusive tasks can effectively solve the memorization overfitting in the meta-learning MAML algorithm.Keywords: mutex task generation, data augmentation, meta-learning, text classification.
Procedia PDF Downloads 14324978 Reconstruction Spectral Reflectance Cube Based on Artificial Neural Network for Multispectral Imaging System
Authors: Iwan Cony Setiadi, Aulia M. T. Nasution
Abstract:
The multispectral imaging (MSI) technique has been used for skin analysis, especially for distant mapping of in-vivo skin chromophores by analyzing spectral data at each reflected image pixel. For ergonomic purpose, our multispectral imaging system is decomposed in two parts: a light source compartment based on LED with 11 different wavelenghts and a monochromatic 8-Bit CCD camera with C-Mount Objective Lens. The software based on GUI MATLAB to control the system was also developed. Our system provides 11 monoband images and is coupled with a software reconstructing hyperspectral cubes from these multispectral images. In this paper, we proposed a new method to build a hyperspectral reflectance cube based on artificial neural network algorithm. After preliminary corrections, a neural network is trained using the 32 natural color from X-Rite Color Checker Passport. The learning procedure involves acquisition, by a spectrophotometer. This neural network is then used to retrieve a megapixel multispectral cube between 380 and 880 nm with a 5 nm resolution from a low-spectral-resolution multispectral acquisition. As hyperspectral cubes contain spectra for each pixel; comparison should be done between the theoretical values from the spectrophotometer and the reconstructed spectrum. To evaluate the performance of reconstruction, we used the Goodness of Fit Coefficient (GFC) and Root Mean Squared Error (RMSE). To validate reconstruction, the set of 8 colour patches reconstructed by our MSI system and the one recorded by the spectrophotometer were compared. The average GFC was 0.9990 (standard deviation = 0.0010) and the average RMSE is 0.2167 (standard deviation = 0.064).Keywords: multispectral imaging, reflectance cube, spectral reconstruction, artificial neural network
Procedia PDF Downloads 32224977 House Extension Strategy in High-Density Informal Settlement: A Case Study in Kampung Cikini, Jakarta, Indonesia
Authors: Meidesta Pitria, Akiko Okabe
Abstract:
In high-density informal settlement, extension area at the outside of the houses could primarily happen as a spatial modification response. House extension in high-density informal settlement is not only becoming a physical spatial modification that makes a blur zone between private and public but also supporting the growth and existence of informal economy and other daily activities in both individuals and communities. This research took a case study in an informal settlement named Kampung Cikini, a densely populated area in Central Jakarta. The aim of this study is to identify and clarify house extension as a strategy in dealing with urbanization in an informal settlement. By using the perspective and information from housewives, the analysis is based on the assumption that land ownership transformation and the activities in house extension area influence the different kinds of house extension’s spatial modification and local planning policy in relation with the implementation of house extension strategy. The data collection was done in four sites, two sites are located in outer-wide alley and another two sites are located in inner-narrow alley. In this research, data of 104 housewives in 86 houses were collected through representatives of housewives and local leader of each sites. The research was started from participatory mapping process, deep interview with local leader, and initiated collaboration with housewives community in having a celebration as communal event to cultivate together the issue. This study shows that land ownership, activities, and alley are indispensable in the decision of extension space making. The more permanency status of land ownership the more permanent and various extension could be implemented. However, in some blocks, the existence of origin house or first land owner also has a significant role in coordination and agreement in using and modifying extension space. In outer-wide alley, the existence of more various activities in front area of the houses is significantly related with the chance given by having wider alley, particularly for informal income generating activities. In the inner-narrow alley, limited space in front of the houses affects more negotiations in the community for having more shared spaces, even inside their private space.Keywords: house extension, housewives, informal settlement, kampung, high density
Procedia PDF Downloads 20624976 Revolutionizing Traditional Farming Using Big Data/Cloud Computing: A Review on Vertical Farming
Authors: Milind Chaudhari, Suhail Balasinor
Abstract:
Due to massive deforestation and an ever-increasing population, the organic content of the soil is depleting at a much faster rate. Due to this, there is a big chance that the entire food production in the world will drop by 40% in the next two decades. Vertical farming can help in aiding food production by leveraging big data and cloud computing to ensure plants are grown naturally by providing the optimum nutrients sunlight by analyzing millions of data points. This paper outlines the most important parameters in vertical farming and how a combination of big data and AI helps in calculating and analyzing these millions of data points. Finally, the paper outlines how different organizations are controlling the indoor environment by leveraging big data in enhancing food quantity and quality.Keywords: big data, IoT, vertical farming, indoor farming
Procedia PDF Downloads 17524975 Soil Salinity from Wastewater Irrigation in Urban Greenery
Authors: H. Nouri, S. Chavoshi Borujeni, S. Anderson, S. Beecham, P. Sutton
Abstract:
The potential risk of salt leaching through wastewater irrigation is of concern for most local governments and city councils. Despite the necessity of salinity monitoring and management in urban greenery, most attention has been on agricultural fields. This study was defined to investigate the capability and feasibility of monitoring and predicting soil salinity using near sensing and remote sensing approaches using EM38 surveys, and high-resolution multispectral image of WorldView3. Veale Gardens within the Adelaide Parklands was selected as the experimental site. The results of the near sensing investigation were validated by testing soil salinity samples in the laboratory. Over 30 band combinations forming salinity indices were tested using image processing techniques. The outcomes of the remote sensing and near sensing approaches were compared to examine whether remotely sensed salinity indicators could map and predict the spatial variation of soil salinity through a potential statistical model. Statistical analysis was undertaken using the Stata 13 statistical package on over 52,000 points. Several regression models were fitted to the data, and the mixed effect modelling was selected the most appropriate one as it takes to account the systematic observation-specific unobserved heterogeneity. Results showed that SAVI (Soil Adjusted Vegetation Index) was the only salinity index that could be considered as a predictor for soil salinity but further investigation is needed. However, near sensing was found as a rapid, practical and realistically accurate approach for salinity mapping of heterogeneous urban vegetation.Keywords: WorldView3, remote sensing, EM38, near sensing, urban green spaces, green smart cities
Procedia PDF Downloads 16224974 Data Challenges Facing Implementation of Road Safety Management Systems in Egypt
Authors: A. Anis, W. Bekheet, A. El Hakim
Abstract:
Implementing a Road Safety Management System (SMS) in a crowded developing country such as Egypt is a necessity. Beginning a sustainable SMS requires a comprehensive reliable data system for all information pertinent to road crashes. In this paper, a survey for the available data in Egypt and validating it for using in an SMS in Egypt. The research provides some missing data, and refer to the unavailable data in Egypt, looking forward to the contribution of the scientific society, the authorities, and the public in solving the problem of missing or unreliable crash data. The required data for implementing an SMS in Egypt are divided into three categories; the first is available data such as fatality and injury rates and it is proven in this research that it may be inconsistent and unreliable, the second category of data is not available, but it may be estimated, an example of estimating vehicle cost is available in this research, the third is not available and can be measured case by case such as the functional and geometric properties of a facility. Some inquiries are provided in this research for the scientific society, such as how to improve the links among stakeholders of road safety in order to obtain a consistent, non-biased, and reliable data system.Keywords: road safety management system, road crash, road fatality, road injury
Procedia PDF Downloads 14624973 Big Data-Driven Smart Policing: Big Data-Based Patrol Car Dispatching in Abu Dhabi, UAE
Authors: Oualid Walid Ben Ali
Abstract:
Big Data has become one of the buzzwords today. The recent explosion of digital data has led the organization, either private or public, to a new era towards a more efficient decision making. At some point, business decided to use that concept in order to learn what make their clients tick with phrases like ‘sales funnel’ analysis, ‘actionable insights’, and ‘positive business impact’. So, it stands to reason that Big Data was viewed through green (read: money) colored lenses. Somewhere along the line, however someone realized that collecting and processing data doesn’t have to be for business purpose only, but also could be used for other purposes to assist law enforcement or to improve policing or in road safety. This paper presents briefly, how Big Data have been used in the fields of policing order to improve the decision making process in the daily operation of the police. As example, we present a big-data driven system which is sued to accurately dispatch the patrol cars in a geographic environment. The system is also used to allocate, in real-time, the nearest patrol car to the location of an incident. This system has been implemented and applied in the Emirate of Abu Dhabi in the UAE.Keywords: big data, big data analytics, patrol car allocation, dispatching, GIS, intelligent, Abu Dhabi, police, UAE
Procedia PDF Downloads 49024972 Mining Multicity Urban Data for Sustainable Population Relocation
Authors: Xu Du, Aparna S. Varde
Abstract:
In this research, we propose to conduct diagnostic and predictive analysis about the key factors and consequences of urban population relocation. To achieve this goal, urban simulation models extract the urban development trends as land use change patterns from a variety of data sources. The results are treated as part of urban big data with other information such as population change and economic conditions. Multiple data mining methods are deployed on this data to analyze nonlinear relationships between parameters. The result determines the driving force of population relocation with respect to urban sprawl and urban sustainability and their related parameters. Experiments so far reveal that data mining methods discover useful knowledge from the multicity urban data. This work sets the stage for developing a comprehensive urban simulation model for catering to specific questions by targeted users. It contributes towards achieving sustainability as a whole.Keywords: data mining, environmental modeling, sustainability, urban planning
Procedia PDF Downloads 30824971 Model Order Reduction for Frequency Response and Effect of Order of Method for Matching Condition
Authors: Aref Ghafouri, Mohammad javad Mollakazemi, Farhad Asadi
Abstract:
In this paper, model order reduction method is used for approximation in linear and nonlinearity aspects in some experimental data. This method can be used for obtaining offline reduced model for approximation of experimental data and can produce and follow the data and order of system and also it can match to experimental data in some frequency ratios. In this study, the method is compared in different experimental data and influence of choosing of order of the model reduction for obtaining the best and sufficient matching condition for following the data is investigated in format of imaginary and reality part of the frequency response curve and finally the effect and important parameter of number of order reduction in nonlinear experimental data is explained further.Keywords: frequency response, order of model reduction, frequency matching condition, nonlinear experimental data
Procedia PDF Downloads 40224970 An Empirical Study of the Impacts of Big Data on Firm Performance
Authors: Thuan Nguyen
Abstract:
In the present time, data to a data-driven knowledge-based economy is the same as oil to the industrial age hundreds of years ago. Data is everywhere in vast volumes! Big data analytics is expected to help firms not only efficiently improve performance but also completely transform how they should run their business. However, employing the emergent technology successfully is not easy, and assessing the roles of big data in improving firm performance is even much harder. There was a lack of studies that have examined the impacts of big data analytics on organizational performance. This study aimed to fill the gap. The present study suggested using firms’ intellectual capital as a proxy for big data in evaluating its impact on organizational performance. The present study employed the Value Added Intellectual Coefficient method to measure firm intellectual capital, via its three main components: human capital efficiency, structural capital efficiency, and capital employed efficiency, and then used the structural equation modeling technique to model the data and test the models. The financial fundamental and market data of 100 randomly selected publicly listed firms were collected. The results of the tests showed that only human capital efficiency had a significant positive impact on firm profitability, which highlighted the prominent human role in the impact of big data technology.Keywords: big data, big data analytics, intellectual capital, organizational performance, value added intellectual coefficient
Procedia PDF Downloads 24524969 Automated Test Data Generation For some types of Algorithm
Authors: Hitesh Tahbildar
Abstract:
The cost of test data generation for a program is computationally very high. In general case, no algorithm to generate test data for all types of algorithms has been found. The cost of generating test data for different types of algorithm is different. Till date, people are emphasizing the need to generate test data for different types of programming constructs rather than different types of algorithms. The test data generation methods have been implemented to find heuristics for different types of algorithms. Some algorithms that includes divide and conquer, backtracking, greedy approach, dynamic programming to find the minimum cost of test data generation have been tested. Our experimental results say that some of these types of algorithm can be used as a necessary condition for selecting heuristics and programming constructs are sufficient condition for selecting our heuristics. Finally we recommend the different heuristics for test data generation to be selected for different types of algorithms.Keywords: ongest path, saturation point, lmax, kL, kS
Procedia PDF Downloads 40524968 Automated Natural Hazard Zonation System with Internet-SMS Warning: Distributed GIS for Sustainable Societies Creating Schema and Interface for Mapping and Communication
Authors: Devanjan Bhattacharya, Jitka Komarkova
Abstract:
The research describes the implementation of a novel and stand-alone system for dynamic hazard warning. The system uses all existing infrastructure already in place like mobile networks, a laptop/PC and the small installation software. The geospatial dataset are the maps of a region which are again frugal. Hence there is no need to invest and it reaches everyone with a mobile. A novel architecture of hazard assessment and warning introduced where major technologies in ICT interfaced to give a unique WebGIS based dynamic real time geohazard warning communication system. A never before architecture introduced for integrating WebGIS with telecommunication technology. Existing technologies interfaced in a novel architectural design to address a neglected domain in a way never done before–through dynamically updatable WebGIS based warning communication. The work publishes new architecture and novelty in addressing hazard warning techniques in sustainable way and user friendly manner. Coupling of hazard zonation and hazard warning procedures into a single system has been shown. Generalized architecture for deciphering a range of geo-hazards has been developed. Hence the developmental work presented here can be summarized as the development of internet-SMS based automated geo-hazard warning communication system; integrating a warning communication system with a hazard evaluation system; interfacing different open-source technologies towards design and development of a warning system; modularization of different technologies towards development of a warning communication system; automated data creation, transformation and dissemination over different interfaces. The architecture of the developed warning system has been functionally automated as well as generalized enough that can be used for any hazard and setup requirement has been kept to a minimum.Keywords: geospatial, web-based GIS, geohazard, warning system
Procedia PDF Downloads 40824967 The Perspective on Data Collection Instruments for Younger Learners
Authors: Hatice Kübra Koç
Abstract:
For academia, collecting reliable and valid data is one of the most significant issues for researchers. However, it is not the same procedure for all different target groups; meanwhile, during data collection from teenagers, young adults, or adults, researchers can use common data collection tools such as questionnaires, interviews, and semi-structured interviews; yet, for young learners and very young ones, these reliable and valid data collection tools cannot be easily designed or applied by the researchers. In this study, firstly, common data collection tools are examined for ‘very young’ and ‘young learners’ participant groups since it is thought that the quality and efficiency of an academic study is mainly based on its valid and correct data collection and data analysis procedure. Secondly, two different data collection instruments for very young and young learners are stated as discussing the efficacy of them. Finally, a suggested data collection tool – a performance-based questionnaire- which is specifically developed for ‘very young’ and ‘young learners’ participant groups in the field of teaching English to young learners as a foreign language is presented in this current study. The designing procedure and suggested items/factors for the suggested data collection tool are accordingly revealed at the end of the study to help researchers have studied with young and very learners.Keywords: data collection instruments, performance-based questionnaire, young learners, very young learners
Procedia PDF Downloads 9224966 Generating Swarm Satellite Data Using Long Short-Term Memory and Generative Adversarial Networks for the Detection of Seismic Precursors
Authors: Yaxin Bi
Abstract:
Accurate prediction and understanding of the evolution mechanisms of earthquakes remain challenging in the fields of geology, geophysics, and seismology. This study leverages Long Short-Term Memory (LSTM) networks and Generative Adversarial Networks (GANs), a generative model tailored to time-series data, for generating synthetic time series data based on Swarm satellite data, which will be used for detecting seismic anomalies. LSTMs demonstrated commendable predictive performance in generating synthetic data across multiple countries. In contrast, the GAN models struggled to generate synthetic data, often producing non-informative values, although they were able to capture the data distribution of the time series. These findings highlight both the promise and challenges associated with applying deep learning techniques to generate synthetic data, underscoring the potential of deep learning in generating synthetic electromagnetic satellite data.Keywords: LSTM, GAN, earthquake, synthetic data, generative AI, seismic precursors
Procedia PDF Downloads 3224965 Interpretable Deep Learning Models for Medical Condition Identification
Authors: Dongping Fang, Lian Duan, Xiaojing Yuan, Mike Xu, Allyn Klunder, Kevin Tan, Suiting Cao, Yeqing Ji
Abstract:
Accurate prediction of a medical condition with straight clinical evidence is a long-sought topic in the medical management and health insurance field. Although great progress has been made with machine learning algorithms, the medical community is still, to a certain degree, suspicious about the model's accuracy and interpretability. This paper presents an innovative hierarchical attention deep learning model to achieve good prediction and clear interpretability that can be easily understood by medical professionals. This deep learning model uses a hierarchical attention structure that matches naturally with the medical history data structure and reflects the member’s encounter (date of service) sequence. The model attention structure consists of 3 levels: (1) attention on the medical code types (diagnosis codes, procedure codes, lab test results, and prescription drugs), (2) attention on the sequential medical encounters within a type, (3) attention on the medical codes within an encounter and type. This model is applied to predict the occurrence of stage 3 chronic kidney disease (CKD3), using three years’ medical history of Medicare Advantage (MA) members from a top health insurance company. The model takes members’ medical events, both claims and electronic medical record (EMR) data, as input, makes a prediction of CKD3 and calculates the contribution from individual events to the predicted outcome. The model outcome can be easily explained with the clinical evidence identified by the model algorithm. Here are examples: Member A had 36 medical encounters in the past three years: multiple office visits, lab tests and medications. The model predicts member A has a high risk of CKD3 with the following well-contributed clinical events - multiple high ‘Creatinine in Serum or Plasma’ tests and multiple low kidneys functioning ‘Glomerular filtration rate’ tests. Among the abnormal lab tests, more recent results contributed more to the prediction. The model also indicates regular office visits, no abnormal findings of medical examinations, and taking proper medications decreased the CKD3 risk. Member B had 104 medical encounters in the past 3 years and was predicted to have a low risk of CKD3, because the model didn’t identify diagnoses, procedures, or medications related to kidney disease, and many lab test results, including ‘Glomerular filtration rate’ were within the normal range. The model accurately predicts members A and B and provides interpretable clinical evidence that is validated by clinicians. Without extra effort, the interpretation is generated directly from the model and presented together with the occurrence date. Our model uses the medical data in its most raw format without any further data aggregation, transformation, or mapping. This greatly simplifies the data preparation process, mitigates the chance for error and eliminates post-modeling work needed for traditional model explanation. To our knowledge, this is the first paper on an interpretable deep-learning model using a 3-level attention structure, sourcing both EMR and claim data, including all 4 types of medical data, on the entire Medicare population of a big insurance company, and more importantly, directly generating model interpretation to support user decision. In the future, we plan to enrich the model input by adding patients’ demographics and information from free-texted physician notes.Keywords: deep learning, interpretability, attention, big data, medical conditions
Procedia PDF Downloads 9124964 Classification Using Worldview-2 Imagery of Giant Panda Habitat in Wolong, Sichuan Province, China
Authors: Yunwei Tang, Linhai Jing, Hui Li, Qingjie Liu, Xiuxia Li, Qi Yan, Haifeng Ding
Abstract:
The giant panda (Ailuropoda melanoleuca) is an endangered species, mainly live in central China, where bamboos act as the main food source of wild giant pandas. Knowledge of spatial distribution of bamboos therefore becomes important for identifying the habitat of giant pandas. There have been ongoing studies for mapping bamboos and other tree species using remote sensing. WorldView-2 (WV-2) is the first high resolution commercial satellite with eight Multi-Spectral (MS) bands. Recent studies demonstrated that WV-2 imagery has a high potential in classification of tree species. The advanced classification techniques are important for utilising high spatial resolution imagery. It is generally agreed that object-based image analysis is a more desirable method than pixel-based analysis in processing high spatial resolution remotely sensed data. Classifiers that use spatial information combined with spectral information are known as contextual classifiers. It is suggested that contextual classifiers can achieve greater accuracy than non-contextual classifiers. Thus, spatial correlation can be incorporated into classifiers to improve classification results. The study area is located at Wuyipeng area in Wolong, Sichuan Province. The complex environment makes it difficult for information extraction since bamboos are sparsely distributed, mixed with brushes, and covered by other trees. Extensive fieldworks in Wuyingpeng were carried out twice. The first one was on 11th June, 2014, aiming at sampling feature locations for geometric correction and collecting training samples for classification. The second fieldwork was on 11th September, 2014, for the purposes of testing the classification results. In this study, spectral separability analysis was first performed to select appropriate MS bands for classification. Also, the reflectance analysis provided information for expanding sample points under the circumstance of knowing only a few. Then, a spatially weighted object-based k-nearest neighbour (k-NN) classifier was applied to the selected MS bands to identify seven land cover types (bamboo, conifer, broadleaf, mixed forest, brush, bare land, and shadow), accounting for spatial correlation within classes using geostatistical modelling. The spatially weighted k-NN method was compared with three alternatives: the traditional k-NN classifier, the Support Vector Machine (SVM) method and the Classification and Regression Tree (CART). Through field validation, it was proved that the classification result obtained using the spatially weighted k-NN method has the highest overall classification accuracy (77.61%) and Kappa coefficient (0.729); the producer’s accuracy and user’s accuracy achieve 81.25% and 95.12% for the bamboo class, respectively, also higher than the other methods. Photos of tree crowns were taken at sample locations using a fisheye camera, so the canopy density could be estimated. It is found that it is difficult to identify bamboo in the areas with a large canopy density (over 0.70); it is possible to extract bamboos in the areas with a median canopy density (from 0.2 to 0.7) and in a sparse forest (canopy density is less than 0.2). In summary, this study explores the ability of WV-2 imagery for bamboo extraction in a mountainous region in Sichuan. The study successfully identified the bamboo distribution, providing supporting knowledge for assessing the habitats of giant pandas.Keywords: bamboo mapping, classification, geostatistics, k-NN, worldview-2
Procedia PDF Downloads 31324963 Optimization of Fourth Order Discrete-Approximation Inclusions
Authors: Elimhan N. Mahmudov
Abstract:
The paper concerns the necessary and sufficient conditions of optimality for Cauchy problem of fourth order discrete (PD) and discrete-approximate (PDA) inclusions. The main problem is formulation of the fourth order adjoint discrete and discrete-approximate inclusions and transversality conditions, which are peculiar to problems including fourth order derivatives and approximate derivatives. Thus the necessary and sufficient conditions of optimality are obtained incorporating the Euler-Lagrange and Hamiltonian forms of inclusions. Derivation of optimality conditions are based on the apparatus of locally adjoint mapping (LAM). Moreover in the application of these results we consider the fourth order linear discrete and discrete-approximate inclusions.Keywords: difference, optimization, fourth, approximation, transversality
Procedia PDF Downloads 37424962 Generation of Quasi-Measurement Data for On-Line Process Data Analysis
Authors: Hyun-Woo Cho
Abstract:
For ensuring the safety of a manufacturing process one should quickly identify an assignable cause of a fault in an on-line basis. To this end, many statistical techniques including linear and nonlinear methods have been frequently utilized. However, such methods possessed a major problem of small sample size, which is mostly attributed to the characteristics of empirical models used for reference models. This work presents a new method to overcome the insufficiency of measurement data in the monitoring and diagnosis tasks. Some quasi-measurement data are generated from existing data based on the two indices of similarity and importance. The performance of the method is demonstrated using a real data set. The results turn out that the presented methods are able to handle the insufficiency problem successfully. In addition, it is shown to be quite efficient in terms of computational speed and memory usage, and thus on-line implementation of the method is straightforward for monitoring and diagnosis purposes.Keywords: data analysis, diagnosis, monitoring, process data, quality control
Procedia PDF Downloads 48124961 Detecting of Crime Hot Spots for Crime Mapping
Authors: Somayeh Nezami
Abstract:
The management of financial and human resources of police in metropolitans requires many information and exact plans to reduce a rate of crime and increase the safety of the society. Geographical Information Systems have an important role in providing crime maps and their analysis. By using them and identification of crime hot spots along with spatial presentation of the results, it is possible to allocate optimum resources while presenting effective methods for decision making and preventive solutions. In this paper, we try to explain and compare between some of the methods of hot spots analysis such as Mode, Fuzzy Mode and Nearest Neighbour Hierarchical spatial clustering (NNH). Then the spots with the highest crime rates of drug smuggling for one province in Iran with borderline with Afghanistan are obtained. We will show that among these three methods NNH leads to the best result.Keywords: GIS, Hot spots, nearest neighbor hierarchical spatial clustering, NNH, spatial analysis of crime
Procedia PDF Downloads 32924960 Litho-Structural Variations and Gold Mineralization around Wonaka Schist Belt, North West Nigeria
Authors: Umar Sambo Umar, Ahmad Isah Haruna, Abubakar Sadik Maigari, Muhammad Bello Abubakar
Abstract:
Schist belts in Nigeria occur prominently west of longitude 80 E and sporadic to the east, they are upper Proterozioc low-medium grade deformed metasediments and metavolcanics that were intruded by Pan-African granitoids. The Wonaka schist belt, though reportedly distinctive in composition and metamorphism, is the least understood; the host for primary gold were not defined, structures which may control primary enrichment have not been delineated. The aim of this work is to determine the relationship between litho-structures and the gold around Wonaka schist belt through geological field mapping, petrographic studies and structural data analysis via ArcGis 10.2, Surfer 11.0 and Stereopro 2.0. The results show that the major rock types are mica schist and migmatites, muscovites detected during microstructural analysis suggests low-grade metamorphism in the metapelites. The shear zones identified were trending North Northeast – South Southwest (NNE-SSW), fractures trend mostly Northeast-Southwest (NE-SW) perpendicular to planes of gneissic foliations, these conform to the late Pan-African deformational episode. Pegmatite lodes, net self-cross cutting quartz veins as well as the quartz stringers hosted by both migmatites and schist are delineated as targets for primary gold mineralization, while major confluences of the streams serve as zones for secondary (placer) gold targets since the streams are dendritic and intermittent.Keywords: gold mineralization, Nigeria, migmatites, Wonaka schist belt
Procedia PDF Downloads 19624959 Emerging Technology for Business Intelligence Applications
Authors: Hsien-Tsen Wang
Abstract:
Business Intelligence (BI) has long helped organizations make informed decisions based on data-driven insights and gain competitive advantages in the marketplace. In the past two decades, businesses witnessed not only the dramatically increasing volume and heterogeneity of business data but also the emergence of new technologies, such as Artificial Intelligence (AI), Semantic Web (SW), Cloud Computing, and Big Data. It is plausible that the convergence of these technologies would bring more value out of business data by establishing linked data frameworks and connecting in ways that enable advanced analytics and improved data utilization. In this paper, we first review and summarize current BI applications and methodology. Emerging technologies that can be integrated into BI applications are then discussed. Finally, we conclude with a proposed synergy framework that aims at achieving a more flexible, scalable, and intelligent BI solution.Keywords: business intelligence, artificial intelligence, semantic web, big data, cloud computing
Procedia PDF Downloads 9424958 Using Equipment Telemetry Data for Condition-Based maintenance decisions
Authors: John Q. Todd
Abstract:
Given that modern equipment can provide comprehensive health, status, and error condition data via built-in sensors, maintenance organizations have a new and valuable source of insight to take advantage of. This presentation will expose what these data payloads might look like and how they can be filtered, visualized, calculated into metrics, used for machine learning, and generate alerts for further action.Keywords: condition based maintenance, equipment data, metrics, alerts
Procedia PDF Downloads 18824957 Improving the Efficiency of Repacking Process with Lean Technique: The Study of Read With Me Group Company Limited
Authors: Jirayut Phetchuen, Jongkol Srithorn
Abstract:
The study examines the unloading and repacking process of Read With Me Group Company Limited. The research aims to improve the old work process and build a new efficient process with the Lean Technique and new machines for faster delivery without increasing the number of employees. Currently, two employees work based on five days on and off. However, workplace injuries have delayed the delivery time, especially the delivery to the neighboring countries. After the process improvement, the working space increased by 25%, the Process Lead Time decreased by 40%, the work efficiency increased by 175.82%, and the work injuries rate was reduced to zero.Keywords: lean technique, plant layout design, U-shaped disassembly line, value stream mapping
Procedia PDF Downloads 104