Search results for: ArcGIS data analysis
41825 From Data Processing to Experimental Design and Back Again: A Parameter Identification Problem Based on FRAP Images
Authors: Stepan Papacek, Jiri Jablonsky, Radek Kana, Ctirad Matonoha, Stefan Kindermann
Abstract:
FRAP (Fluorescence Recovery After Photobleaching) is a widely used measurement technique to determine the mobility of fluorescent molecules within living cells. While the experimental setup and protocol for FRAP experiments are usually fixed, data processing part is still under development. In this paper, we formulate and solve the problem of data selection which enhances the processing of FRAP images. We introduce the concept of the irrelevant data set, i.e., the data which are almost not reducing the confidence interval of the estimated parameters and thus could be neglected. Based on sensitivity analysis, we both solve the problem of the optimal data space selection and we find specific conditions for optimizing an important experimental design factor, e.g., the radius of bleach spot. Finally, a theorem announcing less precision of the integrated data approach compared to the full data case is proven; i.e., we claim that the data set represented by the FRAP recovery curve lead to a larger confidence interval compared to the spatio-temporal (full) data.Keywords: FRAP, inverse problem, parameter identification, sensitivity analysis, optimal experimental design
Procedia PDF Downloads 27641824 Opening up Government Datasets for Big Data Analysis to Support Policy Decisions
Authors: K. Hardy, A. Maurushat
Abstract:
Policy makers are increasingly looking to make evidence-based decisions. Evidence-based decisions have historically used rigorous methodologies of empirical studies by research institutes, as well as less reliable immediate survey/polls often with limited sample sizes. As we move into the era of Big Data analytics, policy makers are looking to different methodologies to deliver reliable empirics in real-time. The question is not why did these people do this for the last 10 years, but why are these people doing this now, and if the this is undesirable, and how can we have an impact to promote change immediately. Big data analytics rely heavily on government data that has been released in to the public domain. The open data movement promises greater productivity and more efficient delivery of services; however, Australian government agencies remain reluctant to release their data to the general public. This paper considers the barriers to releasing government data as open data, and how these barriers might be overcome.Keywords: big data, open data, productivity, data governance
Procedia PDF Downloads 37041823 Qualitative Data Analysis for Health Care Services
Authors: Taner Ersoz, Filiz Ersoz
Abstract:
This study was designed enable application of multivariate technique in the interpretation of categorical data for measuring health care services satisfaction in Turkey. The data was collected from a total of 17726 respondents. The establishment of the sample group and collection of the data were carried out by a joint team from The Ministry of Health and Turkish Statistical Institute (Turk Stat) of Turkey. The multiple correspondence analysis (MCA) was used on the data of 2882 respondents who answered the questionnaire in full. The multiple correspondence analysis indicated that, in the evaluation of health services females, public employees, younger and more highly educated individuals were more concerned and complainant than males, private sector employees, older and less educated individuals. Overall 53 % of the respondents were pleased with the improvements in health care services in the past three years. This study demonstrates the public consciousness in health services and health care satisfaction in Turkey. It was found that most the respondents were pleased with the improvements in health care services over the past three years. Awareness of health service quality increases with education levels. Older individuals and males would appear to have lower expectancies in health services.Keywords: multiple correspondence analysis, multivariate categorical data, health care services, health satisfaction survey
Procedia PDF Downloads 24241822 Road Safety in the Great Britain: An Exploratory Data Analysis
Authors: Jatin Kumar Choudhary, Naren Rayala, Abbas Eslami Kiasari, Fahimeh Jafari
Abstract:
The Great Britain has one of the safest road networks in the world. However, the consequences of any death or serious injury are devastating for loved ones, as well as for those who help the severely injured. This paper aims to analyse the Great Britain's road safety situation and show the response measures for areas where the total damage caused by accidents can be significantly and quickly reduced. In this paper, we do an exploratory data analysis using STATS19 data. For the past 30 years, the UK has had a good record in reducing fatalities. The UK ranked third based on the number of road deaths per million inhabitants. There were around 165,000 accidents reported in the Great Britain in 2009 and it has been decreasing every year until 2019 which is under 120,000. The government continues to scale back road deaths empowering responsible road users by identifying and prosecuting the parameters that make the roads less safe.Keywords: road safety, data analysis, openstreetmap, feature expanding.
Procedia PDF Downloads 13941821 Data Driven Infrastructure Planning for Offshore Wind farms
Authors: Isha Saxena, Behzad Kazemtabrizi, Matthias C. M. Troffaes, Christopher Crabtree
Abstract:
The calculations done at the beginning of the life of a wind farm are rarely reliable, which makes it important to conduct research and study the failure and repair rates of the wind turbines under various conditions. This miscalculation happens because the current models make a simplifying assumption that the failure/repair rate remains constant over time. This means that the reliability function is exponential in nature. This research aims to create a more accurate model using sensory data and a data-driven approach. The data cleaning and data processing is done by comparing the Power Curve data of the wind turbines with SCADA data. This is then converted to times to repair and times to failure timeseries data. Several different mathematical functions are fitted to the times to failure and times to repair data of the wind turbine components using Maximum Likelihood Estimation and the Posterior expectation method for Bayesian Parameter Estimation. Initial results indicate that two parameter Weibull function and exponential function produce almost identical results. Further analysis is being done using the complex system analysis considering the failures of each electrical and mechanical component of the wind turbine. The aim of this project is to perform a more accurate reliability analysis that can be helpful for the engineers to schedule maintenance and repairs to decrease the downtime of the turbine.Keywords: reliability, bayesian parameter inference, maximum likelihood estimation, weibull function, SCADA data
Procedia PDF Downloads 8541820 Performance of the Cmip5 Models in Simulation of the Present and Future Precipitation over the Lake Victoria Basin
Authors: M. A. Wanzala, L. A. Ogallo, F. J. Opijah, J. N. Mutemi
Abstract:
The usefulness and limitations in climate information are due to uncertainty inherent in the climate system. For any given region to have sustainable development it is important to apply climate information into its socio-economic strategic plans. The overall objective of the study was to assess the performance of the Coupled Model Inter-comparison Project (CMIP5) over the Lake Victoria Basin. The datasets used included the observed point station data, gridded rainfall data from Climate Research Unit (CRU) and hindcast data from eight CMIP5. The methodology included trend analysis, spatial analysis, correlation analysis, Principal Component Analysis (PCA) regression analysis, and categorical statistical skill score. Analysis of the trends in the observed rainfall records indicated an increase in rainfall variability both in space and time for all the seasons. The spatial patterns of the individual models output from the models of MPI, MIROC, EC-EARTH and CNRM were closest to the observed rainfall patterns.Keywords: categorical statistics, coupled model inter-comparison project, principal component analysis, statistical downscaling
Procedia PDF Downloads 36841819 A Highly Accurate Computer-Aided Diagnosis: CAD System for the Diagnosis of Breast Cancer by Using Thermographic Analysis
Authors: Mahdi Bazarganigilani
Abstract:
Computer-aided diagnosis (CAD) systems can play crucial roles in diagnosing crucial diseases such as breast cancer at the earliest. In this paper, a CAD system for the diagnosis of breast cancer was introduced and evaluated. This CAD system was developed by using spatio-temporal analysis of data on a set of consecutive thermographic images by employing wavelet transformation. By using this analysis, a very accurate machine learning model using random forest was obtained. The final results showed a promising accuracy of 91% in terms of the F1 measure indicator among 200 patients' sample data. The CAD system was further extended to obtain a detailed analysis of the effect of smaller sub-areas of each breast on the occurrence of cancer.Keywords: computer-aided diagnosis systems, thermographic analysis, spatio-temporal analysis, image processing, machine learning
Procedia PDF Downloads 20941818 FRATSAN: A New Software for Fractal Analysis of Signals
Authors: Hamidreza Namazi
Abstract:
Fractal analysis is assessing fractal characteristics of data. It consists of several methods to assign fractal characteristics to a dataset which may be a theoretical dataset or a pattern or signal extracted from phenomena including natural geometric objects, sound, market fluctuations, heart rates, digital images, molecular motion, networks, etc. Fractal analysis is now widely used in all areas of science. An important limitation of fractal analysis is that arriving at an empirically determined fractal dimension does not necessarily prove that a pattern is fractal; rather, other essential characteristics have to be considered. For this purpose a Visual C++ based software called FRATSAN (FRActal Time Series ANalyser) was developed which extract information from signals through three measures. These measures are Fractal Dimensions, Jeffrey’s Measure and Hurst Exponent. After computing these measures, the software plots the graphs for each measure. Besides computing three measures the software can classify whether the signal is fractal or no. In fact, the software uses a dynamic method of analysis for all the measures. A sliding window is selected with a value equal to 10% of the total number of data entries. This sliding window is moved one data entry at a time to obtain all the measures. This makes the computation very sensitive to slight changes in data, thereby giving the user an acute analysis of the data. In order to test the performance of this software a set of EEG signals was given as input and the results were computed and plotted. This software is useful not only for fundamental fractal analysis of signals but can be used for other purposes. For instance by analyzing the Hurst exponent plot of a given EEG signal in patients with epilepsy the onset of seizure can be predicted by noticing the sudden changes in the plot.Keywords: EEG signals, fractal analysis, fractal dimension, hurst exponent, Jeffrey’s measure
Procedia PDF Downloads 46641817 Finding Data Envelopment Analysis Targets Using Multi-Objective Programming in DEA-R with Stochastic Data
Authors: R. Shamsi, F. Sharifi
Abstract:
In this paper, we obtain the projection of inefficient units in data envelopment analysis (DEA) in the case of stochastic inputs and outputs using the multi-objective programming (MOP) structure. In some problems, the inputs might be stochastic while the outputs are deterministic, and vice versa. In such cases, we propose a multi-objective DEA-R model because in some cases (e.g., when unnecessary and irrational weights by the BCC model reduce the efficiency score), an efficient decision-making unit (DMU) is introduced as inefficient by the BCC model, whereas the DMU is considered efficient by the DEA-R model. In some other cases, only the ratio of stochastic data may be available (e.g., the ratio of stochastic inputs to stochastic outputs). Thus, we provide a multi-objective DEA model without explicit outputs and prove that the input-oriented MOP DEA-R model in the invariable return to scale case can be replaced by the MOP-DEA model without explicit outputs in the variable return to scale and vice versa. Using the interactive methods for solving the proposed model yields a projection corresponding to the viewpoint of the DM and the analyst, which is nearer to reality and more practical. Finally, an application is provided.Keywords: DEA-R, multi-objective programming, stochastic data, data envelopment analysis
Procedia PDF Downloads 10441816 Investigation of Maritime Accidents with Exploratory Data Analysis in the Strait of Çanakkale (Dardanelles)
Authors: Gizem Kodak
Abstract:
The Strait of Çanakkale, together with the Strait of Istanbul and the Sea of Marmara, form the Turkish Straits System. In other words, the Strait of Çanakkale is the southern gate of the system that connects the Black Sea countries with the other countries of the world. Due to the heavy maritime traffic, it is important to scientifically examine the accident characteristics in the region. In particular, the results indicated by the descriptive statistics are of critical importance in order to strengthen the safety of navigation. At this point, exploratory data analysis offers strategic outputs in terms of defining the problem and knowing the strengths and weaknesses against possible accident risk. The study aims to determine the accident characteristics in the Strait of Çanakkale with temporal and spatial analysis of historical data, using Exploratory Data Analysis (EDA) as the research method. The study's results will reveal the general characteristics of maritime accidents in the region and form the infrastructure for future studies. Therefore, the text provides a clear description of the research goals and methodology, and the study's contributions are well-defined.Keywords: maritime accidents, EDA, Strait of Çanakkale, navigational safety
Procedia PDF Downloads 9541815 Classification of Poverty Level Data in Indonesia Using the Naïve Bayes Method
Authors: Anung Style Bukhori, Ani Dijah Rahajoe
Abstract:
Poverty poses a significant challenge in Indonesia, requiring an effective analytical approach to understand and address this issue. In this research, we applied the Naïve Bayes classification method to examine and classify poverty data in Indonesia. The main focus is on classifying data using RapidMiner, a powerful data analysis platform. The analysis process involves data splitting to train and test the classification model. First, we collected and prepared a poverty dataset that includes various factors such as education, employment, and health..The experimental results indicate that the Naïve Bayes classification model can provide accurate predictions regarding the risk of poverty. The use of RapidMiner in the analysis process offers flexibility and efficiency in evaluating the model's performance. The classification produces several values to serve as the standard for classifying poverty data in Indonesia using Naive Bayes. The accuracy result obtained is 40.26%, with a moderate recall result of 35.94%, a high recall result of 63.16%, and a low recall result of 38.03%. The precision for the moderate class is 58.97%, for the high class is 17.39%, and for the low class is 58.70%. These results can be seen from the graph below.Keywords: poverty, classification, naïve bayes, Indonesia
Procedia PDF Downloads 5341814 Methodology for the Multi-Objective Analysis of Data Sets in Freight Delivery
Authors: Dale Dzemydiene, Aurelija Burinskiene, Arunas Miliauskas, Kristina Ciziuniene
Abstract:
Data flow and the purpose of reporting the data are different and dependent on business needs. Different parameters are reported and transferred regularly during freight delivery. This business practices form the dataset constructed for each time point and contain all required information for freight moving decisions. As a significant amount of these data is used for various purposes, an integrating methodological approach must be developed to respond to the indicated problem. The proposed methodology contains several steps: (1) collecting context data sets and data validation; (2) multi-objective analysis for optimizing freight transfer services. For data validation, the study involves Grubbs outliers analysis, particularly for data cleaning and the identification of statistical significance of data reporting event cases. The Grubbs test is often used as it measures one external value at a time exceeding the boundaries of standard normal distribution. In the study area, the test was not widely applied by authors, except when the Grubbs test for outlier detection was used to identify outsiders in fuel consumption data. In the study, the authors applied the method with a confidence level of 99%. For the multi-objective analysis, the authors would like to select the forms of construction of the genetic algorithms, which have more possibilities to extract the best solution. For freight delivery management, the schemas of genetic algorithms' structure are used as a more effective technique. Due to that, the adaptable genetic algorithm is applied for the description of choosing process of the effective transportation corridor. In this study, the multi-objective genetic algorithm methods are used to optimize the data evaluation and select the appropriate transport corridor. The authors suggest a methodology for the multi-objective analysis, which evaluates collected context data sets and uses this evaluation to determine a delivery corridor for freight transfer service in the multi-modal transportation network. In the multi-objective analysis, authors include safety components, the number of accidents a year, and freight delivery time in the multi-modal transportation network. The proposed methodology has practical value in the management of multi-modal transportation processes.Keywords: multi-objective, analysis, data flow, freight delivery, methodology
Procedia PDF Downloads 17941813 Preliminary Design of Maritime Energy Management System: Naval Architectural Approach to Resolve Recent Limitations
Authors: Seyong Jeong, Jinmo Park, Jinhyoun Park, Boram Kim, Kyoungsoo Ahn
Abstract:
Energy management in the maritime industry is being required by economics and in conformity with new legislative actions taken by the International Maritime Organization (IMO) and the European Union (EU). In response, the various performance monitoring methodologies and data collection practices have been examined by different stakeholders. While many assorted advancements in operation and technology are applicable, their adoption in the shipping industry stays small. This slow uptake can be considered due to many different barriers such as data analysis problems, misreported data, and feedback problems, etc. This study presents a conceptual design of an energy management system (EMS) and proposes the methodology to resolve the limitations (e.g., data normalization using naval architectural evaluation, management of misrepresented data, and feedback from shore to ship through management of performance analysis history). We expect this system to make even short-term charterers assess the ship performance properly and implement sustainable fleet control.Keywords: data normalization, energy management system, naval architectural evaluation, ship performance analysis
Procedia PDF Downloads 44841812 Big Brain: A Single Database System for a Federated Data Warehouse Architecture
Authors: X. Gumara Rigol, I. Martínez de Apellaniz Anzuola, A. Garcia Serrano, A. Franzi Cros, O. Vidal Calbet, A. Al Maruf
Abstract:
Traditional federated architectures for data warehousing work well when corporations have existing regional data warehouses and there is a need to aggregate data at a global level. Schibsted Media Group has been maturing from a decentralised organisation into a more globalised one and needed to build both some of the regional data warehouses for some brands at the same time as the global one. In this paper, we present the architectural alternatives studied and why a custom federated approach was the notable recommendation to go further with the implementation. Although the data warehouses are logically federated, the implementation uses a single database system which presented many advantages like: cost reduction and improved data access to global users allowing consumers of the data to have a common data model for detailed analysis across different geographies and a flexible layer for local specific needs in the same place.Keywords: data integration, data warehousing, federated architecture, Online Analytical Processing (OLAP)
Procedia PDF Downloads 23441811 Measured versus Default Interstate Traffic Data in New Mexico, USA
Authors: M. A. Hasan, M. R. Islam, R. A. Tarefder
Abstract:
This study investigates how the site specific traffic data differs from the Mechanistic Empirical Pavement Design Software default values. Two Weigh-in-Motion (WIM) stations were installed in Interstate-40 (I-40) and Interstate-25 (I-25) to developed site specific data. A computer program named WIM Data Analysis Software (WIMDAS) was developed using Microsoft C-Sharp (.Net) for quality checking and processing of raw WIM data. A complete year data from November 2013 to October 2014 was analyzed using the developed WIM Data Analysis Program. After that, the vehicle class distribution, directional distribution, lane distribution, monthly adjustment factor, hourly distribution, axle load spectra, average number of axle per vehicle, axle spacing, lateral wander distribution, and wheelbase distribution were calculated. Then a comparative study was done between measured data and AASHTOWare default values. It was found that the measured general traffic inputs for I-40 and I-25 significantly differ from the default values.Keywords: AASHTOWare, traffic, weigh-in-motion, axle load distribution
Procedia PDF Downloads 34041810 Additive Weibull Model Using Warranty Claim and Finite Element Analysis Fatigue Analysis
Authors: Kanchan Mondal, Dasharath Koulage, Dattatray Manerikar, Asmita Ghate
Abstract:
This paper presents an additive reliability model using warranty data and Finite Element Analysis (FEA) data. Warranty data for any product gives insight to its underlying issues. This is often used by Reliability Engineers to build prediction model to forecast failure rate of parts. But there is one major limitation in using warranty data for prediction. Warranty periods constitute only a small fraction of total lifetime of a product, most of the time it covers only the infant mortality and useful life zone of a bathtub curve. Predicting with warranty data alone in these cases is not generally provide results with desired accuracy. Failure rate of a mechanical part is driven by random issues initially and wear-out or usage related issues at later stages of the lifetime. For better predictability of failure rate, one need to explore the failure rate behavior at wear out zone of a bathtub curve. Due to cost and time constraints, it is not always possible to test samples till failure, but FEA-Fatigue analysis can provide the failure rate behavior of a part much beyond warranty period in a quicker time and at lesser cost. In this work, the authors proposed an Additive Weibull Model, which make use of both warranty and FEA fatigue analysis data for predicting failure rates. It involves modeling of two data sets of a part, one with existing warranty claims and other with fatigue life data. Hazard rate base Weibull estimation has been used for the modeling the warranty data whereas S-N curved based Weibull parameter estimation is used for FEA data. Two separate Weibull models’ parameters are estimated and combined to form the proposed Additive Weibull Model for prediction.Keywords: bathtub curve, fatigue, FEA, reliability, warranty, Weibull
Procedia PDF Downloads 7241809 Assessing Spatial Associations of Mortality Patterns in Municipalities of the Czech Republic
Authors: Jitka Rychtarikova
Abstract:
Regional differences in mortality in the Czech Republic (CR) may be moderate from a broader European perspective, but important discrepancies in life expectancy can be found between smaller territorial units. In this study territorial units are based on Administrative Districts of Municipalities with Extended Powers (MEP). This definition came into force January 1, 2003. There are 205 units and the city of Prague. MEP represents the smallest unit for which mortality patterns based on life tables can be investigated and the Czech Statistical Office has been calculating such life tables (every five-years) since 2004. MEP life tables from 2009-2013 for males and females allowed the investigation of three main life cycles with the use of temporary life expectancies between the exact ages of 0 and 35; 35 and 65; and the life expectancy at exact age 65. The results showed regional survival inequalities primarily in adult and older ages. Consequently, only mortality indicators for adult and elderly population were related to census 2011 unlinked data for the same age groups. The most relevant socio-economic factors taken from the census are: having a partner, educational level and unemployment rate. The unemployment rate was measured for adults aged 35-64 completed years. Exploratory spatial data analysis methods were used to detect regional patterns in spatially contiguous units of MEP. The presence of spatial non-stationarity (spatial autocorrelation) of mortality levels for male and female adults (35-64), and elderly males and females (65+) was tested using global Moran’s I. Spatial autocorrelation of mortality patterns was mapped using local Moran’s I with the intention to depict clusters of low or high mortality and spatial outliers for two age groups (35-64 and 65+). The highest Moran’s I was observed for male temporary life expectancy between exact ages 35 and 65 (0.52) and the lowest was among women with life expectancy of 65 (0.26). Generally, men showed stronger spatial autocorrelation compared to women. The relationship between mortality indicators such as life expectancies and socio-economic factors like the percentage of males/females having a partner; percentage of males/females with at least higher secondary education; and percentage of unemployed males/females from economically active population aged 35-64 years, was evaluated using multiple regression (OLS). The results were then compared to outputs from geographically weighted regression (GWR). In the Czech Republic, there are two broader territories North-West Bohemia (NWB) and North Moravia (NM), in which excess mortality is well established. Results of the t-test of spatial regression showed that for males aged 30-64 the association between mortality and unemployment (when adjusted for education and partnership) was stronger in NM compared to NWB, while educational level impacted the length of survival more in NWB. Geographic variation and relationships in mortality of the CR MEP will also be tested using the spatial Durbin approach. The calculations were conducted by means of ArcGIS 10.6 and SAS 9.4.Keywords: Czech Republic, mortality, municipality, socio-economic factors, spatial analysis
Procedia PDF Downloads 11741808 Flood Disaster Prevention and Mitigation in Nigeria Using Geographic Information System
Authors: Dinebari Akpee, Friday Aabe Gaage, Florence Fred Nwaigwu
Abstract:
Natural disasters like flood affect many parts of the world including developing countries like Nigeria. As a result, many human lives are lost, properties damaged and so much money is lost in infrastructure damages. These hazards and losses can be mitigated and reduced by providing reliable spatial information to the generality of the people through about flood risks through flood inundation maps. Flood inundation maps are very crucial for emergency action plans, urban planning, ecological studies and insurance rates. Nigeria experience her worst flood in her entire history this year. Many cities were submerged and completely under water due to torrential rainfall. Poor city planning, lack of effective development control among others contributes to the problem too. Geographic information system (GIS) can be used to visualize the extent of flooding, analyze flood maps to produce flood damaged estimation maps and flood risk maps. In this research, the under listed steps were taken in preparation of flood risk maps for the study area: (1) Digitization of topographic data and preparation of digital elevation model using ArcGIS (2) Flood simulation using hydraulic model and integration and (3) Integration of the first two steps to produce flood risk maps. The results shows that GIS can play crucial role in Flood disaster control and mitigation.Keywords: flood disaster, risk maps, geographic information system, hazards
Procedia PDF Downloads 22541807 Analysis of Users’ Behavior on Book Loan Log Based on Association Rule Mining
Authors: Kanyarat Bussaban, Kunyanuth Kularbphettong
Abstract:
This research aims to create a model for analysis of student behavior using Library resources based on data mining technique in case of Suan Sunandha Rajabhat University. The model was created under association rules, apriori algorithm. The results were found 14 rules and the rules were tested with testing data set and it showed that the ability of classify data was 79.24 percent and the MSE was 22.91. The results showed that the user’s behavior model by using association rule technique can use to manage the library resources.Keywords: behavior, data mining technique, a priori algorithm, knowledge discovery
Procedia PDF Downloads 40341806 Analyzing Medical Workflows Using Market Basket Analysis
Authors: Mohit Kumar, Mayur Betharia
Abstract:
Healthcare domain, with the emergence of Electronic Medical Record (EMR), collects a lot of data which have been attracting Data Mining expert’s interest. In the past, doctors have relied on their intuition while making critical clinical decisions. This paper presents the means to analyze the Medical workflows to get business insights out of huge dumped medical databases. Market Basket Analysis (MBA) which is a special data mining technique, has been widely used in marketing and e-commerce field to discover the association between products bought together by customers. It helps businesses in increasing their sales by analyzing the purchasing behavior of customers and pitching the right customer with the right product. This paper is an attempt to demonstrate Market Basket Analysis applications in healthcare. In particular, it discusses the Market Basket Analysis Algorithm ‘Apriori’ applications within healthcare in major areas such as analyzing the workflow of diagnostic procedures, Up-selling and Cross-selling of Healthcare Systems, designing healthcare systems more user-friendly. In the paper, we have demonstrated the MBA applications using Angiography Systems, but can be extrapolated to other modalities as well.Keywords: data mining, market basket analysis, healthcare applications, knowledge discovery in healthcare databases, customer relationship management, healthcare systems
Procedia PDF Downloads 17141805 A Method for Identifying Unusual Transactions in E-commerce Through Extended Data Flow Conformance Checking
Authors: Handie Pramana Putra, Ani Dijah Rahajoe
Abstract:
The proliferation of smart devices and advancements in mobile communication technologies have permeated various facets of life with the widespread influence of e-commerce. Detecting abnormal transactions holds paramount significance in this realm due to the potential for substantial financial losses. Moreover, the fusion of data flow and control flow assumes a critical role in the exploration of process modeling and data analysis, contributing significantly to the accuracy and security of business processes. This paper introduces an alternative approach to identify abnormal transactions through a model that integrates both data and control flows. Referred to as the Extended Data Petri net (DPNE), our model encapsulates the entire process, encompassing user login to the e-commerce platform and concluding with the payment stage, including the mobile transaction process. We scrutinize the model's structure, formulate an algorithm for detecting anomalies in pertinent data, and elucidate the rationale and efficacy of the comprehensive system model. A case study validates the responsive performance of each system component, demonstrating the system's adeptness in evaluating every activity within mobile transactions. Ultimately, the results of anomaly detection are derived through a thorough and comprehensive analysis.Keywords: database, data analysis, DPNE, extended data flow, e-commerce
Procedia PDF Downloads 5241804 Improving the Analytical Power of Dynamic DEA Models, by the Consideration of the Shape of the Distribution of Inputs/Outputs Data: A Linear Piecewise Decomposition Approach
Authors: Elias K. Maragos, Petros E. Maravelakis
Abstract:
In Dynamic Data Envelopment Analysis (DDEA), which is a subfield of Data Envelopment Analysis (DEA), the productivity of Decision Making Units (DMUs) is considered in relation to time. In this case, as it is accepted by the most of the researchers, there are outputs, which are produced by a DMU to be used as inputs in a future time. Those outputs are known as intermediates. The common models, in DDEA, do not take into account the shape of the distribution of those inputs, outputs or intermediates data, assuming that the distribution of the virtual value of them does not deviate from linearity. This weakness causes the limitation of the accuracy of the analytical power of the traditional DDEA models. In this paper, the authors, using the concept of piecewise linear inputs and outputs, propose an extended DDEA model. The proposed model increases the flexibility of the traditional DDEA models and improves the measurement of the dynamic performance of DMUs.Keywords: Dynamic Data Envelopment Analysis, DDEA, piecewise linear inputs, piecewise linear outputs
Procedia PDF Downloads 15941803 An Analysis of Sequential Pattern Mining on Databases Using Approximate Sequential Patterns
Authors: J. Suneetha, Vijayalaxmi
Abstract:
Sequential Pattern Mining involves applying data mining methods to large data repositories to extract usage patterns. Sequential pattern mining methodologies used to analyze the data and identify patterns. The patterns have been used to implement efficient systems can recommend on previously observed patterns, in making predictions, improve usability of systems, detecting events, and in general help in making strategic product decisions. In this paper, identified performance of approximate sequential pattern mining defines as identifying patterns approximately shared with many sequences. Approximate sequential patterns can effectively summarize and represent the databases by identifying the underlying trends in the data. Conducting an extensive and systematic performance over synthetic and real data. The results demonstrate that ApproxMAP effective and scalable in mining large sequences databases with long patterns.Keywords: multiple data, performance analysis, sequential pattern, sequence database scalability
Procedia PDF Downloads 33941802 An Exhaustive All-Subsets Examination of Trade Theory on WTO Data
Authors: Masoud Charkhabi
Abstract:
We examine trade theory with this motivation. The full set of World Trade Organization data are organized into country-year pairs, each treated as a different entity. Topological Data Analysis reveals that among the 16 region and 240 region-year pairs there exists in fact a distinguishable group of region-period pairs. The generally accepted periods of shifts from dissimilar-dissimilar to similar-similar trade in goods among regions are examined from this new perspective. The period breaks are treated as cumulative and are flexible. This type of all-subsets analysis is motivated from computer science and is made possible with Lossy Compression and Graph Theory. The results question many patterns in similar-similar to dissimilar-dissimilar trade. They also show indications of economic shifts that only later become evident in other economic metrics.Keywords: econometrics, globalization, network science, topological data, analysis, trade theory, visualization, world trade
Procedia PDF Downloads 37041801 Carbon Sequestration in Spatio-Temporal Vegetation Dynamics
Authors: Nothando Gwazani, K. R. Marembo
Abstract:
An increase in the atmospheric concentration of carbon dioxide (CO₂) from fossil fuel and land use change necessitates identification of strategies for mitigating threats associated with global warming. Oceans are insufficient to offset the accelerating rate of carbon emission. However, the challenges of oceans as a source of reducing carbon footprint can be effectively overcome by the storage of carbon in terrestrial carbon sinks. The gases with special optical properties that are responsible for climate warming include carbon dioxide (CO₂), water vapors, methane (CH₄), nitrous oxide (N₂O), nitrogen oxides (NOₓ), stratospheric ozone (O₃), carbon monoxide (CO) and chlorofluorocarbons (CFC’s). Amongst these, CO₂ plays a crucial role as it contributes to 50% of the total greenhouse effect and has been linked to climate change. Because plants act as carbon sinks, interest in terrestrial carbon sequestration has increased in an effort to explore opportunities for climate change mitigation. Removal of carbon from the atmosphere is a topical issue that addresses one important aspect of an overall strategy for carbon management namely to help mitigate the increasing emissions of CO₂. Thus, terrestrial ecosystems have gained importance for their potential to sequester carbon and reduce carbon sink in oceans, which have a substantial impact on the ocean species. Field data and electromagnetic spectrum bands were analyzed using ArcGIS 10.2, QGIS 2.8 and ERDAS IMAGINE 2015 to examine the vegetation distribution. Satellite remote sensing data coupled with Normalized Difference Vegetation Index (NDVI) was employed to assess future potential changes in vegetation distributions in Eastern Cape Province of South Africa. The observed 5-year interval analysis examines the amount of carbon absorbed using vegetation distribution. In 2015, the numerical results showed low vegetation distribution, therefore increased the acidity of the oceans and gravely affected fish species and corals. The outcomes suggest that the study area could be effectively utilized for carbon sequestration so as to mitigate ocean acidification. The vegetation changes measured through this investigation suggest an environmental shift and reduced vegetation carbon sink, and that threatens biodiversity and ecosystem. In order to sustain the amount of carbon in the terrestrial ecosystems, the identified ecological factors should be enhanced through the application of good land and forest management practices. This will increase the carbon stock of terrestrial ecosystems thereby reducing direct loss to the atmosphere.Keywords: remote sensing, vegetation dynamics, carbon sequestration, terrestrial carbon sink
Procedia PDF Downloads 15041800 Analysing Techniques for Fusing Multimodal Data in Predictive Scenarios Using Convolutional Neural Networks
Authors: Philipp Ruf, Massiwa Chabbi, Christoph Reich, Djaffar Ould-Abdeslam
Abstract:
In recent years, convolutional neural networks (CNN) have demonstrated high performance in image analysis, but oftentimes, there is only structured data available regarding a specific problem. By interpreting structured data as images, CNNs can effectively learn and extract valuable insights from tabular data, leading to improved predictive accuracy and uncovering hidden patterns that may not be apparent in traditional structured data analysis. In applying a single neural network for analyzing multimodal data, e.g., both structured and unstructured information, significant advantages in terms of time complexity and energy efficiency can be achieved. Converting structured data into images and merging them with existing visual material offers a promising solution for applying CNN in multimodal datasets, as they often occur in a medical context. By employing suitable preprocessing techniques, structured data is transformed into image representations, where the respective features are expressed as different formations of colors and shapes. In an additional step, these representations are fused with existing images to incorporate both types of information. This final image is finally analyzed using a CNN.Keywords: CNN, image processing, tabular data, mixed dataset, data transformation, multimodal fusion
Procedia PDF Downloads 12241799 A Study on Big Data Analytics, Applications and Challenges
Authors: Chhavi Rana
Abstract:
The aim of the paper is to highlight the existing development in the field of big data analytics. Applications like bioinformatics, smart infrastructure projects, Healthcare, and business intelligence contain voluminous and incremental data, which is hard to organise and analyse and can be dealt with using the framework and model in this field of study. An organization's decision-making strategy can be enhanced using big data analytics and applying different machine learning techniques and statistical tools on such complex data sets that will consequently make better things for society. This paper reviews the current state of the art in this field of study as well as different application domains of big data analytics. It also elaborates on various frameworks in the process of Analysis using different machine-learning techniques. Finally, the paper concludes by stating different challenges and issues raised in existing research.Keywords: big data, big data analytics, machine learning, review
Procedia PDF Downloads 8141798 A Study on Big Data Analytics, Applications, and Challenges
Authors: Chhavi Rana
Abstract:
The aim of the paper is to highlight the existing development in the field of big data analytics. Applications like bioinformatics, smart infrastructure projects, healthcare, and business intelligence contain voluminous and incremental data which is hard to organise and analyse and can be dealt with using the framework and model in this field of study. An organisation decision-making strategy can be enhanced by using big data analytics and applying different machine learning techniques and statistical tools to such complex data sets that will consequently make better things for society. This paper reviews the current state of the art in this field of study as well as different application domains of big data analytics. It also elaborates various frameworks in the process of analysis using different machine learning techniques. Finally, the paper concludes by stating different challenges and issues raised in existing research.Keywords: big data, big data analytics, machine learning, review
Procedia PDF Downloads 9341797 Cultivation of Stenocereus Spp. as an Option to Reduce Crop Loss Problems in High Marginalization States in Mexico
Authors: Abraham Castro-Alvarez, Luisaldo Sandate-Flores, Roberto Parra-Saldivar
Abstract:
The losing of crops during the whole production process is a problem that is affecting farmers in the whole world, as climate change affects the weather behavior. Stenocereus spp. is a tropical, exotic and endemic columnar cacti, it produces a colored and expensive fruit known how “pitaya”. The quality and value of the fruit, these species represent an attractive option for economical development in arid and semi-arid regions. This fruits are produced in Mexico, mainly in 4 regions, Mixteca Oaxaca-Puebla, Michoacan, Sinaloa-Sonora, Jalisco-Zacatecas. Pitaya can be an option to try mixed crop in this states due to the resistance to hard weather conditions. And also because of the marginalization problems that exist in these townships. As defined by the Population National Council it consists in the absence of development opportunities and the lack of capacity to get them. According to an analysis done in EsriPress ArcGis 10.1 the potential area in the country is almost the half of the territory being the total area of Mexico 1,965,249 km2 and the area with potential to produce pitaya 960,527 km2. This area covers part of the most affected townships that also have a few options of maize varieties making even harder the production of maize and exposing farmers to crop losing if conditions are good enough. Making pitaya a good option for these farmers to have an economic backup in their productions.Keywords: maize, pitaya, rain fed, Stenocereus
Procedia PDF Downloads 31841796 Series Network-Structured Inverse Models of Data Envelopment Analysis: Pitfalls and Solutions
Authors: Zohreh Moghaddas, Morteza Yazdani, Farhad Hosseinzadeh
Abstract:
Nowadays, data envelopment analysis (DEA) models featuring network structures have gained widespread usage for evaluating the performance of production systems and activities (Decision-Making Units (DMUs)) across diverse fields. By examining the relationships between the internal stages of the network, these models offer valuable insights to managers and decision-makers regarding the performance of each stage and its impact on the overall network. To further empower system decision-makers, the inverse data envelopment analysis (IDEA) model has been introduced. This model allows the estimation of crucial information for estimating parameters while keeping the efficiency score unchanged or improved, enabling analysis of the sensitivity of system inputs or outputs according to managers' preferences. This empowers managers to apply their preferences and policies on resources, such as inputs and outputs, and analyze various aspects like production, resource allocation processes, and resource efficiency enhancement within the system. The results obtained can be instrumental in making informed decisions in the future. The top result of this study is an analysis of infeasibility and incorrect estimation that may arise in the theory and application of the inverse model of data envelopment analysis with network structures. By addressing these pitfalls, novel protocols are proposed to circumvent these shortcomings effectively. Subsequently, several theoretical and applied problems are examined and resolved through insightful case studies.Keywords: inverse models of data envelopment analysis, series network, estimation of inputs and outputs, efficiency, resource allocation, sensitivity analysis, infeasibility
Procedia PDF Downloads 51