Search results for: ERA-5 analysis data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 41038

Search results for: ERA-5 analysis data

40018 Banks Profitability Indicators in CEE Countries

Authors: I. Erins, J. Erina

Abstract:

The aim of the present article is to determine the impact of the external and internal factors of bank performance on the profitability indicators of the CEE countries banks in the period from 2006 to 2012. On the basis of research conducted abroad on bank and macroeconomic profitability indicators, in order to obtain research results, the authors evaluated return on average assets (ROAA) and return on average equity (ROAE) indicators of the CEE countries banks. The authors analyzed profitability indicators of banks using descriptive methods, SPSS data analysis methods as well as data correlation and linear regression analysis. The authors concluded that most internal and external indicators of bank performance have no direct effect on the profitability of the banks in the CEE countries. The only exceptions are credit risk and bank size which affect one of the measures of bank profitability–return on average equity.

Keywords: banks, CEE countries, profitability ROAA, ROAE

Procedia PDF Downloads 353
40017 A Two-Stage Bayesian Variable Selection Method with the Extension of Lasso for Geo-Referenced Data

Authors: Georgiana Onicescu, Yuqian Shen

Abstract:

Due to the complex nature of geo-referenced data, multicollinearity of the risk factors in public health spatial studies is a commonly encountered issue, which leads to low parameter estimation accuracy because it inflates the variance in the regression analysis. To address this issue, we proposed a two-stage variable selection method by extending the least absolute shrinkage and selection operator (Lasso) to the Bayesian spatial setting, investigating the impact of risk factors to health outcomes. Specifically, in stage I, we performed the variable selection using Bayesian Lasso and several other variable selection approaches. Then, in stage II, we performed the model selection with only the selected variables from stage I and compared again the methods. To evaluate the performance of the two-stage variable selection methods, we conducted a simulation study with different distributions for the risk factors, using geo-referenced count data as the outcome and Michigan as the research region. We considered the cases when all candidate risk factors are independently normally distributed, or follow a multivariate normal distribution with different correlation levels. Two other Bayesian variable selection methods, Binary indicator, and the combination of Binary indicator and Lasso were considered and compared as alternative methods. The simulation results indicated that the proposed two-stage Bayesian Lasso variable selection method has the best performance for both independent and dependent cases considered. When compared with the one-stage approach, and the other two alternative methods, the two-stage Bayesian Lasso approach provides the highest estimation accuracy in all scenarios considered.

Keywords: Lasso, Bayesian analysis, spatial analysis, variable selection

Procedia PDF Downloads 123
40016 Use of the Gas Chromatography Method for Hydrocarbons' Quality Evaluation in the Offshore Fields of the Baltic Sea

Authors: Pavel Shcherban, Vlad Golovanov

Abstract:

Currently, there is an active geological exploration and development of the subsoil shelf of the Kaliningrad region. To carry out a comprehensive and accurate assessment of the volumes and degree of extraction of hydrocarbons from open deposits, it is necessary to establish not only a number of geological and lithological characteristics of the structures under study, but also to determine the oil quality, its viscosity, density, fractional composition as accurately as possible. In terms of considered works, gas chromatography is one of the most capacious methods that allow the rapid formation of a significant amount of initial data. The aspects of the application of the gas chromatography method for determining the chemical characteristics of the hydrocarbons of the Kaliningrad shelf fields are observed in the article, as well as the correlation-regression analysis of these parameters in comparison with the previously obtained chemical characteristics of hydrocarbon deposits located on the land of the region. In the process of research, a number of methods of mathematical statistics and computer processing of large data sets have been applied, which makes it possible to evaluate the identity of the deposits, to specify the amount of reserves and to make a number of assumptions about the genesis of the hydrocarbons under analysis.

Keywords: computer processing of large databases, correlation-regression analysis, hydrocarbon deposits, method of gas chromatography

Procedia PDF Downloads 140
40015 Applications of Greenhouse Data in Guatemala in the Analysis of Sustainability Indicators

Authors: Maria A. Castillo H., Andres R. Leandro, Jose F. Bienvenido B.

Abstract:

In 2015, Guatemala officially adopted the Sustainable Development Goals (SDG) according to the 2030 Agenda agreed by the United Nations Organization. In 2016, these objectives and goals were reviewed, and the National Priorities were established within the K'atún 2032 National Development Plan. In 2019 and 2021, progress was evaluated with 120 defined indicators, and the need to improve quality and availability of statistical data necessary for the analysis of sustainability indicators was detected, so the values to be reached in 2024 and 2032 were adjusted. The need for greater agricultural technology is one of the priorities established within SDG 2 "Zero Hunger". Within this area, protected agricultural production provides greater productivity throughout the year, reduces the use of chemical products to control pests and diseases, reduces the negative impact of climate and improves product quality. During the crisis caused by Covid-19, there was an increase in exports of fruits and vegetables produced in greenhouses from Guatemala. However, this information has not been considered in the 2021 revision of the Plan. The objective of this study is to evaluate the information available on Greenhouse Agricultural Production and its integration into the Sustainability Indicators for Guatemala. This study was carried out in four phases: 1. Analysis of the Goals established for SDG 2 and the indicators included in the K'atún Plan. 2. Analysis of Environmental, Social and Economic Indicator Models. 3. Definition of territorial levels in 2 geographic scales: Departments and Municipalities. 4. Diagnosis of the available data on technological agricultural production with emphasis on Greenhouses at the 2 geographical scales. A summary of the results is presented for each phase and finally some recommendations for future research are added. The main contribution of this work is to improve the available data that allow the incorporation of some agricultural technology indicators in the established goals, to evaluate their impact on Food Security and Nutrition, Employment and Investment, Poverty, the use of Water and Natural Resources, and to provide a methodology applicable to other production models and other geographical areas.

Keywords: greenhouses, protected agriculture, sustainable indicators, Guatemala, sustainability, SDG

Procedia PDF Downloads 68
40014 Mediation of the Middle Eastern Crises and Economic Growth: An Application of Times Series Analysis

Authors: Gokhan Erkal, Gulsen Aydin, Muge Yuce, Lokman Sahin

Abstract:

This study aims to analyze the impacts of involving in mediation of conflicts in the Middle East from the perspective of the economic growth of the mediators. The Middle East is a highly volatile region of the world with rampant crises whose affects spill beyond its borders. Therefore, management and resolution of the conflicts in the region are of great significance. Mediation is an instrument used for abating violence and settling dispute. The recourse to mediation has grown to an important degree in recent years. However, for mediators, it is a daunting task to involve in the mediation of the deadlocks in the Middle East. This study tries to shed light on the positive correlation between economic growth of the mediator and the successful outcome of the mediation process to provide motivation for mediators. To this end, first, it briefly introduces the conflicts ongoing in the region and their negative impacts. Second, the methodology, time series analysis, and the data to be used, International Crisis Behavior Project Data, are presented. Third, the empirical test is carried out and the findings are evaluated. The conclusion highlights the benefits of successful mediation for the economic growth of the mediators of Middle Eastern crises.

Keywords: international crises, mediation, Middle East, times series analysis

Procedia PDF Downloads 167
40013 Multivariate Control Chart to Determine Efficiency Measurements in Industrial Processes

Authors: J. J. Vargas, N. Prieto, L. A. Toro

Abstract:

Control charts are commonly used to monitor processes involving either variable or attribute of quality characteristics and determining the control limits as a critical task for quality engineers to improve the processes. Nonetheless, in some applications it is necessary to include an estimation of efficiency. In this paper, the ability to define the efficiency of an industrial process was added to a control chart by means of incorporating a data envelopment analysis (DEA) approach. In depth, a Bayesian estimation was performed to calculate the posterior probability distribution of parameters as means and variance and covariance matrix. This technique allows to analyse the data set without the need of using the hypothetical large sample implied in the problem and to be treated as an approximation to the finite sample distribution. A rejection simulation method was carried out to generate random variables from the parameter functions. Each resulting vector was used by stochastic DEA model during several cycles for establishing the distribution of each efficiency measures for each DMU (decision making units). A control limit was calculated with model obtained and if a condition of a low level efficiency of DMU is presented, system efficiency is out of control. In the efficiency calculated a global optimum was reached, which ensures model reliability.

Keywords: data envelopment analysis, DEA, Multivariate control chart, rejection simulation method

Procedia PDF Downloads 365
40012 Urbanization and Income Inequality in Thailand

Authors: Acumsiri Tantikarnpanit

Abstract:

This paper aims to examine the relationship between urbanization and income inequality in Thailand during the period 2002–2020. Using a panel of data for 76 provinces collected from Thailand’s National Statistical Office (Labor Force Survey: LFS), as well as geospatial data from the U.S. Air Force Defense Meteorological Satellite Program (DMSP) and the Visible Infrared Imaging Radiometer Suite Day/Night band (VIIRS-DNB) satellite for nineteen selected years. This paper employs two different definitions to identify urban areas: 1) Urban areas defined by Thailand's National Statistical Office (Labor Force Survey: LFS), and 2) Urban areas estimated using nighttime light data from the DMSP and VIIRS-DNB satellite. The second method includes two sub-categories: 2.1) Determining urban areas by calculating nighttime light density with a population density of 300 people per square kilometer, and 2.2) Calculating urban areas based on nighttime light density corresponding to a population density of 1,500 people per square kilometer. The empirical analysis based on Ordinary Least Squares (OLS), fixed effects, and random effects models reveals a consistent U-shaped relationship between income inequality and urbanization. The findings from the econometric analysis demonstrate that urbanization or population density has a significant and negative impact on income inequality. Moreover, the square of urbanization shows a statistically significant positive impact on income inequality. Additionally, there is a negative association between logarithmically transformed income and income inequality. This paper also proposes the inclusion of satellite imagery, geospatial data, and spatial econometric techniques in future studies to conduct quantitative analysis of spatial relationships.

Keywords: income inequality, nighttime light, population density, Thailand, urbanization

Procedia PDF Downloads 60
40011 Analysis of Factors Affecting the Number of Infant and Maternal Mortality in East Java with Geographically Weighted Bivariate Generalized Poisson Regression Method

Authors: Luh Eka Suryani, Purhadi

Abstract:

Poisson regression is a non-linear regression model with response variable in the form of count data that follows Poisson distribution. Modeling for a pair of count data that show high correlation can be analyzed by Poisson Bivariate Regression. Data, the number of infant mortality and maternal mortality, are count data that can be analyzed by Poisson Bivariate Regression. The Poisson regression assumption is an equidispersion where the mean and variance values are equal. However, the actual count data has a variance value which can be greater or less than the mean value (overdispersion and underdispersion). Violations of this assumption can be overcome by applying Generalized Poisson Regression. Characteristics of each regency can affect the number of cases occurred. This issue can be overcome by spatial analysis called geographically weighted regression. This study analyzes the number of infant mortality and maternal mortality based on conditions in East Java in 2016 using Geographically Weighted Bivariate Generalized Poisson Regression (GWBGPR) method. Modeling is done with adaptive bisquare Kernel weighting which produces 3 regency groups based on infant mortality rate and 5 regency groups based on maternal mortality rate. Variables that significantly influence the number of infant and maternal mortality are the percentages of pregnant women visit health workers at least 4 times during pregnancy, pregnant women get Fe3 tablets, obstetric complication handled, clean household and healthy behavior, and married women with the first marriage age under 18 years.

Keywords: adaptive bisquare kernel, GWBGPR, infant mortality, maternal mortality, overdispersion

Procedia PDF Downloads 145
40010 Towards End-To-End Disease Prediction from Raw Metagenomic Data

Authors: Maxence Queyrel, Edi Prifti, Alexandre Templier, Jean-Daniel Zucker

Abstract:

Analysis of the human microbiome using metagenomic sequencing data has demonstrated high ability in discriminating various human diseases. Raw metagenomic sequencing data require multiple complex and computationally heavy bioinformatics steps prior to data analysis. Such data contain millions of short sequences read from the fragmented DNA sequences and stored as fastq files. Conventional processing pipelines consist in multiple steps including quality control, filtering, alignment of sequences against genomic catalogs (genes, species, taxonomic levels, functional pathways, etc.). These pipelines are complex to use, time consuming and rely on a large number of parameters that often provide variability and impact the estimation of the microbiome elements. Training Deep Neural Networks directly from raw sequencing data is a promising approach to bypass some of the challenges associated with mainstream bioinformatics pipelines. Most of these methods use the concept of word and sentence embeddings that create a meaningful and numerical representation of DNA sequences, while extracting features and reducing the dimensionality of the data. In this paper we present an end-to-end approach that classifies patients into disease groups directly from raw metagenomic reads: metagenome2vec. This approach is composed of four steps (i) generating a vocabulary of k-mers and learning their numerical embeddings; (ii) learning DNA sequence (read) embeddings; (iii) identifying the genome from which the sequence is most likely to come and (iv) training a multiple instance learning classifier which predicts the phenotype based on the vector representation of the raw data. An attention mechanism is applied in the network so that the model can be interpreted, assigning a weight to the influence of the prediction for each genome. Using two public real-life data-sets as well a simulated one, we demonstrated that this original approach reaches high performance, comparable with the state-of-the-art methods applied directly on processed data though mainstream bioinformatics workflows. These results are encouraging for this proof of concept work. We believe that with further dedication, the DNN models have the potential to surpass mainstream bioinformatics workflows in disease classification tasks.

Keywords: deep learning, disease prediction, end-to-end machine learning, metagenomics, multiple instance learning, precision medicine

Procedia PDF Downloads 106
40009 Consumer Values in the Perspective of Javanese Mataraman Society: Identification, Meaning, and Application

Authors: Anna Triwijayati, Etsa Astridya Setiyati, Titik Desi Harsoyo

Abstract:

Culture is the important determinant of human behavior and desire. Culture influences the consumer through the norms and values established by the society in which they live and reflect it. The cultural values of Javanese society certainly have united in the Javanese society behavior in consumption. This research is expected to give big enough theoretical benefits in the findings of cultural value in consumption in Javanese society. These can be an incentive in finding the local cultural value in many tribes in Indonesia, so one time, the local cultural value in Indonesia about consumption can be fundamental part in education and consumption practice in Indonesia. The approach used in this research is non positivist research or is known as qualitative approach. The method or type of research used in this research is ethnomethodology. The collection data is done in Central Java region. The research subject or informant is determined by the purposive technique by certain criteria determined by the researcher. The data is collected by deep interview and observation. Before the data analysis, the researcher does the storing method data stage and implements the data validity procedures. Then, the data is analyzed by the theme and interactive analysis technique. The Javanese Mataraman society has such consumption values such as has to be sufficient, be careful, economical, submit to the one who creates the life, the way life flow, and the present problem is thought in the present also. In the financial management for consumption, the consumer should have the simple life principles, has to be sufficient, has to be able to eat, has to be able to self-press, well-managed/diligent/accurate/careful, the open or transparent management, has the struggle effort, like to self-sacrifice and think about the future. The meaning of consumption value in family is centered to the submission and full-trust to God. These consumption values are applied in consumer behavior in self, family, investment and credit need in short term and long term perspective.

Keywords: values, consumer, consumption, Javanese Mataraman, ethnomethodology

Procedia PDF Downloads 377
40008 Human Resource Management Challenges in Age of Artificial Intelligence: Methodology of Case Analysis

Authors: Olga Leontjeva

Abstract:

In the age of Artificial Intelligence (AI), some organization management approaches need to be adapted or changed. Human Resource Management (HRM) is a part of organization management that is under the managers' focus nowadays, because AI integration into organization activities brings some HRM-connected challenges. The topic became more significant during the crises of many organizations in the world caused by the coronavirus pandemic (COVID-19). The paper presents an approach, which will be used for the study that is going to be focused on the various case analysis. The author of the future study will analyze the cases of the organizations from Latvia and Spain that are grouped by the size, type of activity and area of business. The information for the cases will be collected through structured interviews and online surveys. The main result presented is the questionnaire developed that will be used for the study as well as the definition and description of sampling. The first round of the survey will be based on convenience sampling that is the main limitation of the study. To conclude, the approach developed will help to collect valid data if the organizations participating in the survey are ready to share their cases in depth, so the researchers could draw the right conclusions and generalize compared organizations’ cases. The questionnaire developed for the survey is applicable for both written online data collection as well as for the interviews. The case analysis will help to identify some HRM challenges that are connected to AI integration into organization activities such as management of different generation employees and their training peculiarities.

Keywords: age of artificial intelligence, case analysis, generation Y and Z employees, human resource management

Procedia PDF Downloads 157
40007 An Analysis of the Temporal Aspects of Visual Attention Processing Using Rapid Series Visual Processing (RSVP) Data

Authors: Shreya Borthakur, Aastha Vartak

Abstract:

This Electroencephalogram (EEG) project on Rapid Visual Serial Processing (RSVP) paradigm explores the temporal dynamics of visual attention processing in response to rapidly presented visual stimuli. The study builds upon previous research that used real-world images in RSVP tasks to understand the emergence of object representations in the human brain. The objectives of the research include investigating the differences in accuracy and reaction times between 5 Hz and 20 Hz presentation rates, as well as examining the prominent brain waves, particularly alpha and beta waves, associated with the attention task. The pre-processing and data analysis involves filtering EEG data, creating epochs for target stimuli, and conducting statistical tests using MATLAB, EEGLAB, Chronux toolboxes, and R. The results support the hypotheses, revealing higher accuracy at a slower presentation rate, faster reaction times for less complex targets, and the involvement of alpha and beta waves in attention and cognitive processing. This research sheds light on how short-term memory and cognitive control affect visual processing and could have practical implications in fields like education.

Keywords: RSVP, attention, visual processing, attentional blink, EEG

Procedia PDF Downloads 52
40006 Analysis of Train Passenger Seat Using Ergonomic Function Deployment Method

Authors: Robertoes K. K. Wibowo, Siswoyo Soekarno, Irma Puspitasari

Abstract:

Indonesian people use trains for their transportation, especially they use economy class train transportation because it is cheaper and has a more precise schedule than any other ground transportation. Nevertheless, the economy class passenger seat raises some inconvenience issues for passengers. This is due to the design of the chair on the economic class of trains that did not adjusted to the shape of anthropometry of Indonesian people. Thus, research needs to be conducted on the design of the seats in the economic class of trains. The purpose of this research is to make the design of economy class passenger seats ergonomic. This research method uses questionnaires and anthropometry measurements. The data obtained is processed using House of Quality of Ergonomic Function Development. From the results of analysis and data processing were obtained important changes from the original design. Ergonomic chair design according to the analysis is a stainless steel frame, seat height 390 mm, with a seat width for each passenger of 400 mm and a depth of 400 mm. Design of the backrest has a height of 840 mm, width of 430 mm and length of 300 mm that can move at the angle of 105-115 degrees. The width of the footrest is 42 mm and 400 mm length. The thickness of the seat cushion is 100 mm.

Keywords: chair, ergonomics, function development, train passenger

Procedia PDF Downloads 276
40005 Determination of the Effective Economic and/or Demographic Indicators in Classification of European Union Member and Candidate Countries Using Partial Least Squares Discriminant Analysis

Authors: Esra Polat

Abstract:

Partial Least Squares Discriminant Analysis (PLSDA) is a statistical method for classification and consists a classical Partial Least Squares Regression (PLSR) in which the dependent variable is a categorical one expressing the class membership of each observation. PLSDA can be applied in many cases when classical discriminant analysis cannot be applied. For example, when the number of observations is low and when the number of independent variables is high. When there are missing values, PLSDA can be applied on the data that is available. Finally, it is adapted when multicollinearity between independent variables is high. The aim of this study is to determine the economic and/or demographic indicators, which are effective in grouping the 28 European Union (EU) member countries and 7 candidate countries (including potential candidates Bosnia and Herzegovina (BiH) and Kosova) by using the data set obtained from database of the World Bank for 2014. Leaving the political issues aside, the analysis is only concerned with the economic and demographic variables that have the potential influence on country’s eligibility for EU entrance. Hence, in this study, both the performance of PLSDA method in classifying the countries correctly to their pre-defined groups (candidate or member) and the differences between the EU countries and candidate countries in terms of these indicators are analyzed. As a result of the PLSDA, the value of percentage correctness of 100 % indicates that overall of the 35 countries is classified correctly. Moreover, the most important variables that determine the statuses of member and candidate countries in terms of economic indicators are identified as 'external balance on goods and services (% GDP)', 'gross domestic savings (% GDP)' and 'gross national expenditure (% GDP)' that means for the 2014 economical structure of countries is the most important determinant of EU membership. Subsequently, the model validated to prove the predictive ability by using the data set for 2015. For prediction sample, %97,14 of the countries are correctly classified. An interesting result is obtained for only BiH, which is still a potential candidate for EU, predicted as a member of EU by using the indicators data set for 2015 as a prediction sample. Although BiH has made a significant transformation from a war-torn country to a semi-functional state, ethnic tensions, nationalistic rhetoric and political disagreements are still evident, which inhibit Bosnian progress towards the EU.

Keywords: classification, demographic indicators, economic indicators, European Union, partial least squares discriminant analysis

Procedia PDF Downloads 266
40004 Graph-Based Semantical Extractive Text Analysis

Authors: Mina Samizadeh

Abstract:

In the past few decades, there has been an explosion in the amount of available data produced from various sources with different topics. The availability of this enormous data necessitates us to adopt effective computational tools to explore the data. This leads to an intense growing interest in the research community to develop computational methods focused on processing this text data. A line of study focused on condensing the text so that we are able to get a higher level of understanding in a shorter time. The two important tasks to do this are keyword extraction and text summarization. In keyword extraction, we are interested in finding the key important words from a text. This makes us familiar with the general topic of a text. In text summarization, we are interested in producing a short-length text which includes important information about the document. The TextRank algorithm, an unsupervised learning method that is an extension of the PageRank (algorithm which is the base algorithm of Google search engine for searching pages and ranking them), has shown its efficacy in large-scale text mining, especially for text summarization and keyword extraction. This algorithm can automatically extract the important parts of a text (keywords or sentences) and declare them as a result. However, this algorithm neglects the semantic similarity between the different parts. In this work, we improved the results of the TextRank algorithm by incorporating the semantic similarity between parts of the text. Aside from keyword extraction and text summarization, we develop a topic clustering algorithm based on our framework, which can be used individually or as a part of generating the summary to overcome coverage problems.

Keywords: keyword extraction, n-gram extraction, text summarization, topic clustering, semantic analysis

Procedia PDF Downloads 52
40003 Data Mining Approach for Commercial Data Classification and Migration in Hybrid Storage Systems

Authors: Mais Haj Qasem, Maen M. Al Assaf, Ali Rodan

Abstract:

Parallel hybrid storage systems consist of a hierarchy of different storage devices that vary in terms of data reading speed performance. As we ascend in the hierarchy, data reading speed becomes faster. Thus, migrating the application’ important data that will be accessed in the near future to the uppermost level will reduce the application I/O waiting time; hence, reducing its execution elapsed time. In this research, we implement trace-driven two-levels parallel hybrid storage system prototype that consists of HDDs and SSDs. The prototype uses data mining techniques to classify application’ data in order to determine its near future data accesses in parallel with the its on-demand request. The important data (i.e. the data that the application will access in the near future) are continuously migrated to the uppermost level of the hierarchy. Our simulation results show that our data migration approach integrated with data mining techniques reduces the application execution elapsed time when using variety of traces in at least to 22%.

Keywords: hybrid storage system, data mining, recurrent neural network, support vector machine

Procedia PDF Downloads 291
40002 Design and Implementation of Security Middleware for Data Warehouse Signature, Framework

Authors: Mayada Al Meghari

Abstract:

Recently, grid middlewares have provided large integrated use of network resources as the shared data and the CPU to become a virtual supercomputer. In this work, we present the design and implementation of the middleware for Data Warehouse Signature, DWS Framework. The aim of using the middleware in our DWS framework is to achieve the high performance by the parallel computing. This middleware is developed on Alchemi.Net framework to increase the security among the network nodes through the authentication and group-key distribution model. This model achieves the key security and prevents any intermediate attacks in the middleware. This paper presents the flow process structures of the middleware design. In addition, the paper ensures the implementation of security for DWS middleware enhancement with the authentication and group-key distribution model. Finally, from the analysis of other middleware approaches, the developed middleware of DWS framework is the optimal solution of a complete covering of security issues.

Keywords: middleware, parallel computing, data warehouse, security, group-key, high performance

Procedia PDF Downloads 98
40001 Determination of the Bank's Customer Risk Profile: Data Mining Applications

Authors: Taner Ersoz, Filiz Ersoz, Seyma Ozbilge

Abstract:

In this study, the clients who applied to a bank branch for loan were analyzed through data mining. The study was composed of the information such as amounts of loans received by personal and SME clients working with the bank branch, installment numbers, number of delays in loan installments, payments available in other banks and number of banks to which they are in debt between 2010 and 2013. The client risk profile was examined through Classification and Regression Tree (CART) analysis, one of the decision tree classification methods. At the end of the study, 5 different types of customers have been determined on the decision tree. The classification of these types of customers has been created with the rating of those posing a risk for the bank branch and the customers have been classified according to the risk ratings.

Keywords: client classification, loan suitability, risk rating, CART analysis

Procedia PDF Downloads 324
40000 Comparison Of Data Mining Models To Predict Future Bridge Conditions

Authors: Pablo Martinez, Emad Mohamed, Osama Mohsen, Yasser Mohamed

Abstract:

Highway and bridge agencies, such as the Ministry of Transportation in Ontario, use the Bridge Condition Index (BCI) which is defined as the weighted condition of all bridge elements to determine the rehabilitation priorities for its bridges. Therefore, accurate forecasting of BCI is essential for bridge rehabilitation budgeting planning. The large amount of data available in regard to bridge conditions for several years dictate utilizing traditional mathematical models as infeasible analysis methods. This research study focuses on investigating different classification models that are developed to predict the bridge condition index in the province of Ontario, Canada based on the publicly available data for 2800 bridges over a period of more than 10 years. The data preparation is a key factor to develop acceptable classification models even with the simplest one, the k-NN model. All the models were tested, compared and statistically validated via cross validation and t-test. A simple k-NN model showed reasonable results (within 0.5% relative error) when predicting the bridge condition in an incoming year.

Keywords: asset management, bridge condition index, data mining, forecasting, infrastructure, knowledge discovery in databases, maintenance, predictive models

Procedia PDF Downloads 175
39999 Discussion on Big Data and One of Its Early Training Application

Authors: Fulya Gokalp Yavuz, Mark Daniel Ward

Abstract:

This study focuses on a contemporary and inevitable topic of Data Science and its exemplary application for early career building: Big Data and Leaving Learning Community (LLC). ‘Academia’ and ‘Industry’ have a common sense on the importance of Big Data. However, both of them are in a threat of missing the training on this interdisciplinary area. Some traditional teaching doctrines are far away being effective on Data Science. Practitioners needs some intuition and real-life examples how to apply new methods to data in size of terabytes. We simply explain the scope of Data Science training and exemplified its early stage application with LLC, which is a National Science Foundation (NSF) founded project under the supervision of Prof. Ward since 2014. Essentially, we aim to give some intuition for professors, researchers and practitioners to combine data science tools for comprehensive real-life examples with the guides of mentees’ feedback. As a result of discussing mentoring methods and computational challenges of Big Data, we intend to underline its potential with some more realization.

Keywords: Big Data, computation, mentoring, training

Procedia PDF Downloads 343
39998 Frontier Dynamic Tracking in the Field of Urban Plant and Habitat Research: Data Visualization and Analysis Based on Journal Literature

Authors: Shao Qi

Abstract:

The article uses the CiteSpace knowledge graph analysis tool to sort and visualize the journal literature on urban plants and habitats in the Web of Science and China National Knowledge Infrastructure databases. Based on a comprehensive interpretation of the visualization results of various data sources and the description of the intrinsic relationship between high-frequency keywords using knowledge mapping, the research hotspots, processes and evolution trends in this field are analyzed. Relevant case studies are also conducted for the hotspot contents to explore the means of landscape intervention and synthesize the understanding of research theories. The results show that (1) from 1999 to 2022, the research direction of urban plants and habitats gradually changed from focusing on plant and animal extinction and biological invasion to the field of human urban habitat creation, ecological restoration, and ecosystem services. (2) The results of keyword emergence and keyword growth trend analysis show that habitat creation research has shown a rapid and stable growth trend since 2017, and ecological restoration has gained long-term sustained attention since 2004. The hotspots of future research on urban plants and habitats in China may focus on habitat creation and ecological restoration.

Keywords: research trends, visual analysis, habitat creation, ecological restoration

Procedia PDF Downloads 49
39997 Differences in Production of Knowledge between Internationally Mobile versus Nationally Mobile and Non-Mobile Scientists

Authors: Valeria Aman

Abstract:

The presented study examines the impact of international mobility on knowledge production among mobile scientists and within the sending and receiving research groups. Scientists are relevant to the dynamics of knowledge production because scientific knowledge is mainly characterized by embeddedness and tacitness. International mobility enables the dissemination of scientific knowledge to other places and encourages new combinations of knowledge. It can also increase the interdisciplinarity of research by forming synergetic combinations of knowledge. Particularly innovative ideas can have their roots in related research domains and are sometimes transferred only through the physical mobility of scientists. Diversity among scientists with respect to their knowledge base can act as an engine for the creation of knowledge. It is therefore relevant to study how knowledge acquired through international mobility affects the knowledge production process. In certain research domains, international mobility may be essential to contextualize knowledge and to gain access to knowledge located at distant places. The knowledge production process contingent on the type of international mobility and the epistemic culture of a research field is examined. The production of scientific knowledge is a multi-faceted process, the output of which is mainly published in scholarly journals. Therefore, the study builds upon publication and citation data covered in Elsevier’s Scopus database for the period of 1996 to 2015. To analyse these data, bibliometric and social network analysis techniques are used. A basic analysis of scientific output using publication data, citation data and data on co-authored publications is combined with a content map analysis. Abstracts of publications indicate whether a research stay abroad makes an original contribution methodologically, theoretically or empirically. Moreover, co-citations are analysed to map linkages among scientists and emerging research domains. Finally, acknowledgements are studied that can function as channels of formal and informal communication between the actors involved in the process of knowledge production. The results provide better understanding of how the international mobility of scientists contributes to the production of knowledge, by contrasting the knowledge production dynamics of internationally mobile scientists with those being nationally mobile or immobile. Findings also allow indicating whether international mobility accelerates the production of knowledge and the emergence of new research fields.

Keywords: bibliometrics, diversity, interdisciplinarity, international mobility, knowledge production

Procedia PDF Downloads 282
39996 Empowering a New Frontier in Heart Disease Detection: Unleashing Quantum Machine Learning

Authors: Sadia Nasrin Tisha, Mushfika Sharmin Rahman, Javier Orduz

Abstract:

Machine learning is applied in a variety of fields throughout the world. The healthcare sector has benefited enormously from it. One of the most effective approaches for predicting human heart diseases is to use machine learning applications to classify data and predict the outcome as a classification. However, with the rapid advancement of quantum technology, quantum computing has emerged as a potential game-changer for many applications. Quantum algorithms have the potential to execute substantially faster than their classical equivalents, which can lead to significant improvements in computational performance and efficiency. In this study, we applied quantum machine learning concepts to predict coronary heart diseases from text data. We experimented thrice with three different features; and three feature sets. The data set consisted of 100 data points. We pursue to do a comparative analysis of the two approaches, highlighting the potential benefits of quantum machine learning for predicting heart diseases.

Keywords: quantum machine learning, SVM, QSVM, matrix product state

Procedia PDF Downloads 75
39995 Verification & Validation of Map Reduce Program Model for Parallel K-Mediod Algorithm on Hadoop Cluster

Authors: Trapti Sharma, Devesh Kumar Srivastava

Abstract:

This paper is basically a analysis study of above MapReduce implementation and also to verify and validate the MapReduce solution model for Parallel K-Mediod algorithm on Hadoop Cluster. MapReduce is a programming model which authorize the managing of huge amounts of data in parallel, on a large number of devices. It is specially well suited to constant or moderate changing set of data since the implementation point of a position is usually high. MapReduce has slowly become the framework of choice for “big data”. The MapReduce model authorizes for systematic and instant organizing of large scale data with a cluster of evaluate nodes. One of the primary affect in Hadoop is how to minimize the completion length (i.e. makespan) of a set of MapReduce duty. In this paper, we have verified and validated various MapReduce applications like wordcount, grep, terasort and parallel K-Mediod clustering algorithm. We have found that as the amount of nodes increases the completion time decreases.

Keywords: hadoop, mapreduce, k-mediod, validation, verification

Procedia PDF Downloads 352
39994 An Analysis of the Influence of Employee Readiness for Change on TQM Implementation

Authors: Mohamed Haffar, Khalil Al-Hyari, Mohammed Khair Abu Zaid, Ramadane Djbarni, Mohammed Hamdan

Abstract:

While employee readiness for change (ERFC) is recognised as critical for total quality management (TQM) implementation, there is a lack of systematic and empirical studies regarding the relationship between ERFC dimensions and TQM. Therefore, this study proposes to fill this gap by providing empirical evidence leading to advancement in the understanding of the influences of ERFC components on TQM implementation. The empirical data for this study was drawn from a survey of 400 middle and senior managers of Jordanian firms. The analysis of the collected data, which was conducted using Structural Equation Modeling technique, revealed that three of the ERFC components, namely personally beneficial, change self-efficacy and management support are the most supportive ERFC dimensions for TQM implementation. Therefore, this paper makes a novel contribution by providing a refined and deeper comprehension of the relationships between ERFCs and TQM implementation.

Keywords: total quality management, employee readiness for change, manufacturing organisations, Jordan

Procedia PDF Downloads 542
39993 Governance vs Diaspora Remittances for Sustainable Development: A Case of Rwanda and Kenya

Authors: Albert Maake, Ifunanya Isama

Abstract:

International remittances to developing countries reached US$ 485 billion in 2018. By 2015, the East African region had surpassed US$3.5 mark. Considering this, there is no argument as to the contribution of Diaspora remittances as an alternative source of funds in the development process of the developing countries. Nevertheless, this paper seeks to argue that good governance in areas such as policy design, implementation and monitoring play a critical role in the sustainable development process of a nation as opposed to Diaspora remittances in general. Therefore this study intends at analyzing the contribution of Governance as opposed to that of Diaspora remittances for nation development. Employing documentary analysis technique, the secondary data with respect to the countries under study on Diaspora remittances will be collected. Selected indicators for Governance-HDI, Debt-to-GDP Ratio and Corruption Index, will be sourced from the World Bank Data for the purpose of consistency and where applicable the Central Statistical Agencies of the Nations under study. By means of descriptive statistics and content analysis the data will be comparatively analyzed to highlight the unique experiences in Rwanda and Kenya. The findings and interpretations from the study will affirm and promote capacity building for best practices in good governance for the countries under study.

Keywords: diaspora remittance, governance, Kenya, Rwanda, sustainable development

Procedia PDF Downloads 117
39992 Diversity and Equality in Four Finnish and Italian Energy Companies' Open Access Material

Authors: Elisa Bertagna

Abstract:

A frame analysis of the work done by various energy multinational companies concerning diversity issues and gender equality is presented. Documents of four multinational companies - two from Finland and two from Italy - have been studied. The array of companies’ documents includes data from their websites, policies and so on. The Finnish and Italian contexts have been chosen as a sample of North and South Europe, of 'advanced' and 'less advanced'. The aim of the analysis is to understand if and how human resource and diversity management in Finnish and Italian multinational energy companies communicate their activity towards the employees. Attention is given on how employees are reacting in their role and on the consequences of its social positioning. The findings of this essay are crucially important. They show how the companies in object tend to focus on the HR and DM positive actions towards female employees’ struggles since the industry is characterized by multinationals with male-dominated employees. In this way, other categories, which are also depicted as sensitive such as young and elderly people or foreigners, do not receive the same amount of attention. Consequently, power hierarchies can be found: 'women' as a social category are given more importance and space in the companies’ data than others. Consequently, the present work analysis reflects on possible struggles that such companies might be facing concerning gender biases and further diverse issues.

Keywords: energy, diversity, gender, multinationals, power hierarchies

Procedia PDF Downloads 128
39991 Towards a Secure Storage in Cloud Computing

Authors: Mohamed Elkholy, Ahmed Elfatatry

Abstract:

Cloud computing has emerged as a flexible computing paradigm that reshaped the Information Technology map. However, cloud computing brought about a number of security challenges as a result of the physical distribution of computational resources and the limited control that users have over the physical storage. This situation raises many security challenges for data integrity and confidentiality as well as authentication and access control. This work proposes a security mechanism for data integrity that allows a data owner to be aware of any modification that takes place to his data. The data integrity mechanism is integrated with an extended Kerberos authentication that ensures authorized access control. The proposed mechanism protects data confidentiality even if data are stored on an untrusted storage. The proposed mechanism has been evaluated against different types of attacks and proved its efficiency to protect cloud data storage from different malicious attacks.

Keywords: access control, data integrity, data confidentiality, Kerberos authentication, cloud security

Procedia PDF Downloads 316
39990 Analysis and Identification of Different Factors Affecting Students’ Performance Using a Correlation-Based Network Approach

Authors: Jeff Chak-Fu Wong, Tony Chun Yin Yip

Abstract:

The transition from secondary school to university seems exciting for many first-year students but can be more challenging than expected. Enabling instructors to know students’ learning habits and styles enhances their understanding of the students’ learning backgrounds, allows teachers to provide better support for their students, and has therefore high potential to improve teaching quality and learning, especially in any mathematics-related courses. The aim of this research is to collect students’ data using online surveys, to analyze students’ factors using learning analytics and educational data mining and to discover the characteristics of the students at risk of falling behind in their studies based on students’ previous academic backgrounds and collected data. In this paper, we use correlation-based distance methods and mutual information for measuring student factor relationships. We then develop a factor network using the Minimum Spanning Tree method and consider further study for analyzing the topological properties of these networks using social network analysis tools. Under the framework of mutual information, two graph-based feature filtering methods, i.e., unsupervised and supervised infinite feature selection algorithms, are used to analyze the results for students’ data to rank and select the appropriate subsets of features and yield effective results in identifying the factors affecting students at risk of failing. This discovered knowledge may help students as well as instructors enhance educational quality by finding out possible under-performers at the beginning of the first semester and applying more special attention to them in order to help in their learning process and improve their learning outcomes.

Keywords: students' academic performance, correlation-based distance method, social network analysis, feature selection, graph-based feature filtering method

Procedia PDF Downloads 116
39989 Assessing Flood Risk and Mapping Inundation Zones in the Kelantan River Basin: A Hydrodynamic Modeling Approach

Authors: Fatemehsadat Mortazavizadeh, Amin Dehghani, Majid Mirzaei, Nurulhuda Binti Mohammad Ramli, Adnan Dehghani

Abstract:

Flood is Malaysia's most common and serious natural disaster. Kelantan River Basin is a tropical basin that experiences a rainy season during North-East Monsoon from November to March. It is also one of the hardest hit areas in Peninsular Malaysia during the heavy monsoon rainfall. Considering the consequences of the flood events, it is essential to develop the flood inundation map as part of the mitigation approach. In this study, the delineation of flood inundation zone in the area of Kelantan River basin using a hydrodynamic model is done by HEC-RAS, QGIS and ArcMap. The streamflow data has been generated with the weather generator based on the observation data. Then, the data is statistically analyzed with the Extreme Value (EV1) method for 2-, 5-, 25-, 50- and 100-year return periods. The minimum depth, maximum depth, mean depth, and the standard deviation of all the scenarios, including the OBS, are observed and analyzed. Based on the results, generally, the value of the data increases with the return period for all the scenarios. However, there are certain scenarios that have different results, which not all the data obtained are increasing with the return period. Besides, OBS data resulted in the middle range within Scenario 1 to Scenario 40.

Keywords: flood inundation, kelantan river basin, hydrodynamic model, extreme value analysis

Procedia PDF Downloads 54