Search results for: data bank
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24889

Search results for: data bank

23689 The Analysis of Emergency Shutdown Valves Torque Data in Terms of Its Use as a Health Indicator for System Prognostics

Authors: Ewa M. Laskowska, Jorn Vatn

Abstract:

Industry 4.0 focuses on digital optimization of industrial processes. The idea is to use extracted data in order to build a decision support model enabling use of those data for real time decision making. In terms of predictive maintenance, the desired decision support tool would be a model enabling prognostics of system's health based on the current condition of considered equipment. Within area of system prognostics and health management, a commonly used health indicator is Remaining Useful Lifetime (RUL) of a system. Because the RUL is a random variable, it has to be estimated based on available health indicators. Health indicators can be of different types and come from different sources. They can be process variables, equipment performance variables, data related to number of experienced failures, etc. The aim of this study is the analysis of performance variables of emergency shutdown valves (ESV) used in oil and gas industry. ESV is inspected periodically, and at each inspection torque and time of valve operation are registered. The data will be analyzed by means of machine learning or statistical analysis. The purpose is to investigate whether the available data could be used as a health indicator for a prognostic purpose. The second objective is to examine what is the most efficient way to incorporate the data into predictive model. The idea is to check whether the data can be applied in form of explanatory variables in Markov process or whether other stochastic processes would be a more convenient to build an RUL model based on the information coming from registered data.

Keywords: emergency shutdown valves, health indicator, prognostics, remaining useful lifetime, RUL

Procedia PDF Downloads 74
23688 Block Mining: Block Chain Enabled Process Mining Database

Authors: James Newman

Abstract:

Process mining is an emerging technology that looks to serialize enterprise data in time series data. It has been used by many companies and has been the subject of a variety of research papers. However, the majority of current efforts have looked at how to best create process mining from standard relational databases. This paper is the first pass at outlining a database custom-built for the minimal viable product of process mining. We present Block Miner, a blockchain protocol to store process mining data across a distributed network. We demonstrate the feasibility of storing process mining data on the blockchain. We present a proof of concept and show how the intersection of these two technologies helps to solve a variety of issues, including but not limited to ransomware attacks, tax documentation, and conflict resolution.

Keywords: blockchain, process mining, memory optimization, protocol

Procedia PDF Downloads 78
23687 Vulnerability of Groundwater to Pollution in Akwa Ibom State, Southern Nigeria, using the DRASTIC Model and Geographic Information System (GIS)

Authors: Aniedi A. Udo, Magnus U. Igboekwe, Rasaaq Bello, Francis D. Eyenaka, Michael C. Ohakwere-Eze

Abstract:

Groundwater vulnerability to pollution was assessed in Akwa Ibom State, Southern Nigeria, with the aim of locating areas with high potentials for resource contamination, especially due to anthropogenic influence. The electrical resistivity method was utilized in the collection of the initial field data. Additional data input, which included depth to static water level, drilled well log data, aquifer recharge data, percentage slope, as well as soil information, were sourced from secondary sources. The initial field data were interpreted both manually and with computer modeling to provide information on the geoelectric properties of the subsurface. Interpreted results together with the secondary data were used to develop the DRASTIC thematic maps. A vulnerability assessment was performed using the DRASTIC model in a GIS environment and areas with high vulnerability which needed immediate attention was clearly mapped out and presented using an aquifer vulnerability map. The model was subjected to validation and the rate of validity was 73% within the area of study.

Keywords: groundwater, vulnerability, DRASTIC model, pollution

Procedia PDF Downloads 192
23686 A Review Paper on Data Security in Precision Agriculture Using Internet of Things

Authors: Tonderai Muchenje, Xolani Mkhwanazi

Abstract:

Precision agriculture uses a number of technologies, devices, protocols, and computing paradigms to optimize agricultural processes. Big data, artificial intelligence, cloud computing, and edge computing are all used to handle the huge amounts of data generated by precision agriculture. However, precision agriculture is still emerging and has a low level of security features. Furthermore, future solutions will demand data availability and accuracy as key points to help farmers, and security is important to build robust and efficient systems. Since precision agriculture comprises a wide variety and quantity of resources, security addresses issues such as compatibility, constrained resources, and massive data. Moreover, conventional protection schemes used in the traditional internet may not be useful for agricultural systems, creating extra demands and opportunities. Therefore, this paper aims at reviewing state of the art of precision agriculture security, particularly in open field agriculture, discussing its architecture, describing security issues, and presenting the major challenges and future directions.

Keywords: precision agriculture, security, IoT, EIDE

Procedia PDF Downloads 75
23685 Did Nature of Job Matters - Impact of Perceived Job Autonomy on Turnover Intention in Sales and Marketing Managers: Moderating Effect of Procedural and Distributive Justice

Authors: Muhammad Babar Shahzad

Abstract:

The purpose of our study is to investigate the relationship between perceived job autonomy and turnover intention in sales & marketing staff. Perceived job autonomy is considered one of most studied dimension of Job Characteristic Model. But still there is a confusion in scholars about predictive role of perceived job autonomy in turnover intention. In line of more complex research on this relation, we investigated the relationship between perceived job autonomy and turnover intention. Did nature of job have any impact on this relationship. On the call of different authors we take interactive effect of perceived job autonomy and procedural justice on turnover intention. Predictive role of distributive justice to employee outcomes is not deniable. But predictive role of distributive justice will be prone in different contextual influences. Interactive role of distributive justice and perceived job autonomy is also not tested before. We collected date from 279 marketing and sales managers working in financial institution, FMCG industries, Pharamesutical Industry & Bank. Strong and direct negative relation was found in perceived job autonomy, distributive justice & procedural justice on turnover intention. Distributive and procedural justice is also amplifying the negative relationship of perceived job autonomy and turnover intention. Limitation and future direction for research is also discussed.

Keywords: perceived job autonomy, turnover intention, procedural justice, distributive job

Procedia PDF Downloads 495
23684 Commercial Automobile Insurance: A Practical Approach of the Generalized Additive Model

Authors: Nicolas Plamondon, Stuart Atkinson, Shuzi Zhou

Abstract:

The insurance industry is usually not the first topic one has in mind when thinking about applications of data science. However, the use of data science in the finance and insurance industry is growing quickly for several reasons, including an abundance of reliable customer data, ferocious competition requiring more accurate pricing, etc. Among the top use cases of data science, we find pricing optimization, customer segmentation, customer risk assessment, fraud detection, marketing, and triage analytics. The objective of this paper is to present an application of the generalized additive model (GAM) on a commercial automobile insurance product: an individually rated commercial automobile. These are vehicles used for commercial purposes, but for which there is not enough volume to apply pricing to several vehicles at the same time. The GAM model was selected as an improvement over GLM for its ease of use and its wide range of applications. The model was trained using the largest split of the data to determine model parameters. The remaining part of the data was used as testing data to verify the quality of the modeling activity. We used the Gini coefficient to evaluate the performance of the model. For long-term monitoring, commonly used metrics such as RMSE and MAE will be used. Another topic of interest in the insurance industry is to process of producing the model. We will discuss at a high level the interactions between the different teams with an insurance company that needs to work together to produce a model and then monitor the performance of the model over time. Moreover, we will discuss the regulations in place in the insurance industry. Finally, we will discuss the maintenance of the model and the fact that new data does not come constantly and that some metrics can take a long time to become meaningful.

Keywords: insurance, data science, modeling, monitoring, regulation, processes

Procedia PDF Downloads 67
23683 Modeling Pan Evaporation Using Intelligent Methods of ANN, LSSVM and Tree Model M5 (Case Study: Shahroud and Mayamey Stations)

Authors: Hamidreza Ghazvinian, Khosro Ghazvinian, Touba Khodaiean

Abstract:

The importance of evaporation estimation in water resources and agricultural studies is undeniable. Pan evaporation are used as an indicator to determine the evaporation of lakes and reservoirs around the world due to the ease of interpreting its data. In this research, intelligent models were investigated in estimating pan evaporation on a daily basis. Shahroud and Mayamey were considered as the studied cities. These two cities are located in Semnan province in Iran. The mentioned cities have dry weather conditions that are susceptible to high evaporation potential. Meteorological data of 11 years of synoptic stations of Shahrood and Mayamey cities were used. The intelligent models used in this study are Artificial Neural Network (ANN), Least Squares Support Vector Machine (LSSVM), and M5 tree models. Meteorological parameters of minimum and maximum air temperature (Tmax, Tmin), wind speed (WS), sunshine hours (SH), air pressure (PA), relative humidity (RH) as selected input data and evaporation data from pan (EP) to The output data was considered. 70% of data is used at the education level, and 30 % of the data is used at the test level. Models used with explanation coefficient evaluation (R2) Root of Mean Squares Error (RMSE) and Mean Absolute Error (MAE). The results for the two Shahroud and Mayamey stations showed that the above three models' operations are rather appropriate.

Keywords: pan evaporation, intelligent methods, shahroud, mayamey

Procedia PDF Downloads 61
23682 Generating Insights from Data Using a Hybrid Approach

Authors: Allmin Susaiyah, Aki Härmä, Milan Petković

Abstract:

Automatic generation of insights from data using insight mining systems (IMS) is useful in many applications, such as personal health tracking, patient monitoring, and business process management. Existing IMS face challenges in controlling insight extraction, scaling to large databases, and generalising to unseen domains. In this work, we propose a hybrid approach consisting of rule-based and neural components for generating insights from data while overcoming the aforementioned challenges. Firstly, a rule-based data 2CNL component is used to extract statistically significant insights from data and represent them in a controlled natural language (CNL). Secondly, a BERTSum-based CNL2NL component is used to convert these CNLs into natural language texts. We improve the model using task-specific and domain-specific fine-tuning. Our approach has been evaluated using statistical techniques and standard evaluation metrics. We overcame the aforementioned challenges and observed significant improvement with domain-specific fine-tuning.

Keywords: data mining, insight mining, natural language generation, pre-trained language models

Procedia PDF Downloads 93
23681 Review of K0-Factors and Related Nuclear Data of the Selected Radionuclides for Use in K0-NAA

Authors: Manh-Dung Ho, Van-Giap Pham, Van-Doanh Ho, Quang-Thien Tran, Tuan-Anh Tran

Abstract:

The k0-factors and related nuclear data, i.e. the Q0-factors and effective resonance energies (Ēr) of the selected radionuclides which are used in the k0-based neutron activation analysis (k0-NAA), were critically reviewed to be integrated in the “k0-DALAT” software. The k0- and Q0-factors of some short-lived radionuclides: 46mSc, 110Ag, 116m2In, 165mDy, and 183mW, were experimentally determined at the Dalat research reactor. The other radionuclides selected are: 20F, 36S, 49Ca, 60mCo, 60Co, 75Se, 77mSe, 86mRb, 115Cd, 115mIn, 131Ba, 134mCs, 134Cs, 153Gd, 153Sm, 159Gd, 170Tm, 177mYb, 192Ir, 197mHg, 239U and 239Np. The reviewed data as compared with the literature data were biased within 5.6-7.3% in which the experimental re-determined factors were within 6.1 and 7.3%. The NIST standard reference materials: Oyster Tissue (1566b), Montana II Soil (2711a) and Coal Fly Ash (1633b) were used to validate the new reviewed data showing that the new data gave an improved k0-NAA using the “k0-DALAT” software with a factor of 4.5-6.8% for the investigated radionuclides.

Keywords: neutron activation analysis, k0-based method, k0 factor, Q0 factor, effective resonance energy

Procedia PDF Downloads 110
23680 Optimizing Electric Vehicle Charging with Charging Data Analytics

Authors: Tayyibah Khanam, Mohammad Saad Alam, Sanchari Deb, Yasser Rafat

Abstract:

Electric vehicles are considered as viable replacements to gasoline cars since they help in reducing harmful emissions and stimulate power generation through renewable energy sources, hence contributing to sustainability. However, one of the significant obstacles in the mass deployment of electric vehicles is the charging time anxiety among users and, thus, the subsequent large waiting times for available chargers at charging stations. Data analytics, on the other hand, has revolutionized the decision-making tasks of management and operating systems since its arrival. In this paper, we attempt to optimize the choice of EV charging stations for users in their vicinity by minimizing the time taken to reach the charging stations and the waiting times for available chargers. Time taken to travel to the charging station is calculated by the Google Maps API and the waiting times are predicted by polynomial regression of the historical data stored. The proposed framework utilizes real-time data and historical data from all operating charging stations in the city and assists the user in finding the best suitable charging station for their current situation and can be implemented in a mobile phone application. The algorithm successfully predicts the most optimal choice of a charging station and the minimum required time for various sample data sets.

Keywords: charging data, electric vehicles, machine learning, waiting times

Procedia PDF Downloads 172
23679 Finding Data Envelopment Analysis Targets Using Multi-Objective Programming in DEA-R with Stochastic Data

Authors: R. Shamsi, F. Sharifi

Abstract:

In this paper, we obtain the projection of inefficient units in data envelopment analysis (DEA) in the case of stochastic inputs and outputs using the multi-objective programming (MOP) structure. In some problems, the inputs might be stochastic while the outputs are deterministic, and vice versa. In such cases, we propose a multi-objective DEA-R model because in some cases (e.g., when unnecessary and irrational weights by the BCC model reduce the efficiency score), an efficient decision-making unit (DMU) is introduced as inefficient by the BCC model, whereas the DMU is considered efficient by the DEA-R model. In some other cases, only the ratio of stochastic data may be available (e.g., the ratio of stochastic inputs to stochastic outputs). Thus, we provide a multi-objective DEA model without explicit outputs and prove that the input-oriented MOP DEA-R model in the invariable return to scale case can be replaced by the MOP-DEA model without explicit outputs in the variable return to scale and vice versa. Using the interactive methods for solving the proposed model yields a projection corresponding to the viewpoint of the DM and the analyst, which is nearer to reality and more practical. Finally, an application is provided.

Keywords: DEA-R, multi-objective programming, stochastic data, data envelopment analysis

Procedia PDF Downloads 96
23678 Isotope Effects on Inhibitors Binding to HIV Reverse Transcriptase

Authors: Agnieszka Krzemińska, Katarzyna Świderek, Vicente Molinier, Piotr Paneth

Abstract:

In order to understand in details the interactions between ligands and the enzyme isotope effects were studied between clinically used drugs that bind in the active site of Human Immunodeficiency Virus Reverse Transcriptase, HIV-1 RT, as well as triazole-based inhibitor that binds in the allosteric pocket of this enzyme. The magnitudes and origins of the resulting binding isotope effects were analyzed. Subsequently, binding isotope effect of the same triazole-based inhibitor bound in the active site were analyzed and compared. Together, these results show differences in binding origins in two sites of the enzyme and allow to analyze binding mode and place of newly synthesized inhibitors. Typical protocol is described below on the example of triazole ligand in the allosteric pocket. Triazole was docked into allosteric cavity of HIV-1 RT with Glide using extra-precision mode as implemented in Schroedinger software. The structure of HIV-1 RT was obtained from Protein Data Bank as structure of PDB ID 2RKI. The pKa for titratable amino acids was calculated using PROPKA software, and in order to neutralize the system 15 Cl- were added using tLEaP package implemented in AMBERTools ver.1.5. Also N-terminals and C-terminals were build using tLEaP. The system was placed in 144x160x144Å3 orthorhombic box of water molecules using NAMD program. Missing parameters for triazole were obtained at the AM1 level using Antechamber software implemented in AMBERTools. The energy minimizations were carried out by means of a conjugate gradient algorithm using NAMD. Then system was heated from 0 to 300 K with temperature increment 0.001 K. Subsequently 2 ns Langevin−Verlet (NVT) MM MD simulation with AMBER force field implemented in NAMD was carried out. Periodic Boundary Conditions and cut-offs for the nonbonding interactions, range radius from 14.5 to 16 Å, are used. After 2 ns relaxation 200 ps of QM/MM MD at 300 K were simulated. The triazole was treated quantum mechanically at the AM1 level, protein was described using AMBER and water molecules were described using TIP3P, as implemented in fDynamo library. Molecules 20 Å apart from the triazole were kept frozen, with cut-offs established on range radius from 14.5 to 16 Å. In order to describe interactions between triazole and RT free energy of binding using Free Energy Perturbation method was done. The change in frequencies from ligand in solution to ligand bounded in enzyme was used to calculate binding isotope effects.

Keywords: binding isotope effects, molecular dynamics, HIV, reverse transcriptase

Procedia PDF Downloads 417
23677 Integrated Model for Enhancing Data Security Processing Time in Cloud Computing

Authors: Amani A. Saad, Ahmed A. El-Farag, El-Sayed A. Helali

Abstract:

Cloud computing is an important and promising field in the recent decade. Cloud computing allows sharing resources, services and information among the people of the whole world. Although the advantages of using clouds are great, but there are many risks in a cloud. The data security is the most important and critical problem of cloud computing. In this research a new security model for cloud computing is proposed for ensuring secure communication system, hiding information from other users and saving the user's times. In this proposed model Blowfish encryption algorithm is used for exchanging information or data, and SHA-2 cryptographic hash algorithm is used for data integrity. For user authentication process a simple user-name and password is used, the password uses SHA-2 for one way encryption. The proposed system shows an improvement of the processing time of uploading and downloading files on the cloud in secure form.

Keywords: cloud computing, data security, SAAS, PAAS, IAAS, Blowfish

Procedia PDF Downloads 339
23676 Comparison of Statistical Methods for Estimating Missing Precipitation Data in the River Subbasin Lenguazaque, Colombia

Authors: Miguel Cañon, Darwin Mena, Ivan Cabeza

Abstract:

In this work was compared and evaluated the applicability of statistical methods for the estimation of missing precipitations data in the basin of the river Lenguazaque located in the departments of Cundinamarca and Boyacá, Colombia. The methods used were the method of simple linear regression, distance rate, local averages, mean rates, correlation with nearly stations and multiple regression method. The analysis used to determine the effectiveness of the methods is performed by using three statistical tools, the correlation coefficient (r2), standard error of estimation and the test of agreement of Bland and Altmant. The analysis was performed using real rainfall values removed randomly in each of the seasons and then estimated using the methodologies mentioned to complete the missing data values. So it was determined that the methods with the highest performance and accuracy in the estimation of data according to conditions that were counted are the method of multiple regressions with three nearby stations and a random application scheme supported in the precipitation behavior of related data sets.

Keywords: statistical comparison, precipitation data, river subbasin, Bland and Altmant

Procedia PDF Downloads 454
23675 Effects of Soil Erosion on Vegetation Development

Authors: Josephine Wanja Nyatia

Abstract:

The relationship between vegetation and soil erosion deserves attention due to its scientific importance and practical applications. A great deal of information is available about the mechanisms and benefits of vegetation in the control of soil erosion, but the effects of soil erosion on vegetation development and succession is poorly documented. Research shows that soil erosion is the most important driving force for the degradation of upland and mountain ecosystems. Soil erosion interferes with the process of plant community development and vegetation succession, commencing with seed formation and impacting throughout the whole growth phase and affecting seed availability, dispersal, germination and establishment, plant community structure and spatial distribution. There have been almost no studies on the effects of soil erosion on seed development and availability, of surface flows on seed movement and redistribution, and their influences on soil seed bank and on vegetation establishment and distribution. However, these effects may be the main cause of low vegetation cover in regions of high soil erosion activity, and these issues need to be investigated. Moreover, soil erosion is not only a negative influence on vegetation succession and restoration but also a driving force of plant adaptation and evolution. Consequently, we need to study the effects of soil erosion on ecological processes and on development and regulation of vegetation succession from the points of view of pedology and vegetation, plant and seed ecology, and to establish an integrated theory and technology for deriving practical solutions to soil erosion problems

Keywords: soil erosion, vegetation, development, seed availability

Procedia PDF Downloads 58
23674 Hyperspectral Data Classification Algorithm Based on the Deep Belief and Self-Organizing Neural Network

Authors: Li Qingjian, Li Ke, He Chun, Huang Yong

Abstract:

In this paper, the method of combining the Pohl Seidman's deep belief network with the self-organizing neural network is proposed to classify the target. This method is mainly aimed at the high nonlinearity of the hyperspectral image, the high sample dimension and the difficulty in designing the classifier. The main feature of original data is extracted by deep belief network. In the process of extracting features, adding known labels samples to fine tune the network, enriching the main characteristics. Then, the extracted feature vectors are classified into the self-organizing neural network. This method can effectively reduce the dimensions of data in the spectrum dimension in the preservation of large amounts of raw data information, to solve the traditional clustering and the long training time when labeled samples less deep learning algorithm for training problems, improve the classification accuracy and robustness. Through the data simulation, the results show that the proposed network structure can get a higher classification precision in the case of a small number of known label samples.

Keywords: DBN, SOM, pattern classification, hyperspectral, data compression

Procedia PDF Downloads 325
23673 Assessing Performance of Data Augmentation Techniques for a Convolutional Network Trained for Recognizing Humans in Drone Images

Authors: Masood Varshosaz, Kamyar Hasanpour

Abstract:

In recent years, we have seen growing interest in recognizing humans in drone images for post-disaster search and rescue operations. Deep learning algorithms have shown great promise in this area, but they often require large amounts of labeled data to train the models. To keep the data acquisition cost low, augmentation techniques can be used to create additional data from existing images. There are many techniques of such that can help generate variations of an original image to improve the performance of deep learning algorithms. While data augmentation is potentially assumed to improve the accuracy and robustness of the models, it is important to ensure that the performance gains are not outweighed by the additional computational cost or complexity of implementing the techniques. To this end, it is important to evaluate the impact of data augmentation on the performance of the deep learning models. In this paper, we evaluated the most currently available 2D data augmentation techniques on a standard convolutional network which was trained for recognizing humans in drone images. The techniques include rotation, scaling, random cropping, flipping, shifting, and their combination. The results showed that the augmented models perform 1-3% better compared to a base network. However, as the augmented images only contain the human parts already visible in the original images, a new data augmentation approach is needed to include the invisible parts of the human body. Thus, we suggest a new method that employs simulated 3D human models to generate new data for training the network.

Keywords: human recognition, deep learning, drones, disaster mitigation

Procedia PDF Downloads 76
23672 Emotional Artificial Intelligence and the Right to Privacy

Authors: Emine Akar

Abstract:

The majority of privacy-related regulation has traditionally focused on concepts that are perceived to be well-understood or easily describable, such as certain categories of data and personal information or images. In the past century, such regulation appeared reasonably suitable for its purposes. However, technologies such as AI, combined with ever-increasing capabilities to collect, process, and store “big data”, not only require calibration of these traditional understandings but may require re-thinking of entire categories of privacy law. In the presentation, it will be explained, against the background of various emerging technologies under the umbrella term “emotional artificial intelligence”, why modern privacy law will need to embrace human emotions as potentially private subject matter. This argument can be made on a jurisprudential level, given that human emotions can plausibly be accommodated within the various concepts that are traditionally regarded as the underlying foundation of privacy protection, such as, for example, dignity, autonomy, and liberal values. However, the practical reasons for regarding human emotions as potentially private subject matter are perhaps more important (and very likely more convincing from the perspective of regulators). In that respect, it should be regarded as alarming that, according to most projections, the usefulness of emotional data to governments and, particularly, private companies will not only lead to radically increased processing and analysing of such data but, concerningly, to an exponential growth in the collection of such data. In light of this, it is also necessity to discuss options for how regulators could address this emerging threat.

Keywords: AI, privacy law, data protection, big data

Procedia PDF Downloads 75
23671 Develop a Conceptual Data Model of Geotechnical Risk Assessment in Underground Coal Mining Using a Cloud-Based Machine Learning Platform

Authors: Reza Mohammadzadeh

Abstract:

The major challenges in geotechnical engineering in underground spaces arise from uncertainties and different probabilities. The collection, collation, and collaboration of existing data to incorporate them in analysis and design for given prospect evaluation would be a reliable, practical problem solving method under uncertainty. Machine learning (ML) is a subfield of artificial intelligence in statistical science which applies different techniques (e.g., Regression, neural networks, support vector machines, decision trees, random forests, genetic programming, etc.) on data to automatically learn and improve from them without being explicitly programmed and make decisions and predictions. In this paper, a conceptual database schema of geotechnical risks in underground coal mining based on a cloud system architecture has been designed. A new approach of risk assessment using a three-dimensional risk matrix supported by the level of knowledge (LoK) has been proposed in this model. Subsequently, the model workflow methodology stages have been described. In order to train data and LoK models deployment, an ML platform has been implemented. IBM Watson Studio, as a leading data science tool and data-driven cloud integration ML platform, is employed in this study. As a Use case, a data set of geotechnical hazards and risk assessment in underground coal mining were prepared to demonstrate the performance of the model, and accordingly, the results have been outlined.

Keywords: data model, geotechnical risks, machine learning, underground coal mining

Procedia PDF Downloads 254
23670 Classification of Poverty Level Data in Indonesia Using the Naïve Bayes Method

Authors: Anung Style Bukhori, Ani Dijah Rahajoe

Abstract:

Poverty poses a significant challenge in Indonesia, requiring an effective analytical approach to understand and address this issue. In this research, we applied the Naïve Bayes classification method to examine and classify poverty data in Indonesia. The main focus is on classifying data using RapidMiner, a powerful data analysis platform. The analysis process involves data splitting to train and test the classification model. First, we collected and prepared a poverty dataset that includes various factors such as education, employment, and health..The experimental results indicate that the Naïve Bayes classification model can provide accurate predictions regarding the risk of poverty. The use of RapidMiner in the analysis process offers flexibility and efficiency in evaluating the model's performance. The classification produces several values to serve as the standard for classifying poverty data in Indonesia using Naive Bayes. The accuracy result obtained is 40.26%, with a moderate recall result of 35.94%, a high recall result of 63.16%, and a low recall result of 38.03%. The precision for the moderate class is 58.97%, for the high class is 17.39%, and for the low class is 58.70%. These results can be seen from the graph below.

Keywords: poverty, classification, naïve bayes, Indonesia

Procedia PDF Downloads 38
23669 The Sense of Recognition of Muslim Women in Western Academia

Authors: Naima Mohammadi

Abstract:

The present paper critically reports on the emergency of Iranian international students in a large public university in Italy. Although the most sizeable diaspora of Iranians dates back to the 1979 revolution, a huge wave of Iranian female students travelled abroad after the Iranian Green Movement (2009) due to the intensification of gender discrimination and Islamization. To explore the experience of Iranian female students at an Italian public university, two complementary methods were adopted: a focus group and individual interviews. Focus groups yield detailed collective conversations and provide researchers with an opportunity to observe the interaction between participants, rather than between participant and researcher, which generates data. Semi-structured interviews allow participants to share their stories in their own words and speak about personal experiences and opinions. Research participants were invited to participate through a public call in a Telegram group of Iranian students. Theoretical and purposive sampling was applied to select participants. All participants were assured that full anonymity would be ensured and they consented to take part in the research. A two-hour focus group was held in English with participants in the presence and some online. They were asked to share their motivations for studying in Italy and talk about their experiences both within and outside the university context. Each of these interviews lasted from 45 to 60 minutes and was mostly carried out online and in Farsi. The focus group consisted of 8 Iranian female post-graduate students. In analyzing the data a blended approach was adopted, with a combination of deductive and inductive coding. According to research findings, although 9/11 was the beginning of the West’s challenges against Muslims, the nuclear threats of Islamic regimes promoted the toughest international sanctions against Iranians as a nation across the world. Accordingly, carrying an Iranian identity contributes to social, political, and economic exclusion. Research findings show that geopolitical factors such as international sanctions and Islamophobia, and a lack of reciprocity in terms of recognition, have created a sense of stigmatization for veiled and unveiled Iranian female students who are the largest groups of ‘non-European Muslim international students’ enrolled in Italian universities. Participants addressed how their nationality has devalued their public image and negatively impacted their self-confidence and self-realization in academia. They highlighted the experience of an unwelcoming atmosphere by different groups of people and institutes, such as receiving marked students’ badges, rejected bank account requests, failed visa processes, secondary security screening selection, and hyper-visibility of veiled students. This study corroborates the need for institutions to pay attention to geopolitical factors and religious diversity in student recruitment and provide support mechanisms and access to basic rights. Accordingly, it is suggested that Higher Education Institutions (HEIs) have a social and moral responsibility towards the discrimination and both social and academic exclusion of Iranian students.

Keywords: Iranian diaspora, female students, recognition theory, inclusive university

Procedia PDF Downloads 51
23668 Web Search Engine Based Naming Procedure for Independent Topic

Authors: Takahiro Nishigaki, Takashi Onoda

Abstract:

In recent years, the number of document data has been increasing since the spread of the Internet. Many methods have been studied for extracting topics from large document data. We proposed Independent Topic Analysis (ITA) to extract topics independent of each other from large document data such as newspaper data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis. The topic represented by ITA is represented by a set of words. However, the set of words is quite different from the topics the user imagines. For example, the top five words with high independence of a topic are as follows. Topic1 = {"scor", "game", "lead", "quarter", "rebound"}. This Topic 1 is considered to represent the topic of "SPORTS". This topic name "SPORTS" has to be attached by the user. ITA cannot name topics. Therefore, in this research, we propose a method to obtain topics easy for people to understand by using the web search engine, topics given by the set of words given by independent topic analysis. In particular, we search a set of topical words, and the title of the homepage of the search result is taken as the topic name. And we also use the proposed method for some data and verify its effectiveness.

Keywords: independent topic analysis, topic extraction, topic naming, web search engine

Procedia PDF Downloads 106
23667 Extracting Terrain Points from Airborne Laser Scanning Data in Densely Forested Areas

Authors: Ziad Abdeldayem, Jakub Markiewicz, Kunal Kansara, Laura Edwards

Abstract:

Airborne Laser Scanning (ALS) is one of the main technologies for generating high-resolution digital terrain models (DTMs). DTMs are crucial to several applications, such as topographic mapping, flood zone delineation, geographic information systems (GIS), hydrological modelling, spatial analysis, etc. Laser scanning system generates irregularly spaced three-dimensional cloud of points. Raw ALS data are mainly ground points (that represent the bare earth) and non-ground points (that represent buildings, trees, cars, etc.). Removing all the non-ground points from the raw data is referred to as filtering. Filtering heavily forested areas is considered a difficult and challenging task as the canopy stops laser pulses from reaching the terrain surface. This research presents an approach for removing non-ground points from raw ALS data in densely forested areas. Smoothing splines are exploited to interpolate and fit the noisy ALS data. The presented filter utilizes a weight function to allocate weights for each point of the data. Furthermore, unlike most of the methods, the presented filtering algorithm is designed to be automatic. Three different forested areas in the United Kingdom are used to assess the performance of the algorithm. The results show that the generated DTMs from the filtered data are accurate (when compared against reference terrain data) and the performance of the method is stable for all the heavily forested data samples. The average root mean square error (RMSE) value is 0.35 m.

Keywords: airborne laser scanning, digital terrain models, filtering, forested areas

Procedia PDF Downloads 127
23666 Estimating the Life-Distribution Parameters of Weibull-Life PV Systems Utilizing Non-Parametric Analysis

Authors: Saleem Z. Ramadan

Abstract:

In this paper, a model is proposed to determine the life distribution parameters of the useful life region for the PV system utilizing a combination of non-parametric and linear regression analysis for the failure data of these systems. Results showed that this method is dependable for analyzing failure time data for such reliable systems when the data is scarce.

Keywords: masking, bathtub model, reliability, non-parametric analysis, useful life

Procedia PDF Downloads 546
23665 Preliminary Design of Maritime Energy Management System: Naval Architectural Approach to Resolve Recent Limitations

Authors: Seyong Jeong, Jinmo Park, Jinhyoun Park, Boram Kim, Kyoungsoo Ahn

Abstract:

Energy management in the maritime industry is being required by economics and in conformity with new legislative actions taken by the International Maritime Organization (IMO) and the European Union (EU). In response, the various performance monitoring methodologies and data collection practices have been examined by different stakeholders. While many assorted advancements in operation and technology are applicable, their adoption in the shipping industry stays small. This slow uptake can be considered due to many different barriers such as data analysis problems, misreported data, and feedback problems, etc. This study presents a conceptual design of an energy management system (EMS) and proposes the methodology to resolve the limitations (e.g., data normalization using naval architectural evaluation, management of misrepresented data, and feedback from shore to ship through management of performance analysis history). We expect this system to make even short-term charterers assess the ship performance properly and implement sustainable fleet control.

Keywords: data normalization, energy management system, naval architectural evaluation, ship performance analysis

Procedia PDF Downloads 434
23664 Geospatial Data Complexity in Electronic Airport Layout Plan

Authors: Shyam Parhi

Abstract:

Airports GIS program collects Airports data, validate and verify it, and stores it in specific database. Airports GIS allows authorized users to submit changes to airport data. The verified data is used to develop several engineering applications. One of these applications is electronic Airport Layout Plan (eALP) whose primary aim is to move from paper to digital form of ALP. The first phase of development of eALP was completed recently and it was tested for a few pilot program airports across different regions. We conducted gap analysis and noticed that a lot of development work is needed to fine tune at least six mandatory sheets of eALP. It is important to note that significant amount of programming is needed to move from out-of-box ArcGIS to a much customized ArcGIS which will be discussed. The ArcGIS viewer capability to display essential features like runway or taxiway or the perpendicular distance between them will be discussed. An enterprise level workflow which incorporates coordination process among different lines of business will be highlighted.

Keywords: geospatial data, geology, geographic information systems, aviation

Procedia PDF Downloads 400
23663 Anisotropic Total Fractional Order Variation Model in Seismic Data Denoising

Authors: Jianwei Ma, Diriba Gemechu

Abstract:

In seismic data processing, attenuation of random noise is the basic step to improve quality of data for further application of seismic data in exploration and development in different gas and oil industries. The signal-to-noise ratio of the data also highly determines quality of seismic data. This factor affects the reliability as well as the accuracy of seismic signal during interpretation for different purposes in different companies. To use seismic data for further application and interpretation, we need to improve the signal-to-noise ration while attenuating random noise effectively. To improve the signal-to-noise ration and attenuating seismic random noise by preserving important features and information about seismic signals, we introduce the concept of anisotropic total fractional order denoising algorithm. The anisotropic total fractional order variation model defined in fractional order bounded variation is proposed as a regularization in seismic denoising. The split Bregman algorithm is employed to solve the minimization problem of the anisotropic total fractional order variation model and the corresponding denoising algorithm for the proposed method is derived. We test the effectiveness of theproposed method for synthetic and real seismic data sets and the denoised result is compared with F-X deconvolution and non-local means denoising algorithm.

Keywords: anisotropic total fractional order variation, fractional order bounded variation, seismic random noise attenuation, split Bregman algorithm

Procedia PDF Downloads 197
23662 NSBS: Design of a Network Storage Backup System

Authors: Xinyan Zhang, Zhipeng Tan, Shan Fan

Abstract:

The first layer of defense against data loss is the backup data. This paper implements an agent-based network backup system used the backup, server-storage and server-backup agent these tripartite construction, and we realize the snapshot and hierarchical index in the NSBS. It realizes the control command and data flow separation, balances the system load, thereby improving the efficiency of the system backup and recovery. The test results show the agent-based network backup system can effectively improve the task-based concurrency, reasonably allocate network bandwidth, the system backup performance loss costs smaller and improves data recovery efficiency by 20%.

Keywords: agent, network backup system, three architecture model, NSBS

Procedia PDF Downloads 441
23661 A t-SNE and UMAP Based Neural Network Image Classification Algorithm

Authors: Shelby Simpson, William Stanley, Namir Naba, Xiaodi Wang

Abstract:

Both t-SNE and UMAP are brand new state of art tools to predominantly preserve the local structure that is to group neighboring data points together, which indeed provides a very informative visualization of heterogeneity in our data. In this research, we develop a t-SNE and UMAP base neural network image classification algorithm to embed the original dataset to a corresponding low dimensional dataset as a preprocessing step, then use this embedded database as input to our specially designed neural network classifier for image classification. We use the fashion MNIST data set, which is a labeled data set of images of clothing objects in our experiments. t-SNE and UMAP are used for dimensionality reduction of the data set and thus produce low dimensional embeddings. Furthermore, we use the embeddings from t-SNE and UMAP to feed into two neural networks. The accuracy of the models from the two neural networks is then compared to a dense neural network that does not use embedding as an input to show which model can classify the images of clothing objects more accurately.

Keywords: t-SNE, UMAP, fashion MNIST, neural networks

Procedia PDF Downloads 179
23660 Study of Village Scale Community Based Water Supply and Sanitation Program (Pamsimas) in Indonesia

Authors: Reza Eka Putra

Abstract:

Pamsimas is a community based drinking water supply and sanitation program which is contributed by local community, local government, central government, and World Bank with the aim of achieving Water Supply and Sanitation - the Millennium Development Goals (WSS-MDGs) target. This program is supported by the Ministry of Public Works as the executing agency with the cooperation of Ministry of Interior and the Ministry of Health. Field observations were conducted in two rural samples of 2009 beneficiaries Pamsimas West Java, which is in Ponggang Village, Subang District. The study was evaluated through several parameters, including technical, health, and empowerment aspect. Evaluation was done by comparing the parameters of success that has been set by Pamsimas through Pamsimas book manuals with the parameters from Sanitation & Infrastructure course regarding the appropriate application of technology in society. The result of the study is that the potency of the community before the program is implemented in the village is the determining factor. Stronger cooperation pattern in Ponggang Vilage results in a successful program. Both villages showed a pattern of behavior changes from indiscriminate defecation to sanitary latrine use. Besides, there is a decline in the number of cases of diarrheal disease since the year of Pamsimas implementation.

Keywords: millenium development goals, community develpoment, water supply, sanitation

Procedia PDF Downloads 236