Search results for: data analyze
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 26516

Search results for: data analyze

26456 Modeling and Statistical Analysis of a Soap Production Mix in Bejoy Manufacturing Industry, Anambra State, Nigeria

Authors: Okolie Chukwulozie Paul, Iwenofu Chinwe Onyedika, Sinebe Jude Ebieladoh, M. C. Nwosu

Abstract:

The research work is based on the statistical analysis of the processing data. The essence is to analyze the data statistically and to generate a design model for the production mix of soap manufacturing products in Bejoy manufacturing company Nkpologwu, Aguata Local Government Area, Anambra state, Nigeria. The statistical analysis shows the statistical analysis and the correlation of the data. T test, Partial correlation and bi-variate correlation were used to understand what the data portrays. The design model developed was used to model the data production yield and the correlation of the variables show that the R2 is 98.7%. However, the results confirm that the data is fit for further analysis and modeling. This was proved by the correlation and the R-squared.

Keywords: General Linear Model, correlation, variables, pearson, significance, T-test, soap, production mix and statistic

Procedia PDF Downloads 410
26455 Temporally Coherent 3D Animation Reconstruction from RGB-D Video Data

Authors: Salam Khalifa, Naveed Ahmed

Abstract:

We present a new method to reconstruct a temporally coherent 3D animation from single or multi-view RGB-D video data using unbiased feature point sampling. Given RGB-D video data, in form of a 3D point cloud sequence, our method first extracts feature points using both color and depth information. In the subsequent steps, these feature points are used to match two 3D point clouds in consecutive frames independent of their resolution. Our new motion vectors based dynamic alignment method then fully reconstruct a spatio-temporally coherent 3D animation. We perform extensive quantitative validation using novel error functions to analyze the results. We show that despite the limiting factors of temporal and spatial noise associated to RGB-D data, it is possible to extract temporal coherence to faithfully reconstruct a temporally coherent 3D animation from RGB-D video data.

Keywords: 3D video, 3D animation, RGB-D video, temporally coherent 3D animation

Procedia PDF Downloads 340
26454 The Importance of Generating Electricity through Wind Farms in the Brazilian Electricity Matrix, from 2013 to 2020

Authors: Alex Sidarta Guglielmoni

Abstract:

Since the 1970s, sustainable development has become increasingly present on the international agenda. The present work has as general objective to analyze, discuss and bring answers to the following question, what is the importance of the generation of electric energy through the wind power plants in the Brazilian electricity matrix between 2013 and 2019? To answer this question, we analyzed the generation of renewable energy from wind farms and the consumption of electricity in Brazil during the period of January 2013 until December 2020. The specific objectives of this research are: to analyze the public data, to identify the total wind generation, to identify the total wind capacity generation, to identify the percentage participation of the generation and generation capacity of wind energy in the Brazilian electricity matrix. In order to develop this research, it was necessary a bibliographic search, collection of secondary data, tabulation of generation data, and electricity capacity by a comparative analysis between wind power and the Brazilian electricity matrix. As a result, it was possible to observe how important Brazil is for global sustainable development and how much this country can grow with this, in view of its capacity and potential for generating wind power since this percentage has grown in past few years.

Keywords: wind power, Brazilian market, electricity matrix, generation capacity

Procedia PDF Downloads 89
26453 Developing the P1-P7 Management and Analysis Software for Thai Child Evaluation (TCE) of Food and Nutrition Status

Authors: S. Damapong, C. Kingkeow, W. Kongnoo, P. Pattapokin, S. Pruenglamphu

Abstract:

As the presence of Thai children double burden malnutrition, we conducted a project to promote holistic age-appropriate nutrition for Thai children. Researchers developed P1-P7 computer software for managing and analyzing diverse types of collected data. The study objectives were: i) to use software to manage and analyze the collected data, ii) to evaluate the children nutritional status and their caretakers’ nutrition practice to create regulations for improving nutrition. Data were collected by means of questionnaires, called P1-P7. P1, P2 and P5 were for children and caretakers, and others were for institutions. The children nutritional status, height-for-age, weight-for-age, and weight-for-height standards were calculated using Thai child z-score references. Institution evaluations consisted of various standard regulations including the use of our software. The results showed that the software was used in 44 out of 118 communities (37.3%), 57 out of 240 child development centers and nurseries (23.8%), and 105 out of 152 schools (69.1%). No major problems have been reported with the software, although user efficiency can be increased further through additional training. As the result, the P1-P7 software was used to manage and analyze nutritional status, nutrition behavior, and environmental conditions, in order to conduct Thai Child Evaluation (TCE). The software was most widely used in schools. Some aspects of P1-P7’s questionnaires could be modified to increase ease of use and efficiency.

Keywords: P1-P7 software, Thai child evaluation, nutritional status, malnutrition

Procedia PDF Downloads 332
26452 An Analysis of Sequential Pattern Mining on Databases Using Approximate Sequential Patterns

Authors: J. Suneetha, Vijayalaxmi

Abstract:

Sequential Pattern Mining involves applying data mining methods to large data repositories to extract usage patterns. Sequential pattern mining methodologies used to analyze the data and identify patterns. The patterns have been used to implement efficient systems can recommend on previously observed patterns, in making predictions, improve usability of systems, detecting events, and in general help in making strategic product decisions. In this paper, identified performance of approximate sequential pattern mining defines as identifying patterns approximately shared with many sequences. Approximate sequential patterns can effectively summarize and represent the databases by identifying the underlying trends in the data. Conducting an extensive and systematic performance over synthetic and real data. The results demonstrate that ApproxMAP effective and scalable in mining large sequences databases with long patterns.

Keywords: multiple data, performance analysis, sequential pattern, sequence database scalability

Procedia PDF Downloads 304
26451 Quantifying Mobility of Urban Inhabitant Based on Social Media Data

Authors: Yuyun, Fritz Akhmad Nuzir, Bart Julien Dewancker

Abstract:

Check-in locations on social media provide information about an individual’s location. The millions of units of data generated from these sites provide knowledge for human activity. In this research, we used a geolocation service and users’ texts posted on Twitter social media to analyze human mobility. Our research will answer the questions; what are the movement patterns of a citizen? And, how far do people travel in the city? We explore the people trajectory of 201,118 check-ins and 22,318 users over a period of one month in Makassar city, Indonesia. To accommodate individual mobility, the authors only analyze the users with check-in activity greater than 30 times. We used sampling method with a systematic sampling approach to assign the research sample. The study found that the individual movement shows a high degree of regularity and intensity in certain places. The other finding found that the average distance an urban inhabitant can travel per day is as far as 9.6 km.

Keywords: mobility, check-in, distance, Twitter

Procedia PDF Downloads 143
26450 FCNN-MR: A Parallel Instance Selection Method Based on Fast Condensed Nearest Neighbor Rule

Authors: Lu Si, Jie Yu, Shasha Li, Jun Ma, Lei Luo, Qingbo Wu, Yongqi Ma, Zhengji Liu

Abstract:

Instance selection (IS) technique is used to reduce the data size to improve the performance of data mining methods. Recently, to process very large data set, several proposed methods divide the training set into some disjoint subsets and apply IS algorithms independently to each subset. In this paper, we analyze the limitation of these methods and give our viewpoint about how to divide and conquer in IS procedure. Then, based on fast condensed nearest neighbor (FCNN) rule, we propose a large data sets instance selection method with MapReduce framework. Besides ensuring the prediction accuracy and reduction rate, it has two desirable properties: First, it reduces the work load in the aggregation node; Second and most important, it produces the same result with the sequential version, which other parallel methods cannot achieve. We evaluate the performance of FCNN-MR on one small data set and two large data sets. The experimental results show that it is effective and practical.

Keywords: instance selection, data reduction, MapReduce, kNN

Procedia PDF Downloads 231
26449 A Study of the Travel Motivations of International Tourists in Visiting Thailand: A Case Study of Phuket

Authors: Suphaporn Rattanaphinanchai

Abstract:

The purpose of this study is to 1) describe and analyze the travel motivations of tourists visiting Phi Phi Islands after the Tsunami in 2004 and 2) to better understand whether there are significant differences in the tourists’ motivations in visiting Phi Phi Island after the Tsunami hit across tourists with different demographic profile. This study used Phi Phi Islands, which was damaged by the 2004 Indian Ocean tsunami as a case study. The instrument used in the present study is a self-administered questionnaire. A survey with 200 questionnaires was collected in May - December, 2015. Descriptive statistics, Independent Sample Mean T-tests, and Analysis of Variances was used to analyze the data. The result of the study showed that beauty of nature, good climate, and relaxing atmosphere motivated tourists in visiting Phi Phi Islands after the tsunami.

Keywords: motivation, Thailand, Thai tourism, Thai beaches

Procedia PDF Downloads 216
26448 Empirical Research on Rate of Return, Interest Rate and Mudarabah Deposit

Authors: Inten Meutia, Emylia Yuniarti

Abstract:

The objective of this study is to analyze the effects of interest rate, the rate of return of Islamic banks on the amount of mudarabah deposits in Islamic banks. In analyzing the effect of rate of return in the Islamic banks and interest rate risk in the conventional banks, the 1-month Islamic deposit rate of return and 1 month fixed deposit interest rate of a total Islamic deposit are considered. Using data covering the period from January 2010 to Sepember 2013, the study applies the regression analysis to analyze the effect between variable and independence t-test to analyze the mean difference between rate of return and rate of interest. Regression analysis shows that rate of return have significantly negative influence on mudarabah deposits, while interest rate have negative influence but not significant. The result of independent t test shows that the interest rate is not different from the rate of return in Islamic Bank. It supports the hyphotesis that rate of return in Islamic banking mimic rate of interest in conventional bank. The results of the study have important implications on the risk management practices of the Islamic banks in Indonesia.

Keywords: conventional bank, interest rate, Islamic bank, rate of return

Procedia PDF Downloads 483
26447 Energy Consumption, Emission Absorption and Carbon Emission Reduction on Semarang State University Campus

Authors: Dewi Liesnoor Setyowati, Puji Hardati, Tri Marhaeni Puji Astuti, Muhammad Amin

Abstract:

Universitas Negeri Semarang (UNNES) is a university with a vision of conservation. The impact of the UNNES conservation is the existence of a positive response from the community for the effort of greening the campus and the planting of conservation value in the academic community. But in reality,  energy consumption in UNNES campus tends to increase. The objectives of the study were to analyze the energy consumption in the campus area, to analyze the absorption of emissions by trees and the awareness of UNNES citizens in reducing emissions. Research focuses on energy consumption, carbon emissions, and awareness of citizens in reducing emissions. Research subjects in this study are UNNES citizens (lecturers, students and employees). The research area covers 6 faculties and one administrative center building. Data collection is done by observation, interview and documentation. The research used a quantitative descriptive method to analyze the data. The number of trees in UNNES is 10,264. Total emission on campus UNNES is 7.862.281.56 kg/year, the tree absorption is 6,289,250.38 kg/year. In UNNES campus area there are still 1,575,031.18 kg/year of emissions, not yet absorbed by trees. There are only two areas of the faculty whose trees are capable of absorbing emissions. The awareness of UNNES citizens in reducing energy consumption is seen in change the habit of: using energy-saving equipment (65%); reduce energy consumption per unit (68%); do energy literacy for UNNES citizens (74%). UNNES leaders always provide motivation to the citizens of UNNES, to reduce and change patterns of energy consumption.

Keywords: energy consumption, carbon emission absorption, emission reduction, energy literation

Procedia PDF Downloads 220
26446 Decay Analysis of 118Xe* Nucleus Formed in 28Si Induced Reaction

Authors: Manoj K. Sharma, Neha Grover

Abstract:

Dynamical cluster decay model (DCM) is applied to study the decay mechanism of 118Xe* nucleus in reference to recent data on 28Si + 90Zr → 118Xe* reaction, as an extension of our previous work on the dynamics of 112Xe* nucleus. It is relevant to mention here that DCM is based on collective clusterization approach, where emission probability of different decay paths such as evaporation residue (ER), intermediate mass fragments (IMF) and fission etc. is worked out on parallel scale. Calculations have been done over a wide range of center of mass energies with Ec.m. = 65 - 92 MeV. The evaporation residue (ER) cross-sections of 118Xe* compound nucleus are fitted in reference to available data, using spherical and quadrupole (β2) deformed choice of decaying fragments within the optimum orientations approach. It may be noted that our calculated cross-sections find decent agreement with experimental data and hence provide an opportunity to analyze the exclusive role of deformations in view of fragmentation behavior of 118Xe* nucleus. The possible contribution of IMF fragments is worked out and an extensive effort is being made to analyze the role of excitation energy, angular momentum, diffuseness parameter and level density parameter to have better understanding of the decay patterns governed in the dynamics of 28Si + 90Zr → 118Xe* reaction.

Keywords: cross-sections, deformations, fragmentation, angular momentum

Procedia PDF Downloads 281
26445 Data Science-Based Key Factor Analysis and Risk Prediction of Diabetic

Authors: Fei Gao, Rodolfo C. Raga Jr.

Abstract:

This research proposal will ascertain the major risk factors for diabetes and to design a predictive model for risk assessment. The project aims to improve diabetes early detection and management by utilizing data science techniques, which may improve patient outcomes and healthcare efficiency. The phase relation values of each attribute were used to analyze and choose the attributes that might influence the examiner's survival probability using Diabetes Health Indicators Dataset from Kaggle’s data as the research data. We compare and evaluate eight machine learning algorithms. Our investigation begins with comprehensive data preprocessing, including feature engineering and dimensionality reduction, aimed at enhancing data quality. The dataset, comprising health indicators and medical data, serves as a foundation for training and testing these algorithms. A rigorous cross-validation process is applied, and we assess their performance using five key metrics like accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC-ROC). After analyzing the data characteristics, investigate their impact on the likelihood of diabetes and develop corresponding risk indicators.

Keywords: diabetes, risk factors, predictive model, risk assessment, data science techniques, early detection, data analysis, Kaggle

Procedia PDF Downloads 42
26444 Non-Linear Causality Inference Using BAMLSS and Bi-CAM in Finance

Authors: Flora Babongo, Valerie Chavez

Abstract:

Inferring causality from observational data is one of the fundamental subjects, especially in quantitative finance. So far most of the papers analyze additive noise models with either linearity, nonlinearity or Gaussian noise. We fill in the gap by providing a nonlinear and non-gaussian causal multiplicative noise model that aims to distinguish the cause from the effect using a two steps method based on Bayesian additive models for location, scale and shape (BAMLSS) and on causal additive models (CAM). We have tested our method on simulated and real data and we reached an accuracy of 0.86 on average. As real data, we considered the causality between financial indices such as S&P 500, Nasdaq, CAC 40 and Nikkei, and companies' log-returns. Our results can be useful in inferring causality when the data is heteroskedastic or non-injective.

Keywords: causal inference, DAGs, BAMLSS, financial index

Procedia PDF Downloads 125
26443 MapReduce Algorithm for Geometric and Topological Information Extraction from 3D CAD Models

Authors: Ahmed Fradi

Abstract:

In a digital world in perpetual evolution and acceleration, data more and more voluminous, rich and varied, the new software solutions emerged with the Big Data phenomenon offer new opportunities to the company enabling it not only to optimize its business and to evolve its production model, but also to reorganize itself to increase competitiveness and to identify new strategic axes. Design and manufacturing industrial companies, like the others, face these challenges, data represent a major asset, provided that they know how to capture, refine, combine and analyze them. The objective of our paper is to propose a solution allowing geometric and topological information extraction from 3D CAD model (precisely STEP files) databases, with specific algorithm based on the programming paradigm MapReduce. Our proposal is the first step of our future approach to 3D CAD object retrieval.

Keywords: Big Data, MapReduce, 3D object retrieval, CAD, STEP format

Procedia PDF Downloads 515
26442 Analysis of Urban Population Using Twitter Distribution Data: Case Study of Makassar City, Indonesia

Authors: Yuyun Wabula, B. J. Dewancker

Abstract:

In the past decade, the social networking app has been growing very rapidly. Geolocation data is one of the important features of social media that can attach the user's location coordinate in the real world. This paper proposes the use of geolocation data from the Twitter social media application to gain knowledge about urban dynamics, especially on human mobility behavior. This paper aims to explore the relation between geolocation Twitter with the existence of people in the urban area. Firstly, the study will analyze the spread of people in the particular area, within the city using Twitter social media data. Secondly, we then match and categorize the existing place based on the same individuals visiting. Then, we combine the Twitter data from the tracking result and the questionnaire data to catch the Twitter user profile. To do that, we used the distribution frequency analysis to learn the visitors’ percentage. To validate the hypothesis, we compare it with the local population statistic data and land use mapping released by the city planning department of Makassar local government. The results show that there is the correlation between Twitter geolocation and questionnaire data. Thus, integration the Twitter data and survey data can reveal the profile of the social media users.

Keywords: geolocation, Twitter, distribution analysis, human mobility

Procedia PDF Downloads 287
26441 A Relational Data Base for Radiation Therapy

Authors: Raffaele Danilo Esposito, Domingo Planes Meseguer, Maria Del Pilar Dorado Rodriguez

Abstract:

As far as we know, it is still unavailable a commercial solution which would allow to manage, openly and configurable up to user needs, the huge amount of data generated in a modern Radiation Oncology Department. Currently, available information management systems are mainly focused on Record & Verify and clinical data, and only to a small extent on physical data. Thus, results in a partial and limited use of the actually available information. In the present work we describe the implementation at our department of a centralized information management system based on a web server. Our system manages both information generated during patient planning and treatment, and information of general interest for the whole department (i.e. treatment protocols, quality assurance protocols etc.). Our objective it to be able to analyze in a simple and efficient way all the available data and thus to obtain quantitative evaluations of our treatments. This would allow us to improve our work flow and protocols. To this end we have implemented a relational data base which would allow us to use in a practical and efficient way all the available information. As always we only use license free software.

Keywords: information management system, radiation oncology, medical physics, free software

Procedia PDF Downloads 211
26440 Design of Knowledge Management System with Geographic Information System

Authors: Angga Hidayah Ramadhan, Luciana Andrawina, M. Azani Hasibuan

Abstract:

Data will be as a core of the decision if it has a good treatment or process, which is process that data into information, and information into knowledge to make a wisdom or decision. Today, many companies have not realize it include XYZ University Admission Directorate as executor of National Admission called Seleksi Masuk Bersama (SMB) that during the time, the workers only uses their feeling to make a decision. Whereas if it done, then that company can analyze the data to make a right decision to get a pin sales from student candidate or registrant that follow SMB as many as possible. Therefore, needs Knowledge Management System (KMS) with Geographic Information System (GIS) use 5C4C that can process that company data becomes more useful and can help make decisions. This information system can process data into information based on the pin sold data with 5C (Contextualized, Categorize, Calculation, Correction, Condensed) and convert information into knowledge with 4C (Comparing, Consequence, Connection, Conversation) that has been several steps until these data can be useful to make easier to take a decision or wisdom, resolve problems, communicate, and quicker to learn to the employees have not experience and also for ease of viewing/visualization based on spatial data that equipped with GIS functionality that can be used to indicate events in each province with indicator that facilitate in this system. The system also have a function to save the tacit on the system then to be proceed into explicit in expert system based on the problems that will be found from the consequences of information. With the system each team can make a decision with same ways, structured, and the important is based on the actual event/data.

Keywords: 5C4C, data, information, knowledge

Procedia PDF Downloads 427
26439 Analysis of Business Intelligence Tools in Healthcare

Authors: Avishkar Gawade, Omkar Bansode, Ketan Bhambure, Bhargav Deore

Abstract:

In recent year wide range of business intelligence technology have been applied to different area in order to support decision making process BI enables extraction of knowledge from data store. BI tools usually used in public health field for financial and administrative purposes.BI uses a dashboard in presentation stage to deliver information to information to end users.In this paper,we intend to analyze some open source BI tools on the market and their applicability in the clinical sphere taking into consideration the general characteristics of the clinical environment.A pervasive BI platform was developed using a real case in order to prove the tool viability.Analysis of various BI Tools in done with the help of several parameters such as data security,data integration,data quality reporting and anlaytics,performance,scalability and cost effectivesness.

Keywords: CDSS, EHR, business intelliegence, tools

Procedia PDF Downloads 109
26438 Development of Risk Management System for Urban Railroad Underground Structures and Surrounding Ground

Authors: Y. K. Park, B. K. Kim, J. W. Lee, S. J. Lee

Abstract:

To assess the risk of the underground structures and surrounding ground, we collect basic data by the engineering method of measurement, exploration and surveys and, derive the risk through proper analysis and each assessment for urban railroad underground structures and surrounding ground including station inflow. Basic data are obtained by the fiber-optic sensors, MEMS sensors, water quantity/quality sensors, tunnel scanner, ground penetrating radar, light weight deflectometer, and are evaluated if they are more than the proper value or not. Based on these data, we analyze the risk level of urban railroad underground structures and surrounding ground. And we develop the risk management system to manage efficiently these data and to support a convenient interface environment at input/output of data.

Keywords: urban railroad, underground structures, ground subsidence, station inflow, risk

Procedia PDF Downloads 309
26437 Dissimilarity Measure for General Histogram Data and Its Application to Hierarchical Clustering

Authors: K. Umbleja, M. Ichino

Abstract:

Symbolic data mining has been developed to analyze data in very large datasets. It is also useful in cases when entry specific details should remain hidden. Symbolic data mining is quickly gaining popularity as datasets in need of analyzing are becoming ever larger. One type of such symbolic data is a histogram, which enables to save huge amounts of information into a single variable with high-level of granularity. Other types of symbolic data can also be described in histograms, therefore making histogram a very important and general symbolic data type - a method developed for histograms - can also be applied to other types of symbolic data. Due to its complex structure, analyzing histograms is complicated. This paper proposes a method, which allows to compare two histogram-valued variables and therefore find a dissimilarity between two histograms. Proposed method uses the Ichino-Yaguchi dissimilarity measure for mixed feature-type data analysis as a base and develops a dissimilarity measure specifically for histogram data, which allows to compare histograms with different number of bins and bin widths (so called general histogram). Proposed dissimilarity measure is then used as a measure for clustering. Furthermore, linkage method based on weighted averages is proposed with the concept of cluster compactness to measure the quality of clustering. The method is then validated with application on real datasets. As a result, the proposed dissimilarity measure is found producing adequate and comparable results with general histograms without the loss of detail or need to transform the data.

Keywords: dissimilarity measure, hierarchical clustering, histograms, symbolic data analysis

Procedia PDF Downloads 131
26436 Existential Concerns and Related Manifestations of Higher Learning Institution Students in Ethiopia: A Case Study of Aksum University

Authors: Ezgiamn Abraha Hagos

Abstract:

The primary objective of this study was to assess the existential concerns and related manifestations of higher learning students by investigating their perception of meaningful life and evaluating their purpose in life. In addition, this study was aimed at assessing the manifestations of existential pain among the students. Data was procured using Purpose in Life test (PIL), Well-being Manifestation Measure Scale (WBMMS), and focus group discussion. The total numbers of participants was 478, of which 299 were males and the remaining 179 females. They were selected using a simple random sampling technique. Data was analyzed using two ways. SPSS-version 20 was used to analyze the quantitative part, and narrative modes were utilized to analyze the qualitative data. The research finding revealed that students are involved in risk taking behaviors like alcohol ingestion, drug use, Khat (chat) chewing, and unsafe sex. In line with this it is found out that life in campus was perceived as temporary and as a result the sense of hedonism was prevalent at any cost. Of course, the most important thing for the majority of the students was to know about the purpose of life. Regarding WBMMS, there was no statistically significant difference among males and females and with the exception of the sub-scale of happiness; in all the sub-scales the mean is low. At last, assisting adolescents to develop holistically in terms of body, mind, and spirit is recommended.

Keywords: existential concerns, higher learning institutions, Ethiopia, Aksum University

Procedia PDF Downloads 395
26435 New Two-Way Map-Reduce Join Algorithm: Hash Semi Join

Authors: Marwa Hussein Mohamed, Mohamed Helmy Khafagy, Samah Ahmed Senbel

Abstract:

Map Reduce is a programming model used to handle and support massive data sets. Rapidly increasing in data size and big data are the most important issue today to make an analysis of this data. map reduce is used to analyze data and get more helpful information by using two simple functions map and reduce it's only written by the programmer, and it includes load balancing , fault tolerance and high scalability. The most important operation in data analysis are join, but map reduce is not directly support join. This paper explains two-way map-reduce join algorithm, semi-join and per split semi-join, and proposes new algorithm hash semi-join that used hash table to increase performance by eliminating unused records as early as possible and apply join using hash table rather than using map function to match join key with other data table in the second phase but using hash tables isn't affecting on memory size because we only save matched records from the second table only. Our experimental result shows that using a hash table with hash semi-join algorithm has higher performance than two other algorithms while increasing the data size from 10 million records to 500 million and running time are increased according to the size of joined records between two tables.

Keywords: map reduce, hadoop, semi join, two way join

Procedia PDF Downloads 486
26434 Publishing Formats of Scientific Journals in the XXI Century: the Case of Small Publishing Market

Authors: Arūnas Gudinavičius, Andrius Šuminas

Abstract:

The analysis of scholarly journals formats is fragmented and needs to be studied from a point of view of scientific communication. While PDF is to the author’s best knowledge probably the most popular digital format of XXI century, but there are more formats available: HTML, EPUB, etc. Our aim is to analyze how these formats important to the readers and what is their contribution to scientific communication. We want to investigate how printed journals are still popular between scholars and does different formats are preferred between fields of science . In most cases, publishing of scientific journals are examined from a narrow perspective of a particular university science affair administrators or research funding institution. We believe that more data o n formats used in scholarly periodicals currently published in Lithuania as well as in Eastern Europe are needed. Science communication is often analyzed as a directed chain of information in the author-publisher-reader cycle. The paper is focusing on the publishing part of this chain. A distinction is made between formal and informal forms of scientific communication, which is relevant in today's context, when both forms of communication intertwine and complement each other. In our research, we will analyze formal documentary (formats of publication of scientific articles) communication - scientific information recorded in a certain medium and formatted in certain format (printed, PDF, HTML, EPUB, etc.). In our research, we will analyze the stage of publication of research results in scientific journals and their dissemination through specific publication formats. The paper is to systematize and analyze the various types of formats of scientific journal published in XXI century in Lithuania (small publishing market). The research analyses the case of small European country and presents publishing formats characteristics of the publication of scientific periodicals.

Keywords: scientific communication, scientific journals, publishing formats, reading

Procedia PDF Downloads 58
26433 Comparison of Different Reanalysis Products for Predicting Extreme Precipitation in the Southern Coast of the Caspian Sea

Authors: Parvin Ghafarian, Mohammadreza Mohammadpur Panchah, Mehri Fallahi

Abstract:

Synoptic patterns from surface up to tropopause are very important for forecasting the weather and atmospheric conditions. There are many tools to prepare and analyze these maps. Reanalysis data and the outputs of numerical weather prediction models, satellite images, meteorological radar, and weather station data are used in world forecasting centers to predict the weather. The forecasting extreme precipitating on the southern coast of the Caspian Sea (CS) is the main issue due to complex topography. Also, there are different types of climate in these areas. In this research, we used two reanalysis data such as ECMWF Reanalysis 5th Generation Description (ERA5) and National Centers for Environmental Prediction /National Center for Atmospheric Research (NCEP/NCAR) for verification of the numerical model. ERA5 is the latest version of ECMWF. The temporal resolution of ERA5 is hourly, and the NCEP/NCAR is every six hours. Some atmospheric parameters such as mean sea level pressure, geopotential height, relative humidity, wind speed and direction, sea surface temperature, etc. were selected and analyzed. Some different type of precipitation (rain and snow) was selected. The results showed that the NCEP/NCAR has more ability to demonstrate the intensity of the atmospheric system. The ERA5 is suitable for extract the value of parameters for specific point. Also, ERA5 is appropriate to analyze the snowfall events over CS (snow cover and snow depth). Sea surface temperature has the main role to generate instability over CS, especially when the cold air pass from the CS. Sea surface temperature of NCEP/NCAR product has low resolution near coast. However, both data were able to detect meteorological synoptic patterns that led to heavy rainfall over CS. However, due to the time lag, they are not suitable for forecast centers. The application of these two data is for research and verification of meteorological models. Finally, ERA5 has a better resolution, respect to NCEP/NCAR reanalysis data, but NCEP/NCAR data is available from 1948 and appropriate for long term research.

Keywords: synoptic patterns, heavy precipitation, reanalysis data, snow

Procedia PDF Downloads 90
26432 Handling, Exporting and Archiving Automated Mineralogy Data Using TESCAN TIMA

Authors: Marek Dosbaba

Abstract:

Within the mining sector, SEM-based Automated Mineralogy (AM) has been the standard application for quickly and efficiently handling mineral processing tasks. Over the last decade, the trend has been to analyze larger numbers of samples, often with a higher level of detail. This has necessitated a shift from interactive sample analysis performed by an operator using a SEM, to an increased reliance on offline processing to analyze and report the data. In response to this trend, TESCAN TIMA Mineral Analyzer is designed to quickly create a virtual copy of the studied samples, thereby preserving all the necessary information. Depending on the selected data acquisition mode, TESCAN TIMA can perform hyperspectral mapping and save an X-ray spectrum for each pixel or segment, respectively. This approach allows the user to browse through elemental distribution maps of all elements detectable by means of energy dispersive spectroscopy. Re-evaluation of the existing data for the presence of previously unconsidered elements is possible without the need to repeat the analysis. Additional tiers of data such as a secondary electron or cathodoluminescence images can also be recorded. To take full advantage of these information-rich datasets, TIMA utilizes a new archiving tool introduced by TESCAN. The dataset size can be reduced for long-term storage and all information can be recovered on-demand in case of renewed interest. TESCAN TIMA is optimized for network storage of its datasets because of the larger data storage capacity of servers compared to local drives, which also allows multiple users to access the data remotely. This goes hand in hand with the support of remote control for the entire data acquisition process. TESCAN also brings a newly extended open-source data format that allows other applications to extract, process and report AM data. This offers the ability to link TIMA data to large databases feeding plant performance dashboards or geometallurgical models. The traditional tabular particle-by-particle or grain-by-grain export process is preserved and can be customized with scripts to include user-defined particle/grain properties.

Keywords: Tescan, electron microscopy, mineralogy, SEM, automated mineralogy, database, TESCAN TIMA, open format, archiving, big data

Procedia PDF Downloads 83
26431 Model for Introducing Products to New Customers through Decision Tree Using Algorithm C4.5 (J-48)

Authors: Komol Phaisarn, Anuphan Suttimarn, Vitchanan Keawtong, Kittisak Thongyoun, Chaiyos Jamsawang

Abstract:

This article is intended to analyze insurance information which contains information on the customer decision when purchasing life insurance pay package. The data were analyzed in order to present new customers with Life Insurance Perfect Pay package to meet new customers’ needs as much as possible. The basic data of insurance pay package were collect to get data mining; thus, reducing the scattering of information. The data were then classified in order to get decision model or decision tree using Algorithm C4.5 (J-48). In the classification, WEKA tools are used to form the model and testing datasets are used to test the decision tree for the accurate decision. The validation of this model in classifying showed that the accurate prediction was 68.43% while 31.25% were errors. The same set of data were then tested with other models, i.e. Naive Bayes and Zero R. The results showed that J-48 method could predict more accurately. So, the researcher applied the decision tree in writing the program used to introduce the product to new customers to persuade customers’ decision making in purchasing the insurance package that meets the new customers’ needs as much as possible.

Keywords: decision tree, data mining, customers, life insurance pay package

Procedia PDF Downloads 402
26430 Attribute Analysis of Quick Response Code Payment Users Using Discriminant Non-negative Matrix Factorization

Authors: Hironori Karachi, Haruka Yamashita

Abstract:

Recently, the system of quick response (QR) code is getting popular. Many companies introduce new QR code payment services and the services are competing with each other to increase the number of users. For increasing the number of users, we should grasp the difference of feature of the demographic information, usage information, and value of users between services. In this study, we conduct an analysis of real-world data provided by Nomura Research Institute including the demographic data of users and information of users’ usages of two services; LINE Pay, and PayPay. For analyzing such data and interpret the feature of them, Nonnegative Matrix Factorization (NMF) is widely used; however, in case of the target data, there is a problem of the missing data. EM-algorithm NMF (EMNMF) to complete unknown values for understanding the feature of the given data presented by matrix shape. Moreover, for comparing the result of the NMF analysis of two matrices, there is Discriminant NMF (DNMF) shows the difference of users features between two matrices. In this study, we combine EMNMF and DNMF and also analyze the target data. As the interpretation, we show the difference of the features of users between LINE Pay and Paypay.

Keywords: data science, non-negative matrix factorization, missing data, quality of services

Procedia PDF Downloads 98
26429 Developing an Information Model of Manufacturing Process for Sustainability

Authors: Jae Hyun Lee

Abstract:

Manufacturing companies use life-cycle inventory databases to analyze sustainability of their manufacturing processes. Life cycle inventory data provides reference data which may not be accurate for a specific company. Collecting accurate data of manufacturing processes for a specific company requires enormous time and efforts. An information model of typical manufacturing processes can reduce time and efforts to get appropriate reference data for a specific company. This paper shows an attempt to build an abstract information model which can be used to develop information models for specific manufacturing processes.

Keywords: process information model, sustainability, OWL, manufacturing

Procedia PDF Downloads 397
26428 Evaluation of Social Media Customer Engagement: A Content Analysis of Automobile Brand Pages

Authors: Adithya Jaikumar, Sudarsan Jayasingh

Abstract:

The dramatic technology led changes that continue to take place at the market place has led to the emergence and implication of online brand pages on social media networks. The Facebook brand page has become extremely popular among different brands. The primary aim of this study was to identify the impact of post formats and content type on customer engagement in Facebook brand pages. Methodology used for this study was to analyze and categorize 9037 content messages posted by 20 automobile brands in India during April 2014 to March 2015 and the customer activity it generated in return. The data was obtained from Fanpage karma- an online tool used for social media analytics. The statistical technique used to analyze the count data was negative binomial regression. The study indicates that there is a statistically significant relationship between the type of post and the customer engagement. The study shows that photos are the most posted format and highest engagement is found to be related to videos. The finding also reveals that social events and entertainment related content increases engagement with the message.

Keywords: content analysis, customer engagement, digital engagement, facebook brand pages, social media

Procedia PDF Downloads 291
26427 Cybersecurity Assessment of Decentralized Autonomous Organizations in Smart Cities

Authors: Claire Biasco, Thaier Hayajneh

Abstract:

A smart city is the integration of digital technologies in urban environments to enhance the quality of life. Smart cities capture real-time information from devices, sensors, and network data to analyze and improve city functions such as traffic analysis, public safety, and environmental impacts. Current smart cities face controversy due to their reliance on real-time data tracking and surveillance. Internet of Things (IoT) devices and blockchain technology are converging to reshape smart city infrastructure away from its centralized model. Connecting IoT data to blockchain applications would create a peer-to-peer, decentralized model. Furthermore, blockchain technology powers the ability for IoT device data to shift from the ownership and control of centralized entities to individuals or communities with Decentralized Autonomous Organizations (DAOs). In the context of smart cities, DAOs can govern cyber-physical systems to have a greater influence over how urban services are being provided. This paper will explore how the core components of a smart city now apply to DAOs. We will also analyze different definitions of DAOs to determine their most important aspects in relation to smart cities. Both categorizations will provide a solid foundation to conduct a cybersecurity assessment of DAOs in smart cities. It will identify the benefits and risks of adopting DAOs as they currently operate. The paper will then provide several mitigation methods to combat cybersecurity risks of DAO integrations. Finally, we will give several insights into what challenges will be faced by DAO and blockchain spaces in the coming years before achieving a higher level of maturity.

Keywords: blockchain, IoT, smart city, DAO

Procedia PDF Downloads 68