Search results for: big data analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 40541

Search results for: big data analysis

40241 Confirmatory Factor Analysis of Smartphone Addiction Inventory (SPAI) in the Yemeni Environment

Authors: Mohammed Al-Khadher

Abstract:

Currently, we are witnessing rapid advancements in the field of information and communications technology, forcing us, as psychologists, to combat the psychological and social effects of such developments. It also drives us to continually look for the development and preparation of measurement tools compatible with the changes brought about by the digital revolution. In this context, the current study aimed to identify the factor analysis of the Smartphone Addiction Inventory (SPAI) in the Republic of Yemen. The sample consisted of (1920) university students (1136 males and 784 females) who answered the inventory, and the data was analyzed using the statistical software (AMOS V25). The factor analysis results showed a goodness-of-fit of the data five-factor model with excellent indicators, as RMSEA-(.052), CFI-(.910), GFI-(.931), AGFI-(.915), TLI-(.897), NFI-(.895), RFI-(.880), and RMR-(.032). All within the ideal range to prove the model's fit of the scale’s factor analysis. The confirmatory factor analysis results showed factor loading in (4) items on (Time Spent), (4) items on (Compulsivity), (8) items on (Daily Life Interference), (5) items on (Craving), and (3) items on (Sleep interference); and all standard values of factor loading were statistically significant at the significance level (>.001).

Keywords: smartphone addiction inventory (SPAI), confirmatory factor analysis (CFA), yemeni students, people at risk of smartphone addiction

Procedia PDF Downloads 57
40240 A Novel Heuristic for Analysis of Large Datasets by Selecting Wrapper-Based Features

Authors: Bushra Zafar, Usman Qamar

Abstract:

Large data sample size and dimensions render the effectiveness of conventional data mining methodologies. A data mining technique are important tools for collection of knowledgeable information from variety of databases and provides supervised learning in the form of classification to design models to describe vital data classes while structure of the classifier is based on class attribute. Classification efficiency and accuracy are often influenced to great extent by noisy and undesirable features in real application data sets. The inherent natures of data set greatly masks its quality analysis and leave us with quite few practical approaches to use. To our knowledge first time, we present a new approach for investigation of structure and quality of datasets by providing a targeted analysis of localization of noisy and irrelevant features of data sets. Machine learning is based primarily on feature selection as pre-processing step which offers us to select few features from number of features as a subset by reducing the space according to certain evaluation criterion. The primary objective of this study is to trim down the scope of the given data sample by searching a small set of important features which may results into good classification performance. For this purpose, a heuristic for wrapper-based feature selection using genetic algorithm and for discriminative feature selection an external classifier are used. Selection of feature based on its number of occurrence in the chosen chromosomes. Sample dataset has been used to demonstrate proposed idea effectively. A proposed method has improved average accuracy of different datasets is about 95%. Experimental results illustrate that proposed algorithm increases the accuracy of prediction of different diseases.

Keywords: data mining, generic algorithm, KNN algorithms, wrapper based feature selection

Procedia PDF Downloads 293
40239 Hybrid Approach for Country’s Performance Evaluation

Authors: C. Slim

Abstract:

This paper presents an integrated model, which hybridized data envelopment analysis (DEA) and support vector machine (SVM) together, to class countries according to their efficiency and performance. This model takes into account aspects of multi-dimensional indicators, decision-making hierarchy and relativity of measurement. Starting from a set of indicators of performance as exhaustive as possible, a process of successive aggregations has been developed to attain an overall evaluation of a country’s competitiveness.

Keywords: Artificial Neural Networks (ANN), Support vector machine (SVM), Data Envelopment Analysis (DEA), Aggregations, indicators of performance

Procedia PDF Downloads 305
40238 A Safety Analysis Method for Multi-Agent Systems

Authors: Ching Louis Liu, Edmund Kazmierczak, Tim Miller

Abstract:

Safety analysis for multi-agent systems is complicated by the, potentially nonlinear, interactions between agents. This paper proposes a method for analyzing the safety of multi-agent systems by explicitly focusing on interactions and the accident data of systems that are similar in structure and function to the system being analyzed. The method creates a Bayesian network using the accident data from similar systems. A feature of our method is that the events in accident data are labeled with HAZOP guide words. Our method uses an Ontology to abstract away from the details of a multi-agent implementation. Using the ontology, our methods then constructs an “Interaction Map,” a graphical representation of the patterns of interactions between agents and other artifacts. Interaction maps combined with statistical data from accidents and the HAZOP classifications of events can be converted into a Bayesian Network. Bayesian networks allow designers to explore “what it” scenarios and make design trade-offs that maintain safety. We show how to use the Bayesian networks, and the interaction maps to improve multi-agent system designs.

Keywords: multi-agent system, safety analysis, safety model, integration map

Procedia PDF Downloads 392
40237 The Development of the Website Learning the Local Wisdom in Phra Nakhon Si Ayutthaya Province

Authors: Bunthida Chunngam, Thanyanan Worasesthaphong

Abstract:

This research had objective to develop of the website learning the local wisdom in Phra Nakhon Si Ayutthaya province and studied satisfaction of system user. This research sample was multistage sample for 100 questionnaires, analyzed data to calculated reliability value with Cronbach’s alpha coefficient method α=0.82. This system had 3 functions which were system using, system feather evaluation and system accuracy evaluation which the statistics used for data analysis was descriptive statistics to explain sample feature so these statistics were frequency, percentage, mean and standard deviation. This data analysis result found that the system using performance quality had good level satisfaction (4.44 mean), system feather function analysis had good level satisfaction (4.11 mean) and system accuracy had good level satisfaction (3.74 mean).

Keywords: website, learning, local wisdom, Phra Nakhon Si Ayutthaya province

Procedia PDF Downloads 92
40236 Exploring the Role of Data Mining in Crime Classification: A Systematic Literature Review

Authors: Faisal Muhibuddin, Ani Dijah Rahajoe

Abstract:

This in-depth exploration, through a systematic literature review, scrutinizes the nuanced role of data mining in the classification of criminal activities. The research focuses on investigating various methodological aspects and recent developments in leveraging data mining techniques to enhance the effectiveness and precision of crime categorization. Commencing with an exposition of the foundational concepts of crime classification and its evolutionary dynamics, this study details the paradigm shift from conventional methods towards approaches supported by data mining, addressing the challenges and complexities inherent in the modern crime landscape. Specifically, the research delves into various data mining techniques, including K-means clustering, Naïve Bayes, K-nearest neighbour, and clustering methods. A comprehensive review of the strengths and limitations of each technique provides insights into their respective contributions to improving crime classification models. The integration of diverse data sources takes centre stage in this research. A detailed analysis explores how the amalgamation of structured data (such as criminal records) and unstructured data (such as social media) can offer a holistic understanding of crime, enriching classification models with more profound insights. Furthermore, the study explores the temporal implications in crime classification, emphasizing the significance of considering temporal factors to comprehend long-term trends and seasonality. The availability of real-time data is also elucidated as a crucial element in enhancing responsiveness and accuracy in crime classification.

Keywords: data mining, classification algorithm, naïve bayes, k-means clustering, k-nearest neigbhor, crime, data analysis, sistematic literature review

Procedia PDF Downloads 30
40235 The Right to Data Portability and Its Influence on the Development of Digital Services

Authors: Roman Bieda

Abstract:

The General Data Protection Regulation (GDPR) will come into force on 25 May 2018 which will create a new legal framework for the protection of personal data in the European Union. Article 20 of GDPR introduces a right to data portability. This right allows for data subjects to receive the personal data which they have provided to a data controller, in a structured, commonly used and machine-readable format, and to transmit this data to another data controller. The right to data portability, by facilitating transferring personal data between IT environments (e.g.: applications), will also facilitate changing the provider of services (e.g. changing a bank or a cloud computing service provider). Therefore, it will contribute to the development of competition and the digital market. The aim of this paper is to discuss the right to data portability and its influence on the development of new digital services.

Keywords: data portability, digital market, GDPR, personal data

Procedia PDF Downloads 442
40234 Autonomic Threat Avoidance and Self-Healing in Database Management System

Authors: Wajahat Munir, Muhammad Haseeb, Adeel Anjum, Basit Raza, Ahmad Kamran Malik

Abstract:

Databases are the key components of the software systems. Due to the exponential growth of data, it is the concern that the data should be accurate and available. The data in databases is vulnerable to internal and external threats, especially when it contains sensitive data like medical or military applications. Whenever the data is changed by malicious intent, data analysis result may lead to disastrous decisions. Autonomic self-healing is molded toward computer system after inspiring from the autonomic system of human body. In order to guarantee the accuracy and availability of data, we propose a technique which on a priority basis, tries to avoid any malicious transaction from execution and in case a malicious transaction affects the system, it heals the system in an isolated mode in such a way that the availability of system would not be compromised. Using this autonomic system, the management cost and time of DBAs can be minimized. In the end, we test our model and present the findings.

Keywords: autonomic computing, self-healing, threat avoidance, security

Procedia PDF Downloads 475
40233 Seismic Performance Evaluation of Existing Building Using Structural Information Modeling

Authors: Byungmin Cho, Dongchul Lee, Taejin Kim, Minhee Lee

Abstract:

The procedure for the seismic retrofit of existing buildings includes the seismic evaluation. In the evaluation step, it is assessed whether the buildings have satisfactory performance against seismic load. Based on the results of that, the buildings are upgraded. To evaluate seismic performance of the buildings, it usually goes through the model transformation from elastic analysis to inelastic analysis. However, when the data is not delivered through the interwork, engineers should manually input the data. In this process, since it leads to inaccuracy and loss of information, the results of the analysis become less accurate. Therefore, in this study, the process for the seismic evaluation of existing buildings using structural information modeling is suggested. This structural information modeling makes the work economic and accurate. To this end, it is determined which part of the process could be computerized through the investigation of the process for the seismic evaluation based on ASCE 41. The structural information modeling process is developed to apply to the seismic evaluation using Perform 3D program usually used for the nonlinear response history analysis. To validate this process, the seismic performance of an existing building is investigated.

Keywords: existing building, nonlinear analysis, seismic performance, structural information modeling

Procedia PDF Downloads 351
40232 Implementation and Performance Analysis of Data Encryption Standard and RSA Algorithm with Image Steganography and Audio Steganography

Authors: S. C. Sharma, Ankit Gambhir, Rajeev Arya

Abstract:

In today’s era data security is an important concern and most demanding issues because it is essential for people using online banking, e-shopping, reservations etc. The two major techniques that are used for secure communication are Cryptography and Steganography. Cryptographic algorithms scramble the data so that intruder will not able to retrieve it; however steganography covers that data in some cover file so that presence of communication is hidden. This paper presents the implementation of Ron Rivest, Adi Shamir, and Leonard Adleman (RSA) Algorithm with Image and Audio Steganography and Data Encryption Standard (DES) Algorithm with Image and Audio Steganography. The coding for both the algorithms have been done using MATLAB and its observed that these techniques performed better than individual techniques. The risk of unauthorized access is alleviated up to a certain extent by using these techniques. These techniques could be used in Banks, RAW agencies etc, where highly confidential data is transferred. Finally, the comparisons of such two techniques are also given in tabular forms.

Keywords: audio steganography, data security, DES, image steganography, intruder, RSA, steganography

Procedia PDF Downloads 261
40231 Customers’ Acceptability of Islamic Banking: Employees’ Perspective in Peshawar

Authors: Tahira Imtiaz, Karim Ullah

Abstract:

This paper aims to incorporate the banks employees’ perspective on acceptability of Islamic banking by the customers of Peshawar. A qualitative approach is adopted for which six in-depth interviews with employees of Islamic banks are conducted. The employees were asked to share their experience regarding customers’ acceptance attitude towards acceptability of Islamic banking. Collected data was analyzed through thematic analysis technique and its synthesis with the current literature. Through data analysis a theoretical framework is developed, which highlights the factors which drive customers towards Islamic banking, as witnessed by the employees. The practical implication of analyzed data evident that a new model could be developed on the basis of four determinants of human preference namely: inner satisfaction, time, faith and market forces.

Keywords: customers’ attraction, employees’ perspective, Islamic banking, Riba

Procedia PDF Downloads 302
40230 Probability Sampling in Matched Case-Control Study in Drug Abuse

Authors: Surya R. Niraula, Devendra B Chhetry, Girish K. Singh, S. Nagesh, Frederick A. Connell

Abstract:

Background: Although random sampling is generally considered to be the gold standard for population-based research, the majority of drug abuse research is based on non-random sampling despite the well-known limitations of this kind of sampling. Method: We compared the statistical properties of two surveys of drug abuse in the same community: one using snowball sampling of drug users who then identified “friend controls” and the other using a random sample of non-drug users (controls) who then identified “friend cases.” Models to predict drug abuse based on risk factors were developed for each data set using conditional logistic regression. We compared the precision of each model using bootstrapping method and the predictive properties of each model using receiver operating characteristics (ROC) curves. Results: Analysis of 100 random bootstrap samples drawn from the snowball-sample data set showed a wide variation in the standard errors of the beta coefficients of the predictive model, none of which achieved statistical significance. One the other hand, bootstrap analysis of the random-sample data set showed less variation, and did not change the significance of the predictors at the 5% level when compared to the non-bootstrap analysis. Comparison of the area under the ROC curves using the model derived from the random-sample data set was similar when fitted to either data set (0.93, for random-sample data vs. 0.91 for snowball-sample data, p=0.35); however, when the model derived from the snowball-sample data set was fitted to each of the data sets, the areas under the curve were significantly different (0.98 vs. 0.83, p < .001). Conclusion: The proposed method of random sampling of controls appears to be superior from a statistical perspective to snowball sampling and may represent a viable alternative to snowball sampling.

Keywords: drug abuse, matched case-control study, non-probability sampling, probability sampling

Procedia PDF Downloads 467
40229 TAXAPRO, A Streamlined Pipeline to Analyze Shotgun Metagenomes

Authors: Sofia Sehli, Zainab El Ouafi, Casey Eddington, Soumaya Jbara, Kasambula Arthur Shem, Islam El Jaddaoui, Ayorinde Afolayan, Olaitan I. Awe, Allissa Dillman, Hassan Ghazal

Abstract:

The ability to promptly sequence whole genomes at a relatively low cost has revolutionized the way we study the microbiome. Microbiologists are no longer limited to studying what can be grown in a laboratory and instead are given the opportunity to rapidly identify the makeup of microbial communities in a wide variety of environments. Analyzing whole genome sequencing (WGS) data is a complex process that involves multiple moving parts and might be rather unintuitive for scientists that don’t typically work with this type of data. Thus, to help lower the barrier for less-computationally inclined individuals, TAXAPRO was developed at the first Omics Codeathon held virtually by the African Society for Bioinformatics and Computational Biology (ASBCB) in June 2021. TAXAPRO is an advanced metagenomics pipeline that accurately assembles organelle genomes from whole-genome sequencing data. TAXAPRO seamlessly combines WGS analysis tools to create a pipeline that automatically processes raw WGS data and presents organism abundance information in both a tabular and graphical format. TAXAPRO was evaluated using COVID-19 patient gut microbiome data. Analysis performed by TAXAPRO demonstrated a high abundance of Clostridia and Bacteroidia genera and a low abundance of Proteobacteria genera relative to others in the gut microbiome of patients hospitalized with COVID-19, consistent with the original findings derived using a different analysis methodology. This provides crucial evidence that the TAXAPRO workflow dispenses reliable organism abundance information overnight without the hassle of performing the analysis manually.

Keywords: metagenomics, shotgun metagenomic sequence analysis, COVID-19, pipeline, bioinformatics

Procedia PDF Downloads 174
40228 An Architectural Model for APT Detection

Authors: Nam-Uk Kim, Sung-Hwan Kim, Tai-Myoung Chung

Abstract:

Typical security management systems are not suitable for detecting APT attack, because they cannot draw the big picture from trivial events of security solutions. Although SIEM solutions have security analysis engine for that, their security analysis mechanisms need to be verified in academic field. Although this paper proposes merely an architectural model for APT detection, we will keep studying on correlation analysis mechanism in the future.

Keywords: advanced persistent threat, anomaly detection, data mining

Procedia PDF Downloads 495
40227 Comparative Sustainability Performance Analysis of Australian Companies Using Composite Measures

Authors: Ramona Zharfpeykan, Paul Rouse

Abstract:

Organizational sustainability is important to both organizations themselves and their stakeholders. Despite its increasing popularity and increasing numbers of organizations reporting sustainability, research on evaluating and comparing the sustainability performance of companies is limited. The aim of this study was to develop models to measure sustainability performance for both cross-sectional and longitudinal comparisons across companies in the same or different industries. A secondary aim was to see if sustainability reports can be used to evaluate sustainability performance. The study used both a content analysis of Australian sustainability reports in mining and metals and financial services for 2011-2014 and a survey of Australian and New Zealand organizations. Two methods ranging from a composite index using uniform weights to data envelopment analysis (DEA) were employed to analyze the data and develop the models. The results show strong statistically significant relationships between the developed models, which suggests that each model provides a consistent, systematic and reasonably robust analysis. The results of the models show that for both industries, companies that had sustainability scores above or below the industry average stayed almost the same during the study period. These indices and models can be used by companies to evaluate their sustainability performance and compare it with previous years, or with other companies in the same or different industries. These methods can also be used by various stakeholders and sustainability ranking companies such as the Global Reporting Initiative (GRI).

Keywords: data envelopment analysis, sustainability, sustainability performance measurement system, sustainability performance index, global reporting initiative

Procedia PDF Downloads 143
40226 Geographic Information Systems and Remotely Sensed Data for the Hydrological Modelling of Mazowe Dam

Authors: Ellen Nhedzi Gozo

Abstract:

Unavailability of adequate hydro-meteorological data has always limited the analysis and understanding of hydrological behaviour of several dam catchments including Mazowe Dam in Zimbabwe. The problem of insufficient data for Mazowe Dam catchment analysis was solved by extracting catchment characteristics and aerial hydro-meteorological data from ASTER, LANDSAT, Shuttle Radar Topographic Mission SRTM remote sensing (RS) images using ILWIS, ArcGIS and ERDAS Imagine geographic information systems (GIS) software. Available observed hydrological as well as meteorological data complemented the use of the remotely sensed information. Ground truth land cover was mapped using a Garmin Etrex global positioning system (GPS) system. This information was then used to validate land cover classification detail that was obtained from remote sensing images. A bathymetry survey was conducted using a SONAR system connected to GPS. Hydrological modelling using the HBV model was then performed to simulate the hydrological process of the catchment in an effort to verify the reliability of the derived parameters. The model output shows a high Nash-Sutcliffe Coefficient that is close to 1 indicating that the parameters derived from remote sensing and GIS can be applied with confidence in the analysis of Mazowe Dam catchment.

Keywords: geographic information systems, hydrological modelling, remote sensing, water resources management

Procedia PDF Downloads 296
40225 The Data Quality Model for the IoT based Real-time Water Quality Monitoring Sensors

Authors: Rabbia Idrees, Ananda Maiti, Saurabh Garg, Muhammad Bilal Amin

Abstract:

IoT devices are the basic building blocks of IoT network that generate enormous volume of real-time and high-speed data to help organizations and companies to take intelligent decisions. To integrate this enormous data from multisource and transfer it to the appropriate client is the fundamental of IoT development. The handling of this huge quantity of devices along with the huge volume of data is very challenging. The IoT devices are battery-powered and resource-constrained and to provide energy efficient communication, these IoT devices go sleep or online/wakeup periodically and a-periodically depending on the traffic loads to reduce energy consumption. Sometime these devices get disconnected due to device battery depletion. If the node is not available in the network, then the IoT network provides incomplete, missing, and inaccurate data. Moreover, many IoT applications, like vehicle tracking and patient tracking require the IoT devices to be mobile. Due to this mobility, If the distance of the device from the sink node become greater than required, the connection is lost. Due to this disconnection other devices join the network for replacing the broken-down and left devices. This make IoT devices dynamic in nature which brings uncertainty and unreliability in the IoT network and hence produce bad quality of data. Due to this dynamic nature of IoT devices we do not know the actual reason of abnormal data. If data are of poor-quality decisions are likely to be unsound. It is highly important to process data and estimate data quality before bringing it to use in IoT applications. In the past many researchers tried to estimate data quality and provided several Machine Learning (ML), stochastic and statistical methods to perform analysis on stored data in the data processing layer, without focusing the challenges and issues arises from the dynamic nature of IoT devices and how it is impacting data quality. A comprehensive review on determining the impact of dynamic nature of IoT devices on data quality is done in this research and presented a data quality model that can deal with this challenge and produce good quality of data. This research presents the data quality model for the sensors monitoring water quality. DBSCAN clustering and weather sensors are used in this research to make data quality model for the sensors monitoring water quality. An extensive study has been done in this research on finding the relationship between the data of weather sensors and sensors monitoring water quality of the lakes and beaches. The detailed theoretical analysis has been presented in this research mentioning correlation between independent data streams of the two sets of sensors. With the help of the analysis and DBSCAN, a data quality model is prepared. This model encompasses five dimensions of data quality: outliers’ detection and removal, completeness, patterns of missing values and checks the accuracy of the data with the help of cluster’s position. At the end, the statistical analysis has been done on the clusters formed as the result of DBSCAN, and consistency is evaluated through Coefficient of Variation (CoV).

Keywords: clustering, data quality, DBSCAN, and Internet of things (IoT)

Procedia PDF Downloads 107
40224 Positive Affect, Negative Affect, Organizational and Motivational Factor on the Acceptance of Big Data Technologies

Authors: Sook Ching Yee, Angela Siew Hoong Lee

Abstract:

Big data technologies have become a trend to exploit business opportunities and provide valuable business insights through the analysis of big data. However, there are still many organizations that have yet to adopt big data technologies especially small and medium organizations (SME). This study uses the technology acceptance model (TAM) to look into several constructs in the TAM and other additional constructs which are positive affect, negative affect, organizational factor and motivational factor. The conceptual model proposed in the study will be tested on the relationship and influence of positive affect, negative affect, organizational factor and motivational factor towards the intention to use big data technologies to produce an outcome. Empirical research is used in this study by conducting a survey to collect data.

Keywords: big data technologies, motivational factor, negative affect, organizational factor, positive affect, technology acceptance model (TAM)

Procedia PDF Downloads 327
40223 An Analysis System for Integrating High-Throughput Transcript Abundance Data with Metabolic Pathways in Green Algae

Authors: Han-Qin Zheng, Yi-Fan Chiang-Hsieh, Chia-Hung Chien, Wen-Chi Chang

Abstract:

As the most important non-vascular plants, algae have many research applications, including high species diversity, biofuel sources, adsorption of heavy metals and, following processing, health supplements. With the increasing availability of next-generation sequencing (NGS) data for algae genomes and transcriptomes, an integrated resource for retrieving gene expression data and metabolic pathway is essential for functional analysis and systems biology in algae. However, gene expression profiles and biological pathways are displayed separately in current resources, and making it impossible to search current databases directly to identify the cellular response mechanisms. Therefore, this work develops a novel AlgaePath database to retrieve gene expression profiles efficiently under various conditions in numerous metabolic pathways. AlgaePath, a web-based database, integrates gene information, biological pathways, and next-generation sequencing (NGS) datasets in Chlamydomonasreinhardtii and Neodesmus sp. UTEX 2219-4. Users can identify gene expression profiles and pathway information by using five query pages (i.e. Gene Search, Pathway Search, Differentially Expressed Genes (DEGs) Search, Gene Group Analysis, and Co-Expression Analysis). The gene expression data of 45 and 4 samples can be obtained directly on pathway maps in C. reinhardtii and Neodesmus sp. UTEX 2219-4, respectively. Genes that are differentially expressed between two conditions can be identified in Folds Search. Furthermore, the Gene Group Analysis of AlgaePath includes pathway enrichment analysis, and can easily compare the gene expression profiles of functionally related genes in a map. Finally, Co-Expression Analysis provides co-expressed transcripts of a target gene. The analysis results provide a valuable reference for designing further experiments and elucidating critical mechanisms from high-throughput data. More than an effective interface to clarify the transcript response mechanisms in different metabolic pathways under various conditions, AlgaePath is also a data mining system to identify critical mechanisms based on high-throughput sequencing.

Keywords: next-generation sequencing (NGS), algae, transcriptome, metabolic pathway, co-expression

Procedia PDF Downloads 377
40222 Chemometric-Based Voltammetric Method for Analysis of Vitamins and Heavy Metals in Honey Samples

Authors: Marwa A. A. Ragab, Amira F. El-Yazbi, Amr El-Hawiet

Abstract:

The analysis of heavy metals in honey samples is crucial. When found in honey, they denote environmental pollution. Some of these heavy metals as lead either present at low or high concentrations are considered to be toxic. Other heavy metals, for example, copper and zinc, if present at low concentrations, they considered safe even vital minerals. On the contrary, if they present at high concentrations, they are toxic. Their voltammetric determination in honey represents a challenge due to the presence of other electro-active components as vitamins, which may overlap with the peaks of the metal, hindering their accurate and precise determination. The simultaneous analysis of some vitamins: nicotinic acid (B3) and riboflavin (B2), and heavy metals: lead, cadmium, and zinc, in honey samples, was addressed. The analysis was done in 0.1 M Potassium Chloride (KCl) using a hanging mercury drop electrode (HMDE), followed by chemometric manipulation of the voltammetric data using the derivative method. Then the derivative data were convoluted using discrete Fourier functions. The proposed method allowed the simultaneous analysis of vitamins and metals though their varied responses and sensitivities. Although their peaks were overlapped, the proposed chemometric method allowed their accurate and precise analysis. After the chemometric treatment of the data, metals were successfully quantified at low levels in the presence of vitamins (1: 2000). The heavy metals limit of detection (LOD) values after the chemometric treatment of data decreased by more than 60% than those obtained from the direct voltammetric method. The method applicability was tested by analyzing the selected metals and vitamins in real honey samples obtained from different botanical origins.

Keywords: chemometrics, overlapped voltammetric peaks, derivative and convoluted derivative methods, metals and vitamins

Procedia PDF Downloads 120
40221 Sentiment Classification of Documents

Authors: Swarnadip Ghosh

Abstract:

Sentiment Analysis is the process of detecting the contextual polarity of text. In other words, it determines whether a piece of writing is positive, negative or neutral.Sentiment analysis of documents holds great importance in today's world, when numerous information is stored in databases and in the world wide web. An efficient algorithm to illicit such information, would be beneficial for social, economic as well as medical purposes. In this project, we have developed an algorithm to classify a document into positive or negative. Using our algorithm, we obtained a feature set from the data, and classified the documents based on this feature set. It is important to note that, in the classification, we have not used the independence assumption, which is considered by many procedures like the Naive Bayes. This makes the algorithm more general in scope. Moreover, because of the sparsity and high dimensionality of such data, we did not use empirical distribution for estimation, but developed a method by finding degree of close clustering of the data points. We have applied our algorithm on a movie review data set obtained from IMDb and obtained satisfactory results.

Keywords: sentiment, Run's Test, cross validation, higher dimensional pmf estimation

Procedia PDF Downloads 372
40220 A Statistical Approach to Classification of Agricultural Regions

Authors: Hasan Vural

Abstract:

Turkey is a favorable country to produce a great variety of agricultural products because of her different geographic and climatic conditions which have been used to divide the country into four main and seven sub regions. This classification into seven regions traditionally has been used in order to data collection and publication especially related with agricultural production. Afterwards, nine agricultural regions were considered. Recently, the governmental body which is responsible of data collection and dissemination (Turkish Institute of Statistics-TIS) has used 12 classes which include 11 sub regions and Istanbul province. This study aims to evaluate these classification efforts based on the acreage of ten main crops in a ten years time period (1996-2005). The panel data grouped in 11 subregions has been evaluated by cluster and multivariate statistical methods. It was concluded that from the agricultural production point of view, it will be rather meaningful to consider three main and eight sub-agricultural regions throughout the country.

Keywords: agricultural region, factorial analysis, cluster analysis,

Procedia PDF Downloads 377
40219 Response Analysis of a Steel Reinforced Concrete High-Rise Building during the 2011 Tohoku Earthquake

Authors: Naohiro Nakamura, Takuya Kinoshita, Hiroshi Fukuyama

Abstract:

The 2011 off The Pacific Coast of Tohoku Earthquake caused considerable damage to wide areas of eastern Japan. A large number of earthquake observation records were obtained at various places. To design more earthquake-resistant buildings and improve earthquake disaster prevention, it is necessary to utilize these data to analyze and evaluate the behavior of a building during an earthquake. This paper presents an earthquake response simulation analysis (hereafter a seismic response analysis) that was conducted using data recorded during the main earthquake (hereafter the main shock) as well as the earthquakes before and after it. The data were obtained at a high-rise steel-reinforced concrete (SRC) building in the bay area of Tokyo. We first give an overview of the building, along with the characteristics of the earthquake motion and the building during the main shock. The data indicate that there was a change in the natural period before and after the earthquake. Next, we present the results of our seismic response analysis. First, the analysis model and conditions are shown, and then, the analysis result is compared with the observational records. Using the analysis result, we then study the effect of soil-structure interaction on the response of the building. By identifying the characteristics of the building during the earthquake (i.e., the 1st natural period and the 1st damping ratio) by the Auto-Regressive eXogenous (ARX) model, we compare the analysis result with the observational records so as to evaluate the accuracy of the response analysis. In this study, a lumped-mass system SR model was used to conduct a seismic response analysis using observational data as input waves. The main results of this study are as follows: 1) The observational records of the 3/11 main shock put it between a level 1 and level 2 earthquake. The result of the ground response analysis showed that the maximum shear strain in the ground was about 0.1% and that the possibility of liquefaction occurring was low. 2) During the 3/11 main shock, the observed wave showed that the eigenperiod of the building became longer; this behavior could be generally reproduced in the response analysis. This prolonged eigenperiod was due to the nonlinearity of the superstructure, and the effect of the nonlinearity of the ground seems to have been small. 3) As for the 4/11 aftershock, a continuous analysis in which the subject seismic wave was input after the 3/11 main shock was input was conducted. The analyzed values generally corresponded well with the observed values. This means that the effect of the nonlinearity of the main shock was retained by the building. It is important to consider this when conducting the response evaluation. 4) The first period and the damping ratio during a vibration were evaluated by an ARX model. Our results show that the response analysis model in this study is generally good at estimating a change in the response of the building during a vibration.

Keywords: ARX model, response analysis, SRC building, the 2011 off the Pacific Coast of Tohoku Earthquake

Procedia PDF Downloads 140
40218 Performance Evaluation and Comparison between the Empirical Mode Decomposition, Wavelet Analysis, and Singular Spectrum Analysis Applied to the Time Series Analysis in Atmospheric Science

Authors: Olivier Delage, Hassan Bencherif, Alain Bourdier

Abstract:

Signal decomposition approaches represent an important step in time series analysis, providing useful knowledge and insight into the data and underlying dynamics characteristics while also facilitating tasks such as noise removal and feature extraction. As most of observational time series are nonlinear and nonstationary, resulting of several physical processes interaction at different time scales, experimental time series have fluctuations at all time scales and requires the development of specific signal decomposition techniques. Most commonly used techniques are data driven, enabling to obtain well-behaved signal components without making any prior-assumptions on input data. Among the most popular time series decomposition techniques, most cited in the literature, are the empirical mode decomposition and its variants, the empirical wavelet transform and singular spectrum analysis. With increasing popularity and utility of these methods in wide ranging applications, it is imperative to gain a good understanding and insight into the operation of these algorithms. In this work, we describe all of the techniques mentioned above as well as their ability to denoise signals, to capture trends, to identify components corresponding to the physical processes involved in the evolution of the observed system and deduce the dimensionality of the underlying dynamics. Results obtained with all of these methods on experimental total ozone columns and rainfall time series will be discussed and compared

Keywords: denoising, empirical mode decomposition, singular spectrum analysis, time series, underlying dynamics, wavelet analysis

Procedia PDF Downloads 76
40217 Charting Sentiments with Naive Bayes and Logistic Regression

Authors: Jummalla Aashrith, N. L. Shiva Sai, K. Bhavya Sri

Abstract:

The swift progress of web technology has not only amassed a vast reservoir of internet data but also triggered a substantial surge in data generation. The internet has metamorphosed into one of the dynamic hubs for online education, idea dissemination, as well as opinion-sharing. Notably, the widely utilized social networking platform Twitter is experiencing considerable expansion, providing users with the ability to share viewpoints, participate in discussions spanning diverse communities, and broadcast messages on a global scale. The upswing in online engagement has sparked a significant curiosity in subjective analysis, particularly when it comes to Twitter data. This research is committed to delving into sentiment analysis, focusing specifically on the realm of Twitter. It aims to offer valuable insights into deciphering information within tweets, where opinions manifest in a highly unstructured and diverse manner, spanning a spectrum from positivity to negativity, occasionally punctuated by neutrality expressions. Within this document, we offer a comprehensive exploration and comparative assessment of modern approaches to opinion mining. Employing a range of machine learning algorithms such as Naive Bayes and Logistic Regression, our investigation plunges into the domain of Twitter data streams. We delve into overarching challenges and applications inherent in the realm of subjectivity analysis over Twitter.

Keywords: machine learning, sentiment analysis, visualisation, python

Procedia PDF Downloads 24
40216 Rodriguez Diego, Del Valle Martin, Hargreaves Matias, Riveros Jose Luis

Authors: Nathainail Bashir, Neil Anderson

Abstract:

The objective of this study site was to investigate the current state of the practice with regards to karst detection methods and recommend the best method and pattern of arrays to acquire the desire results. Proper site investigation in karst prone regions is extremely valuable in determining the location of possible voids. Two geophysical techniques were employed: multichannel analysis of surface waves (MASW) and electric resistivity tomography (ERT).The MASW data was acquired at each test location using different array lengths and different array orientations (to increase the probability of getting interpretable data in karst terrain). The ERT data were acquired using a dipole-dipole array consisting of 168 electrodes. The MASW data was interpreted (re: estimated depth to physical top of rock) and used to constrain and verify the interpretation of the ERT data. The ERT data indicates poorer quality MASW data were acquired in areas where there was significant local variation in the depth to top of rock.

Keywords: dipole-dipole, ERT, Karst terrains, MASW

Procedia PDF Downloads 285
40215 Nonlinear Analysis in Investigating the Complexity of Neurophysiological Data during Reflex Behavior

Authors: Juliana A. Knocikova

Abstract:

Methods of nonlinear signal analysis are based on finding that random behavior can arise in deterministic nonlinear systems with a few degrees of freedom. Considering the dynamical systems, entropy is usually understood as a rate of information production. Changes in temporal dynamics of physiological data are indicating evolving of system in time, thus a level of new signal pattern generation. During last decades, many algorithms were introduced to assess some patterns of physiological responses to external stimulus. However, the reflex responses are usually characterized by short periods of time. This characteristic represents a great limitation for usual methods of nonlinear analysis. To solve the problems of short recordings, parameter of approximate entropy has been introduced as a measure of system complexity. Low value of this parameter is reflecting regularity and predictability in analyzed time series. On the other side, increasing of this parameter means unpredictability and a random behavior, hence a higher system complexity. Reduced neurophysiological data complexity has been observed repeatedly when analyzing electroneurogram and electromyogram activities during defence reflex responses. Quantitative phrenic neurogram changes are also obvious during severe hypoxia, as well as during airway reflex episodes. Concluding, the approximate entropy parameter serves as a convenient tool for analysis of reflex behavior characterized by short lasting time series.

Keywords: approximate entropy, neurophysiological data, nonlinear dynamics, reflex

Procedia PDF Downloads 278
40214 The Establishment of Probabilistic Risk Assessment Analysis Methodology for Dry Storage Concrete Casks Using SAPHIRE 8

Authors: J. R. Wang, W. Y. Cheng, J. S. Yeh, S. W. Chen, Y. M. Ferng, J. H. Yang, W. S. Hsu, C. Shih

Abstract:

To understand the risk for dry storage concrete casks in the cask loading, transfer, and storage phase, the purpose of this research is to establish the probabilistic risk assessment (PRA) analysis methodology for dry storage concrete casks by using SAPHIRE 8 code. This analysis methodology is used to perform the study of Taiwan nuclear power plants (NPPs) dry storage system. The process of research has three steps. First, the data of the concrete casks and Taiwan NPPs are collected. Second, the PRA analysis methodology is developed by using SAPHIRE 8. Third, the PRA analysis is performed by using this methodology. According to the analysis results, the maximum risk is the multipurpose canister (MPC) drop case.

Keywords: PRA, dry storage, concrete cask, SAPHIRE

Procedia PDF Downloads 186
40213 Exploring the Correlation between Population Distribution and Urban Heat Island under Urban Data: Taking Shenzhen Urban Heat Island as an Example

Authors: Wang Yang

Abstract:

Shenzhen is a modern city of China's reform and opening-up policy, the development of urban morphology has been established on the administration of the Chinese government. This city`s planning paradigm is primarily affected by the spatial structure and human behavior. The subjective urban agglomeration center is divided into several groups and centers. In comparisons of this effect, the city development law has better to be neglected. With the continuous development of the internet, extensive data technology has been introduced in China. Data mining and data analysis has become important tools in municipal research. Data mining has been utilized to improve data cleaning such as receiving business data, traffic data and population data. Prior to data mining, government data were collected by traditional means, then were analyzed using city-relationship research, delaying the timeliness of urban development, especially for the contemporary city. Data update speed is very fast and based on the Internet. The city's point of interest (POI) in the excavation serves as data source affecting the city design, while satellite remote sensing is used as a reference object, city analysis is conducted in both directions, the administrative paradigm of government is broken and urban research is restored. Therefore, the use of data mining in urban analysis is very important. The satellite remote sensing data of the Shenzhen city in July 2018 were measured by the satellite Modis sensor and can be utilized to perform land surface temperature inversion, and analyze city heat island distribution of Shenzhen. This article acquired and classified the data from Shenzhen by using Data crawler technology. Data of Shenzhen heat island and interest points were simulated and analyzed in the GIS platform to discover the main features of functional equivalent distribution influence. Shenzhen is located in the east-west area of China. The city’s main streets are also determined according to the direction of city development. Therefore, it is determined that the functional area of the city is also distributed in the east-west direction. The urban heat island can express the heat map according to the functional urban area. Regional POI has correspondence. The research result clearly explains that the distribution of the urban heat island and the distribution of urban POIs are one-to-one correspondence. Urban heat island is primarily influenced by the properties of the underlying surface, avoiding the impact of urban climate. Using urban POIs as analysis object, the distribution of municipal POIs and population aggregation are closely connected, so that the distribution of the population corresponded with the distribution of the urban heat island.

Keywords: POI, satellite remote sensing, the population distribution, urban heat island thermal map

Procedia PDF Downloads 82
40212 On-line Control of the Natural and Anthropogenic Safety in Krasnoyarsk Region

Authors: T. Penkova, A. Korobko, V. Nicheporchuk, L. Nozhenkova, A. Metus

Abstract:

This paper presents an approach of on-line control of the state of technosphere and environment objects based on the integration of Data Warehouse, OLAP and Expert systems technologies. It looks at the structure and content of data warehouse that provides consolidation and storage of monitoring data. There is a description of OLAP-models that provide a multidimensional analysis of monitoring data and dynamic analysis of principal parameters of controlled objects. The authors suggest some criteria of emergency risk assessment using expert knowledge about danger levels. It is demonstrated now some of the proposed solutions could be adopted in territorial decision making support systems. Operational control allows authorities to detect threat, prevent natural and anthropogenic emergencies and ensure a comprehensive safety of territory.

Keywords: decision making support systems, emergency risk assessment, natural and anthropogenic safety, on-line control, territory

Procedia PDF Downloads 376