Search results for: cloud data privacy and integrity
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25488

Search results for: cloud data privacy and integrity

23808 Minimum Data of a Speech Signal as Special Indicators of Identification in Phonoscopy

Authors: Nazaket Gazieva

Abstract:

Voice biometric data associated with physiological, psychological and other factors are widely used in forensic phonoscopy. There are various methods for identifying and verifying a person by voice. This article explores the minimum speech signal data as individual parameters of a speech signal. Monozygotic twins are believed to be genetically identical. Using the minimum data of the speech signal, we came to the conclusion that the voice imprint of monozygotic twins is individual. According to the conclusion of the experiment, we can conclude that the minimum indicators of the speech signal are more stable and reliable for phonoscopic examinations.

Keywords: phonogram, speech signal, temporal characteristics, fundamental frequency, biometric fingerprints

Procedia PDF Downloads 131
23807 A Non-parametric Clustering Approach for Multivariate Geostatistical Data

Authors: Francky Fouedjio

Abstract:

Multivariate geostatistical data have become omnipresent in the geosciences and pose substantial analysis challenges. One of them is the grouping of data locations into spatially contiguous clusters so that data locations within the same cluster are more similar while clusters are different from each other, in some sense. Spatially contiguous clusters can significantly improve the interpretation that turns the resulting clusters into meaningful geographical subregions. In this paper, we develop an agglomerative hierarchical clustering approach that takes into account the spatial dependency between observations. It relies on a dissimilarity matrix built from a non-parametric kernel estimator of the spatial dependence structure of data. It integrates existing methods to find the optimal cluster number and to evaluate the contribution of variables to the clustering. The capability of the proposed approach to provide spatially compact, connected and meaningful clusters is assessed using bivariate synthetic dataset and multivariate geochemical dataset. The proposed clustering method gives satisfactory results compared to other similar geostatistical clustering methods.

Keywords: clustering, geostatistics, multivariate data, non-parametric

Procedia PDF Downloads 471
23806 Big Data in Telecom Industry: Effective Predictive Techniques on Call Detail Records

Authors: Sara ElElimy, Samir Moustafa

Abstract:

Mobile network operators start to face many challenges in the digital era, especially with high demands from customers. Since mobile network operators are considered a source of big data, traditional techniques are not effective with new era of big data, Internet of things (IoT) and 5G; as a result, handling effectively different big datasets becomes a vital task for operators with the continuous growth of data and moving from long term evolution (LTE) to 5G. So, there is an urgent need for effective Big data analytics to predict future demands, traffic, and network performance to full fill the requirements of the fifth generation of mobile network technology. In this paper, we introduce data science techniques using machine learning and deep learning algorithms: the autoregressive integrated moving average (ARIMA), Bayesian-based curve fitting, and recurrent neural network (RNN) are employed for a data-driven application to mobile network operators. The main framework included in models are identification parameters of each model, estimation, prediction, and final data-driven application of this prediction from business and network performance applications. These models are applied to Telecom Italia Big Data challenge call detail records (CDRs) datasets. The performance of these models is found out using a specific well-known evaluation criteria shows that ARIMA (machine learning-based model) is more accurate as a predictive model in such a dataset than the RNN (deep learning model).

Keywords: big data analytics, machine learning, CDRs, 5G

Procedia PDF Downloads 128
23805 A Data Mining Approach for Analysing and Predicting the Bank's Asset Liability Management Based on Basel III Norms

Authors: Nidhin Dani Abraham, T. K. Sri Shilpa

Abstract:

Asset liability management is an important aspect in banking business. Moreover, the today’s banking is based on BASEL III which strictly regulates on the counterparty default. This paper focuses on prediction and analysis of counter party default risk, which is a type of risk occurs when the customers fail to repay the amount back to the lender (bank or any financial institutions). This paper proposes an approach to reduce the counterparty risk occurring in the financial institutions using an appropriate data mining technique and thus predicts the occurrence of NPA. It also helps in asset building and restructuring quality. Liability management is very important to carry out banking business. To know and analyze the depth of liability of bank, a suitable technique is required. For that a data mining technique is being used to predict the dormant behaviour of various deposit bank customers. Various models are implemented and the results are analyzed of saving bank deposit customers. All these data are cleaned using data cleansing approach from the bank data warehouse.

Keywords: data mining, asset liability management, BASEL III, banking

Procedia PDF Downloads 541
23804 A Dynamic Ensemble Learning Approach for Online Anomaly Detection in Alibaba Datacenters

Authors: Wanyi Zhu, Xia Ming, Huafeng Wang, Junda Chen, Lu Liu, Jiangwei Jiang, Guohua Liu

Abstract:

Anomaly detection is a first and imperative step needed to respond to unexpected problems and to assure high performance and security in large data center management. This paper presents an online anomaly detection system through an innovative approach of ensemble machine learning and adaptive differentiation algorithms, and applies them to performance data collected from a continuous monitoring system for multi-tier web applications running in Alibaba data centers. We evaluate the effectiveness and efficiency of this algorithm with production traffic data and compare with the traditional anomaly detection approaches such as a static threshold and other deviation-based detection techniques. The experiment results show that our algorithm correctly identifies the unexpected performance variances of any running application, with an acceptable false positive rate. This proposed approach has already been deployed in real-time production environments to enhance the efficiency and stability in daily data center operations.

Keywords: Alibaba data centers, anomaly detection, big data computation, dynamic ensemble learning

Procedia PDF Downloads 189
23803 Numerical Investigation of Hot Oil Velocity Effect on Force Heat Convection and Impact of Wind Velocity on Convection Heat Transfer in Receiver Tube of Parabolic Trough Collector System

Authors: O. Afshar

Abstract:

A solar receiver is designed for operation under extremely uneven heat flux distribution, cyclic weather, and cloud transient cycle conditions, which can include large thermal stress and even receiver failure. In this study, the effect of different oil velocity on convection coefficient factor and impact of wind velocity on local Nusselt number by Finite Volume Method will be analyzed. This study is organized to give an overview of the numerical modeling using a MATLAB software, as an accurate, time efficient and economical way of analyzing the heat transfer trends over stationary receiver tube for different Reynolds number. The results reveal when oil velocity is below 0.33m/s, the value of convection coefficient is negligible at low temperature. The numerical graphs indicate that when oil velocity increases up to 1.2 m/s, heat convection coefficient increases significantly. In fact, a reduction in oil velocity causes a reduction in heat conduction through the glass envelope. In addition, the different local Nusselt number is reduced when the wind blows toward the concave side of the collector and it has a significant effect on heat losses reduction through the glass envelope.

Keywords: receiver tube, heat convection, heat conduction, Nusselt number

Procedia PDF Downloads 344
23802 Unsupervised Text Mining Approach to Early Warning System

Authors: Ichihan Tai, Bill Olson, Paul Blessner

Abstract:

Traditional early warning systems that alarm against crisis are generally based on structured or numerical data; therefore, a system that can make predictions based on unstructured textual data, an uncorrelated data source, is a great complement to the traditional early warning systems. The Chicago Board Options Exchange (CBOE) Volatility Index (VIX), commonly referred to as the fear index, measures the cost of insurance against market crash, and spikes in the event of crisis. In this study, news data is consumed for prediction of whether there will be a market-wide crisis by predicting the movement of the fear index, and the historical references to similar events are presented in an unsupervised manner. Topic modeling-based prediction and representation are made based on daily news data between 1990 and 2015 from The Wall Street Journal against VIX index data from CBOE.

Keywords: early warning system, knowledge management, market prediction, topic modeling.

Procedia PDF Downloads 326
23801 The Role of Synthetic Data in Aerial Object Detection

Authors: Ava Dodd, Jonathan Adams

Abstract:

The purpose of this study is to explore the characteristics of developing a machine learning application using synthetic data. The study is structured to develop the application for the purpose of deploying the computer vision model. The findings discuss the realities of attempting to develop a computer vision model for practical purpose, and detail the processes, tools, and techniques that were used to meet accuracy requirements. The research reveals that synthetic data represents another variable that can be adjusted to improve the performance of a computer vision model. Further, a suite of tools and tuning recommendations are provided.

Keywords: computer vision, machine learning, synthetic data, YOLOv4

Procedia PDF Downloads 210
23800 Perception-Oriented Model Driven Development for Designing Data Acquisition Process in Wireless Sensor Networks

Authors: K. Indra Gandhi

Abstract:

Wireless Sensor Networks (WSNs) have always been characterized for application-specific sensing, relaying and collection of information for further analysis. However, software development was not considered as a separate entity in this process of data collection which has posed severe limitations on the software development for WSN. Software development for WSN is a complex process since the components involved are data-driven, network-driven and application-driven in nature. This implies that there is a tremendous need for the separation of concern from the software development perspective. A layered approach for developing data acquisition design based on Model Driven Development (MDD) has been proposed as the sensed data collection process itself varies depending upon the application taken into consideration. This work focuses on the layered view of the data acquisition process so as to ease the software point of development. A metamodel has been proposed that enables reusability and realization of the software development as an adaptable component for WSN systems. Further, observing users perception indicates that proposed model helps in improving the programmer's productivity by realizing the collaborative system involved.

Keywords: data acquisition, model-driven development, separation of concern, wireless sensor networks

Procedia PDF Downloads 423
23799 Comparative Analysis of Data Gathering Protocols with Multiple Mobile Elements for Wireless Sensor Network

Authors: Bhat Geetalaxmi Jairam, D. V. Ashoka

Abstract:

Wireless Sensor Networks are used in many applications to collect sensed data from different sources. Sensed data has to be delivered through sensors wireless interface using multi-hop communication towards the sink. The data collection in wireless sensor networks consumes energy. Energy consumption is the major constraints in WSN .Reducing the energy consumption while increasing the amount of generated data is a great challenge. In this paper, we have implemented two data gathering protocols with multiple mobile sinks/elements to collect data from sensor nodes. First, is Energy-Efficient Data Gathering with Tour Length-Constrained Mobile Elements in Wireless Sensor Networks (EEDG), in which mobile sinks uses vehicle routing protocol to collect data. Second is An Intelligent Agent-based Routing Structure for Mobile Sinks in WSNs (IAR), in which mobile sinks uses prim’s algorithm to collect data. Authors have implemented concepts which are common to both protocols like deployment of mobile sinks, generating visiting schedule, collecting data from the cluster member. Authors have compared the performance of both protocols by taking statistics based on performance parameters like Delay, Packet Drop, Packet Delivery Ratio, Energy Available, Control Overhead. Authors have concluded this paper by proving EEDG is more efficient than IAR protocol but with few limitations which include unaddressed issues likes Redundancy removal, Idle listening, Mobile Sink’s pause/wait state at the node. In future work, we plan to concentrate more on these limitations to avail a new energy efficient protocol which will help in improving the life time of the WSN.

Keywords: aggregation, consumption, data gathering, efficiency

Procedia PDF Downloads 481
23798 Comparison of Reserve Strength Ratio and Capacity Curve Parameters of Offshore Platforms with Distinct Bracing Arrangements

Authors: Aran Dezhban, Hooshang Dolatshahi Pirooz

Abstract:

The phenomenon of corrosion, especially in the Persian Gulf region, is the main cause of the deterioration of offshore platforms, due to the high corrosion of its water. This phenomenon occurs mostly in the area of water spraying, threatening the members of the first floor of the jacket, legs, and piles in this area. In the current study, the effect of bracing arrangement on the Capacity Curve and Reserve Strength Ratio of Fixed-Type Offshore Platforms is investigated. In order to continue the operation of the platform, two modes of robust and damaged structures are considered, while checking the adequacy of the platform capacity based on the allowable values of API RP-2SIM regulations. The platform in question is located in the Persian Gulf, which is modeled on the OpenSEES software. In this research, the Nonlinear Pushover Analysis has been used. After validation, the Capacity Curve of the studied platforms is obtained and then their Reserve Strength Ratio is calculated. Results are compared with the criteria in the API-2SIM regulations.

Keywords: fixed-type jacket structure, structural integrity management, nonlinear pushover analysis, robust and damaged structure, reserve strength ration, capacity curve

Procedia PDF Downloads 106
23797 Status and Results from EXO-200

Authors: Ryan Maclellan

Abstract:

EXO-200 has provided one of the most sensitive searches for neutrinoless double-beta decay utilizing 175 kg of enriched liquid xenon in an ultra-low background time projection chamber. This detector has demonstrated excellent energy resolution and background rejection capabilities. Using the first two years of data, EXO-200 has set a limit of 1.1x10^25 years at 90% C.L. on the neutrinoless double-beta decay half-life of Xe-136. The experiment has experienced a brief hiatus in data taking during a temporary shutdown of its host facility: the Waste Isolation Pilot Plant. EXO-200 expects to resume data taking in earnest this fall with upgraded detector electronics. Results from the analysis of EXO-200 data and an update on the current status of EXO-200 will be presented.

Keywords: double-beta, Majorana, neutrino, neutrinoless

Procedia PDF Downloads 403
23796 Test Method Development for Evaluation of Process and Design Effect on Reinforced Tube

Authors: Cathal Merz, Gareth O’Donnell

Abstract:

Coil reinforced thin-walled (CRTW) tubes are used in medicine to treat problems affecting blood vessels within the body through minimally invasive procedures. The CRTW tube considered in this research makes up part of such a device and is inserted into the patient via their femoral or brachial arteries and manually navigated to the site in need of treatment. This procedure replaces the requirement to perform open surgery but is limited by reduction of blood vessel lumen diameter and increase in tortuosity of blood vessels deep in the brain. In order to maximize the capability of these procedures, CRTW tube devices are being manufactured with decreasing wall thicknesses in order to deliver treatment deeper into the body and to allow passage of other devices through its inner diameter. This introduces significant stresses to the device materials which have resulted in an observed increase in the breaking of the proximal segment of the device into two separate pieces after it has failed by buckling. As there is currently no international standard for measuring the mechanical properties of these CRTW tube devices, it is difficult to accurately analyze this problem. The aim of the current work is to address this discrepancy in the biomedical device industry by developing a measurement system that can be used to quantify the effect of process and design changes on CRTW tube performance, aiding in the development of better performing, next generation devices. Using materials testing frames, micro-computed tomography (micro-CT) imaging, experiment planning, analysis of variance (ANOVA), T-tests and regression analysis, test methods have been developed for assessing the impact of process and design changes on the device. The major findings of this study have been an insight into the suitability of buckle and three-point bend tests for the measurement of the effect of varying processing factors on the device’s performance, and guidelines for interpreting the output data from the test methods. The findings of this study are of significant interest with respect to verifying and validating key process and design changes associated with the device structure and material condition. Test method integrity evaluation is explored throughout.

Keywords: neurovascular catheter, coil reinforced tube, buckling, three-point bend, tensile

Procedia PDF Downloads 105
23795 Michel Foucault’s Docile Bodies and The Matrix Trilogy: A Close Reading Applied to the Human Pods and Growing Fields in the Films

Authors: Julian Iliev

Abstract:

The recent release of The Matrix Resurrections persuaded many film scholars that The Matrix trilogy had lost its appeal and its concepts were largely outdated. This study examines the human pods and growing fields in the trilogy. Their functionality is compared to Michel Foucault’s concept of docile bodies: linking fictional and contemporary worlds. This paradigm is scrutinized through surveillance literature. The analogy brings to light common elements of hidden surveillance practices in technologies. The comparison illustrates the effects of body manipulation portrayed in the movies and their relevance with contemporary surveillance practices. Many scholars have utilized a close reading methodology in film studies (J.Bizzocchi, J.Tanenbaum, P.Larsen, S. Herbrechter, and Deacon et al.). The use of a particular lens through which media text is examined is an indispensable factor that needs to be incorporated into the methodology. The study spotlights both scenes from the trilogy depicting the human pods and growing fields. The functionality of the pods and the fields compare directly with Foucault’s concept of docile bodies. By utilizing Foucault’s study as a lens, the research will unearth hidden components and insights into the films. Foucault recognizes three disciplines that produce docile bodies: 1) manipulation and the interchangeability of individual bodies, 2) elimination of unnecessary movements and management of time, and 3) command system guaranteeing constant supervision and continuity protection. These disciplines can be found in the pods and growing fields. Each body occupies a single pod aiding easier manipulation and fast interchangeability. The movement of the bodies in the pods is reduced to the absolute minimum. Thus, the body is transformed into the ultimate object of control – minimum movement correlates to maximum energy generation. Supervision is exercised by wiring the body with numerous types of cables. This ultimate supervision of body activity reduces the body’s purpose to mere functioning. If a body does not function as an energy source, then it’s unplugged, ejected, and liquefied. The command system secures the constant supervision and continuity of the process. To Foucault, the disciplines are distinctly different from slavery because they stop short of a total takeover of the bodies. This is a clear difference from the slave system implemented in the films. Even though their system might lack sophistication, it makes up for it in the elevation of functionality. Further, surveillance literature illustrates the connection between the generation of body energy in The Matrix trilogy to the generation of individual data in contemporary society. This study found that the three disciplines producing docile bodies were present in the portrayal of the pods and fields in The Matrix trilogy. The above comparison combined with surveillance literature yields insights into analogous processes and contemporary surveillance practices. Thus, the constant generation of energy in The Matrix trilogy can be equated to the consistent data generation in contemporary society. This essay shows the relevance of the body manipulation concept in the Matrix films with contemporary surveillance practices.

Keywords: docile bodies, film trilogies, matrix movies, michel foucault, privacy loss, surveillance

Procedia PDF Downloads 83
23794 Remaining Useful Life (RUL) Assessment Using Progressive Bearing Degradation Data and ANN Model

Authors: Amit R. Bhende, G. K. Awari

Abstract:

Remaining useful life (RUL) prediction is one of key technologies to realize prognostics and health management that is being widely applied in many industrial systems to ensure high system availability over their life cycles. The present work proposes a data-driven method of RUL prediction based on multiple health state assessment for rolling element bearings. Bearing degradation data at three different conditions from run to failure is used. A RUL prediction model is separately built in each condition. Feed forward back propagation neural network models are developed for prediction modeling.

Keywords: bearing degradation data, remaining useful life (RUL), back propagation, prognosis

Procedia PDF Downloads 427
23793 Spatio-Temporal Data Mining with Association Rules for Lake Van

Authors: Tolga Aydin, M. Fatih Alaeddinoğlu

Abstract:

People, throughout the history, have made estimates and inferences about the future by using their past experiences. Developing information technologies and the improvements in the database management systems make it possible to extract useful information from knowledge in hand for the strategic decisions. Therefore, different methods have been developed. Data mining by association rules learning is one of such methods. Apriori algorithm, one of the well-known association rules learning algorithms, is not commonly used in spatio-temporal data sets. However, it is possible to embed time and space features into the data sets and make Apriori algorithm a suitable data mining technique for learning spatio-temporal association rules. Lake Van, the largest lake of Turkey, is a closed basin. This feature causes the volume of the lake to increase or decrease as a result of change in water amount it holds. In this study, evaporation, humidity, lake altitude, amount of rainfall and temperature parameters recorded in Lake Van region throughout the years are used by the Apriori algorithm and a spatio-temporal data mining application is developed to identify overflows and newly-formed soil regions (underflows) occurring in the coastal parts of Lake Van. Identifying possible reasons of overflows and underflows may be used to alert the experts to take precautions and make the necessary investments.

Keywords: apriori algorithm, association rules, data mining, spatio-temporal data

Procedia PDF Downloads 362
23792 Building Data Infrastructure for Public Use and Informed Decision Making in Developing Countries-Nigeria

Authors: Busayo Fashoto, Abdulhakeem Shaibu, Justice Agbadu, Samuel Aiyeoribe

Abstract:

Data has gone from just rows and columns to being an infrastructure itself. The traditional medium of data infrastructure has been managed by individuals in different industries and saved on personal work tools; one of such is the laptop. This hinders data sharing and Sustainable Development Goal (SDG) 9 for infrastructure sustainability across all countries and regions. However, there has been a constant demand for data across different agencies and ministries by investors and decision-makers. The rapid development and adoption of open-source technologies that promote the collection and processing of data in new ways and in ever-increasing volumes are creating new data infrastructure in sectors such as lands and health, among others. This paper examines the process of developing data infrastructure and, by extension, a data portal to provide baseline data for sustainable development and decision making in Nigeria. This paper employs the FAIR principle (Findable, Accessible, Interoperable, and Reusable) of data management using open-source technology tools to develop data portals for public use. eHealth Africa, an organization that uses technology to drive public health interventions in Nigeria, developed a data portal which is a typical data infrastructure that serves as a repository for various datasets on administrative boundaries, points of interest, settlements, social infrastructure, amenities, and others. This portal makes it possible for users to have access to datasets of interest at any point in time at no cost. A skeletal infrastructure of this data portal encompasses the use of open-source technology such as Postgres database, GeoServer, GeoNetwork, and CKan. These tools made the infrastructure sustainable, thus promoting the achievement of SDG 9 (Industries, Innovation, and Infrastructure). As of 6th August 2021, a wider cross-section of 8192 users had been created, 2262 datasets had been downloaded, and 817 maps had been created from the platform. This paper shows the use of rapid development and adoption of technologies that facilitates data collection, processing, and publishing in new ways and in ever-increasing volumes. In addition, the paper is explicit on new data infrastructure in sectors such as health, social amenities, and agriculture. Furthermore, this paper reveals the importance of cross-sectional data infrastructures for planning and decision making, which in turn can form a central data repository for sustainable development across developing countries.

Keywords: data portal, data infrastructure, open source, sustainability

Procedia PDF Downloads 81
23791 An Empirical Study of Determinants Influencing Telemedicine Services Acceptance by Healthcare Professionals: Case of Selected Hospitals in Ghana

Authors: Jonathan Kissi, Baozhen Dai, Wisdom W. K. Pomegbe, Abdul-Basit Kassim

Abstract:

Protecting patient’s digital information is a growing concern for healthcare institutions as people nowadays perpetually live their lives through telemedicine services. These telemedicine services have been confronted with several determinants that hinder their successful implementations, especially in developing countries. Identifying such determinants that influence the acceptance of telemedicine services is also a problem for healthcare professionals. Despite the tremendous increase in telemedicine services, its adoption, and use has been quite slow in some healthcare settings. Generally, it is accepted in today’s globalizing world that the success of telemedicine services relies on users’ satisfaction. Satisfying health professionals and patients are one of the crucial objectives of telemedicine success. This study seeks to investigate the determinants that influence health professionals’ intention to utilize telemedicine services in clinical activities in a sub-Saharan African country in West Africa (Ghana). A hybridized model comprising of health adoption models, including technology acceptance theory, diffusion of innovation theory, and protection of motivation theory, were used to investigate these quandaries. The study was carried out in four government health institutions that apply and regulate telemedicine services in their clinical activities. A structured questionnaire was developed and used for data collection. Purposive and convenience sampling methods were used in the selection of healthcare professionals from different medical fields for the study. The collected data were analyzed based on structural equation modeling (SEM) approach. All selected constructs showed a significant relationship with health professional’s behavioral intention in the direction expected from prior literature including perceived usefulness, perceived ease of use, management strategies, financial sustainability, communication channels, patients security threat, patients privacy risk, self efficacy, actual service use, user satisfaction, and telemedicine services systems securities threat. Surprisingly, user characteristics and response efficacy of health professionals were not significant in the hybridized model. The findings and insights from this research show that health professionals are pragmatic when making choices for technology applications and also their willingness to use telemedicine services. They are, however, anxious about its threats and coping appraisals. The identified significant constructs in the study may help to increase efficiency, quality of services, quality patient care delivery, and satisfactory user satisfaction among healthcare professionals. The implantation and effective utilization of telemedicine services in the selected hospitals will aid as a strategy to eradicate hardships in healthcare services delivery. The service will help attain universal health access coverage to all populace. This study contributes to empirical knowledge by identifying the vital factors influencing health professionals’ behavioral intentions to adopt telemedicine services. The study will also help stakeholders of healthcare to formulate better policies towards telemedicine service usage.

Keywords: telemedicine service, perceived usefulness, perceived ease of use, management strategies, security threats

Procedia PDF Downloads 129
23790 Process Data-Driven Representation of Abnormalities for Efficient Process Control

Authors: Hyun-Woo Cho

Abstract:

Unexpected operational events or abnormalities of industrial processes have a serious impact on the quality of final product of interest. In terms of statistical process control, fault detection and diagnosis of processes is one of the essential tasks needed to run the process safely. In this work, nonlinear representation of process measurement data is presented and evaluated using a simulation process. The effect of using different representation methods on the diagnosis performance is tested in terms of computational efficiency and data handling. The results have shown that the nonlinear representation technique produced more reliable diagnosis results and outperforms linear methods. The use of data filtering step improved computational speed and diagnosis performance for test data sets. The presented scheme is different from existing ones in that it attempts to extract the fault pattern in the reduced space, not in the original process variable space. Thus this scheme helps to reduce the sensitivity of empirical models to noise.

Keywords: fault diagnosis, nonlinear technique, process data, reduced spaces

Procedia PDF Downloads 238
23789 Text-to-Speech in Azerbaijani Language via Transfer Learning in a Low Resource Environment

Authors: Dzhavidan Zeinalov, Bugra Sen, Firangiz Aslanova

Abstract:

Most text-to-speech models cannot operate well in low-resource languages and require a great amount of high-quality training data to be considered good enough. Yet, with the improvements made in ASR systems, it is now much easier than ever to collect data for the design of custom text-to-speech models. In this work, our work on using the ASR model to collect data to build a viable text-to-speech system for one of the leading financial institutions of Azerbaijan will be outlined. NVIDIA’s implementation of the Tacotron 2 model was utilized along with the HiFiGAN vocoder. As for the training, the model was first trained with high-quality audio data collected from the Internet, then fine-tuned on the bank’s single speaker call center data. The results were then evaluated by 50 different listeners and got a mean opinion score of 4.17, displaying that our method is indeed viable. With this, we have successfully designed the first text-to-speech model in Azerbaijani and publicly shared 12 hours of audiobook data for everyone to use.

Keywords: Azerbaijani language, HiFiGAN, Tacotron 2, text-to-speech, transfer learning, whisper

Procedia PDF Downloads 31
23788 An Empirical Evaluation of Performance of Machine Learning Techniques on Imbalanced Software Quality Data

Authors: Ruchika Malhotra, Megha Khanna

Abstract:

The development of change prediction models can help the software practitioners in planning testing and inspection resources at early phases of software development. However, a major challenge faced during the training process of any classification model is the imbalanced nature of the software quality data. A data with very few minority outcome categories leads to inefficient learning process and a classification model developed from the imbalanced data generally does not predict these minority categories correctly. Thus, for a given dataset, a minority of classes may be change prone whereas a majority of classes may be non-change prone. This study explores various alternatives for adeptly handling the imbalanced software quality data using different sampling methods and effective MetaCost learners. The study also analyzes and justifies the use of different performance metrics while dealing with the imbalanced data. In order to empirically validate different alternatives, the study uses change data from three application packages of open-source Android data set and evaluates the performance of six different machine learning techniques. The results of the study indicate extensive improvement in the performance of the classification models when using resampling method and robust performance measures.

Keywords: change proneness, empirical validation, imbalanced learning, machine learning techniques, object-oriented metrics

Procedia PDF Downloads 411
23787 Quality of Age Reporting from Tanzania 2012 Census Results: An Assessment Using Whipple’s Index, Myer’s Blended Index, and Age-Sex Accuracy Index

Authors: A. Sathiya Susuman, Hamisi F. Hamisi

Abstract:

Background: Many socio-economic and demographic data are age-sex attributed. However, a variety of irregularities and misstatement are noted with respect to age-related data and less to sex data because of its biological differences between the genders. Noting the misstatement/misreporting of age data regardless of its significance importance in demographics and epidemiological studies, this study aims at assessing the quality of 2012 Tanzania Population and Housing Census Results. Methods: Data for the analysis are downloaded from Tanzania National Bureau of Statistics. Age heaping and digit preference were measured using summary indices viz., Whipple’s index, Myers’ blended index, and Age-Sex Accuracy index. Results: The recorded Whipple’s index for both sexes was 154.43; male has the lowest index of about 152.65 while female has the highest index of about 156.07. For Myers’ blended index, the preferences were at digits ‘0’ and ‘5’ while avoidance were at digits ‘1’ and ‘3’ for both sexes. Finally, Age-sex index stood at 59.8 where sex ratio score was 5.82 and age ratio scores were 20.89 and 21.4 for males and female respectively. Conclusion: The evaluation of the 2012 PHC data using the demographic techniques has qualified the data inaccurate as the results of systematic heaping and digit preferences/avoidances. Thus, innovative methods in data collection along with measuring and minimizing errors using statistical techniques should be used to ensure accuracy of age data.

Keywords: age heaping, digit preference/avoidance, summary indices, Whipple’s index, Myer’s index, age-sex accuracy index

Procedia PDF Downloads 463
23786 Fracture Toughness Characterizations of Single Edge Notch (SENB) Testing Using DIC System

Authors: Amr Mohamadien, Ali Imanpour, Sylvester Agbo, Nader Yoosef-Ghodsi, Samer Adeeb

Abstract:

The fracture toughness resistance curve (e.g., J-R curve and crack tip opening displacement (CTOD) or δ-R curve) is important in facilitating strain-based design and integrity assessment of oil and gas pipelines. This paper aims to present laboratory experimental data to characterize the fracture behavior of pipeline steel. The influential parameters associated with the fracture of API 5L X52 pipeline steel, including different initial crack sizes, were experimentally investigated for a single notch edge bend (SENB). A total of 9 small-scale specimens with different crack length to specimen depth ratios were conducted and tested using single edge notch bending (SENB). ASTM E1820 and BS7448 provide testing procedures to construct the fracture resistance curve (Load-CTOD, CTOD-R, or J-R) from test results. However, these procedures are limited by standard specimens’ dimensions, displacement gauges, and calibration curves. To overcome these limitations, this paper presents the use of small-scale specimens and a 3D-digital image correlation (DIC) system to extract the parameters required for fracture toughness estimation. Fracture resistance curve parameters in terms of crack mouth open displacement (CMOD), crack tip opening displacement (CTOD), and crack growth length (∆a) were carried out from test results by utilizing the DIC system, and an improved regression fitting resistance function (CTOD Vs. crack growth), or (J-integral Vs. crack growth) that is dependent on a variety of initial crack sizes was constructed and presented. The obtained results were compared to the available results of the classical physical measurement techniques, and acceptable matchings were observed. Moreover, a case study was implemented to estimate the maximum strain value that initiates the stable crack growth. This might be of interest to developing more accurate strain-based damage models. The results of laboratory testing in this study offer a valuable database to develop and validate damage models that are able to predict crack propagation of pipeline steel, accounting for the influential parameters associated with fracture toughness.

Keywords: fracture toughness, crack propagation in pipeline steels, CTOD-R, strain-based damage model

Procedia PDF Downloads 56
23785 Model for Introducing Products to New Customers through Decision Tree Using Algorithm C4.5 (J-48)

Authors: Komol Phaisarn, Anuphan Suttimarn, Vitchanan Keawtong, Kittisak Thongyoun, Chaiyos Jamsawang

Abstract:

This article is intended to analyze insurance information which contains information on the customer decision when purchasing life insurance pay package. The data were analyzed in order to present new customers with Life Insurance Perfect Pay package to meet new customers’ needs as much as possible. The basic data of insurance pay package were collect to get data mining; thus, reducing the scattering of information. The data were then classified in order to get decision model or decision tree using Algorithm C4.5 (J-48). In the classification, WEKA tools are used to form the model and testing datasets are used to test the decision tree for the accurate decision. The validation of this model in classifying showed that the accurate prediction was 68.43% while 31.25% were errors. The same set of data were then tested with other models, i.e. Naive Bayes and Zero R. The results showed that J-48 method could predict more accurately. So, the researcher applied the decision tree in writing the program used to introduce the product to new customers to persuade customers’ decision making in purchasing the insurance package that meets the new customers’ needs as much as possible.

Keywords: decision tree, data mining, customers, life insurance pay package

Procedia PDF Downloads 419
23784 Screening Methodology for Seismic Risk Assessment of Aging Structures in Oil and Gas Plants

Authors: Mohammad Nazri Mustafa, Pedram Hatami Abdullah, M. Fakhrur Razi Ahmad Faizul

Abstract:

With the issuance of Malaysian National Annex 2017 as a part of MS EN 1998-1:2015, the seismic mapping of Malaysian Peninsular including Sabah and Sarawak has undergone some changes in terms of the Peak Ground Acceleration (PGA) value. The revision to the PGA has raised a concern on the safety of oil and gas onshore structures as these structures were not designed to accommodate the new PGA values which are much higher than the previous values used in the original design. In view of the high numbers of structures and buildings to be re-assessed, a risk assessment methodology has been developed to prioritize and rank the assets in terms of their criticality against the new seismic loading. To-date such risk assessment method for oil and gas onshore structures is lacking, and it is the main intention of this technical paper to share the risk assessment methodology and risk elements scoring finalized via Delphi Method. The finalized methodology and the values used to rank the risk elements have been established based on years of relevant experience on the subject matter and based on a series of rigorous discussions with professionals in the industry. The risk scoring is mapped against the risk matrix (i.e., the LOF versus COF) and hence, the overall risk for the assets can be obtained. The overall risk can be used to prioritize and optimize integrity assessment, repair and strengthening work against the new seismic mapping of the country.

Keywords: methodology, PGA, risk, seismic

Procedia PDF Downloads 145
23783 Biodegradable Magnesium Alloys with Addition of Rare Earth Elements for Biomedical Applications

Authors: Yuncang Li, Cuie Wen

Abstract:

Biodegradable metallic materials such as magnesium (Mg)-based alloys have attracted extensive interest for use as bone implant materials. However, the high biodegradation rate of existing Mg alloys in the physiological environment of human body leads to losing mechanical integrity before adequate bone healing and producing a large volume of hydrogen gas. Therefore, slowing down the biodegradation rate of Mg alloys is a critical task in developing new biodegradable Mg alloy implant materials. One of the most effective approaches to achieve this is to strategically design new Mg alloys with low biodegradation rate, excellent biocompatibility, and enhanced mechanical properties. Our research selected biocompatible and biofunctional alloying elements such as zirconium (Zr), strontium (Sr), and rare earth elements (REEs) to alloy Mg and has developed a new series of Mg-Zr-Sr-REEs alloys for biodegradable implant applications. Research results indicated that Sr and Zr additions could refine the grain size, decrease the biodegradation rate, and enhance the biological behaviors of the Mg alloys. The REE addition, such as holmium (Ho) and dysprosium (Dy) to Mg-Zr-Sr alloys resulted in enhanced mechanical strength and decreased biodegradation rate. In addition, Ho and Dy additions (≤ 5 wt.%) to Mg-Zr-Sr alloys led to enhancement of cell adhesion and proliferation of osteoblast cells on the Mg-Zr-Sr-Ho/Dy alloys.

Keywords: biocompatibility, magnesium, mechanical and biodegrade properties, rare earth elements

Procedia PDF Downloads 107
23782 On the Optimality Assessment of Nano-Particle Size Spectrometry and Its Association to the Entropy Concept

Authors: A. Shaygani, R. Saifi, M. S. Saidi, M. Sani

Abstract:

Particle size distribution, the most important characteristics of aerosols, is obtained through electrical characterization techniques. The dynamics of charged nano-particles under the influence of electric field in electrical mobility spectrometer (EMS) reveals the size distribution of these particles. The accuracy of this measurement is influenced by flow conditions, geometry, electric field and particle charging process, therefore by the transfer function (transfer matrix) of the instrument. In this work, a wire-cylinder corona charger was designed and the combined field-diffusion charging process of injected poly-disperse aerosol particles was numerically simulated as a prerequisite for the study of a multi-channel EMS. The result, a cloud of particles with non-uniform charge distribution, was introduced to the EMS. The flow pattern and electric field in the EMS were simulated using computational fluid dynamics (CFD) to obtain particle trajectories in the device and therefore to calculate the reported signal by each electrometer. According to the output signals (resulted from bombardment of particles and transferring their charges as currents), we proposed a modification to the size of detecting rings (which are connected to electrometers) in order to evaluate particle size distributions more accurately. Based on the capability of the system to transfer information contents about size distribution of the injected particles, we proposed a benchmark for the assessment of optimality of the design. This method applies the concept of Von Neumann entropy and borrows the definition of entropy from information theory (Shannon entropy) to measure optimality. Entropy, according to the Shannon entropy, is the ''average amount of information contained in an event, sample or character extracted from a data stream''. Evaluating the responses (signals) which were obtained via various configurations of detecting rings, the best configuration which gave the best predictions about the size distributions of injected particles, was the modified configuration. It was also the one that had the maximum amount of entropy. A reasonable consistency was also observed between the accuracy of the predictions and the entropy content of each configuration. In this method, entropy is extracted from the transfer matrix of the instrument for each configuration. Ultimately, various clouds of particles were introduced to the simulations and predicted size distributions were compared to the exact size distributions.

Keywords: aerosol nano-particle, CFD, electrical mobility spectrometer, von neumann entropy

Procedia PDF Downloads 331
23781 Exploring the Role of Data Mining in Crime Classification: A Systematic Literature Review

Authors: Faisal Muhibuddin, Ani Dijah Rahajoe

Abstract:

This in-depth exploration, through a systematic literature review, scrutinizes the nuanced role of data mining in the classification of criminal activities. The research focuses on investigating various methodological aspects and recent developments in leveraging data mining techniques to enhance the effectiveness and precision of crime categorization. Commencing with an exposition of the foundational concepts of crime classification and its evolutionary dynamics, this study details the paradigm shift from conventional methods towards approaches supported by data mining, addressing the challenges and complexities inherent in the modern crime landscape. Specifically, the research delves into various data mining techniques, including K-means clustering, Naïve Bayes, K-nearest neighbour, and clustering methods. A comprehensive review of the strengths and limitations of each technique provides insights into their respective contributions to improving crime classification models. The integration of diverse data sources takes centre stage in this research. A detailed analysis explores how the amalgamation of structured data (such as criminal records) and unstructured data (such as social media) can offer a holistic understanding of crime, enriching classification models with more profound insights. Furthermore, the study explores the temporal implications in crime classification, emphasizing the significance of considering temporal factors to comprehend long-term trends and seasonality. The availability of real-time data is also elucidated as a crucial element in enhancing responsiveness and accuracy in crime classification.

Keywords: data mining, classification algorithm, naïve bayes, k-means clustering, k-nearest neigbhor, crime, data analysis, sistematic literature review

Procedia PDF Downloads 56
23780 Assessing Supply Chain Performance through Data Mining Techniques: A Case of Automotive Industry

Authors: Emin Gundogar, Burak Erkayman, Nusret Sazak

Abstract:

Providing effective management performance through the whole supply chain is critical issue and hard to applicate. The proper evaluation of integrated data may conclude with accurate information. Analysing the supply chain data through OLAP (On-Line Analytical Processing) technologies may provide multi-angle view of the work and consolidation. In this study, association rules and classification techniques are applied to measure the supply chain performance metrics of an automotive manufacturer in Turkey. Main criteria and important rules are determined. The comparison of the results of the algorithms is presented.

Keywords: supply chain performance, performance measurement, data mining, automotive

Procedia PDF Downloads 500
23779 Information Theoretic Approach for Beamforming in Wireless Communications

Authors: Syed Khurram Mahmud, Athar Naveed, Shoaib Arif

Abstract:

Beamforming is a signal processing technique extensively utilized in wireless communications and radars for desired signal intensification and interference signal minimization through spatial selectivity. In this paper, we present a method for calculation of optimal weight vectors for smart antenna array, to achieve a directive pattern during transmission and selective reception in interference prone environment. In proposed scheme, Mutual Information (MI) extrema are evaluated through an energy constrained objective function, which is based on a-priori information of interference source and desired array factor. Signal to Interference plus Noise Ratio (SINR) performance is evaluated for both transmission and reception. In our scheme, MI is presented as an index to identify trade-off between information gain, SINR, illumination time and spatial selectivity in an energy constrained optimization problem. The employed method yields lesser computational complexity, which is presented through comparative analysis with conventional methods in vogue. MI based beamforming offers enhancement of signal integrity in degraded environment while reducing computational intricacy and correlating key performance indicators.

Keywords: beamforming, interference, mutual information, wireless communications

Procedia PDF Downloads 269