Search results for: data consistency
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24914

Search results for: data consistency

24134 A Study of Safety of Data Storage Devices of Graduate Students at Suan Sunandha Rajabhat University

Authors: Komol Phaisarn, Natcha Wattanaprapa

Abstract:

This research is a survey research with an objective to study the safety of data storage devices of graduate students of academic year 2013, Suan Sunandha Rajabhat University. Data were collected by questionnaire on the safety of data storage devices according to CIA principle. A sample size of 81 was drawn from population by purposive sampling method. The results show that most of the graduate students of academic year 2013 at Suan Sunandha Rajabhat University use handy drive to store their data and the safety level of the devices is at good level.

Keywords: security, safety, storage devices, graduate students

Procedia PDF Downloads 341
24133 Prompt Design for Code Generation in Data Analysis Using Large Language Models

Authors: Lu Song Ma Li Zhi

Abstract:

With the rapid advancement of artificial intelligence technology, large language models (LLMs) have become a milestone in the field of natural language processing, demonstrating remarkable capabilities in semantic understanding, intelligent question answering, and text generation. These models are gradually penetrating various industries, particularly showcasing significant application potential in the data analysis domain. However, retraining or fine-tuning these models requires substantial computational resources and ample downstream task datasets, which poses a significant challenge for many enterprises and research institutions. Without modifying the internal parameters of the large models, prompt engineering techniques can rapidly adapt these models to new domains. This paper proposes a prompt design strategy aimed at leveraging the capabilities of large language models to automate the generation of data analysis code. By carefully designing prompts, data analysis requirements can be described in natural language, which the large language model can then understand and convert into executable data analysis code, thereby greatly enhancing the efficiency and convenience of data analysis. This strategy not only lowers the threshold for using large models but also significantly improves the accuracy and efficiency of data analysis. Our approach includes requirements for the precision of natural language descriptions, coverage of diverse data analysis needs, and mechanisms for immediate feedback and adjustment. Experimental results show that with this prompt design strategy, large language models perform exceptionally well in multiple data analysis tasks, generating high-quality code and significantly shortening the data analysis cycle. This method provides an efficient and convenient tool for the data analysis field and demonstrates the enormous potential of large language models in practical applications.

Keywords: large language models, prompt design, data analysis, code generation

Procedia PDF Downloads 11
24132 Comparison of Different Methods to Produce Fuzzy Tolerance Relations for Rainfall Data Classification in the Region of Central Greece

Authors: N. Samarinas, C. Evangelides, C. Vrekos

Abstract:

The aim of this paper is the comparison of three different methods, in order to produce fuzzy tolerance relations for rainfall data classification. More specifically, the three methods are correlation coefficient, cosine amplitude and max-min method. The data were obtained from seven rainfall stations in the region of central Greece and refers to 20-year time series of monthly rainfall height average. Three methods were used to express these data as a fuzzy relation. This specific fuzzy tolerance relation is reformed into an equivalence relation with max-min composition for all three methods. From the equivalence relation, the rainfall stations were categorized and classified according to the degree of confidence. The classification shows the similarities among the rainfall stations. Stations with high similarity can be utilized in water resource management scenarios interchangeably or to augment data from one to another. Due to the complexity of calculations, it is important to find out which of the methods is computationally simpler and needs fewer compositions in order to give reliable results.

Keywords: classification, fuzzy logic, tolerance relations, rainfall data

Procedia PDF Downloads 304
24131 The Role of Extrovert and Introvert Personality in Second Language Acquisition

Authors: Fatma Hsain Ali Suliman

Abstract:

Personality plays an important role in acquiring a second language. For second language learners to make maximum progress with their own learning styles, their individual differences must be recognized and attended to. Personality is considered to be a pattern of unique characteristics that give a person’s behavior a kind of consistency and individuality. Therefore, the enclosed study, which is entitled “The Role of Personality in Second language Acquisition: Extroversion and Introversion”, tends to shed light on the relationship between learners’ personalities and second language acquisition process. In other words, it aims at drawing attention to how individual differences of students as being extroverts or introverts could affect the language acquisition process. As a literature review, this paper discusses the results of some studies concerning this issue as well as the point views of researchers and scholars who have focused on the effect of extrovert and introvert personality on acquiring a second language. To accomplish the goals of this study, which is divided into 5 chapters including introduction, review of related literature, research method and design, results and discussions and conclusions and recommendations, 20 students of English Department, Faculty of Arts, Misurata University, Libya were handed out a questionnaire to figure out the effect of their personalities on the learning process. Finally, to be more sure about the role of personality in a second language acquisition process, the same students who were given the questionnaire were observed in their ESL classes.

Keywords: second language acquisition, personality, extroversion, introversion, individual differences, language learning strategy, personality factors, psycho linguistics

Procedia PDF Downloads 636
24130 Customer Satisfaction and Effective HRM Policies: Customer and Employee Satisfaction

Authors: S. Anastasiou, C. Nathanailides

Abstract:

The purpose of this study is to examine the possible link between employee and customer satisfaction. The service provided by employees, help to build a good relationship with customers and can help at increasing their loyalty. Published data for job satisfaction and indicators of customer services were gathered from relevant published works which included data from five different countries. The reviewed data indicate a significant correlation between indicators of customer and employee satisfaction in the Banking sector. There was a significant correlation between the two parameters (Pearson correlation R2=0.52 P<0.05) The reviewed data provide evidence that there is some practical evidence which links these two parameters.

Keywords: job satisfaction, job performance, customer’ service, banks, human resources management

Procedia PDF Downloads 311
24129 Evaluation of Australian Open Banking Regulation: Balancing Customer Data Privacy and Innovation

Authors: Suman Podder

Abstract:

As Australian ‘Open Banking’ allows customers to share their financial data with accredited Third-Party Providers (‘TPPs’), it is necessary to evaluate whether the regulators have achieved the balance between protecting customer data privacy and promoting data-related innovation. Recognising the need to increase customers’ influence on their own data, and the benefits of data-related innovation, the Australian Government introduced ‘Consumer Data Right’ (‘CDR’) to the banking sector through Open Banking regulation. Under Open Banking, TPPs can access customers’ banking data that allows the TPPs to tailor their products and services to meet customer needs at a more competitive price. This facilitated access and use of customer data will promote innovation by providing opportunities for new products and business models to emerge and grow. However, the success of Open Banking depends on the willingness of the customers to share their data, so the regulators have augmented the protection of data by introducing new privacy safeguards to instill confidence and trust in the system. The dilemma in policymaking is that, on the one hand, lenient data privacy laws will help the flow of information, but at the risk of individuals’ loss of privacy, on the other hand, stringent laws that adequately protect privacy may dissuade innovation. Using theoretical and doctrinal methods, this paper examines whether the privacy safeguards under Open Banking will add to the compliance burden of the participating financial institutions, resulting in the undesirable effect of stifling other policy objectives such as innovation. The contribution of this research is three-fold. In the emerging field of customer data sharing, this research is one of the few academic studies on the objectives and impact of Open Banking in the Australian context. Additionally, Open Banking is still in the early stages of implementation, so this research traces the evolution of Open Banking through policy debates regarding the desirability of customer data-sharing. Finally, the research focuses not only on the customers’ data privacy and juxtaposes it with another important objective of promoting innovation, but it also highlights the critical issues facing the data-sharing regime. This paper argues that while it is challenging to develop a regulatory framework for protecting data privacy without impeding innovation and jeopardising yet unknown opportunities, data privacy and innovation promote different aspects of customer welfare. This paper concludes that if a regulation is appropriately designed and implemented, the benefits of data-sharing will outweigh the cost of compliance with the CDR.

Keywords: consumer data right, innovation, open banking, privacy safeguards

Procedia PDF Downloads 131
24128 Debt Relief for Emerging Economies: An Empirical Investigation

Authors: Hummad Ch. Umar

Abstract:

Most of the developing economies, including Pakistan, are confronted with high level of external debt which is adversely affecting their economic performance. The hypothesis of debt overhang is often used to assess the negative relationship between foreign debt and the economic growth of the indebted country. As first objective of the present study, this hypothesis is tested by using Pooled OLS (POLS), Generalized Method of Moment (GMM), Random Effect (RE), and Fixed effect (FE) techniques. As second objective, the study uses the concept of debt Laffer Curve to determine the eligibility condition of the indebted countries for the relief programs. According to this approach, countries lying on the right side of the Laffer Curve are stated to be trapped in the strong debt overhang making them unable to come out of the vicious circle of low growth and high foreign debt. The empirical analysis confirms that only two countries out of twenty two completely fulfill the conditions of being eligible for the debt relief. All other countries continue to face debt burden of different magnitudes. The study further confirms that the debt relief alone is not sufficient for overcoming the debt problem. Instead, sound economic policies and conducive investment decisions are required to lay the foundations of long-term growth and development. Debt relief should be the option for only those countries that meet a minimum measurable criterion of good governance, economic freedom, and consistency of policies.

Keywords: external debt, debt burden, debt overhang, debt laffer curve, debt relief, investment decisions

Procedia PDF Downloads 314
24127 Generation of Automated Alarms for Plantwide Process Monitoring

Authors: Hyun-Woo Cho

Abstract:

Earlier detection of incipient abnormal operations in terms of plant-wide process management is quite necessary in order to improve product quality and process safety. And generating warning signals or alarms for operating personnel plays an important role in process automation and intelligent plant health monitoring. Various methodologies have been developed and utilized in this area such as expert systems, mathematical model-based approaches, multivariate statistical approaches, and so on. This work presents a nonlinear empirical monitoring methodology based on the real-time analysis of massive process data. Unfortunately, the big data includes measurement noises and unwanted variations unrelated to true process behavior. Thus the elimination of such unnecessary patterns of the data is executed in data processing step to enhance detection speed and accuracy. The performance of the methodology was demonstrated using simulated process data. The case study showed that the detection speed and performance was improved significantly irrespective of the size and the location of abnormal events.

Keywords: detection, monitoring, process data, noise

Procedia PDF Downloads 237
24126 Meanings and Concepts of Standardization in Systems Medicine

Authors: Imme Petersen, Wiebke Sick, Regine Kollek

Abstract:

In systems medicine, high-throughput technologies produce large amounts of data on different biological and pathological processes, including (disturbed) gene expressions, metabolic pathways and signaling. The large volume of data of different types, stored in separate databases and often located at different geographical sites have posed new challenges regarding data handling and processing. Tools based on bioinformatics have been developed to resolve the upcoming problems of systematizing, standardizing and integrating the various data. However, the heterogeneity of data gathered at different levels of biological complexity is still a major challenge in data analysis. To build multilayer disease modules, large and heterogeneous data of disease-related information (e.g., genotype, phenotype, environmental factors) are correlated. Therefore, a great deal of attention in systems medicine has been put on data standardization, primarily to retrieve and combine large, heterogeneous datasets into standardized and incorporated forms and structures. However, this data-centred concept of standardization in systems medicine is contrary to the debate in science and technology studies (STS) on standardization that rather emphasizes the dynamics, contexts and negotiations of standard operating procedures. Based on empirical work on research consortia that explore the molecular profile of diseases to establish systems medical approaches in the clinic in Germany, we trace how standardized data are processed and shaped by bioinformatics tools, how scientists using such data in research perceive such standard operating procedures and which consequences for knowledge production (e.g. modeling) arise from it. Hence, different concepts and meanings of standardization are explored to get a deeper insight into standard operating procedures not only in systems medicine, but also beyond.

Keywords: data, science and technology studies (STS), standardization, systems medicine

Procedia PDF Downloads 328
24125 Big Data in Construction Project Management: The Colombian Northeast Case

Authors: Sergio Zabala-Vargas, Miguel Jiménez-Barrera, Luz VArgas-Sánchez

Abstract:

In recent years, information related to project management in organizations has been increasing exponentially. Performance data, management statistics, indicator results have forced the collection, analysis, traceability, and dissemination of project managers to be essential. In this sense, there are current trends to facilitate efficient decision-making in emerging technology projects, such as: Machine Learning, Data Analytics, Data Mining, and Big Data. The latter is the most interesting in this project. This research is part of the thematic line Construction methods and project management. Many authors present the relevance that the use of emerging technologies, such as Big Data, has taken in recent years in project management in the construction sector. The main focus is the optimization of time, scope, budget, and in general mitigating risks. This research was developed in the northeastern region of Colombia-South America. The first phase was aimed at diagnosing the use of emerging technologies (Big-Data) in the construction sector. In Colombia, the construction sector represents more than 50% of the productive system, and more than 2 million people participate in this economic segment. The quantitative approach was used. A survey was applied to a sample of 91 companies in the construction sector. Preliminary results indicate that the use of Big Data and other emerging technologies is very low and also that there is interest in modernizing project management. There is evidence of a correlation between the interest in using new data management technologies and the incorporation of Building Information Modeling BIM. The next phase of the research will allow the generation of guidelines and strategies for the incorporation of technological tools in the construction sector in Colombia.

Keywords: big data, building information modeling, tecnology, project manamegent

Procedia PDF Downloads 119
24124 Monitoring of Sustainability of Extruded Soya Product TRADKON SPC-TEX in Order to Define Expiration Date

Authors: Radovan Čobanović, Milica Rankov Šicar

Abstract:

New attitudes about nutrition impose new styles, and therefore a neNew attitudes about nutrition impose new styles, and therefore a new kind of food. The goal of our work was to define the shelf life of new extruded soya product with minimum 65% of protein based on the analyses. According to the plan it was defined that a certain quantity of the same batch of new product (soybean flakes) which had predicted shelf life of 2 years had to be stored for 24 months in storage and analyzed at the beginning and end of sustainability plan on instrumental analyses (heavy metals, pesticides and mycotoxins) and every month on sensory analyses (odor, taste, color, consistency), microbiological analyses (Salmonella spp., Escherichia coli, Enterobacteriaceae, sulfite-reducing clostridia, Listeria monocytogenes), chemical analyses (protein, ash, fat, crude cellulose, granulation) and at the beginning on GMO analyses. All analyses were tested according to: sensory analyses ISO 6658, Salmonella spp ISO 6579, Escherichia coli ISO 16649-2, Enterobacteriaceae ISO 21528-2, sulfite-reducing clostridia ISO 15213 and Listeria monocytogenes ISO 11290-2, chemical and instrumental analyses Serbian ordinance on the methods of physico-chemical analyses and GMO analyses JRC Compendium. The results obtained after the analyses which were done according to the plan during the 24 months indicate that are no changes of products concerning both sensory and chemical analyses. As far as microbiological results are concerned Salmonella spp was not detected and all other quantitative analyses showed values <10 cfu/g. The other parameters for food safety (heavy metals, pesticides and mycotoxins) were not present in analyzed samples and also all analyzed samples were negative concerning genetic testing. On the basis of monitoring the sample under defined storage conditions and analyses of quality control, GMO analyses and food safety of the sample during the shelf within two years, the results showed that all the parameters of the sample during defined period is in accordance with Serbian regulative so that indicate that predicted shelf life can be adopted.w kind of food. The goal of our work was to define the shelf life of new extruded soya product with minimum 65% of protein based on the analyses. According to the plan it was defined that a certain quantity of the same batch of new product (soybean flakes) which had predicted shelf life of 2 years had to be stored for 24 months in storage and analyzed at the beginning and end of sustainability plan on instrumental analyses (heavy metals, pesticides and mycotoxins) and every month on sensory analyses (odor, taste, color, consistency), microbiological analyses (Salmonella spp., Escherichia coli, Enterobacteriaceae, sulfite-reducin clostridia, Listeria monocytogenes), chemical analyses (protein, ash, fat, crude cellulose, granulation) and at the beginning on GMO analyses. All analyses were tested according: sensory analyses ISO 6658, Salmonella spp ISO 6579, Escherichia coli ISO 16649-2, Enterobacteriaceae ISO 21528-2, sulfite-reducing clostridia ISO 15213 and Listeria monocytogenes ISO 11290-2, chemical and instrumental analyses Serbian ordinance on the methods of physico-chemical analyses and GMO analyses JRC Compendium. The results obtained after the analyses which were done according to the plan during the 24 months indicate that are no changes of products concerning both sensory and chemical analyses. As far as microbiological results are concerned Salmonella spp was not detected and all other quantitative analyses showed values <10 cfu/g. The other parameters for food safety (heavy metals, pesticides and mycotoxins) were not present in analyzed samples and also all analyzed samples were negative concerning genetic testing. On the basis of monitoring the sample under defined storage conditions and analyses of quality control, GMO analyses and food safety of the sample during the shelf within two years, the results showed that all the parameters of the sample during defined period is in accordance with Serbian regulative so that indicate that predicted shelf life can be adopted.

Keywords: extruded soya product, food safety analyses, GMO analyses, shelf life

Procedia PDF Downloads 284
24123 Assessing Antimicrobial Activity of Various Plant Extracts on Midgutmicroflora of Aedesaegypti

Authors: V. Baweja, K. K. Gupta, V. Dubey, C. Keshavam

Abstract:

Antimicrobial activity of six indigenous plants such as Tulsi Ocimum sanctum, Neem Azadirachta indica, Aloe vera, Turmeric Curcuma longa, Lantana Lantana camara, and Clove Syzygium aromaticum was assessed against the gut microbiota of the dengue fever mosquito Aedes aegypti, keeping in view that the presence of midgut bacteria may affect the ability of the vector to transmit pathogens. Eleven different types of bacterial clones were isolated from the midgut of lab-reared fourth instar larvae of Aedes aegypti and were grown on LB agar medium at an optimum temperature of 25 ºC. Identification of these bacteria was done on the basis of their colony characteristic such as colony size, shape, opacity, elevation, consistency, and growth. Light microscopic studies of the gut microbiota revealed dominance of Gram-negative cocci over gram positive cocci and bacilli and Gram-negative bacilli. Identification of species was done by chemical characterization of the colonies. Crude extracts of all test plants were screened for their antimicrobial activities against gut microbiota by disc diffusion assay. The zone of exclusion seen after 24 hr of incubation in different assays revealed the most potent antibacterial activities in neem followed by clove and turmeric. Lantana and Aloe vera were least effective.

Keywords: plant extract, aedes, dengue, antimicrobial activity

Procedia PDF Downloads 393
24122 Minimum Data of a Speech Signal as Special Indicators of Identification in Phonoscopy

Authors: Nazaket Gazieva

Abstract:

Voice biometric data associated with physiological, psychological and other factors are widely used in forensic phonoscopy. There are various methods for identifying and verifying a person by voice. This article explores the minimum speech signal data as individual parameters of a speech signal. Monozygotic twins are believed to be genetically identical. Using the minimum data of the speech signal, we came to the conclusion that the voice imprint of monozygotic twins is individual. According to the conclusion of the experiment, we can conclude that the minimum indicators of the speech signal are more stable and reliable for phonoscopic examinations.

Keywords: phonogram, speech signal, temporal characteristics, fundamental frequency, biometric fingerprints

Procedia PDF Downloads 127
24121 Nurse's Use of Power to Standardize Nursing Terminology in Electronic Health Record

Authors: Samira Ali

Abstract:

Aim: The purpose of this study was to describe nurses’ potential use of power levels to influence the standardization of nursing terminology (SNT) in electronic health records. Also, to examine the relationship between nurses’ use of power levels and variables such as position, communication and the potential goal of achieving SNT in electronic health records. Background: In an era of evidence-based nursing care, with an emphasis on nursing’s ability to measure the care rendered and improve outcomes of care, little is known about the nurse’s potential use of their power to SNT in electronic health records and lack of use of an SNT in electronic health records. Method: This descriptive, correlational, and cross-sectional study was conducted using survey methodology to assess the nurse’s use of power to influence the SNT in electronic health records. The Theory of Group Power within Organizations (TGPO) provided the conceptual framework for this study. A total of (n=232) nurses responded to the survey, posted on three nursing organizations’ websites. Results revealed the mean Cronbach’s alpha of the subscales was .94, suggesting high internal consistency. The mean power capability score was moderately high, at 134.22 (SD = 18.49). Power Capacity was significantly correlated with Power Capability (r = .96, p < .001). Power Capacity subscales were significantly correlated with Power Capacity and Power Capability. Conclusion: The mean Cronbach’s alpha of the subscales was .94 suggestive of reliability of the instrument. Nurses could potentially use power to achieve their goals, such as the implementation of SNT in electronic health records.

Keywords: nurses, power, actualized power, nursing terminology, electronic health records

Procedia PDF Downloads 235
24120 Assessing Professionalism, Communication, and Collaboration among Emergency Physicians by Implementing a 360-Degree Evaluation

Authors: Ahmed Al Ansari, Khalid Al Khalifa

Abstract:

Objective: Multisource feedback (MSF), also called the 360-Degree evaluation is an evaluation process by which questionnaires are distributed amongst medical peers and colleagues to assess physician performance from different sources other than the attending or the supervising physicians. The aim of this study was to design, implement, and evaluate a 360-Degree process in assessing emergency physicians trainee in the Kingdom of Bahrain. Method: The study was undertaken in Bahrain Defense Force Hospital which is a military teaching hospital in the Kingdom of Bahrain. Thirty emergency physicians (who represent the total population of the emergency physicians in our hospital) were assessed in this study. We developed an instrument modified from the Physician achievement review instrument PAR which was used to assess Physician in Alberta. We focused in our instrument to assess professionalism, communication skills and collaboration only. To achieve face and content validity, table of specification was constructed and a working group was involved in constructing the instrument. Expert opinion was considered as well. The instrument consisted of 39 items; were 15 items to assess professionalism, 13 items to assess communication skills, and 11 items to assess collaboration. Each emergency physicians was evaluated with 3 groups of raters, 4 Medical colleague emergency physicians, 4 medical colleague who are considered referral physicians from different departments, and 4 Coworkers from the emergency department. Independent administrative team was formed to carry on the responsibility of distributing the instruments and collecting them in closed envelopes. Each envelope was consisted of that instrument and a guide for the implementation of the MSF and the purpose of the study. Results: A total of 30 emergency physicians 16 males and 14 females who represent the total number of the emergency physicians in our hospital were assessed. The total collected forms is 269, were 105 surveys from coworkers working in emergency department, 93 surveys from medical colleague emergency physicians, and 116 surveys from referral physicians from different departments. The total mean response rates were 71.2%. The whole instrument was found to be suitable for factor analysis (KMO = 0.967; Bartlett test significant, p<0.00). Factor analysis showed that the data on the questionnaire decomposed into three factors which counted for 72.6% of the total variance: professionalism, collaboration, and communication. Reliability analysis indicated that the instrument full scale had high internal consistency (Cronbach’s α 0.98). The generalizability coefficients (Ep2) were 0.71 for the surveys. Conclusions: Based on the present results, the current instruments and procedures have high reliability, validity, and feasibility in assessing emergency physicians trainee in the emergency room.

Keywords: MSF system, emergency, validity, generalizability

Procedia PDF Downloads 348
24119 The Great Mimicker: A Case of Disseminated Tuberculosis

Authors: W. Ling, Mohamed Saufi Bin Awang

Abstract:

Introduction: Mycobacterium tuberculosis post a major health problem worldwide. Central nervous system (CNS) infection by mycobacterium tuberculosis is one of the most devastating complications of tuberculosis. Although with advancement in medical fields, we are yet to understand the pathophysiology of how mycobacterium tuberculosis was able to cross the blood-brain barrier (BBB) and infect the CNS. CNS TB may present with nonspecific clinical symptoms which can mimic other diseases/conditions; this is what makes the diagnosis relatively difficult and challenging. Public health has to be informed and educated about the spread of TB, and early identification of TB is important as it is a curable disease. Case Report: A young 21-year-old Malay gentleman was initially presented to us with symptoms of ear discharge, tinnitus, and right-sided headache for the past one year. Further history reveals that the symptoms have been mismanaged and neglected over the period of 1 year. Initial investigation reveals features of inflammation of the ear. Further imaging showed the feature of chronic inflammation of the otitis media and atypical right cerebral abscess, which has the same characteristic features and consistency. He further underwent a biopsy, and results reveal positive Mycobacterium tuberculosis of the otitis media. With the results and the available imaging, we were certain that this is likely a case of disseminated tuberculosis causing CNS TB. Conclusion: We aim to highlight the challenge and difficult face in our health care system and public health in early identification and treatment.

Keywords: central nervous system tuberculosis, intracranial tuberculosis, tuberculous encephalopathy, tuberculous meningitis

Procedia PDF Downloads 173
24118 A Non-parametric Clustering Approach for Multivariate Geostatistical Data

Authors: Francky Fouedjio

Abstract:

Multivariate geostatistical data have become omnipresent in the geosciences and pose substantial analysis challenges. One of them is the grouping of data locations into spatially contiguous clusters so that data locations within the same cluster are more similar while clusters are different from each other, in some sense. Spatially contiguous clusters can significantly improve the interpretation that turns the resulting clusters into meaningful geographical subregions. In this paper, we develop an agglomerative hierarchical clustering approach that takes into account the spatial dependency between observations. It relies on a dissimilarity matrix built from a non-parametric kernel estimator of the spatial dependence structure of data. It integrates existing methods to find the optimal cluster number and to evaluate the contribution of variables to the clustering. The capability of the proposed approach to provide spatially compact, connected and meaningful clusters is assessed using bivariate synthetic dataset and multivariate geochemical dataset. The proposed clustering method gives satisfactory results compared to other similar geostatistical clustering methods.

Keywords: clustering, geostatistics, multivariate data, non-parametric

Procedia PDF Downloads 470
24117 Big Data in Telecom Industry: Effective Predictive Techniques on Call Detail Records

Authors: Sara ElElimy, Samir Moustafa

Abstract:

Mobile network operators start to face many challenges in the digital era, especially with high demands from customers. Since mobile network operators are considered a source of big data, traditional techniques are not effective with new era of big data, Internet of things (IoT) and 5G; as a result, handling effectively different big datasets becomes a vital task for operators with the continuous growth of data and moving from long term evolution (LTE) to 5G. So, there is an urgent need for effective Big data analytics to predict future demands, traffic, and network performance to full fill the requirements of the fifth generation of mobile network technology. In this paper, we introduce data science techniques using machine learning and deep learning algorithms: the autoregressive integrated moving average (ARIMA), Bayesian-based curve fitting, and recurrent neural network (RNN) are employed for a data-driven application to mobile network operators. The main framework included in models are identification parameters of each model, estimation, prediction, and final data-driven application of this prediction from business and network performance applications. These models are applied to Telecom Italia Big Data challenge call detail records (CDRs) datasets. The performance of these models is found out using a specific well-known evaluation criteria shows that ARIMA (machine learning-based model) is more accurate as a predictive model in such a dataset than the RNN (deep learning model).

Keywords: big data analytics, machine learning, CDRs, 5G

Procedia PDF Downloads 126
24116 A Data Mining Approach for Analysing and Predicting the Bank's Asset Liability Management Based on Basel III Norms

Authors: Nidhin Dani Abraham, T. K. Sri Shilpa

Abstract:

Asset liability management is an important aspect in banking business. Moreover, the today’s banking is based on BASEL III which strictly regulates on the counterparty default. This paper focuses on prediction and analysis of counter party default risk, which is a type of risk occurs when the customers fail to repay the amount back to the lender (bank or any financial institutions). This paper proposes an approach to reduce the counterparty risk occurring in the financial institutions using an appropriate data mining technique and thus predicts the occurrence of NPA. It also helps in asset building and restructuring quality. Liability management is very important to carry out banking business. To know and analyze the depth of liability of bank, a suitable technique is required. For that a data mining technique is being used to predict the dormant behaviour of various deposit bank customers. Various models are implemented and the results are analyzed of saving bank deposit customers. All these data are cleaned using data cleansing approach from the bank data warehouse.

Keywords: data mining, asset liability management, BASEL III, banking

Procedia PDF Downloads 539
24115 A Dynamic Ensemble Learning Approach for Online Anomaly Detection in Alibaba Datacenters

Authors: Wanyi Zhu, Xia Ming, Huafeng Wang, Junda Chen, Lu Liu, Jiangwei Jiang, Guohua Liu

Abstract:

Anomaly detection is a first and imperative step needed to respond to unexpected problems and to assure high performance and security in large data center management. This paper presents an online anomaly detection system through an innovative approach of ensemble machine learning and adaptive differentiation algorithms, and applies them to performance data collected from a continuous monitoring system for multi-tier web applications running in Alibaba data centers. We evaluate the effectiveness and efficiency of this algorithm with production traffic data and compare with the traditional anomaly detection approaches such as a static threshold and other deviation-based detection techniques. The experiment results show that our algorithm correctly identifies the unexpected performance variances of any running application, with an acceptable false positive rate. This proposed approach has already been deployed in real-time production environments to enhance the efficiency and stability in daily data center operations.

Keywords: Alibaba data centers, anomaly detection, big data computation, dynamic ensemble learning

Procedia PDF Downloads 185
24114 Crop Classification using Unmanned Aerial Vehicle Images

Authors: Iqra Yaseen

Abstract:

One of the well-known areas of computer science and engineering, image processing in the context of computer vision has been essential to automation. In remote sensing, medical science, and many other fields, it has made it easier to uncover previously undiscovered facts. Grading of diverse items is now possible because of neural network algorithms, categorization, and digital image processing. Its use in the classification of agricultural products, particularly in the grading of seeds or grains and their cultivars, is widely recognized. A grading and sorting system enables the preservation of time, consistency, and uniformity. Global population growth has led to an increase in demand for food staples, biofuel, and other agricultural products. To meet this demand, available resources must be used and managed more effectively. Image processing is rapidly growing in the field of agriculture. Many applications have been developed using this approach for crop identification and classification, land and disease detection and for measuring other parameters of crop. Vegetation localization is the base of performing these task. Vegetation helps to identify the area where the crop is present. The productivity of the agriculture industry can be increased via image processing that is based upon Unmanned Aerial Vehicle photography and satellite. In this paper we use the machine learning techniques like Convolutional Neural Network, deep learning, image processing, classification, You Only Live Once to UAV imaging dataset to divide the crop into distinct groups and choose the best way to use it.

Keywords: image processing, UAV, YOLO, CNN, deep learning, classification

Procedia PDF Downloads 94
24113 Unsupervised Text Mining Approach to Early Warning System

Authors: Ichihan Tai, Bill Olson, Paul Blessner

Abstract:

Traditional early warning systems that alarm against crisis are generally based on structured or numerical data; therefore, a system that can make predictions based on unstructured textual data, an uncorrelated data source, is a great complement to the traditional early warning systems. The Chicago Board Options Exchange (CBOE) Volatility Index (VIX), commonly referred to as the fear index, measures the cost of insurance against market crash, and spikes in the event of crisis. In this study, news data is consumed for prediction of whether there will be a market-wide crisis by predicting the movement of the fear index, and the historical references to similar events are presented in an unsupervised manner. Topic modeling-based prediction and representation are made based on daily news data between 1990 and 2015 from The Wall Street Journal against VIX index data from CBOE.

Keywords: early warning system, knowledge management, market prediction, topic modeling.

Procedia PDF Downloads 324
24112 The Role of Synthetic Data in Aerial Object Detection

Authors: Ava Dodd, Jonathan Adams

Abstract:

The purpose of this study is to explore the characteristics of developing a machine learning application using synthetic data. The study is structured to develop the application for the purpose of deploying the computer vision model. The findings discuss the realities of attempting to develop a computer vision model for practical purpose, and detail the processes, tools, and techniques that were used to meet accuracy requirements. The research reveals that synthetic data represents another variable that can be adjusted to improve the performance of a computer vision model. Further, a suite of tools and tuning recommendations are provided.

Keywords: computer vision, machine learning, synthetic data, YOLOv4

Procedia PDF Downloads 209
24111 Perception-Oriented Model Driven Development for Designing Data Acquisition Process in Wireless Sensor Networks

Authors: K. Indra Gandhi

Abstract:

Wireless Sensor Networks (WSNs) have always been characterized for application-specific sensing, relaying and collection of information for further analysis. However, software development was not considered as a separate entity in this process of data collection which has posed severe limitations on the software development for WSN. Software development for WSN is a complex process since the components involved are data-driven, network-driven and application-driven in nature. This implies that there is a tremendous need for the separation of concern from the software development perspective. A layered approach for developing data acquisition design based on Model Driven Development (MDD) has been proposed as the sensed data collection process itself varies depending upon the application taken into consideration. This work focuses on the layered view of the data acquisition process so as to ease the software point of development. A metamodel has been proposed that enables reusability and realization of the software development as an adaptable component for WSN systems. Further, observing users perception indicates that proposed model helps in improving the programmer's productivity by realizing the collaborative system involved.

Keywords: data acquisition, model-driven development, separation of concern, wireless sensor networks

Procedia PDF Downloads 421
24110 A Convergent Interacting Particle Method for Computing Kpp Front Speeds in Random Flows

Authors: Tan Zhang, Zhongjian Wang, Jack Xin, Zhiwen Zhang

Abstract:

We aim to efficiently compute the spreading speeds of reaction-diffusion-advection (RDA) fronts in divergence-free random flows under the Kolmogorov-Petrovsky-Piskunov (KPP) nonlinearity. We study a stochastic interacting particle method (IPM) for the reduced principal eigenvalue (Lyapunov exponent) problem of an associated linear advection-diffusion operator with spatially random coefficients. The Fourier representation of the random advection field and the Feynman-Kac (FK) formula of the principal eigenvalue (Lyapunov exponent) form the foundation of our method implemented as a genetic evolution algorithm. The particles undergo advection-diffusion and mutation/selection through a fitness function originated in the FK semigroup. We analyze the convergence of the algorithm based on operator splitting and present numerical results on representative flows such as 2D cellular flow and 3D Arnold-Beltrami-Childress (ABC) flow under random perturbations. The 2D examples serve as a consistency check with semi-Lagrangian computation. The 3D results demonstrate that IPM, being mesh-free and self-adaptive, is simple to implement and efficient for computing front spreading speeds in the advection-dominated regime for high-dimensional random flows on unbounded domains where no truncation is needed.

Keywords: KPP front speeds, random flows, Feynman-Kac semigroups, interacting particle method, convergence analysis

Procedia PDF Downloads 34
24109 Comparative Analysis of Data Gathering Protocols with Multiple Mobile Elements for Wireless Sensor Network

Authors: Bhat Geetalaxmi Jairam, D. V. Ashoka

Abstract:

Wireless Sensor Networks are used in many applications to collect sensed data from different sources. Sensed data has to be delivered through sensors wireless interface using multi-hop communication towards the sink. The data collection in wireless sensor networks consumes energy. Energy consumption is the major constraints in WSN .Reducing the energy consumption while increasing the amount of generated data is a great challenge. In this paper, we have implemented two data gathering protocols with multiple mobile sinks/elements to collect data from sensor nodes. First, is Energy-Efficient Data Gathering with Tour Length-Constrained Mobile Elements in Wireless Sensor Networks (EEDG), in which mobile sinks uses vehicle routing protocol to collect data. Second is An Intelligent Agent-based Routing Structure for Mobile Sinks in WSNs (IAR), in which mobile sinks uses prim’s algorithm to collect data. Authors have implemented concepts which are common to both protocols like deployment of mobile sinks, generating visiting schedule, collecting data from the cluster member. Authors have compared the performance of both protocols by taking statistics based on performance parameters like Delay, Packet Drop, Packet Delivery Ratio, Energy Available, Control Overhead. Authors have concluded this paper by proving EEDG is more efficient than IAR protocol but with few limitations which include unaddressed issues likes Redundancy removal, Idle listening, Mobile Sink’s pause/wait state at the node. In future work, we plan to concentrate more on these limitations to avail a new energy efficient protocol which will help in improving the life time of the WSN.

Keywords: aggregation, consumption, data gathering, efficiency

Procedia PDF Downloads 479
24108 Status and Results from EXO-200

Authors: Ryan Maclellan

Abstract:

EXO-200 has provided one of the most sensitive searches for neutrinoless double-beta decay utilizing 175 kg of enriched liquid xenon in an ultra-low background time projection chamber. This detector has demonstrated excellent energy resolution and background rejection capabilities. Using the first two years of data, EXO-200 has set a limit of 1.1x10^25 years at 90% C.L. on the neutrinoless double-beta decay half-life of Xe-136. The experiment has experienced a brief hiatus in data taking during a temporary shutdown of its host facility: the Waste Isolation Pilot Plant. EXO-200 expects to resume data taking in earnest this fall with upgraded detector electronics. Results from the analysis of EXO-200 data and an update on the current status of EXO-200 will be presented.

Keywords: double-beta, Majorana, neutrino, neutrinoless

Procedia PDF Downloads 401
24107 Remaining Useful Life (RUL) Assessment Using Progressive Bearing Degradation Data and ANN Model

Authors: Amit R. Bhende, G. K. Awari

Abstract:

Remaining useful life (RUL) prediction is one of key technologies to realize prognostics and health management that is being widely applied in many industrial systems to ensure high system availability over their life cycles. The present work proposes a data-driven method of RUL prediction based on multiple health state assessment for rolling element bearings. Bearing degradation data at three different conditions from run to failure is used. A RUL prediction model is separately built in each condition. Feed forward back propagation neural network models are developed for prediction modeling.

Keywords: bearing degradation data, remaining useful life (RUL), back propagation, prognosis

Procedia PDF Downloads 424
24106 Spatio-Temporal Data Mining with Association Rules for Lake Van

Authors: Tolga Aydin, M. Fatih Alaeddinoğlu

Abstract:

People, throughout the history, have made estimates and inferences about the future by using their past experiences. Developing information technologies and the improvements in the database management systems make it possible to extract useful information from knowledge in hand for the strategic decisions. Therefore, different methods have been developed. Data mining by association rules learning is one of such methods. Apriori algorithm, one of the well-known association rules learning algorithms, is not commonly used in spatio-temporal data sets. However, it is possible to embed time and space features into the data sets and make Apriori algorithm a suitable data mining technique for learning spatio-temporal association rules. Lake Van, the largest lake of Turkey, is a closed basin. This feature causes the volume of the lake to increase or decrease as a result of change in water amount it holds. In this study, evaporation, humidity, lake altitude, amount of rainfall and temperature parameters recorded in Lake Van region throughout the years are used by the Apriori algorithm and a spatio-temporal data mining application is developed to identify overflows and newly-formed soil regions (underflows) occurring in the coastal parts of Lake Van. Identifying possible reasons of overflows and underflows may be used to alert the experts to take precautions and make the necessary investments.

Keywords: apriori algorithm, association rules, data mining, spatio-temporal data

Procedia PDF Downloads 360
24105 Building Data Infrastructure for Public Use and Informed Decision Making in Developing Countries-Nigeria

Authors: Busayo Fashoto, Abdulhakeem Shaibu, Justice Agbadu, Samuel Aiyeoribe

Abstract:

Data has gone from just rows and columns to being an infrastructure itself. The traditional medium of data infrastructure has been managed by individuals in different industries and saved on personal work tools; one of such is the laptop. This hinders data sharing and Sustainable Development Goal (SDG) 9 for infrastructure sustainability across all countries and regions. However, there has been a constant demand for data across different agencies and ministries by investors and decision-makers. The rapid development and adoption of open-source technologies that promote the collection and processing of data in new ways and in ever-increasing volumes are creating new data infrastructure in sectors such as lands and health, among others. This paper examines the process of developing data infrastructure and, by extension, a data portal to provide baseline data for sustainable development and decision making in Nigeria. This paper employs the FAIR principle (Findable, Accessible, Interoperable, and Reusable) of data management using open-source technology tools to develop data portals for public use. eHealth Africa, an organization that uses technology to drive public health interventions in Nigeria, developed a data portal which is a typical data infrastructure that serves as a repository for various datasets on administrative boundaries, points of interest, settlements, social infrastructure, amenities, and others. This portal makes it possible for users to have access to datasets of interest at any point in time at no cost. A skeletal infrastructure of this data portal encompasses the use of open-source technology such as Postgres database, GeoServer, GeoNetwork, and CKan. These tools made the infrastructure sustainable, thus promoting the achievement of SDG 9 (Industries, Innovation, and Infrastructure). As of 6th August 2021, a wider cross-section of 8192 users had been created, 2262 datasets had been downloaded, and 817 maps had been created from the platform. This paper shows the use of rapid development and adoption of technologies that facilitates data collection, processing, and publishing in new ways and in ever-increasing volumes. In addition, the paper is explicit on new data infrastructure in sectors such as health, social amenities, and agriculture. Furthermore, this paper reveals the importance of cross-sectional data infrastructures for planning and decision making, which in turn can form a central data repository for sustainable development across developing countries.

Keywords: data portal, data infrastructure, open source, sustainability

Procedia PDF Downloads 80