Search results for: Data Analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 31270

Search results for: Data Analysis

30970 Government Big Data Ecosystem: A Systematic Literature Review

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Data that is high in volume, velocity, veracity and comes from a variety of sources is usually generated in all sectors including the government sector. Globally public administrations are pursuing (big) data as new technology and trying to adopt a data-centric architecture for hosting and sharing data. Properly executed, big data and data analytics in the government (big) data ecosystem can be led to data-driven government and have a direct impact on the way policymakers work and citizens interact with governments. In this research paper, we conduct a systematic literature review. The main aims of this paper are to highlight essential aspects of the government (big) data ecosystem and to explore the most critical socio-technical factors that contribute to the successful implementation of government (big) data ecosystem. The essential aspects of government (big) data ecosystem include definition, data types, data lifecycle models, and actors and their roles. We also discuss the potential impact of (big) data in public administration and gaps in the government data ecosystems literature. As this is a new topic, we did not find specific articles on government (big) data ecosystem and therefore focused our research on various relevant areas like humanitarian data, open government data, scientific research data, industry data, etc.

Keywords: applications of big data, big data, big data types. big data ecosystem, critical success factors, data-driven government, egovernment, gaps in data ecosystems, government (big) data, literature review, public administration, systematic review

Procedia PDF Downloads 221
30969 Advances in Mathematical Sciences: Unveiling the Power of Data Analytics

Authors: Zahid Ullah, Atlas Khan

Abstract:

The rapid advancements in data collection, storage, and processing capabilities have led to an explosion of data in various domains. In this era of big data, mathematical sciences play a crucial role in uncovering valuable insights and driving informed decision-making through data analytics. The purpose of this abstract is to present the latest advances in mathematical sciences and their application in harnessing the power of data analytics. This abstract highlights the interdisciplinary nature of data analytics, showcasing how mathematics intersects with statistics, computer science, and other related fields to develop cutting-edge methodologies. It explores key mathematical techniques such as optimization, mathematical modeling, network analysis, and computational algorithms that underpin effective data analysis and interpretation. The abstract emphasizes the role of mathematical sciences in addressing real-world challenges across different sectors, including finance, healthcare, engineering, social sciences, and beyond. It showcases how mathematical models and statistical methods extract meaningful insights from complex datasets, facilitating evidence-based decision-making and driving innovation. Furthermore, the abstract emphasizes the importance of collaboration and knowledge exchange among researchers, practitioners, and industry professionals. It recognizes the value of interdisciplinary collaborations and the need to bridge the gap between academia and industry to ensure the practical application of mathematical advancements in data analytics. The abstract highlights the significance of ongoing research in mathematical sciences and its impact on data analytics. It emphasizes the need for continued exploration and innovation in mathematical methodologies to tackle emerging challenges in the era of big data and digital transformation. In summary, this abstract sheds light on the advances in mathematical sciences and their pivotal role in unveiling the power of data analytics. It calls for interdisciplinary collaboration, knowledge exchange, and ongoing research to further unlock the potential of mathematical methodologies in addressing complex problems and driving data-driven decision-making in various domains.

Keywords: mathematical sciences, data analytics, advances, unveiling

Procedia PDF Downloads 84
30968 Research and Application of Multi-Scale Three Dimensional Plant Modeling

Authors: Weiliang Wen, Xinyu Guo, Ying Zhang, Jianjun Du, Boxiang Xiao

Abstract:

Reconstructing and analyzing three-dimensional (3D) models from situ measured data is important for a number of researches and applications in plant science, including plant phenotyping, functional-structural plant modeling (FSPM), plant germplasm resources protection, agricultural technology popularization. It has many scales like cell, tissue, organ, plant and canopy from micro to macroscopic. The techniques currently used for data capture, feature analysis, and 3D reconstruction are quite different of different scales. In this context, morphological data acquisition, 3D analysis and modeling of plants on different scales are introduced systematically. The commonly used data capture equipment for these multiscale is introduced. Then hot issues and difficulties of different scales are described respectively. Some examples are also given, such as Micron-scale phenotyping quantification and 3D microstructure reconstruction of vascular bundles within maize stalks based on micro-CT scanning, 3D reconstruction of leaf surfaces and feature extraction from point cloud acquired by using 3D handheld scanner, plant modeling by combining parameter driven 3D organ templates. Several application examples by using the 3D models and analysis results of plants are also introduced. A 3D maize canopy was constructed, and light distribution was simulated within the canopy, which was used for the designation of ideal plant type. A grape tree model was constructed from 3D digital and point cloud data, which was used for the production of science content of 11th international conference on grapevine breeding and genetics. By using the tissue models of plants, a Google glass was used to look around visually inside the plant to understand the internal structure of plants. With the development of information technology, 3D data acquisition, and data processing techniques will play a greater role in plant science.

Keywords: plant, three dimensional modeling, multi-scale, plant phenotyping, three dimensional data acquisition

Procedia PDF Downloads 273
30967 Gendered Labelling and Its Effects on Vhavenda Women

Authors: Matodzi Rapalalani

Abstract:

In context with Spencer's (2018) classic labelling theory, labels influence the perceptions of both the individual and other members of society. That is, once labelled, the individual act in ways that confirm the stereotypes attached to the label. This study, therefore, investigates the understanding of gendered labelling and its effects on Vhavenda women. Gender socialization and patriarchy have been viewed as the core causes of the problem. The literature presented the development of gendered labelling, forms of it, and other aspects. A qualitative method of data collection was used in this study, and semi-structural interviews were conducted. A total of 6 participants were used as it is easy to deal with a small sample. Thematic analysis was used as the data was interpreted and analyzed. Ethical issues such as confidentiality, informed consent, and voluntary participation were considered. Through the analysis and data interpretation, causes such as lack of Christian values, insecurities, and lust were mentioned as well as some of the effects such as frustrations, increased divorce, and low self-esteem.

Keywords: gender, naming, Venda, women, African culture

Procedia PDF Downloads 87
30966 Methodologies, Findings, Discussion, and Limitations in Global, Multi-Lingual Research: We Are All Alone - Chinese Internet Drama

Authors: Patricia Portugal Marques de Carvalho Lourenco

Abstract:

A three-phase methodological multi-lingual path was designed, constructed and carried out using the 2020 Chinese Internet Drama Series We Are All Alone as a case study. Phase one, the backbone of the research, comprised of secondary data analysis, providing the structure on which the next two phases would be built on. Phase one incorporated a Google Scholar and a Baidu Index analysis, Star Network Influence Index and Mydramalist.com top two drama reviews, along with an article written about the drama and scrutiny of Chinese related blogs and websites. Phase two was field research elaborated across Latin Europe, and phase three was social media focused, having into account that perceptions are going to be memory conditioned based on past ideas recall. Overall, research has shown the poor cultural expression of Chinese entertainment in Latin Europe and demonstrated the inexistence of Chinese content in French, Italian, Portuguese and Spanish Business to Consumer retailers; a reflection of their low significance in Latin European markets and the short-life cycle of entertainment products in general, bubble-gum, disposable goods without a mid to long-term effect in consumers lives. The process of conducting comprehensive international research was complex and time-consuming, with data not always available in Mandarin, the researcher’s linguistic deficiency, limited Chinese Cultural Knowledge and cultural equivalence. Despite steps being taken to minimize the international proposed research, theoretical limitations concurrent to Latin Europe and China still occurred. Data accuracy was disputable; sampling, data collection/analysis methods are heterogeneous; ascertaining data requirements and the method of analysis to achieve a construct equivalence was challenging and morose to operationalize. Secondary data was also not often readily available in Mandarin; yet, in spite of the array of limitations, research was done, and results were produced.

Keywords: research methodologies, international research, primary data, secondary data, research limitations, online dramas, china, latin europe

Procedia PDF Downloads 64
30965 The Relationship Between Hourly Compensation and Unemployment Rate Using the Panel Data Regression Analysis

Authors: S. K. Ashiquer Rahman

Abstract:

the paper concentrations on the importance of hourly compensation, emphasizing the significance of the unemployment rate. There are the two most important factors of a nation these are its unemployment rate and hourly compensation. These are not merely statistics but they have profound effects on individual, families, and the economy. They are inversely related to one another. When we consider the unemployment rate that will probably decline as hourly compensations in manufacturing rise. But when we reduced the unemployment rates and increased job prospects could result from higher compensation. That’s why, the increased hourly compensation in the manufacturing sector that could have a favorable effect on job changing issues. Moreover, the relationship between hourly compensation and unemployment is complex and influenced by broader economic factors. In this paper, we use panel data regression models to evaluate the expected link between hourly compensation and unemployment rate in order to determine the effect of hourly compensation on unemployment rate. We estimate the fixed effects model, evaluate the error components, and determine which model (the FEM or ECM) is better by pooling all 60 observations. We then analysis and review the data by comparing 3 several countries (United States, Canada and the United Kingdom) using panel data regression models. Finally, we provide result, analysis and a summary of the extensive research on how the hourly compensation effects on the unemployment rate. Additionally, this paper offers relevant and useful informational to help the government and academic community use an econometrics and social approach to lessen on the effect of the hourly compensation on Unemployment rate to eliminate the problem.

Keywords: hourly compensation, Unemployment rate, panel data regression models, dummy variables, random effects model, fixed effects model, the linear regression model

Procedia PDF Downloads 71
30964 Using Corpora in Semantic Studies of English Adjectives

Authors: Oxana Lukoshus

Abstract:

The methods of corpus linguistics, a well-established field of research, are being increasingly applied in cognitive linguistics. Corpora data are especially useful for different quantitative studies of grammatical and other aspects of language. The main objective of this paper is to demonstrate how present-day corpora can be applied in semantic studies in general and in semantic studies of adjectives in particular. Polysemantic adjectives have been the subject of numerous studies. But most of them have been carried out on dictionaries. Undoubtedly, dictionaries are viewed as one of the basic data sources, but only at the initial steps of a research. The author usually starts with the analysis of the lexicographic data after which s/he comes up with a hypothesis. In the research conducted three polysemantic synonyms true, loyal, faithful have been analyzed in terms of differences and similarities in their semantic structure. A corpus-based approach in the study of the above-mentioned adjectives involves the following. After the analysis of the dictionary data there was the reference to the following corpora to study the distributional patterns of the words under study – the British National Corpus (BNC) and the Corpus of Contemporary American English (COCA). These corpora are continually updated and contain thousands of examples of the words under research which make them a useful and convenient data source. For the purpose of this study there were no special needs regarding genre, mode or time of the texts included in the corpora. Out of the range of possibilities offered by corpus-analysis software (e.g. word lists, statistics of word frequencies, etc.), the most useful tool for the semantic analysis was the extracting a list of co-occurrence for the given search words. Searching by lemmas, e.g. true, true to, and grouping the results by lemmas have proved to be the most efficient corpora feature for the adjectives under the study. Following the search process, the corpora provided a list of co-occurrences, which were then to be analyzed and classified. Not every co-occurrence was relevant for the analysis. For example, the phrases like An enormous sense of responsibility to protect the minds and hearts of the faithful from incursions by the state was perceived to be the basic duty of the church leaders or ‘True,’ said Phoebe, ‘but I'd probably get to be a Union Official immediately were left out as in the first example the faithful is a substantivized adjective and in the second example true is used alone with no other parts of speech. The subsequent analysis of the corpora data gave the grounds for the distribution groups of the adjectives under the study which were then investigated with the help of a semantic experiment. To sum it up, the corpora-based approach has proved to be a powerful, reliable and convenient tool to get the data for the further semantic study.

Keywords: corpora, corpus-based approach, polysemantic adjectives, semantic studies

Procedia PDF Downloads 308
30963 Intelligent Production Machine

Authors: A. Şahinoğlu, R. Gürbüz, A. Güllü, M. Karhan

Abstract:

This study in production machines, it is aimed that machine will automatically perceive cutting data and alter cutting parameters. The two most important parameters have to be checked in machine control unit are progress feed rate and speeds. These parameters are aimed to be controlled by sounds of machine. Optimum sound’s features introduced to computer. During process, real time data is received and converted by Matlab software. Data is converted into numerical values. According to them progress and speeds decreases/increases at a certain rate and thus optimum sound is acquired. Cutting process is made in respect of optimum cutting parameters. During chip remove progress, features of cutting tools, kind of cut material, cutting parameters and used machine; affects on various parameters. Instead of required parameters need to be measured such as temperature, vibration, and tool wear that emerged during cutting process; detailed analysis of the sound emerged during cutting process will provide detection of various data that included in the cutting process by the much more easy and economic way. The relation between cutting parameters and sound is being identified.

Keywords: cutting process, sound processing, intelligent late, sound analysis

Procedia PDF Downloads 329
30962 Legal Regulation of Personal Information Data Transmission Risk Assessment: A Case Study of the EU’s DPIA

Authors: Cai Qianyi

Abstract:

In the midst of global digital revolution, the flow of data poses security threats that call China's existing legislative framework for protecting personal information into question. As a preliminary procedure for risk analysis and prevention, the risk assessment of personal data transmission lacks detailed guidelines for support. Existing provisions reveal unclear responsibilities for network operators and weakened rights for data subjects. Furthermore, the regulatory system's weak operability and a lack of industry self-regulation heighten data transmission hazards. This paper aims to compare the regulatory pathways for data information transmission risks between China and Europe from a legal framework and content perspective. It draws on the “Data Protection Impact Assessment Guidelines” to empower multiple stakeholders, including data processors, controllers, and subjects, while also defining obligations. In conclusion, this paper intends to solve China's digital security shortcomings by developing a more mature regulatory framework and industry self-regulation mechanisms, resulting in a win-win situation for personal data protection and the development of the digital economy.

Keywords: personal information data transmission, risk assessment, DPIA, internet service provider, personal information data transimission, risk assessment

Procedia PDF Downloads 52
30961 Industry 4.0 and Supply Chain Integration: Case of Tunisian Industrial Companies

Authors: Rym Ghariani, Ghada Soltane, Younes Boujelbene

Abstract:

Industry 4.0, a set of emerging smart and digital technologies, has been the main focus of operations management researchers and practitioners in recent years. The objective of this research paper is to study the impact of Industry 4.0 on the integration of the supply chain (SCI) in Tunisian industrial companies. A conceptual model to study the relationship between Industry 4.0 technologies and supply chain integration was designed. This model contains three explained variables (Big data, Internet of Things, and Robotics) and one variable to be explained (supply chain integration). In order to answer our research questions and investigate the research hypotheses, principal component analysis and discriminant analysis were used using SPSS26 software. The results reveal that there is a statistically positive impact significant impact of Industry 4.0 (Big data, Internet of Things and Robotics) on the integration of the supply chain. Interestingly, big data has a greater positive impact on supply chain integration than the Internet of Things and robotics.

Keywords: industry 4.0 (I4.0), big data, internet of things, robotics, supply chain integration

Procedia PDF Downloads 48
30960 Blood Glucose Measurement and Analysis: Methodology

Authors: I. M. Abd Rahim, H. Abdul Rahim, R. Ghazali

Abstract:

There is numerous non-invasive blood glucose measurement technique developed by researchers, and near infrared (NIR) is the potential technique nowadays. However, there are some disagreements on the optimal wavelength range that is suitable to be used as the reference of the glucose substance in the blood. This paper focuses on the experimental data collection technique and also the analysis method used to analyze the data gained from the experiment. The selection of suitable linear and non-linear model structure is essential in prediction system, as the system developed need to be conceivably accurate.

Keywords: linear, near-infrared (NIR), non-invasive, non-linear, prediction system

Procedia PDF Downloads 452
30959 Application of Neutron Stimulated Gamma Spectroscopy for Soil Elemental Analysis and Mapping

Authors: Aleksandr Kavetskiy, Galina Yakubova, Nikolay Sargsyan, Stephen A. Prior, H. Allen Torbert

Abstract:

Determining soil elemental content and distribution (mapping) within a field are key features of modern agricultural practice. While traditional chemical analysis is a time consuming and labor-intensive multi-step process (e.g., sample collections, transport to laboratory, physical preparations, and chemical analysis), neutron-gamma soil analysis can be performed in-situ. This analysis is based on the registration of gamma rays issued from nuclei upon interaction with neutrons. Soil elements such as Si, C, Fe, O, Al, K, and H (moisture) can be assessed with this method. Data received from analysis can be directly used for creating soil elemental distribution maps (based on ArcGIS software) suitable for agricultural purposes. The neutron-gamma analysis system developed for field application consisted of an MP320 Neutron Generator (Thermo Fisher Scientific, Inc.), 3 sodium iodide gamma detectors (SCIONIX, Inc.) with a total volume of 7 liters, 'split electronics' (XIA, LLC), a power system, and an operational computer. Paired with GPS, this system can be used in the scanning mode to acquire gamma spectra while traversing a field. Using acquired spectra, soil elemental content can be calculated. These data can be combined with geographical coordinates in a geographical information system (i.e., ArcGIS) to produce elemental distribution maps suitable for agricultural purposes. Special software has been developed that will acquire gamma spectra, process and sort data, calculate soil elemental content, and combine these data with measured geographic coordinates to create soil elemental distribution maps. For example, 5.5 hours was needed to acquire necessary data for creating a carbon distribution map of an 8.5 ha field. This paper will briefly describe the physics behind the neutron gamma analysis method, physical construction the measurement system, and main characteristics and modes of work when conducting field surveys. Soil elemental distribution maps resulting from field surveys will be presented. and discussed. Comparison of these maps with maps created on the bases of chemical analysis and soil moisture measurements determined by soil electrical conductivity was similar. The maps created by neutron-gamma analysis were reproducible, as well. Based on these facts, it can be asserted that neutron stimulated soil gamma spectroscopy paired with GPS system is fully applicable for soil elemental agricultural field mapping.

Keywords: ArcGIS mapping, neutron gamma analysis, soil elemental content, soil gamma spectroscopy

Procedia PDF Downloads 130
30958 Emerging Research Trends in Routing Protocol for Wireless Sensor Network

Authors: Subhra Prosun Paul, Shruti Aggarwal

Abstract:

Now a days Routing Protocol in Wireless Sensor Network has become a promising technique in the different fields of the latest computer technology. Routing in Wireless Sensor Network is a demanding task due to the different design issues of all sensor nodes. Network architecture, no of nodes, traffic of routing, the capacity of each sensor node, network consistency, service value are the important factor for the design and analysis of Routing Protocol in Wireless Sensor Network. Additionally, internal energy, the distance between nodes, the load of sensor nodes play a significant role in the efficient routing protocol. In this paper, our intention is to analyze the research trends in different routing protocols of Wireless Sensor Network in terms of different parameters. In order to explain the research trends on Routing Protocol in Wireless Sensor Network, different data related to this research topic are analyzed with the help of Web of Science and Scopus databases. The data analysis is performed from global perspective-taking different parameters like author, source, document, country, organization, keyword, year, and a number of the publication. Different types of experiments are also performed, which help us to evaluate the recent research tendency in the Routing Protocol of Wireless Sensor Network. In order to do this, we have used Web of Science and Scopus databases separately for data analysis. We have observed that there has been a tremendous development of research on this topic in the last few years as it has become a very popular topic day by day.

Keywords: analysis, routing protocol, research trends, wireless sensor network

Procedia PDF Downloads 208
30957 Developing Logistics Indices for Turkey as an an Indicator of Economic Activity

Authors: Gizem İntepe, Eti Mizrahi

Abstract:

Investment and financing decisions are influenced by various economic features. Detailed analysis should be conducted in order to make decisions not only by companies but also by governments. Such analysis can be conducted either at the company level or on a sectoral basis to reduce risks and to maximize profits. Sectoral disaggregation caused by seasonality effects, subventions, data advantages or disadvantages may appear in sectors behaving parallel to BIST (Borsa Istanbul stock exchange) Index. Proposed logistic indices could serve market needs as a decision parameter in sectoral basis and also helps forecasting activities in import export volume changes. Also it is an indicator of logistic activity, which is also a sign of economic mobility at the national level. Publicly available data from “Ministry of Transport, Maritime Affairs and Communications” and “Turkish Statistical Institute” is utilized to obtain five logistics indices namely as; exLogistic, imLogistic, fLogistic, dLogistic and cLogistic index. Then, efficiency and reliability of these indices are tested.

Keywords: economic activity, export trade data, import trade data, logistics indices

Procedia PDF Downloads 328
30956 The Quality of Food and Drink Product Labels Translation from Indonesian into English

Authors: Rudi Hartono, Bambang Purwanto

Abstract:

The translation quality of food and drink labels from Indonesian into English is poor because the translation is not accurate, less natural, and difficult to read. The label translation can be found in some cans packages of food and drink products produced and marketed by several companies in Indonesia. If this problem is left unchecked, it will lead to a misunderstanding on the translation results and make consumers confused. This study was conducted to analyze the translation errors on food and drink products labels and formulate the solution for the better translation quality. The research design was the evaluation research with a holistic criticism approach. The data used were words, phrases, and sentences translated from Indonesian to English language printed on food and drink product labels. The data were processed by using Interactive Model Analysis that carried out three main steps: collecting, classifying, and verifying data. Furthermore, the data were analyzed by using content analysis to view the accuracy, naturalness, and readability of translation. The results showed that the translation quality of food and drink product labels from Indonesian to English has the level of accuracy (60%), level of naturalness (50%), and level readability (60%). This fact needs a help to create an effective strategy for translating food and drink product labels later.

Keywords: translation quality, food and drink product labels, a holistic criticism approach, interactive model, content analysis

Procedia PDF Downloads 358
30955 Empirical Acceleration Functions and Fuzzy Information

Authors: Muhammad Shafiq

Abstract:

In accelerated life testing approaches life time data is obtained under various conditions which are considered more severe than usual condition. Classical techniques are based on obtained precise measurements, and used to model variation among the observations. In fact, there are two types of uncertainty in data: variation among the observations and the fuzziness. Analysis techniques, which do not consider fuzziness and are only based on precise life time observations, lead to pseudo results. This study was aimed to examine the behavior of empirical acceleration functions using fuzzy lifetimes data. The results showed an increased fuzziness in the transformed life times as compare to the input data.

Keywords: acceleration function, accelerated life testing, fuzzy number, non-precise data

Procedia PDF Downloads 291
30954 Performance Study of PV Power plants in Algeria

Authors: Razika Ihaddadene, Nabila Ihaddadene

Abstract:

This paper aims to highlight the importance of the application of the IEC 61724 standard in the study of the performance analysis of photovoltaic power plants on a monthly and annual scale. Likewise, the comparison of two photovoltaic power plants with two different climates was carried out in order to determine the effect of climatic parameters on the analysis of photovoltaic performances. All data from the Ain Skhouna and Adrar photovoltaic power plants for 2018 and the data from the Saida1 field for one month in 2019 were used. The results of the performance analysis according to the indicated standard show that the Saida PV power plant performs better than the Adrar PV power plant, which is due to the effect of increasing the ambient temperature. Increasing ambient temperature increases losses decreases system efficiency and performance ratio. It presents a key element in the proper functioning of PV plants.

Keywords: pv power plants, IEC 61724 norm, grid connected pv, algeria

Procedia PDF Downloads 72
30953 A Social Cognitive Investigation in the Context of Vocational Training Performance of People with Disabilities

Authors: Majid A. AlSayari

Abstract:

The study reported here investigated social cognitive theory (SCT) in the context of Vocational Rehab (VR) for people with disabilities. The prime purpose was to increase knowledge of VR phenomena and make recommendations for improving VR services. The sample consisted of 242 persons with Spinal Cord Injuries (SCI) who completed questionnaires. A further 32 participants were Trainers. Analysis of questionnaire data was carried out using factor analysis, multiple regression analysis, and thematic analysis. The analysis suggested that, in motivational terms, and consistent with research carried out in other academic contexts, self-efficacy was the best predictor of VR performance. The author concludes that that VR self-efficacy predicted VR training performance.

Keywords: people with physical disabilities, social cognitive theory, self-efficacy, vocational training

Procedia PDF Downloads 305
30952 An Analysis on Clustering Based Gene Selection and Classification for Gene Expression Data

Authors: K. Sathishkumar, V. Thiagarasu

Abstract:

Due to recent advances in DNA microarray technology, it is now feasible to obtain gene expression profiles of tissue samples at relatively low costs. Many scientists around the world use the advantage of this gene profiling to characterize complex biological circumstances and diseases. Microarray techniques that are used in genome-wide gene expression and genome mutation analysis help scientists and physicians in understanding of the pathophysiological mechanisms, in diagnoses and prognoses, and choosing treatment plans. DNA microarray technology has now made it possible to simultaneously monitor the expression levels of thousands of genes during important biological processes and across collections of related samples. Elucidating the patterns hidden in gene expression data offers a tremendous opportunity for an enhanced understanding of functional genomics. However, the large number of genes and the complexity of biological networks greatly increase the challenges of comprehending and interpreting the resulting mass of data, which often consists of millions of measurements. A first step toward addressing this challenge is the use of clustering techniques, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. This work presents an analysis of several clustering algorithms proposed to deals with the gene expression data effectively. The existing clustering algorithms like Support Vector Machine (SVM), K-means algorithm and evolutionary algorithm etc. are analyzed thoroughly to identify the advantages and limitations. The performance evaluation of the existing algorithms is carried out to determine the best approach. In order to improve the classification performance of the best approach in terms of Accuracy, Convergence Behavior and processing time, a hybrid clustering based optimization approach has been proposed.

Keywords: microarray technology, gene expression data, clustering, gene Selection

Procedia PDF Downloads 319
30951 Status and Results from EXO-200

Authors: Ryan Maclellan

Abstract:

EXO-200 has provided one of the most sensitive searches for neutrinoless double-beta decay utilizing 175 kg of enriched liquid xenon in an ultra-low background time projection chamber. This detector has demonstrated excellent energy resolution and background rejection capabilities. Using the first two years of data, EXO-200 has set a limit of 1.1x10^25 years at 90% C.L. on the neutrinoless double-beta decay half-life of Xe-136. The experiment has experienced a brief hiatus in data taking during a temporary shutdown of its host facility: the Waste Isolation Pilot Plant. EXO-200 expects to resume data taking in earnest this fall with upgraded detector electronics. Results from the analysis of EXO-200 data and an update on the current status of EXO-200 will be presented.

Keywords: double-beta, Majorana, neutrino, neutrinoless

Procedia PDF Downloads 407
30950 Risk Factors’ Analysis on Shanghai Carbon Trading

Authors: Zhaojun Wang, Zongdi Sun, Zhiyuan Liu

Abstract:

First of all, the carbon trading price and trading volume in Shanghai are transformed by Fourier transform, and the frequency response diagram is obtained. Then, the frequency response diagram is analyzed and the Blackman filter is designed. The Blackman filter is used to filter, and the carbon trading time domain and frequency response diagram are obtained. After wavelet analysis, the carbon trading data were processed; respectively, we got the average value for each 5 days, 10 days, 20 days, 30 days, and 60 days. Finally, the data are used as input of the Back Propagation Neural Network model for prediction.

Keywords: Shanghai carbon trading, carbon trading price, carbon trading volume, wavelet analysis, BP neural network model

Procedia PDF Downloads 385
30949 Application of Observational Medical Outcomes Partnership-Common Data Model (OMOP-CDM) Database in Nursing Health Problems with Prostate Cancer-a Pilot Study

Authors: Hung Lin-Zin, Lai Mei-Yen

Abstract:

Prostate cancer is the most commonly diagnosed male cancer in the U.S. The prevalence is around 1 in 8. The etiology of prostate cancer is still unknown, but some predisposing factors, such as age, black race, family history, and obesity, may increase the risk of the disease. In 2020, a total of 7,178 Taiwanese people were nearly diagnosed with prostate cancer, accounting for 5.88% of all cancer cases, and the incidence rate ranked fifth among men. In that year, the total number of deaths from prostate cancer was 1,730, accounting for 3.45% of all cancer deaths, and the death rate ranked 6th among men, accounting for 94.34% of the cases of male reproductive organs. Looking for domestic and foreign literature on the use of OMOP (Observational Medical Outcomes Partnership, hereinafter referred to as OMOP) database analysis, there are currently nearly a hundred literature published related to nursing-related health problems and nursing measures built in the OMOP general data model database of medical institutions are extremely rare. The OMOP common data model construction analysis platform is a system developed by the FDA in 2007, using a common data model (common data model, CDM) to analyze and monitor healthcare data. It is important to build up relevant nursing information from the OMOP- CDM database to assist our daily practice. Therefore, we choose prostate cancer patients who are our popular care objects and use the OMOP- CDM database to explore the common associated health problems. With the assistance of OMOP-CDM database analysis, we can expect early diagnosis and prevention of prostate cancer patients' comorbidities to improve patient care.

Keywords: OMOP, nursing diagnosis, health problem, prostate cancer

Procedia PDF Downloads 46
30948 Business Intelligence for Profiling of Telecommunication Customer

Authors: Rokhmatul Insani, Hira Laksmiwati Soemitro

Abstract:

Business Intelligence is a methodology that exploits the data to produce information and knowledge systematically, business intelligence can support the decision-making process. Some methods in business intelligence are data warehouse and data mining. A data warehouse can store historical data from transactional data. For data modelling in data warehouse, we apply dimensional modelling by Kimball. While data mining is used to extracting patterns from the data and get insight from the data. Data mining has many techniques, one of which is segmentation. For profiling of telecommunication customer, we use customer segmentation according to customer’s usage of services, customer invoice and customer payment. Customers can be grouped according to their characteristics and can be identified the profitable customers. We apply K-Means Clustering Algorithm for segmentation. The input variable for that algorithm we use RFM (Recency, Frequency and Monetary) model. All process in data mining, we use tools IBM SPSS modeller.

Keywords: business intelligence, customer segmentation, data warehouse, data mining

Procedia PDF Downloads 473
30947 Network Analysis of Genes Involved in the Biosynthesis of Medicinally Important Naphthodianthrone Derivatives of Hypericum perforatum

Authors: Nafiseh Noormohammadi, Ahmad Sobhani Najafabadi

Abstract:

Hypericins (hypericin and pseudohypericin) are natural napthodianthrone derivatives produced by Hypericum perforatum (St. John’s Wort), which have many medicinal properties such as antitumor, antineoplastic, antiviral, and antidepressant activities. Production and accumulation of hypericin in the plant are influenced by both genetic and environmental conditions. Despite the existence of different high-throughput data on the plant, genetic dimensions of hypericin biosynthesis have not yet been completely understood. In this research, 21 high-quality RNA-seq data on different parts of the plant were integrated into metabolic data to reconstruct a coexpression network. Results showed that a cluster of 30 transcripts was correlated with total hypericin. The identified transcripts were divided into three main groups based on their functions, including hypericin biosynthesis genes, transporters, detoxification genes, and transcription factors (TFs). In the biosynthetic group, different isoforms of polyketide synthase (PKSs) and phenolic oxidative coupling proteins (POCPs) were identified. Phylogenetic analysis of protein sequences integrated into gene expression analysis showed that some of the POCPs seem to be very important in the biosynthetic pathway of hypericin. In the TFs group, six TFs were correlated with total hypericin. qPCR analysis of these six TFs confirmed that three of them were highly correlated. The identified genes in this research are a rich resource for further studies on the molecular breeding of H. perforatum in order to obtain varieties with high hypericin production.

Keywords: hypericin, St. John’s Wort, data mining, transcription factors, secondary metabolites

Procedia PDF Downloads 84
30946 An Exploratory Sequential Design: A Mixed Methods Model for the Statistics Learning Assessment with a Bayesian Network Representation

Authors: Zhidong Zhang

Abstract:

This study established a mixed method model in assessing statistics learning with Bayesian network models. There are three variants in exploratory sequential designs. There are three linked steps in one of the designs: qualitative data collection and analysis, quantitative measure, instrument, intervention, and quantitative data collection analysis. The study used a scoring model of analysis of variance (ANOVA) as a content domain. The research study is to examine students’ learning in both semantic and performance aspects at fine grain level. The ANOVA score model, y = α+ βx1 + γx1+ ε, as a cognitive task to collect data during the student learning process. When the learning processes were decomposed into multiple steps in both semantic and performance aspects, a hierarchical Bayesian network was established. This is a theory-driven process. The hierarchical structure was gained based on qualitative cognitive analysis. The data from students’ ANOVA score model learning was used to give evidence to the hierarchical Bayesian network model from the evidential variables. Finally, the assessment results of students’ ANOVA score model learning were reported. Briefly, this was a mixed method research design applied to statistics learning assessment. The mixed methods designs expanded more possibilities for researchers to establish advanced quantitative models initially with a theory-driven qualitative mode.

Keywords: exploratory sequential design, ANOVA score model, Bayesian network model, mixed methods research design, cognitive analysis

Procedia PDF Downloads 166
30945 Assessment of Routine Health Information System (RHIS) Quality Assurance Practices in Tarkwa Sub-Municipal Health Directorate, Ghana

Authors: Richard Okyere Boadu, Judith Obiri-Yeboah, Kwame Adu Okyere Boadu, Nathan Kumasenu Mensah, Grace Amoh-Agyei

Abstract:

Routine health information system (RHIS) quality assurance has become an important issue, not only because of its significance in promoting a high standard of patient care but also because of its impact on government budgets for the maintenance of health services. A routine health information system comprises healthcare data collection, compilation, storage, analysis, report generation, and dissemination on a routine basis in various healthcare settings. The data from RHIS give a representation of health status, health services, and health resources. The sources of RHIS data are normally individual health records, records of services delivered, and records of health resources. Using reliable information from routine health information systems is fundamental in the healthcare delivery system. Quality assurance practices are measures that are put in place to ensure the health data that are collected meet required quality standards. Routine health information system quality assurance practices ensure that data that are generated from the system are fit for use. This study considered quality assurance practices in the RHIS processes. Methods: A cross-sectional study was conducted in eight health facilities in Tarkwa Sub-Municipal Health Service in the western region of Ghana. The study involved routine quality assurance practices among the 90 health staff and management selected from facilities in Tarkwa Sub-Municipal who collected or used data routinely from 24th December 2019 to 20th January 2020. Results: Generally, Tarkwa Sub-Municipal health service appears to practice quality assurance during data collection, compilation, storage, analysis and dissemination. The results show some achievement in quality control performance in report dissemination (77.6%), data analysis (68.0%), data compilation (67.4%), report compilation (66.3%), data storage (66.3%) and collection (61.1%). Conclusions: Even though the Tarkwa Sub-Municipal Health Directorate engages in some control measures to ensure data quality, there is a need to strengthen the process to achieve the targeted percentage of performance (90.0%). There was a significant shortfall in quality assurance practices performance, especially during data collection, with respect to the expected performance.

Keywords: quality assurance practices, assessment of routine health information system quality, routine health information system, data quality

Procedia PDF Downloads 70
30944 Investigating Dynamic Transition Process of Issues Using Unstructured Text Analysis

Authors: Myungsu Lim, William Xiu Shun Wong, Yoonjin Hyun, Chen Liu, Seongi Choi, Dasom Kim, Namgyu Kim

Abstract:

The amount of real-time data generated through various mass media has been increasing rapidly. In this study, we had performed topic analysis by using the unstructured text data that is distributed through news article. As one of the most prevalent applications of topic analysis, the issue tracking technique investigates the changes of the social issues that identified through topic analysis. Currently, traditional issue tracking is conducted by identifying the main topics of documents that cover an entire period at the same time and analyzing the occurrence of each topic by the period of occurrence. However, this traditional issue tracking approach has limitation that it cannot discover dynamic mutation process of complex social issues. The purpose of this study is to overcome the limitations of the existing issue tracking method. We first derived core issues of each period, and then discover the dynamic mutation process of various issues. In this study, we further analyze the mutation process from the perspective of the issues categories, in order to figure out the pattern of issue flow, including the frequency and reliability of the pattern. In other words, this study allows us to understand the components of the complex issues by tracking the dynamic history of issues. This methodology can facilitate a clearer understanding of complex social phenomena by providing mutation history and related category information of the phenomena.

Keywords: Data Mining, Issue Tracking, Text Mining, topic Analysis, topic Detection, Trend Detection

Procedia PDF Downloads 399
30943 Multivariate Data Analysis for Automatic Atrial Fibrillation Detection

Authors: Zouhair Haddi, Stephane Delliaux, Jean-Francois Pons, Ismail Kechaf, Jean-Claude De Haro, Mustapha Ouladsine

Abstract:

Atrial fibrillation (AF) has been considered as the most common cardiac arrhythmia, and a major public health burden associated with significant morbidity and mortality. Nowadays, telemedical approaches targeting cardiac outpatients situate AF among the most challenged medical issues. The automatic, early, and fast AF detection is still a major concern for the healthcare professional. Several algorithms based on univariate analysis have been developed to detect atrial fibrillation. However, the published results do not show satisfactory classification accuracy. This work was aimed at resolving this shortcoming by proposing multivariate data analysis methods for automatic AF detection. Four publicly-accessible sets of clinical data (AF Termination Challenge Database, MIT-BIH AF, Normal Sinus Rhythm RR Interval Database, and MIT-BIH Normal Sinus Rhythm Databases) were used for assessment. All time series were segmented in 1 min RR intervals window and then four specific features were calculated. Two pattern recognition methods, i.e., Principal Component Analysis (PCA) and Learning Vector Quantization (LVQ) neural network were used to develop classification models. PCA, as a feature reduction method, was employed to find important features to discriminate between AF and Normal Sinus Rhythm. Despite its very simple structure, the results show that the LVQ model performs better on the analyzed databases than do existing algorithms, with high sensitivity and specificity (99.19% and 99.39%, respectively). The proposed AF detection holds several interesting properties, and can be implemented with just a few arithmetical operations which make it a suitable choice for telecare applications.

Keywords: atrial fibrillation, multivariate data analysis, automatic detection, telemedicine

Procedia PDF Downloads 260
30942 Bayesian Analysis of Topp-Leone Generalized Exponential Distribution

Authors: Najrullah Khan, Athar Ali Khan

Abstract:

The Topp-Leone distribution was introduced by Topp- Leone in 1955. In this paper, an attempt has been made to fit Topp-Leone Generalized exponential (TPGE) distribution. A real survival data set is used for illustrations. Implementation is done using R and JAGS and appropriate illustrations are made. R and JAGS codes have been provided to implement censoring mechanism using both optimization and simulation tools. The main aim of this paper is to describe and illustrate the Bayesian modelling approach to the analysis of survival data. Emphasis is placed on the modeling of data and the interpretation of the results. Crucial to this is an understanding of the nature of the incomplete or 'censored' data encountered. Analytic approximation and simulation tools are covered here, but most of the emphasis is on Markov chain based Monte Carlo method including independent Metropolis algorithm, which is currently the most popular technique. For analytic approximation, among various optimization algorithms and trust region method is found to be the best. In this paper, TPGE model is also used to analyze the lifetime data in Bayesian paradigm. Results are evaluated from the above mentioned real survival data set. The analytic approximation and simulation methods are implemented using some software packages. It is clear from our findings that simulation tools provide better results as compared to those obtained by asymptotic approximation.

Keywords: Bayesian Inference, JAGS, Laplace Approximation, LaplacesDemon, posterior, R Software, simulation

Procedia PDF Downloads 524
30941 An Investigation into the Views of Distant Science Education Students Regarding Teaching Laboratory Work Online

Authors: Abraham Motlhabane

Abstract:

This research analysed the written views of science education students regarding the teaching of laboratory work using the online mode. The research adopted the qualitative methodology. The qualitative research was aimed at investigating small and distinct groups normally regarded as a single-site study. Qualitative research was used to describe and analyze the phenomena from the student’s perspective. This means the research began with assumptions of the world view that use theoretical lenses of research problems inquiring into the meaning of individual students. The research was conducted with three groups of students studying for Postgraduate Certificate in Education, Bachelor of Education and honors Bachelor of Education respectively. In each of the study programmes, the science education module is compulsory. Five science education students from each study programme were purposively selected to participate in this research. Therefore, 15 students participated in the research. In order to analysis the data, the data were first printed and hard copies were used in the analysis. The data was read several times and key concepts and ideas were highlighted. Themes and patterns were identified to describe the data. Coding as a process of organising and sorting data was used. The findings of the study are very diverse; some students are in favour of online laboratory whereas other students argue that science can only be learnt through hands-on experimentation.

Keywords: online learning, laboratory work, views, perceptions

Procedia PDF Downloads 137