Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 42679

Search results for: exploratory data analysis

42049 Implementation and Performance Analysis of Data Encryption Standard and RSA Algorithm with Image Steganography and Audio Steganography

Authors: S. C. Sharma, Ankit Gambhir, Rajeev Arya

Abstract:

In today’s era data security is an important concern and most demanding issues because it is essential for people using online banking, e-shopping, reservations etc. The two major techniques that are used for secure communication are Cryptography and Steganography. Cryptographic algorithms scramble the data so that intruder will not able to retrieve it; however steganography covers that data in some cover file so that presence of communication is hidden. This paper presents the implementation of Ron Rivest, Adi Shamir, and Leonard Adleman (RSA) Algorithm with Image and Audio Steganography and Data Encryption Standard (DES) Algorithm with Image and Audio Steganography. The coding for both the algorithms have been done using MATLAB and its observed that these techniques performed better than individual techniques. The risk of unauthorized access is alleviated up to a certain extent by using these techniques. These techniques could be used in Banks, RAW agencies etc, where highly confidential data is transferred. Finally, the comparisons of such two techniques are also given in tabular forms.

Keywords: audio steganography, data security, DES, image steganography, intruder, RSA, steganography

Procedia PDF Downloads 296

42048 The Age Difference in Social Skills Constructs for School Adaptation: A Cross-Sectional Study of Japanese Students at Elementary, Junior, and Senior High School

Authors: Hiroki Shinkawa, Tadaaki Tomiie

Abstract:

Many interventions for social skills acquisition aim to decrease the gap between social skills deficits in the individual and normative social skills; nevertheless little is known of typical social skills according to age difference in students. In this study, we developed new quintet of Hokkaido Social Skills Inventory (HSSI) in order to identify age-appropriate social skills for school adaptation. First, we selected 13 categories of social skills for school adaptation from previous studies, and created questionnaire items through discussion by 25 teachers in all three levels from elementary schools to senior high schools. Second, the factor structures of five versions of the social skills scale were investigated on 2nd grade (n = 1,864), 4th grade (n = 1,936), 6th grade (n = 2,085), 7th grade (n = 2,007), and 10th grade (n = 912) students, respectively. The exploratory factor analysis showed that a number of constructing factors of social skills increased as one’s grade in school advanced. The results in the present study can be useful to characterize the age-appropriate social skills for school adaptation.

Keywords: social skills, age difference, children, adolescents

Procedia PDF Downloads 400

42047 Anomaly Detection in Financial Markets Using Tucker Decomposition

Authors: Salma Krafessi

Abstract:

The financial markets have a multifaceted, intricate environment, and enormous volumes of data are produced every day. To find investment possibilities, possible fraudulent activity, and market oddities, accurate anomaly identification in this data is essential. Conventional methods for detecting anomalies frequently fail to capture the complex organization of financial data. In order to improve the identification of abnormalities in financial time series data, this study presents Tucker Decomposition as a reliable multi-way analysis approach. We start by gathering closing prices for the S&P 500 index across a number of decades. The information is converted to a three-dimensional tensor format, which contains internal characteristics and temporal sequences in a sliding window structure. The tensor is then broken down using Tucker Decomposition into a core tensor and matching factor matrices, allowing latent patterns and relationships in the data to be captured. A possible sign of abnormalities is the reconstruction error from Tucker's Decomposition. We are able to identify large deviations that indicate unusual behavior by setting a statistical threshold. A thorough examination that contrasts the Tucker-based method with traditional anomaly detection approaches validates our methodology. The outcomes demonstrate the superiority of Tucker's Decomposition in identifying intricate and subtle abnormalities that are otherwise missed. This work opens the door for more research into multi-way data analysis approaches across a range of disciplines and emphasizes the value of tensor-based methods in financial analysis.

Keywords: tucker decomposition, financial markets, financial engineering, artificial intelligence, decomposition models

Procedia PDF Downloads 74

42046 Exploring the Role of Data Mining in Crime Classification: A Systematic Literature Review

Authors: Faisal Muhibuddin, Ani Dijah Rahajoe

Abstract:

This in-depth exploration, through a systematic literature review, scrutinizes the nuanced role of data mining in the classification of criminal activities. The research focuses on investigating various methodological aspects and recent developments in leveraging data mining techniques to enhance the effectiveness and precision of crime categorization. Commencing with an exposition of the foundational concepts of crime classification and its evolutionary dynamics, this study details the paradigm shift from conventional methods towards approaches supported by data mining, addressing the challenges and complexities inherent in the modern crime landscape. Specifically, the research delves into various data mining techniques, including K-means clustering, Naïve Bayes, K-nearest neighbour, and clustering methods. A comprehensive review of the strengths and limitations of each technique provides insights into their respective contributions to improving crime classification models. The integration of diverse data sources takes centre stage in this research. A detailed analysis explores how the amalgamation of structured data (such as criminal records) and unstructured data (such as social media) can offer a holistic understanding of crime, enriching classification models with more profound insights. Furthermore, the study explores the temporal implications in crime classification, emphasizing the significance of considering temporal factors to comprehend long-term trends and seasonality. The availability of real-time data is also elucidated as a crucial element in enhancing responsiveness and accuracy in crime classification.

Keywords: data mining, classification algorithm, naïve bayes, k-means clustering, k-nearest neigbhor, crime, data analysis, sistematic literature review

Procedia PDF Downloads 75

42045 A Novel Heuristic for Analysis of Large Datasets by Selecting Wrapper-Based Features

Authors: Bushra Zafar, Usman Qamar

Abstract:

Large data sample size and dimensions render the effectiveness of conventional data mining methodologies. A data mining technique are important tools for collection of knowledgeable information from variety of databases and provides supervised learning in the form of classification to design models to describe vital data classes while structure of the classifier is based on class attribute. Classification efficiency and accuracy are often influenced to great extent by noisy and undesirable features in real application data sets. The inherent natures of data set greatly masks its quality analysis and leave us with quite few practical approaches to use. To our knowledge first time, we present a new approach for investigation of structure and quality of datasets by providing a targeted analysis of localization of noisy and irrelevant features of data sets. Machine learning is based primarily on feature selection as pre-processing step which offers us to select few features from number of features as a subset by reducing the space according to certain evaluation criterion. The primary objective of this study is to trim down the scope of the given data sample by searching a small set of important features which may results into good classification performance. For this purpose, a heuristic for wrapper-based feature selection using genetic algorithm and for discriminative feature selection an external classifier are used. Selection of feature based on its number of occurrence in the chosen chromosomes. Sample dataset has been used to demonstrate proposed idea effectively. A proposed method has improved average accuracy of different datasets is about 95%. Experimental results illustrate that proposed algorithm increases the accuracy of prediction of different diseases.

Keywords: data mining, generic algorithm, KNN algorithms, wrapper based feature selection

Procedia PDF Downloads 320

42044 Analyzing Keyword Networks for the Identification of Correlated Research Topics

Authors: Thiago M. R. Dias, Patrícia M. Dias, Gray F. Moita

Abstract:

The production and publication of scientific works have increased significantly in the last years, being the Internet the main factor of access and distribution of these works. Faced with this, there is a growing interest in understanding how scientific research has evolved, in order to explore this knowledge to encourage research groups to become more productive. Therefore, the objective of this work is to explore repositories containing data from scientific publications and to characterize keyword networks of these publications, in order to identify the most relevant keywords, and to highlight those that have the greatest impact on the network. To do this, each article in the study repository has its keywords extracted and in this way the network is characterized, after which several metrics for social network analysis are applied for the identification of the highlighted keywords.

Keywords: bibliometrics, data analysis, extraction and data integration, scientometrics

Procedia PDF Downloads 264

42043 Using RASCAL Code to Analyze the Postulated UF6 Fire Accident

Authors: J. R. Wang, Y. Chiang, W. S. Hsu, S. H. Chen, J. H. Yang, S. W. Chen, C. Shih, Y. F. Chang, Y. H. Huang, B. R. Shen

Abstract:

In this research, the RASCAL code was used to simulate and analyze the postulated UF₆ fire accident which may occur in the Institute of Nuclear Energy Research (INER). There are four main steps in this research. In the first step, the UF₆ data of INER were collected. In the second step, the RASCAL analysis methodology and model was established by using these data. Third, this RASCAL model was used to perform the simulation and analysis of the postulated UF₆ fire accident. Three cases were simulated and analyzed in this step. Finally, the analysis results of RASCAL were compared with the hazardous levels of the chemicals. According to the compared results of three cases, Case 3 has the maximum danger in human health.

Keywords: RASCAL, UF₆, safety, hydrogen fluoride

Procedia PDF Downloads 226

42042 Community Health Workers’ Performance and Their Influence in the Adoption of Strategies to Address Malaria Burden at a Subnational Level Health System in Cameroon

Authors: Tacho Rubby Kong

Abstract:

Community health workers’ performances are known to influence members’ behaviours and practices while translating policies into service delivery. However, little remains known about the extent to which this remains true within interventions aimed at addressing malaria burden in low-resource settings like Cameroon. The objective of this study was to examine the health workers’ performance and their influence on the adoption of strategies to address the malaria burden at a subnational level health system in Cameroon. A qualitative exploratory design was adopted on a purposively selected sample of 18 key informants. The study was conducted in Konye health district among sub-national health systems, managers, health facility in-charges, and frontline community health workers. Data was collected using semi-structured interview guides in a face-to-face interview with respondents. The analysis adopted a thematic approach utilising journals, credible authors, and peer review articles for data management. Participants acknowledged that workplace networks were influential during the implementation of policies to address malaria. The influence exerted was in form of linkage with other services, caution, and advice regarding strict adherence to policy recommendations, perhaps reflective of the level of trust in providers’ ability to adhere to policy provisions. At the district health management level and among non-state actors, support in perceived areas of weak performance in policy implementation was observed. In addition, timely initiation of contact and subsequent referral was another aspect where community health workers exerted influence while translating policies to address the malaria burden. While the level of support from among network peers was observed to influence community health workers’ adoption and implementation of strategies to address the malaria burden, different mechanisms triggered subsequent response and level of adherence to recommended policy aspects. Drawing from the elicited responses, it was infer that community health workers’ performance influence the direction and extent of success in policy implementation to address the malaria burden at the subnational level.

Keywords: subnational, community, malaria, strategy

Procedia PDF Downloads 96

42041 A Safety Analysis Method for Multi-Agent Systems

Authors: Ching Louis Liu, Edmund Kazmierczak, Tim Miller

Abstract:

Safety analysis for multi-agent systems is complicated by the, potentially nonlinear, interactions between agents. This paper proposes a method for analyzing the safety of multi-agent systems by explicitly focusing on interactions and the accident data of systems that are similar in structure and function to the system being analyzed. The method creates a Bayesian network using the accident data from similar systems. A feature of our method is that the events in accident data are labeled with HAZOP guide words. Our method uses an Ontology to abstract away from the details of a multi-agent implementation. Using the ontology, our methods then constructs an “Interaction Map,” a graphical representation of the patterns of interactions between agents and other artifacts. Interaction maps combined with statistical data from accidents and the HAZOP classifications of events can be converted into a Bayesian Network. Bayesian networks allow designers to explore “what it” scenarios and make design trade-offs that maintain safety. We show how to use the Bayesian networks, and the interaction maps to improve multi-agent system designs.

Keywords: multi-agent system, safety analysis, safety model, integration map

Procedia PDF Downloads 422

42040 Hybrid Approach for Country’s Performance Evaluation

Authors: C. Slim

Abstract:

This paper presents an integrated model, which hybridized data envelopment analysis (DEA) and support vector machine (SVM) together, to class countries according to their efficiency and performance. This model takes into account aspects of multi-dimensional indicators, decision-making hierarchy and relativity of measurement. Starting from a set of indicators of performance as exhaustive as possible, a process of successive aggregations has been developed to attain an overall evaluation of a country’s competitiveness.

Keywords: Artificial Neural Networks (ANN), Support vector machine (SVM), Data Envelopment Analysis (DEA), Aggregations, indicators of performance

Procedia PDF Downloads 344

42039 Probability Sampling in Matched Case-Control Study in Drug Abuse

Authors: Surya R. Niraula, Devendra B Chhetry, Girish K. Singh, S. Nagesh, Frederick A. Connell

Abstract:

Background: Although random sampling is generally considered to be the gold standard for population-based research, the majority of drug abuse research is based on non-random sampling despite the well-known limitations of this kind of sampling. Method: We compared the statistical properties of two surveys of drug abuse in the same community: one using snowball sampling of drug users who then identified “friend controls” and the other using a random sample of non-drug users (controls) who then identified “friend cases.” Models to predict drug abuse based on risk factors were developed for each data set using conditional logistic regression. We compared the precision of each model using bootstrapping method and the predictive properties of each model using receiver operating characteristics (ROC) curves. Results: Analysis of 100 random bootstrap samples drawn from the snowball-sample data set showed a wide variation in the standard errors of the beta coefficients of the predictive model, none of which achieved statistical significance. One the other hand, bootstrap analysis of the random-sample data set showed less variation, and did not change the significance of the predictors at the 5% level when compared to the non-bootstrap analysis. Comparison of the area under the ROC curves using the model derived from the random-sample data set was similar when fitted to either data set (0.93, for random-sample data vs. 0.91 for snowball-sample data, p=0.35); however, when the model derived from the snowball-sample data set was fitted to each of the data sets, the areas under the curve were significantly different (0.98 vs. 0.83, p < .001). Conclusion: The proposed method of random sampling of controls appears to be superior from a statistical perspective to snowball sampling and may represent a viable alternative to snowball sampling.

Keywords: drug abuse, matched case-control study, non-probability sampling, probability sampling

Procedia PDF Downloads 496

42038 The Development of the Website Learning the Local Wisdom in Phra Nakhon Si Ayutthaya Province

Authors: Bunthida Chunngam, Thanyanan Worasesthaphong

Abstract:

This research had objective to develop of the website learning the local wisdom in Phra Nakhon Si Ayutthaya province and studied satisfaction of system user. This research sample was multistage sample for 100 questionnaires, analyzed data to calculated reliability value with Cronbach’s alpha coefficient method α=0.82. This system had 3 functions which were system using, system feather evaluation and system accuracy evaluation which the statistics used for data analysis was descriptive statistics to explain sample feature so these statistics were frequency, percentage, mean and standard deviation. This data analysis result found that the system using performance quality had good level satisfaction (4.44 mean), system feather function analysis had good level satisfaction (4.11 mean) and system accuracy had good level satisfaction (3.74 mean).

Keywords: website, learning, local wisdom, Phra Nakhon Si Ayutthaya province

Procedia PDF Downloads 126

42037 Positive Affect, Negative Affect, Organizational and Motivational Factor on the Acceptance of Big Data Technologies

Authors: Sook Ching Yee, Angela Siew Hoong Lee

Abstract:

Big data technologies have become a trend to exploit business opportunities and provide valuable business insights through the analysis of big data. However, there are still many organizations that have yet to adopt big data technologies especially small and medium organizations (SME). This study uses the technology acceptance model (TAM) to look into several constructs in the TAM and other additional constructs which are positive affect, negative affect, organizational factor and motivational factor. The conceptual model proposed in the study will be tested on the relationship and influence of positive affect, negative affect, organizational factor and motivational factor towards the intention to use big data technologies to produce an outcome. Empirical research is used in this study by conducting a survey to collect data.

Keywords: big data technologies, motivational factor, negative affect, organizational factor, positive affect, technology acceptance model (TAM)

Procedia PDF Downloads 365

42036 The Data Quality Model for the IoT based Real-time Water Quality Monitoring Sensors

Authors: Rabbia Idrees, Ananda Maiti, Saurabh Garg, Muhammad Bilal Amin

Abstract:

IoT devices are the basic building blocks of IoT network that generate enormous volume of real-time and high-speed data to help organizations and companies to take intelligent decisions. To integrate this enormous data from multisource and transfer it to the appropriate client is the fundamental of IoT development. The handling of this huge quantity of devices along with the huge volume of data is very challenging. The IoT devices are battery-powered and resource-constrained and to provide energy efficient communication, these IoT devices go sleep or online/wakeup periodically and a-periodically depending on the traffic loads to reduce energy consumption. Sometime these devices get disconnected due to device battery depletion. If the node is not available in the network, then the IoT network provides incomplete, missing, and inaccurate data. Moreover, many IoT applications, like vehicle tracking and patient tracking require the IoT devices to be mobile. Due to this mobility, If the distance of the device from the sink node become greater than required, the connection is lost. Due to this disconnection other devices join the network for replacing the broken-down and left devices. This make IoT devices dynamic in nature which brings uncertainty and unreliability in the IoT network and hence produce bad quality of data. Due to this dynamic nature of IoT devices we do not know the actual reason of abnormal data. If data are of poor-quality decisions are likely to be unsound. It is highly important to process data and estimate data quality before bringing it to use in IoT applications. In the past many researchers tried to estimate data quality and provided several Machine Learning (ML), stochastic and statistical methods to perform analysis on stored data in the data processing layer, without focusing the challenges and issues arises from the dynamic nature of IoT devices and how it is impacting data quality. A comprehensive review on determining the impact of dynamic nature of IoT devices on data quality is done in this research and presented a data quality model that can deal with this challenge and produce good quality of data. This research presents the data quality model for the sensors monitoring water quality. DBSCAN clustering and weather sensors are used in this research to make data quality model for the sensors monitoring water quality. An extensive study has been done in this research on finding the relationship between the data of weather sensors and sensors monitoring water quality of the lakes and beaches. The detailed theoretical analysis has been presented in this research mentioning correlation between independent data streams of the two sets of sensors. With the help of the analysis and DBSCAN, a data quality model is prepared. This model encompasses five dimensions of data quality: outliers’ detection and removal, completeness, patterns of missing values and checks the accuracy of the data with the help of cluster’s position. At the end, the statistical analysis has been done on the clusters formed as the result of DBSCAN, and consistency is evaluated through Coefficient of Variation (CoV).

Keywords: clustering, data quality, DBSCAN, and Internet of things (IoT)

Procedia PDF Downloads 144

42035 Customers’ Acceptability of Islamic Banking: Employees’ Perspective in Peshawar

Authors: Tahira Imtiaz, Karim Ullah

Abstract:

This paper aims to incorporate the banks employees’ perspective on acceptability of Islamic banking by the customers of Peshawar. A qualitative approach is adopted for which six in-depth interviews with employees of Islamic banks are conducted. The employees were asked to share their experience regarding customers’ acceptance attitude towards acceptability of Islamic banking. Collected data was analyzed through thematic analysis technique and its synthesis with the current literature. Through data analysis a theoretical framework is developed, which highlights the factors which drive customers towards Islamic banking, as witnessed by the employees. The practical implication of analyzed data evident that a new model could be developed on the basis of four determinants of human preference namely: inner satisfaction, time, faith and market forces.

Keywords: customers’ attraction, employees’ perspective, Islamic banking, Riba

Procedia PDF Downloads 336

42034 Creativity as a National System: An Exploratory Model towards Enhance Innovation Ecosystems

Authors: Oscar Javier Montiel Mendez

Abstract:

The link between knowledge-creativity-innovation-entrepreneurship is well established, and broadly emphasized the importance of national innovation systems (NIS) as an approach stresses that the flow of information and technology among people, organizations and institutions are key to its process. Understanding the linkages among the actors involved in innovation is relevant to NIS. Creativity is supposed to fuel NIS, mainly focusing on a personal, group or organizational level, leaving aside the fourth one, as a national system. It is suggested that NIS takes Creativity for granted, an ex-ante stage already solved through some mechanisms, like programs for nurturing it at elementary and secondary schools, universities, or public/organizational specific programs. Or worse, that the individual already has this competence, and that the elements of the NIS will communicate between in a way that will lead to the creation of S curves, with an impact on national systems/programs on entrepreneurship, clusters, and the economy. But creativity constantly appears at any time during NIS, being the key input. Under an initial, exploratory, focused and refined literature review, based on Csikszentmihalyi’s systemic model, Amabile's componential theory, Kaufman and Beghetto’s 4C model, and the OECD’s (Organisation for Economic Co-operation and Development) NIS model (expanded), an NCS theoretical model is elaborated. Its suggested that its implementation could become a significant factor helping strengthen local, regional and national economies. The results also suggest that the establishment of a national creativity system (NCS), something that appears not been previously addressed, as a strategic/vital companion for a NIS, installing it not only as a national education strategy, but as its foundation, managing it and measuring its impact on NIS, entrepreneurship and the rest of the ecosystem, could make more effective public policies. Likewise, it should have a beneficial impact on the efforts of all the stakeholders involved and should help prevent some of the possible failures that NIS present.

Keywords: national creativity system, national innovation system, entrepreneurship ecosystem, systemic creativity

Procedia PDF Downloads 439

42033 Public Participation Best Practices in Environmental Decision-making in Newfoundland and Labrador: Analyzing the Forestry Management Planning Process

Authors: Kimberley K. Whyte-Jones

Abstract:

Public participation may improve the quality of environmental management decisions. However, the quality of such a decision is strongly dependent on the quality of the process that leads to it. In order to ensure an effective and efficient process, key features of best practice in participation should be carefully observed; this would also combat disillusionment of citizens, decision-makers and practitioners. The overarching aim of this study is to determine what constitutes an effective public participation process relevant to the Newfoundland and Labrador, Canada context, and to discover whether the public participation process that led to the 2014-2024 Provincial Sustainable Forest Management Strategy (PSFMS) met best practices criteria. The research design uses an exploratory case study strategy to consider a specific participatory process in environmental decision-making in Newfoundland and Labrador. Data collection methods include formal semi-structured interviews and the review of secondary data sources. The results of this study will determine the validity of a specific public participation best practice framework. The findings will be useful for informing citizen participation processes in general and will deduce best practices in public participation in environmental management in the province. The study is, therefore, meaningful for guiding future policies and practices in the management of forest resources in the province of Newfoundland and Labrador, and will help in filling a noticeable gap in research compiling best practices for environmentally related public participation processes.

Keywords: best practices, environmental decision-making, forest management, public participation

Procedia PDF Downloads 326

42032 Confirmatory Factor Analysis of Smartphone Addiction Inventory (SPAI) in the Yemeni Environment

Authors: Mohammed Al-Khadher

Abstract:

Currently, we are witnessing rapid advancements in the field of information and communications technology, forcing us, as psychologists, to combat the psychological and social effects of such developments. It also drives us to continually look for the development and preparation of measurement tools compatible with the changes brought about by the digital revolution. In this context, the current study aimed to identify the factor analysis of the Smartphone Addiction Inventory (SPAI) in the Republic of Yemen. The sample consisted of (1920) university students (1136 males and 784 females) who answered the inventory, and the data was analyzed using the statistical software (AMOS V25). The factor analysis results showed a goodness-of-fit of the data five-factor model with excellent indicators, as RMSEA-(.052), CFI-(.910), GFI-(.931), AGFI-(.915), TLI-(.897), NFI-(.895), RFI-(.880), and RMR-(.032). All within the ideal range to prove the model's fit of the scale’s factor analysis. The confirmatory factor analysis results showed factor loading in (4) items on (Time Spent), (4) items on (Compulsivity), (8) items on (Daily Life Interference), (5) items on (Craving), and (3) items on (Sleep interference); and all standard values of factor loading were statistically significant at the significance level (>.001).

Keywords: smartphone addiction inventory (SPAI), confirmatory factor analysis (CFA), yemeni students, people at risk of smartphone addiction

Procedia PDF Downloads 101

42031 Geographic Information Systems and Remotely Sensed Data for the Hydrological Modelling of Mazowe Dam

Authors: Ellen Nhedzi Gozo

Abstract:

Unavailability of adequate hydro-meteorological data has always limited the analysis and understanding of hydrological behaviour of several dam catchments including Mazowe Dam in Zimbabwe. The problem of insufficient data for Mazowe Dam catchment analysis was solved by extracting catchment characteristics and aerial hydro-meteorological data from ASTER, LANDSAT, Shuttle Radar Topographic Mission SRTM remote sensing (RS) images using ILWIS, ArcGIS and ERDAS Imagine geographic information systems (GIS) software. Available observed hydrological as well as meteorological data complemented the use of the remotely sensed information. Ground truth land cover was mapped using a Garmin Etrex global positioning system (GPS) system. This information was then used to validate land cover classification detail that was obtained from remote sensing images. A bathymetry survey was conducted using a SONAR system connected to GPS. Hydrological modelling using the HBV model was then performed to simulate the hydrological process of the catchment in an effort to verify the reliability of the derived parameters. The model output shows a high Nash-Sutcliffe Coefficient that is close to 1 indicating that the parameters derived from remote sensing and GIS can be applied with confidence in the analysis of Mazowe Dam catchment.

Keywords: geographic information systems, hydrological modelling, remote sensing, water resources management

Procedia PDF Downloads 342

42030 TAXAPRO, A Streamlined Pipeline to Analyze Shotgun Metagenomes

Authors: Sofia Sehli, Zainab El Ouafi, Casey Eddington, Soumaya Jbara, Kasambula Arthur Shem, Islam El Jaddaoui, Ayorinde Afolayan, Olaitan I. Awe, Allissa Dillman, Hassan Ghazal

Abstract:

The ability to promptly sequence whole genomes at a relatively low cost has revolutionized the way we study the microbiome. Microbiologists are no longer limited to studying what can be grown in a laboratory and instead are given the opportunity to rapidly identify the makeup of microbial communities in a wide variety of environments. Analyzing whole genome sequencing (WGS) data is a complex process that involves multiple moving parts and might be rather unintuitive for scientists that don’t typically work with this type of data. Thus, to help lower the barrier for less-computationally inclined individuals, TAXAPRO was developed at the first Omics Codeathon held virtually by the African Society for Bioinformatics and Computational Biology (ASBCB) in June 2021. TAXAPRO is an advanced metagenomics pipeline that accurately assembles organelle genomes from whole-genome sequencing data. TAXAPRO seamlessly combines WGS analysis tools to create a pipeline that automatically processes raw WGS data and presents organism abundance information in both a tabular and graphical format. TAXAPRO was evaluated using COVID-19 patient gut microbiome data. Analysis performed by TAXAPRO demonstrated a high abundance of Clostridia and Bacteroidia genera and a low abundance of Proteobacteria genera relative to others in the gut microbiome of patients hospitalized with COVID-19, consistent with the original findings derived using a different analysis methodology. This provides crucial evidence that the TAXAPRO workflow dispenses reliable organism abundance information overnight without the hassle of performing the analysis manually.

Keywords: metagenomics, shotgun metagenomic sequence analysis, COVID-19, pipeline, bioinformatics

Procedia PDF Downloads 228

42029 Optimizing Quantum Machine Learning with Amplitude and Phase Encoding Techniques

Authors: Om Viroje

Abstract:

Quantum machine learning represents a frontier in computational technology, promising significant advancements in data processing capabilities. This study explores the significance of data encoding techniques, specifically amplitude and phase encoding, in this emerging field. By employing a comparative analysis methodology, the research evaluates how these encoding techniques affect the accuracy, efficiency, and noise resilience of quantum algorithms. Our findings reveal that amplitude encoding enhances algorithmic accuracy and noise tolerance, whereas phase encoding significantly boosts computational efficiency. These insights are crucial for developing robust quantum frameworks that can be effectively applied in real-world scenarios. In conclusion, optimizing encoding strategies is essential for advancing quantum machine learning, potentially transforming various industries through improved data processing and analysis.

Keywords: quantum machine learning, data encoding, amplitude encoding, phase encoding, noise resilience

Procedia PDF Downloads 30

42028 Seismic Performance Evaluation of Existing Building Using Structural Information Modeling

Authors: Byungmin Cho, Dongchul Lee, Taejin Kim, Minhee Lee

Abstract:

The procedure for the seismic retrofit of existing buildings includes the seismic evaluation. In the evaluation step, it is assessed whether the buildings have satisfactory performance against seismic load. Based on the results of that, the buildings are upgraded. To evaluate seismic performance of the buildings, it usually goes through the model transformation from elastic analysis to inelastic analysis. However, when the data is not delivered through the interwork, engineers should manually input the data. In this process, since it leads to inaccuracy and loss of information, the results of the analysis become less accurate. Therefore, in this study, the process for the seismic evaluation of existing buildings using structural information modeling is suggested. This structural information modeling makes the work economic and accurate. To this end, it is determined which part of the process could be computerized through the investigation of the process for the seismic evaluation based on ASCE 41. The structural information modeling process is developed to apply to the seismic evaluation using Perform 3D program usually used for the nonlinear response history analysis. To validate this process, the seismic performance of an existing building is investigated.

Keywords: existing building, nonlinear analysis, seismic performance, structural information modeling

Procedia PDF Downloads 389

42027 Self-Management among the Ethnic Groups with Type 2 Diabetes Mellitus in Thailand

Authors: Siwarak Kitchanapaibul, Warren Gillibrand, Rob Burton

Abstract:

The prevalence of diabetes mellitus has been rising all over the world. Self-management is required for diabetes mellitus patients. The objective of this study is to explore the self-management among the ethnic groups with type 2 diabetes mellitus in Thailand, an upper middle-income country which is located in South East Asia. The ethnic groups in Thailand are a minority group which has limited education and a different culture, language, costume and lifestyle from Thai people. The qualitative exploratory study was used in this study. In-depth interviews with semi-structured open questions were conducted by 20 participants from purposive sampling. These participants were the ethnic groups who have type 2 diabetes mellitus, received the services from a region hospital, understood Thai and were willing to participate. Content analysis was adopted for the study. The results showed that all of the participants controlled their diet before the appointment day and never miss their appointment. Only 3 participants did their exercise while 2 participants stated that they occasionally forgot to take medicine. 10 participants use the herbs for reducing the sugar level. 12 participants drank a lot of water after a lapse in the diet because they believed that water could dilute the sugar. The findings identified 5 themes; ‘controlling diet before appointment day’; ‘drinking water after a lapse in diet’; ‘medication being a vital importance’; ‘exercise is unimportant’; and ‘taking herbs for sugar reduction’. The results of this study are important to the health professionals to understand the self-management of Ethnic groups and use the data to create the appropriate intervention for promoting health among the ethnic groups with type 2 diabetes mellitus in Thailand. The findings will lead to the revision of health policy and the procedure for promoting health in this special ethnic groups.

Keywords: self-management, diabetes, ethnic groups, Thailand

Procedia PDF Downloads 307

42026 Comparative Sustainability Performance Analysis of Australian Companies Using Composite Measures

Authors: Ramona Zharfpeykan, Paul Rouse

Abstract:

Organizational sustainability is important to both organizations themselves and their stakeholders. Despite its increasing popularity and increasing numbers of organizations reporting sustainability, research on evaluating and comparing the sustainability performance of companies is limited. The aim of this study was to develop models to measure sustainability performance for both cross-sectional and longitudinal comparisons across companies in the same or different industries. A secondary aim was to see if sustainability reports can be used to evaluate sustainability performance. The study used both a content analysis of Australian sustainability reports in mining and metals and financial services for 2011-2014 and a survey of Australian and New Zealand organizations. Two methods ranging from a composite index using uniform weights to data envelopment analysis (DEA) were employed to analyze the data and develop the models. The results show strong statistically significant relationships between the developed models, which suggests that each model provides a consistent, systematic and reasonably robust analysis. The results of the models show that for both industries, companies that had sustainability scores above or below the industry average stayed almost the same during the study period. These indices and models can be used by companies to evaluate their sustainability performance and compare it with previous years, or with other companies in the same or different industries. These methods can also be used by various stakeholders and sustainability ranking companies such as the Global Reporting Initiative (GRI).

Keywords: data envelopment analysis, sustainability, sustainability performance measurement system, sustainability performance index, global reporting initiative

Procedia PDF Downloads 184

42025 An Architectural Model for APT Detection

Authors: Nam-Uk Kim, Sung-Hwan Kim, Tai-Myoung Chung

Abstract:

Typical security management systems are not suitable for detecting APT attack, because they cannot draw the big picture from trivial events of security solutions. Although SIEM solutions have security analysis engine for that, their security analysis mechanisms need to be verified in academic field. Although this paper proposes merely an architectural model for APT detection, we will keep studying on correlation analysis mechanism in the future.

Keywords: advanced persistent threat, anomaly detection, data mining

Procedia PDF Downloads 532

42024 Sentiment Classification of Documents

Authors: Swarnadip Ghosh

Abstract:

Sentiment Analysis is the process of detecting the contextual polarity of text. In other words, it determines whether a piece of writing is positive, negative or neutral.Sentiment analysis of documents holds great importance in today's world, when numerous information is stored in databases and in the world wide web. An efficient algorithm to illicit such information, would be beneficial for social, economic as well as medical purposes. In this project, we have developed an algorithm to classify a document into positive or negative. Using our algorithm, we obtained a feature set from the data, and classified the documents based on this feature set. It is important to note that, in the classification, we have not used the independence assumption, which is considered by many procedures like the Naive Bayes. This makes the algorithm more general in scope. Moreover, because of the sparsity and high dimensionality of such data, we did not use empirical distribution for estimation, but developed a method by finding degree of close clustering of the data points. We have applied our algorithm on a movie review data set obtained from IMDb and obtained satisfactory results.

Keywords: sentiment, Run's Test, cross validation, higher dimensional pmf estimation

Procedia PDF Downloads 407

42023 Rodriguez Diego, Del Valle Martin, Hargreaves Matias, Riveros Jose Luis

Authors: Nathainail Bashir, Neil Anderson

Abstract:

The objective of this study site was to investigate the current state of the practice with regards to karst detection methods and recommend the best method and pattern of arrays to acquire the desire results. Proper site investigation in karst prone regions is extremely valuable in determining the location of possible voids. Two geophysical techniques were employed: multichannel analysis of surface waves (MASW) and electric resistivity tomography (ERT).The MASW data was acquired at each test location using different array lengths and different array orientations (to increase the probability of getting interpretable data in karst terrain). The ERT data were acquired using a dipole-dipole array consisting of 168 electrodes. The MASW data was interpreted (re: estimated depth to physical top of rock) and used to constrain and verify the interpretation of the ERT data. The ERT data indicates poorer quality MASW data were acquired in areas where there was significant local variation in the depth to top of rock.

Keywords: dipole-dipole, ERT, Karst terrains, MASW

Procedia PDF Downloads 318

42022 Recent Advances in Data Warehouse

Authors: Fahad Hanash Alzahrani

Abstract:

This paper describes some recent advances in a quickly developing area of data storing and processing based on Data Warehouses and Data Mining techniques, which are associated with software, hardware, data mining algorithms and visualisation techniques having common features for any specific problems and tasks of their implementation.

Keywords: data warehouse, data mining, knowledge discovery in databases, on-line analytical processing

Procedia PDF Downloads 406

42021 An Analysis System for Integrating High-Throughput Transcript Abundance Data with Metabolic Pathways in Green Algae

Authors: Han-Qin Zheng, Yi-Fan Chiang-Hsieh, Chia-Hung Chien, Wen-Chi Chang

Abstract:

As the most important non-vascular plants, algae have many research applications, including high species diversity, biofuel sources, adsorption of heavy metals and, following processing, health supplements. With the increasing availability of next-generation sequencing (NGS) data for algae genomes and transcriptomes, an integrated resource for retrieving gene expression data and metabolic pathway is essential for functional analysis and systems biology in algae. However, gene expression profiles and biological pathways are displayed separately in current resources, and making it impossible to search current databases directly to identify the cellular response mechanisms. Therefore, this work develops a novel AlgaePath database to retrieve gene expression profiles efficiently under various conditions in numerous metabolic pathways. AlgaePath, a web-based database, integrates gene information, biological pathways, and next-generation sequencing (NGS) datasets in Chlamydomonasreinhardtii and Neodesmus sp. UTEX 2219-4. Users can identify gene expression profiles and pathway information by using five query pages (i.e. Gene Search, Pathway Search, Differentially Expressed Genes (DEGs) Search, Gene Group Analysis, and Co-Expression Analysis). The gene expression data of 45 and 4 samples can be obtained directly on pathway maps in C. reinhardtii and Neodesmus sp. UTEX 2219-4, respectively. Genes that are differentially expressed between two conditions can be identified in Folds Search. Furthermore, the Gene Group Analysis of AlgaePath includes pathway enrichment analysis, and can easily compare the gene expression profiles of functionally related genes in a map. Finally, Co-Expression Analysis provides co-expressed transcripts of a target gene. The analysis results provide a valuable reference for designing further experiments and elucidating critical mechanisms from high-throughput data. More than an effective interface to clarify the transcript response mechanisms in different metabolic pathways under various conditions, AlgaePath is also a data mining system to identify critical mechanisms based on high-throughput sequencing.

Keywords: next-generation sequencing (NGS), algae, transcriptome, metabolic pathway, co-expression

Procedia PDF Downloads 410

42020 Chemometric-Based Voltammetric Method for Analysis of Vitamins and Heavy Metals in Honey Samples

Authors: Marwa A. A. Ragab, Amira F. El-Yazbi, Amr El-Hawiet

Abstract:

The analysis of heavy metals in honey samples is crucial. When found in honey, they denote environmental pollution. Some of these heavy metals as lead either present at low or high concentrations are considered to be toxic. Other heavy metals, for example, copper and zinc, if present at low concentrations, they considered safe even vital minerals. On the contrary, if they present at high concentrations, they are toxic. Their voltammetric determination in honey represents a challenge due to the presence of other electro-active components as vitamins, which may overlap with the peaks of the metal, hindering their accurate and precise determination. The simultaneous analysis of some vitamins: nicotinic acid (B3) and riboflavin (B2), and heavy metals: lead, cadmium, and zinc, in honey samples, was addressed. The analysis was done in 0.1 M Potassium Chloride (KCl) using a hanging mercury drop electrode (HMDE), followed by chemometric manipulation of the voltammetric data using the derivative method. Then the derivative data were convoluted using discrete Fourier functions. The proposed method allowed the simultaneous analysis of vitamins and metals though their varied responses and sensitivities. Although their peaks were overlapped, the proposed chemometric method allowed their accurate and precise analysis. After the chemometric treatment of the data, metals were successfully quantified at low levels in the presence of vitamins (1: 2000). The heavy metals limit of detection (LOD) values after the chemometric treatment of data decreased by more than 60% than those obtained from the direct voltammetric method. The method applicability was tested by analyzing the selected metals and vitamins in real honey samples obtained from different botanical origins.

Keywords: chemometrics, overlapped voltammetric peaks, derivative and convoluted derivative methods, metals and vitamins

Procedia PDF Downloads 154

‹
1
2
...
19
20
21
22
23
24
25
...
1422
1423
›