Search results for: predictive data mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25423

Search results for: predictive data mining

24043 Machine Learning Analysis of Student Success in Introductory Calculus Based Physics I Course

Authors: Chandra Prayaga, Aaron Wade, Lakshmi Prayaga, Gopi Shankar Mallu

Abstract:

This paper presents the use of machine learning algorithms to predict the success of students in an introductory physics course. Data having 140 rows pertaining to the performance of two batches of students was used. The lack of sufficient data to train robust machine learning models was compensated for by generating synthetic data similar to the real data. CTGAN and CTGAN with Gaussian Copula (Gaussian) were used to generate synthetic data, with the real data as input. To check the similarity between the real data and each synthetic dataset, pair plots were made. The synthetic data was used to train machine learning models using the PyCaret package. For the CTGAN data, the Ada Boost Classifier (ADA) was found to be the ML model with the best fit, whereas the CTGAN with Gaussian Copula yielded Logistic Regression (LR) as the best model. Both models were then tested for accuracy with the real data. ROC-AUC analysis was performed for all the ten classes of the target variable (Grades A, A-, B+, B, B-, C+, C, C-, D, F). The ADA model with CTGAN data showed a mean AUC score of 0.4377, but the LR model with the Gaussian data showed a mean AUC score of 0.6149. ROC-AUC plots were obtained for each Grade value separately. The LR model with Gaussian data showed consistently better AUC scores compared to the ADA model with CTGAN data, except in two cases of the Grade value, C- and A-.

Keywords: machine learning, student success, physics course, grades, synthetic data, CTGAN, gaussian copula CTGAN

Procedia PDF Downloads 36
24042 Data Access, AI Intensity, and Scale Advantages

Authors: Chuping Lo

Abstract:

This paper presents a simple model demonstrating that ceteris paribus countries with lower barriers to accessing global data tend to earn higher incomes than other countries. Therefore, large countries that inherently have greater data resources tend to have higher incomes than smaller countries, such that the former may be more hesitant than the latter to liberalize cross-border data flows to maintain this advantage. Furthermore, countries with higher artificial intelligence (AI) intensity in production technologies tend to benefit more from economies of scale in data aggregation, leading to higher income and more trade as they are better able to utilize global data.

Keywords: digital intensity, digital divide, international trade, scale of economics

Procedia PDF Downloads 56
24041 Secured Transmission and Reserving Space in Images Before Encryption to Embed Data

Authors: G. R. Navaneesh, E. Nagarajan, C. H. Rajam Raju

Abstract:

Nowadays the multimedia data are used to store some secure information. All previous methods allocate a space in image for data embedding purpose after encryption. In this paper, we propose a novel method by reserving space in image with a boundary surrounded before encryption with a traditional RDH algorithm, which makes it easy for the data hider to reversibly embed data in the encrypted images. The proposed method can achieve real time performance, that is, data extraction and image recovery are free of any error. A secure transmission process is also discussed in this paper, which improves the efficiency by ten times compared to other processes as discussed.

Keywords: secure communication, reserving room before encryption, least significant bits, image encryption, reversible data hiding

Procedia PDF Downloads 406
24040 Leveraging Digital Transformation Initiatives and Artificial Intelligence to Optimize Readiness and Simulate Mission Performance across the Fleet

Authors: Justin Woulfe

Abstract:

Siloed logistics and supply chain management systems throughout the Department of Defense (DOD) has led to disparate approaches to modeling and simulation (M&S), a lack of understanding of how one system impacts the whole, and issues with “optimal” solutions that are good for one organization but have dramatic negative impacts on another. Many different systems have evolved to try to understand and account for uncertainty and try to reduce the consequences of the unknown. As the DoD undertakes expansive digital transformation initiatives, there is an opportunity to fuse and leverage traditionally disparate data into a centrally hosted source of truth. With a streamlined process incorporating machine learning (ML) and artificial intelligence (AI), advanced M&S will enable informed decisions guiding program success via optimized operational readiness and improved mission success. One of the current challenges is to leverage the terabytes of data generated by monitored systems to provide actionable information for all levels of users. The implementation of a cloud-based application analyzing data transactions, learning and predicting future states from current and past states in real-time, and communicating those anticipated states is an appropriate solution for the purposes of reduced latency and improved confidence in decisions. Decisions made from an ML and AI application combined with advanced optimization algorithms will improve the mission success and performance of systems, which will improve the overall cost and effectiveness of any program. The Systecon team constructs and employs model-based simulations, cutting across traditional silos of data, aggregating maintenance, and supply data, incorporating sensor information, and applying optimization and simulation methods to an as-maintained digital twin with the ability to aggregate results across a system’s lifecycle and across logical and operational groupings of systems. This coupling of data throughout the enterprise enables tactical, operational, and strategic decision support, detachable and deployable logistics services, and configuration-based automated distribution of digital technical and product data to enhance supply and logistics operations. As a complete solution, this approach significantly reduces program risk by allowing flexible configuration of data, data relationships, business process workflows, and early test and evaluation, especially budget trade-off analyses. A true capability to tie resources (dollars) to weapon system readiness in alignment with the real-world scenarios a warfighter may experience has been an objective yet to be realized to date. By developing and solidifying an organic capability to directly relate dollars to readiness and to inform the digital twin, the decision-maker is now empowered through valuable insight and traceability. This type of educated decision-making provides an advantage over the adversaries who struggle with maintaining system readiness at an affordable cost. The M&S capability developed allows program managers to independently evaluate system design and support decisions by quantifying their impact on operational availability and operations and support cost resulting in the ability to simultaneously optimize readiness and cost. This will allow the stakeholders to make data-driven decisions when trading cost and readiness throughout the life of the program. Finally, sponsors are available to validate product deliverables with efficiency and much higher accuracy than in previous years.

Keywords: artificial intelligence, digital transformation, machine learning, predictive analytics

Procedia PDF Downloads 152
24039 Identity Verification Using k-NN Classifiers and Autistic Genetic Data

Authors: Fuad M. Alkoot

Abstract:

DNA data have been used in forensics for decades. However, current research looks at using the DNA as a biometric identity verification modality. The goal is to improve the speed of identification. We aim at using gene data that was initially used for autism detection to find if and how accurate is this data for identification applications. Mainly our goal is to find if our data preprocessing technique yields data useful as a biometric identification tool. We experiment with using the nearest neighbor classifier to identify subjects. Results show that optimal classification rate is achieved when the test set is corrupted by normally distributed noise with zero mean and standard deviation of 1. The classification rate is close to optimal at higher noise standard deviation reaching 3. This shows that the data can be used for identity verification with high accuracy using a simple classifier such as the k-nearest neighbor (k-NN). 

Keywords: biometrics, genetic data, identity verification, k nearest neighbor

Procedia PDF Downloads 245
24038 Fractional, Component and Morphological Composition of Ambient Air Dust in the Areas of Mining Industry

Authors: S.V. Kleyn, S.Yu. Zagorodnov, А.А. Kokoulina

Abstract:

Technogenic emissions of the mining and processing complex are characterized by a high content of chemical components and solid dust particles. However, each industrial enterprise and the surrounding area have features that require refinement and parameterization. Numerous studies have shown the negative impact of fine dust PM10 and PM2.5 on the health, as well as the possibility of toxic components absorption, including heavy metals by dust particles. The target of the study was the quantitative assessment of the fractional and particle size composition of ambient air dust in the area of impact by primary magnesium production complex. Also, we tried to describe the morphology features of dust particles. Study methods. To identify the dust emission sources, the analysis of the production process has been carried out. The particulate composition of the emissions was measured using laser particle analyzer Microtrac S3500 (covered range of particle size is 20 nm to 2000 km). Particle morphology and the component composition were established by electron microscopy by scanning microscope of high resolution (magnification rate - 5 to 300 000 times) with X-ray fluorescence device S3400N ‘HITACHI’. The chemical composition was identified by X-ray analysis of the samples using an X-ray diffractometer XRD-700 ‘Shimadzu’. Determination of the dust pollution level was carried out using model calculations of emissions in the atmosphere dispersion. The calculations were verified by instrumental studies. Results of the study. The results demonstrated that the dust emissions of different technical processes are heterogeneous and fractional structure is complicated. The percentage of particle sizes up to 2.5 micrometres inclusive was ranged from 0.00 to 56.70%; particle sizes less than 10 microns inclusive – 0.00 - 85.60%; particle sizes greater than 10 microns - 14.40% -100.00%. During microscopy, the presence of nanoscale size particles has been detected. Studied dust particles are round, irregular, cubic and integral shapes. The composition of the dust includes magnesium, sodium, potassium, calcium, iron, chlorine. On the base of obtained results, it was performed the model calculations of dust emissions dispersion and establishment of the areas of fine dust РМ 10 and РМ 2.5 distribution. It was found that the dust emissions of fine powder fractions PM10 and PM2.5 are dispersed over large distances and beyond the border of the industrial site of the enterprise. The population living near the enterprise is exposed to the risk of diseases associated with dust exposure. Data are transferred to the economic entity to make decisions on the measures to minimize the risks. Exposure and risks indicators on the health are used to provide named patient health and preventive care to the citizens living in the area of negative impact of the facility.

Keywords: dust emissions, еxposure assessment, PM 10, PM 2.5

Procedia PDF Downloads 248
24037 A Review on Intelligent Systems for Geoscience

Authors: R Palson Kennedy, P.Kiran Sai

Abstract:

This article introduces machine learning (ML) researchers to the hurdles that geoscience problems present, as well as the opportunities for improvement in both ML and geosciences. This article presents a review from the data life cycle perspective to meet that need. Numerous facets of geosciences present unique difficulties for the study of intelligent systems. Geosciences data is notoriously difficult to analyze since it is frequently unpredictable, intermittent, sparse, multi-resolution, and multi-scale. The first half addresses data science’s essential concepts and theoretical underpinnings, while the second section contains key themes and sharing experiences from current publications focused on each stage of the data life cycle. Finally, themes such as open science, smart data, and team science are considered.

Keywords: Data science, intelligent system, machine learning, big data, data life cycle, recent development, geo science

Procedia PDF Downloads 126
24036 Survey of the Relationship between Functional Movement Screening Tests and Anthropometric Dimensions in Healthy People, 2018

Authors: Akram Sadat Jafari Roodbandi, Parisa Kahani, Fatollah Rahimi Bafrani, Ali Dehghan, Nava Seyedi, Vafa Feyzi, Zohreh Forozanfar

Abstract:

Introduction: Movement function is considered as the ability to produce and maintain balance, stability, and movement throughout the movement chain. Having a score of 14 and above on 7 sub-tests in the functional movement screening (FMS) test shows agility and optimal movement performance. On the other hand, the person's body is an important factor in physical fitness and optimal movement performance. The aim of this study was to identify effective anthropometric dimensions in increasing motor function. Methods: This study was a descriptive-analytical and cross-sectional study using simple random sampling. FMS test and 25 anthropometric dimensions and subcutaneous in five body regions measured in 139 healthy students of Bam University of Medical Sciences. Data analysis was performed using SPSS software and univariate tests and linear regressions at a significance level of 0.05. Results: 139 students were enrolled in the study, 51.1% (71 subjects) and the rest were female. The mean and standard deviation of age, weight, height, and arm subcutaneous fat were 21.5 ± 1.45, 12.6 ± 64.3, 168.7 ± 9.8, 15.3 ± 7, respectively. 17 subjects (12.2%) of the participants in the study have a score of less than 14, and the rest were above 14. Using regression analysis, it was found that exercise and arm subcutaneous fat are predictive variables associated with obtaining a high score in the FMS test. Conclusion: Exercise and weight loss are effective factors for increasing the movement performance of individuals, and this factor is independent of the size of other physical dimensions.

Keywords: functional movement, screening test, anthropometry, ergonomics

Procedia PDF Downloads 140
24035 Identifying Protein-Coding and Non-Coding Regions in Transcriptomes

Authors: Angela U. Makolo

Abstract:

Protein-coding and Non-coding regions determine the biology of a sequenced transcriptome. Research advances have shown that Non-coding regions are important in disease progression and clinical diagnosis. Existing bioinformatics tools have been targeted towards Protein-coding regions alone. Therefore, there are challenges associated with gaining biological insights from transcriptome sequence data. These tools are also limited to computationally intensive sequence alignment, which is inadequate and less accurate to identify both Protein-coding and Non-coding regions. Alignment-free techniques can overcome the limitation of identifying both regions. Therefore, this study was designed to develop an efficient sequence alignment-free model for identifying both Protein-coding and Non-coding regions in sequenced transcriptomes. Feature grouping and randomization procedures were applied to the input transcriptomes (37,503 data points). Successive iterations were carried out to compute the gradient vector that converged the developed Protein-coding and Non-coding Region Identifier (PNRI) model to the approximate coefficient vector. The logistic regression algorithm was used with a sigmoid activation function. A parameter vector was estimated for every sample in 37,503 data points in a bid to reduce the generalization error and cost. Maximum Likelihood Estimation (MLE) was used for parameter estimation by taking the log-likelihood of six features and combining them into a summation function. Dynamic thresholding was used to classify the Protein-coding and Non-coding regions, and the Receiver Operating Characteristic (ROC) curve was determined. The generalization performance of PNRI was determined in terms of F1 score, accuracy, sensitivity, and specificity. The average generalization performance of PNRI was determined using a benchmark of multi-species organisms. The generalization error for identifying Protein-coding and Non-coding regions decreased from 0.514 to 0.508 and to 0.378, respectively, after three iterations. The cost (difference between the predicted and the actual outcome) also decreased from 1.446 to 0.842 and to 0.718, respectively, for the first, second and third iterations. The iterations terminated at the 390th epoch, having an error of 0.036 and a cost of 0.316. The computed elements of the parameter vector that maximized the objective function were 0.043, 0.519, 0.715, 0.878, 1.157, and 2.575. The PNRI gave an ROC of 0.97, indicating an improved predictive ability. The PNRI identified both Protein-coding and Non-coding regions with an F1 score of 0.970, accuracy (0.969), sensitivity (0.966), and specificity of 0.973. Using 13 non-human multi-species model organisms, the average generalization performance of the traditional method was 74.4%, while that of the developed model was 85.2%, thereby making the developed model better in the identification of Protein-coding and Non-coding regions in transcriptomes. The developed Protein-coding and Non-coding region identifier model efficiently identified the Protein-coding and Non-coding transcriptomic regions. It could be used in genome annotation and in the analysis of transcriptomes.

Keywords: sequence alignment-free model, dynamic thresholding classification, input randomization, genome annotation

Procedia PDF Downloads 57
24034 Islamic Extremist Groups' Usage of Populism in Social Media to Radicalize Muslim Migrants in Europe

Authors: Muhammad Irfan

Abstract:

The rise of radicalization within Islam has spawned a new era of global terror. The battlefield Successes of ISIS and the Taliban are fuelled by an ideological war waged, largely and successfully, in the media arena. This research will examine how Islamic extremist groups are using media modalities and populist narratives to influence migrant Muslim populations in Europe towards extremism. In 2014, ISIS shocked the world in exporting horrifically graphic forms of violence on social media. Their Muslim support base was largely disgusted and reviled. In response, they reconfigured their narrative by introducing populist 'hooks', astutely portraying the Muslim populous as oppressed and exploited by unjust, corrupt autocratic regimes and Western power structures. Within this crucible of real and perceived oppression, hundreds of thousands of the most desperate, vulnerable and abused migrants left their homelands, risking their lives in the hope of finding peace, justice, and prosperity in Europe. Instead, many encountered social stigmatization, detention and/or discrimination for being illegal migrants, for lacking resources and for simply being Muslim. This research will examine how Islamic extremist groups are exploiting the disenfranchisement of these migrant populations and using populist messaging on social media to influence them towards violent extremism. ISIS, in particular, formulates specific encoded messages for newly-arriving Muslims in Europe, preying upon their vulnerability. Violence is posited, as a populist response, to the tyranny of European oppression. This research will analyze the factors and indicators which propel Muslim migrants along the spectrum from resilience to violence extremism. Expected outcomes are identification of factors which influence vulnerability towards violent extremism; an early-warning detection framework; predictive analysis models; and de-radicalization frameworks. This research will provide valuable tools (practical and policy level) for European governments, security stakeholders, communities, policy-makers, and educators; it is anticipated to contribute to a de-escalation of Islamic extremism globally.

Keywords: populism, radicalization, de-radicalization, social media, ISIS, Taliban, shariah, jihad, Islam, Europe, political communication, terrorism, migrants, refugees, extremism, global terror, predictive analysis, early warning detection, models, strategic communication, populist narratives, Islamic extremism

Procedia PDF Downloads 113
24033 Applying the Regression Technique for ‎Prediction of the Acute Heart Attack ‎

Authors: Paria Soleimani, Arezoo Neshati

Abstract:

Myocardial infarction is one of the leading causes of ‎death in the world. Some of these deaths occur even before the patient ‎reaches the hospital. Myocardial infarction occurs as a result of ‎impaired blood supply. Because the most of these deaths are due to ‎coronary artery disease, hence the awareness of the warning signs of a ‎heart attack is essential. Some heart attacks are sudden and intense, but ‎most of them start slowly, with mild pain or discomfort, then early ‎detection and successful treatment of these symptoms is vital to save ‎them. Therefore, importance and usefulness of a system designing to ‎assist physicians in the early diagnosis of the acute heart attacks is ‎obvious.‎ The purpose of this study is to determine how well a predictive ‎model would perform based on the only patient-reportable clinical ‎history factors, without using diagnostic tests or physical exams. This ‎type of the prediction model might have application outside of the ‎hospital setting to give accurate advice to patients to influence them to ‎seek care in appropriate situations. For this purpose, the data were ‎collected on 711 heart patients in Iran hospitals. 28 attributes of clinical ‎factors can be reported by patients; were studied. Three logistic ‎regression models were made on the basis of the 28 features to predict ‎the risk of heart attacks. The best logistic regression model in terms of ‎performance had a C-index of 0.955 and with an accuracy of 94.9%. ‎The variables, severe chest pain, back pain, cold sweats, shortness of ‎breath, nausea, and vomiting were selected as the main features.‎

Keywords: Coronary heart disease, Acute heart attacks, Prediction, Logistic ‎regression‎

Procedia PDF Downloads 442
24032 A Study of Mortars with Granulated Blast Furnace Slag as Fine Aggregate and Its Influence on Properties of Burnt Clay Brick Masonry

Authors: Vibha Venkataramu, B. V. Venkatarama Reddy

Abstract:

Natural river sand is the most preferred choice as fine aggregate in masonry mortars. Uncontrolled mining of sand from riverbeds for several decades has had detrimental effects on the environment. Several countries across the world have put strict restrictions on sand mining from riverbeds. However, in countries like India, the huge infrastructural boom has made the local construction industry to look for alternative materials to sand. This study aims at understanding the suitability of granulated blast furnace slag (GBS) as fine aggregates in masonry mortars. Apart from characterising the material properties of GBS, such as particle size distribution, pH, chemical composition, etc., of GBS, tests were performed on the mortars with GBS as fine aggregate. Additionally, the properties of five brick tall, stack bonded masonry prisms with various types of GBS mortars were studied. The mortars with mix proportions 1: 0: 6 (cement: lime: fine aggregate), 1: 1: 6, and 1: 0: 3 were considered for the study. Fresh and hardened properties of mortar, such as flow and compressive strength, were studied. To understand the behaviour of GBS mortars on masonry, tests such as compressive strength and flexure bond strength were performed on masonry prisms made with a different type of GBS mortars. Furthermore, the elastic properties of masonry with GBS mortars were also studied under compression. For comparison purposes, the properties of corresponding control mortars with natural sand as fine aggregate and masonry prisms with sand mortars were also studied under similar testing conditions. From the study, it was observed the addition of GBS negatively influenced the flow of mortars and positively influenced the compressive strength. The GBS mortars showed 20 to 25 % higher compressive strength at 28 days of age, compared to corresponding control mortars. Furthermore, masonry made with GBS mortars showed nearly 10 % higher compressive strengths compared to control specimens. But, the impact of GBS on the flexural strength of masonry was marginal.

Keywords: building materials, fine aggregate, granulated blast furnace slag in mortars, masonry properties

Procedia PDF Downloads 114
24031 Executive Function Assessment with Aboriginal Australians

Authors: T. Keiller, E. Hindman, P. Hassmen, K. Radford, L. Lavrencic

Abstract:

Background: Psychosocial disadvantage is associated with impaired cognitive abilities, with executive functioning (EF) abilities particularly vulnerable. EF abilities strongly predict general daily functioning, educational and career prospects, and health choices. A reliable and valid assessment of EF is important to support appropriate care and intervention strategies. However, evidence-based EF assessment tools for use with Aboriginal Australians are limited. Aim and Method: This research aims to develop and validate a culturally appropriate EF tool for use with indigenous Australians. To this end, Study One aims to review current literature examining the benefits and disadvantages of current EF assessment tools for use with Indigenous Australians. Study Two aims to collate expert opinion on the strengths and weaknesses of various current EF assessment tools for use with Indigenous Australians using Delphi methodology with experienced psychologists (n = 10). The initial two studies will inform the development of a culturally appropriate assessment tool. Study Three aims to evaluate the psychometric properties of the tool with an Indigenous sample living in the New South Wales Mid-North Coast. The study aims to quantify the predictive validity of this tool via comparison to functionality predictors and neuropsychological assessment scores. Study Four aims to collect qualitative data surrounding the feasibility and acceptability of the tool among indigenous Australians and health professionals. Expected Results: Findings from this research are likely to inform cognitive assessment practices and tool selection for health professionals conducting cognitive assessments with Indigenous Australians. Improved assessment of EF will inform appropriate care and intervention strategies for individuals with EF deficits.

Keywords: aboriginal Australians, assessment tool, cognition, executive functioning

Procedia PDF Downloads 260
24030 Data Quality as a Pillar of Data-Driven Organizations: Exploring the Benefits of Data Mesh

Authors: Marc Bachelet, Abhijit Kumar Chatterjee, José Manuel Avila

Abstract:

Data quality is a key component of any data-driven organization. Without data quality, organizations cannot effectively make data-driven decisions, which often leads to poor business performance. Therefore, it is important for an organization to ensure that the data they use is of high quality. This is where the concept of data mesh comes in. Data mesh is an organizational and architectural decentralized approach to data management that can help organizations improve the quality of data. The concept of data mesh was first introduced in 2020. Its purpose is to decentralize data ownership, making it easier for domain experts to manage the data. This can help organizations improve data quality by reducing the reliance on centralized data teams and allowing domain experts to take charge of their data. This paper intends to discuss how a set of elements, including data mesh, are tools capable of increasing data quality. One of the key benefits of data mesh is improved metadata management. In a traditional data architecture, metadata management is typically centralized, which can lead to data silos and poor data quality. With data mesh, metadata is managed in a decentralized manner, ensuring accurate and up-to-date metadata, thereby improving data quality. Another benefit of data mesh is the clarification of roles and responsibilities. In a traditional data architecture, data teams are responsible for managing all aspects of data, which can lead to confusion and ambiguity in responsibilities. With data mesh, domain experts are responsible for managing their own data, which can help provide clarity in roles and responsibilities and improve data quality. Additionally, data mesh can also contribute to a new form of organization that is more agile and adaptable. By decentralizing data ownership, organizations can respond more quickly to changes in their business environment, which in turn can help improve overall performance by allowing better insights into business as an effect of better reports and visualization tools. Monitoring and analytics are also important aspects of data quality. With data mesh, monitoring, and analytics are decentralized, allowing domain experts to monitor and analyze their own data. This will help in identifying and addressing data quality problems in quick time, leading to improved data quality. Data culture is another major aspect of data quality. With data mesh, domain experts are encouraged to take ownership of their data, which can help create a data-driven culture within the organization. This can lead to improved data quality and better business outcomes. Finally, the paper explores the contribution of AI in the coming years. AI can help enhance data quality by automating many data-related tasks, like data cleaning and data validation. By integrating AI into data mesh, organizations can further enhance the quality of their data. The concepts mentioned above are illustrated by AEKIDEN experience feedback. AEKIDEN is an international data-driven consultancy that has successfully implemented a data mesh approach. By sharing their experience, AEKIDEN can help other organizations understand the benefits and challenges of implementing data mesh and improving data quality.

Keywords: data culture, data-driven organization, data mesh, data quality for business success

Procedia PDF Downloads 125
24029 DNpro: A Deep Learning Network Approach to Predicting Protein Stability Changes Induced by Single-Site Mutations

Authors: Xiao Zhou, Jianlin Cheng

Abstract:

A single amino acid mutation can have a significant impact on the stability of protein structure. Thus, the prediction of protein stability change induced by single site mutations is critical and useful for studying protein function and structure. Here, we presented a deep learning network with the dropout technique for predicting protein stability changes upon single amino acid substitution. While using only protein sequence as input, the overall prediction accuracy of the method on a standard benchmark is >85%, which is higher than existing sequence-based methods and is comparable to the methods that use not only protein sequence but also tertiary structure, pH value and temperature. The results demonstrate that deep learning is a promising technique for protein stability prediction. The good performance of this sequence-based method makes it a valuable tool for predicting the impact of mutations on most proteins whose experimental structures are not available. Both the downloadable software package and the user-friendly web server (DNpro) that implement the method for predicting protein stability changes induced by amino acid mutations are freely available for the community to use.

Keywords: bioinformatics, deep learning, protein stability prediction, biological data mining

Procedia PDF Downloads 452
24028 Big Data Analysis with RHadoop

Authors: Ji Eun Shin, Byung Ho Jung, Dong Hoon Lim

Abstract:

It is almost impossible to store or analyze big data increasing exponentially with traditional technologies. Hadoop is a new technology to make that possible. R programming language is by far the most popular statistical tool for big data analysis based on distributed processing with Hadoop technology. With RHadoop that integrates R and Hadoop environment, we implemented parallel multiple regression analysis with different sizes of actual data. Experimental results showed our RHadoop system was much faster as the number of data nodes increases. We also compared the performance of our RHadoop with lm function and big lm packages available on big memory. The results showed that our RHadoop was faster than other packages owing to paralleling processing with increasing the number of map tasks as the size of data increases.

Keywords: big data, Hadoop, parallel regression analysis, R, RHadoop

Procedia PDF Downloads 427
24027 A Mutually Exclusive Task Generation Method Based on Data Augmentation

Authors: Haojie Wang, Xun Li, Rui Yin

Abstract:

In order to solve the memorization overfitting in the meta-learning MAML algorithm, a method of generating mutually exclusive tasks based on data augmentation is proposed. This method generates a mutex task by corresponding one feature of the data to multiple labels, so that the generated mutex task is inconsistent with the data distribution in the initial dataset. Because generating mutex tasks for all data will produce a large number of invalid data and, in the worst case, lead to exponential growth of computation, this paper also proposes a key data extraction method, that only extracts part of the data to generate the mutex task. The experiments show that the method of generating mutually exclusive tasks can effectively solve the memorization overfitting in the meta-learning MAML algorithm.

Keywords: data augmentation, mutex task generation, meta-learning, text classification.

Procedia PDF Downloads 85
24026 Efficient Positioning of Data Aggregation Point for Wireless Sensor Network

Authors: Sifat Rahman Ahona, Rifat Tasnim, Naima Hassan

Abstract:

Data aggregation is a helpful technique for reducing the data communication overhead in wireless sensor network. One of the important tasks of data aggregation is positioning of the aggregator points. There are a lot of works done on data aggregation. But, efficient positioning of the aggregators points is not focused so much. In this paper, authors are focusing on the positioning or the placement of the aggregation points in wireless sensor network. Authors proposed an algorithm to select the aggregators positions for a scenario where aggregator nodes are more powerful than sensor nodes.

Keywords: aggregation point, data communication, data aggregation, wireless sensor network

Procedia PDF Downloads 148
24025 The Clash between Environmental and Heritage Laws: An Australian Case Study

Authors: Andrew R. Beatty

Abstract:

The exploitation of Australia’s vast mineral wealth is regulated by a matrix of planning, environment and heritage legislation, and despite the desire for a ‘balance’ between economic, environmental and heritage values, Aboriginal objects and places are often detrimentally impacted by mining approvals. The Australian experience is not novel. There are other cases of clashes between the rights of traditional landowners and businesses seeking to exploit mineral or other resources on or beneath those lands, including in the United States, Canada, and Brazil. How one reconciles the rights of traditional owners with those of resource companies is an ongoing legal problem of general interest. In Australia, planning and environmental approvals for resource projects are ordinarily issued by State or Territory governments. Federal legislation such as the Aboriginal and Torres Strait Islander Heritage Protection Act 1984 (Cth) is intended to act as a safety net when State or Territory legislation is incapable of protecting Indigenous objects or places in the context of approvals for resource projects. This paper will analyse the context and effectiveness of legislation enacted to protect Indigenous heritage in the planning process. In particular, the paper will analyse how the statutory objects of such legislation need to be weighed against the statutory objects of competing legislation designed to facilitate and control resource exploitation. Using a current claim in the Federal Court of Australia for the protection of a culturally significant landscape as a case study, this paper will examine the challenges faced in ascribing value to cultural heritage within the wider context of environmental and planning laws. Our findings will reveal that there is an inherent difficulty in defining and weighing competing economic, environmental and heritage considerations. An alternative framework will be proposed to guide regulators towards making decisions that result in better protection of Indigenous heritage in the context of resource management.

Keywords: environmental law, heritage law, indigenous rights, mining

Procedia PDF Downloads 90
24024 Spatial Econometric Approaches for Count Data: An Overview and New Directions

Authors: Paula Simões, Isabel Natário

Abstract:

This paper reviews a number of theoretical aspects for implementing an explicit spatial perspective in econometrics for modelling non-continuous data, in general, and count data, in particular. It provides an overview of the several spatial econometric approaches that are available to model data that are collected with reference to location in space, from the classical spatial econometrics approaches to the recent developments on spatial econometrics to model count data, in a Bayesian hierarchical setting. Considerable attention is paid to the inferential framework, necessary for structural consistent spatial econometric count models, incorporating spatial lag autocorrelation, to the corresponding estimation and testing procedures for different assumptions, to the constrains and implications embedded in the various specifications in the literature. This review combines insights from the classical spatial econometrics literature as well as from hierarchical modeling and analysis of spatial data, in order to look for new possible directions on the processing of count data, in a spatial hierarchical Bayesian econometric context.

Keywords: spatial data analysis, spatial econometrics, Bayesian hierarchical models, count data

Procedia PDF Downloads 582
24023 MCD-017: Potential Candidate from the Class of Nitroimidazoles to Treat Tuberculosis

Authors: Gurleen Kour, Mowkshi Khullar, B. K. Chandan, Parvinder Pal Singh, Kushalava Reddy Yumpalla, Gurunadham Munagala, Ram A. Vishwakarma, Zabeer Ahmed

Abstract:

New chemotherapeutic compounds against multidrug-resistant Mycobacterium tuberculosis (Mtb) are urgently needed to combat drug resistance in tuberculosis (TB). Apart from in-vitro potency against the target, physiochemical properties and pharmacokinetic properties play an imperative role in the process of drug discovery. We have identified novel nitroimidazole derivatives with potential activity against mycobacterium tuberculosis. One lead candidates, MCD-017, which showed potent activity against H37Rv strain (MIC=0.5µg/ml) and was further evaluated in the process of drug development. Methods: Basic physicochemical parameters like solubility and lipophilicity (LogP) were evaluated. Thermodynamic solubility was determined in PBS buffer (pH 7.4) using LC/MS-MS. The partition coefficient (Log P) of the compound was determined between octanol and phosphate buffered saline (PBS at pH 7.4) at 25°C by the microscale shake flask method. The compound followed Lipinski’s rule of five, which is predictive of good oral bioavailability and was further evaluated for metabolic stability. In-vitro metabolic stability was determined in rat liver microsomes. The hepatotoxicity of the compound was also determined in HepG2 cell line. In vivo pharmacokinetic profile of the compound after oral dosing was also obtained using balb/c mice. Results: The compound exhibited favorable solubility and lipophilicity. The physical and chemical properties of the compound were made use of as the first determination of drug-like properties. The compound obeyed Lipinski’s rule of five, with molecular weight < 500, number of hydrogen bond donors (HBD) < 5 and number of hydrogen bond acceptors(HBA) not more then 10. The log P of the compound was less than 5 and therefore the compound is predictive of exhibiting good absorption and permeation. Pooled rat liver microsomes were prepared from rat liver homogenate for measuring the metabolic stability. 99% of the compound was not metabolized and remained intact. The compound did not exhibit cytoxicity in hepG2 cells upto 40 µg/ml. The compound revealed good pharmacokinetic profile at a dose of 5mg/kg administered orally with a half life (t1/2) of 1.15 hours, Cmax of 642ng/ml, clearance of 4.84 ml/min/kg and a volume of distribution of 8.05 l/kg. Conclusion : The emergence of multi drug resistance (MDR) and extensively drug resistant (XDR) Tuberculosis emphasize the requirement of novel drugs active against tuberculosis. Thus, the need to evaluate physicochemical and pharmacokinetic properties in the early stages of drug discovery is required to reduce the attrition associated with poor drug exposure. In summary, it can be concluded that MCD-017 may be considered a good candidate for further preclinical and clinical evaluations.

Keywords: mycobacterium tuberculosis, pharmacokinetics, physicochemical properties, hepatotoxicity

Procedia PDF Downloads 451
24022 A NoSQL Based Approach for Real-Time Managing of Robotics's Data

Authors: Gueidi Afef, Gharsellaoui Hamza, Ben Ahmed Samir

Abstract:

This paper deals with the secret of the continual progression data that new data management solutions have been emerged: The NoSQL databases. They crossed several areas like personalization, profile management, big data in real-time, content management, catalog, view of customers, mobile applications, internet of things, digital communication and fraud detection. Nowadays, these database management systems are increasing. These systems store data very well and with the trend of big data, a new challenge’s store demands new structures and methods for managing enterprise data. The new intelligent machine in the e-learning sector, thrives on more data, so smart machines can learn more and faster. The robotics are our use case to focus on our test. The implementation of NoSQL for Robotics wrestle all the data they acquire into usable form because with the ordinary type of robotics; we are facing very big limits to manage and find the exact information in real-time. Our original proposed approach was demonstrated by experimental studies and running example used as a use case.

Keywords: NoSQL databases, database management systems, robotics, big data

Procedia PDF Downloads 341
24021 Development of a Novel Clinical Screening Tool, Using the BSGE Pain Questionnaire, Clinical Examination and Ultrasound to Predict the Severity of Endometriosis Prior to Laparoscopic Surgery

Authors: Marlin Mubarak

Abstract:

Background: Endometriosis is a complex disabling disease affecting young females in the reproductive period mainly. The aim of this project is to generate a diagnostic model to predict severity and stage of endometriosis prior to Laparoscopic surgery. This will help to improve the pre-operative diagnostic accuracy of stage 3 & 4 endometriosis and as a result, refer relevant women to a specialist centre for complex Laparoscopic surgery. The model is based on the British Society of Gynaecological Endoscopy (BSGE) pain questionnaire, clinical examination and ultrasound scan. Design: This is a prospective, observational, study, in which women completed the BSGE pain questionnaire, a BSGE requirement. Also, as part of the routine preoperative assessment patient had a routine ultrasound scan and when recto-vaginal and deep infiltrating endometriosis was suspected an MRI was performed. Setting: Luton & Dunstable University Hospital. Patients: Symptomatic women (n = 56) scheduled for laparoscopy due to pelvic pain. The age ranged between 17 – 52 years of age (mean 33.8 years, SD 8.7 years). Interventions: None outside the recognised and established endometriosis centre protocol set up by BSGE. Main Outcome Measure(s): Sensitivity and specificity of endometriosis diagnosis predicted by symptoms based on BSGE pain questionnaire, clinical examinations and imaging. Findings: The prevalence of diagnosed endometriosis was calculated to be 76.8% and the prevalence of advanced stage was 55.4%. Deep infiltrating endometriosis in various locations was diagnosed in 32/56 women (57.1%) and some had DIE involving several locations. Logistic regression analysis was performed on 36 clinical variables to create a simple clinical prediction model. After creating the scoring system using variables with P < 0.05, the model was applied to the whole dataset. The sensitivity was 83.87% and specificity 96%. The positive likelihood ratio was 20.97 and the negative likelihood ratio was 0.17, indicating that the model has a good predictive value and could be useful in predicting advanced stage endometriosis. Conclusions: This is a hypothesis-generating project with one operator, but future proposed research would provide validation of the model and establish its usefulness in the general setting. Predictive tools based on such model could help organise the appropriate investigation in clinical practice, reduce risks associated with surgery and improve outcome. It could be of value for future research to standardise the assessment of women presenting with pelvic pain. The model needs further testing in a general setting to assess if the initial results are reproducible.

Keywords: deep endometriosis, endometriosis, minimally invasive, MRI, ultrasound.

Procedia PDF Downloads 343
24020 Numerical Modelling of 3-D Fracture Propagation and Damage Evolution of an Isotropic Heterogeneous Rock with a Pre-Existing Surface Flaw under Uniaxial Compression

Authors: S. Mondal, L. M. Olsen-Kettle, L. Gross

Abstract:

Fracture propagation and damage evolution are extremely important for many industrial applications including mining industry, composite materials, earthquake simulations, hydraulic fracturing. The influence of pre-existing flaws and rock heterogeneity on the processes and mechanisms of rock fracture has important ramifications in many mining and reservoir engineering applications. We simulate the damage evolution and fracture propagation in an isotropic sandstone specimen containing a pre-existing 3-D surface flaw in different configurations under uniaxial compression. We apply a damage model based on the unified strength theory and solve the solid deformation and damage evolution equations using the Finite Element Method (FEM) with tetrahedron elements on unstructured meshes through the simulation software, eScript. Unstructured meshes provide higher geometrical flexibility and allow a more accurate way to model the varying flaw depth, angle, and length through locally adapted FEM meshes. The heterogeneity of rock is considered by initializing material properties using a Weibull distribution sampled over a cubic grid. In our model, we introduce a length scale related to the rock heterogeneity which is independent of the mesh size. We investigate the effect of parameters including the heterogeneity of the elastic moduli and geometry of the single flaw in the stress strain response. The generation of three typical surface cracking patterns, called wing cracks, anti-wing cracks and far-field cracks were identified, and these depend on the geometry of the pre-existing surface flaw. This model results help to advance our understanding of fracture and damage growth in heterogeneous rock with the aim to develop fracture simulators for different industry applications.

Keywords: finite element method, heterogeneity, isotropic damage, uniaxial compression

Procedia PDF Downloads 206
24019 Research on Internet Attention of Tourism and Marketing Strategy in Northeast Sichuan Economic Zone in China Based on Baidu Index

Authors: Chuanqiao Zheng, Wei Zeng, Haozhen Lin

Abstract:

As of March 2020, the number of Chinese netizens has reached 904 million. The proportion of Internet users accessing the Internet through mobile phones is as high as 99.3%. Under the background of 'Internet +', tourists have a stronger sense of independence in the choice of tourism destinations and tourism products. Tourists are more inclined to learn about the relevant information on tourism destinations and other tourists' evaluations of tourist products through the Internet. The search engine, as an integrated platform that contains a wealth of information, is highly valuable to the analysis of the characteristics of the Internet attention given to various tourism destinations, through big data mining and analysis. This article uses the Baidu Index as the data source, which is one of the products of Baidu Search. The Baidu Index is based on big data, which collects and shares the search results of a large number of Internet users on the Baidu search engine. The big data used in this article includes search index, demand map, population profile, etc. The main research methods used are: (1) based on the search index, analyzing the Internet attention given to the tourism in five cities in Northeast Sichuan at different times, so as to obtain the overall trend and individual characteristics of tourism development in the region; (2) based on the demand map and the population profile, analyzing the demographic characteristics and market positioning of the tourist groups in these cities to understand the characteristics and needs of the target groups; (3) correlating the Internet attention data with the permanent population of each province in China in the corresponding to construct the Boston matrix of the Internet attention rate of the Northeast Sichuan tourism, obtain the tourism target markets, and then propose development strategies for different markets. The study has found that: a) the Internet attention given to the tourism in the region can be categorized into tourist off-season and peak season; the Internet attention given to tourism in different cities is quite different. b) tourists look for information including tour guide information, ticket information, traffic information, weather information, and information on the competing tourism cities; with regard to the population profile, the main group of potential tourists searching for the keywords of tourism in the five prefecture-level cities in Northeast Sichuan are youth. The male to female ratio is about 6 to 4, with males being predominant. c) through the construction of the Boston matrix, it is concluded that the star market for tourism in the Northeast Sichuan Economic Zone includes Sichuan and Shaanxi; the cash cows market includes Hainan and Ningxia; the question market includes Jiangsu and Shanghai; the dog market includes Hubei and Jiangxi. The study concludes with the following planning strategies and recommendations: i) creating a diversified business format that integrates cultural and tourism; ii) creating a brand image of niche tourism; iii) focusing on the development of tourism products; iv) innovating composite three-dimensional marketing channels.

Keywords: Baidu Index, big data, internet attention, tourism

Procedia PDF Downloads 115
24018 Contribution to the Decision-Making Process for Selecting the Suitable Maintenance Policy

Authors: Nasser Y. Mahamoud, Pierre Dehombreux, Hassan E. Robleh

Abstract:

Industrial companies may be confronted with questions about their choice of maintenance policy. This choice must be guided by several numbers of decision criteria or objectives related to their production or service activities but also to their level of development and their investment prospects. A decision-support methodology to choose a maintenance policy (corrective, systematic or conditional preventive, predictive, opportunistic or not) is proposed to facilitate this choice using the main categories of the most important decision criteria. The different steps of this methodology are illustrated using theoretical case: identification of the different maintenance alternatives, determining the structure of the most important categories of the decision criteria, assessing the different maintenance policies on to the criteria by using an ordinal preference relation, and finally ranking the different maintenance policies.

Keywords: maintenance policy, decision criteria, decision-making process, AHP

Procedia PDF Downloads 323
24017 The Determinants of Corporate Hedging Strategy

Authors: Ademola Ajibade

Abstract:

Previous studies have explored several rationales for hedging strategies, but the evidence provided by these studies remains ambiguous. Using a hand-collected dataset of 2460 observations of non-financial firms in eight African countries covering 2013-2022, this paper investigates the determinants and extent of corporate hedge use. In particular, this paper focuses on the link between country-specific conditions and the corporate hedging behaviour of firms. To our knowledge, this represents the first African studies investigating the association between country-specific factors and corporate hedging policy. The evidence based on both univariate and multivariate reveal that country-level corruption and government quality are important indicators of the decisions and extent of hedge use among African firms. However, the connection between country-specific factors as a rationale for corporate hedge use is stronger for firms located in highly corrupt countries. This suggest that firms located in corrupt countries are more motivated to hedge due to the large exposure they face. In addition, we test the risk management theories and observe that CEOs educational qualification and experience shape corporate hedge behaviour. We implement a lagged variables in a panel data setting to address endogeneity concern and implement an interaction term between governance indices and firm-specific variables to test for robustness. Generally, our findings reveal that institutional factors shape risk management decisions and have a predictive power in explaining corporate hedging strategy.

Keywords: corporate hedging, governance quality, corruption, derivatives

Procedia PDF Downloads 79
24016 Reductive Control in the Management of Redundant Actuation

Authors: Mkhinini Maher, Knani Jilani

Abstract:

We present in this work the performances of a mobile omnidirectional robot through evaluating its management of the redundancy of actuation. Thus we come to the predictive control implemented. The distribution of the wringer on the robot actions, through the inverse pseudo of Moore-Penrose, corresponds to a -geometric- distribution of efforts. We will show that the load on vehicle wheels would not be equi-distributed in terms of wheels configuration and of robot movement. Thus, the threshold of sliding is not the same for the three wheels of the vehicle. We suggest exploiting the redundancy of actuation to reduce the risk of wheels sliding and to ameliorate, thereby, its accuracy of displacement. This kind of approach was the subject of study for the legged robots.

Keywords: mobile robot, actuation, redundancy, omnidirectional, inverse pseudo moore-penrose, reductive control

Procedia PDF Downloads 502
24015 A Mutually Exclusive Task Generation Method Based on Data Augmentation

Authors: Haojie Wang, Xun Li, Rui Yin

Abstract:

In order to solve the memorization overfitting in the model-agnostic meta-learning MAML algorithm, a method of generating mutually exclusive tasks based on data augmentation is proposed. This method generates a mutex task by corresponding one feature of the data to multiple labels so that the generated mutex task is inconsistent with the data distribution in the initial dataset. Because generating mutex tasks for all data will produce a large number of invalid data and, in the worst case, lead to an exponential growth of computation, this paper also proposes a key data extraction method that only extract part of the data to generate the mutex task. The experiments show that the method of generating mutually exclusive tasks can effectively solve the memorization overfitting in the meta-learning MAML algorithm.

Keywords: mutex task generation, data augmentation, meta-learning, text classification.

Procedia PDF Downloads 129
24014 Revolutionizing Traditional Farming Using Big Data/Cloud Computing: A Review on Vertical Farming

Authors: Milind Chaudhari, Suhail Balasinor

Abstract:

Due to massive deforestation and an ever-increasing population, the organic content of the soil is depleting at a much faster rate. Due to this, there is a big chance that the entire food production in the world will drop by 40% in the next two decades. Vertical farming can help in aiding food production by leveraging big data and cloud computing to ensure plants are grown naturally by providing the optimum nutrients sunlight by analyzing millions of data points. This paper outlines the most important parameters in vertical farming and how a combination of big data and AI helps in calculating and analyzing these millions of data points. Finally, the paper outlines how different organizations are controlling the indoor environment by leveraging big data in enhancing food quantity and quality.

Keywords: big data, IoT, vertical farming, indoor farming

Procedia PDF Downloads 165