Search results for: data discovery
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24828

Search results for: data discovery

24768 Helping the Development of Public Policies with Knowledge of Criminal Data

Authors: Diego De Castro Rodrigues, Marcelo B. Nery, Sergio Adorno

Abstract:

The project aims to develop a framework for social data analysis, particularly by mobilizing criminal records and applying descriptive computational techniques, such as associative algorithms and extraction of tree decision rules, among others. The methods and instruments discussed in this work will enable the discovery of patterns, providing a guided means to identify similarities between recurring situations in the social sphere using descriptive techniques and data visualization. The study area has been defined as the city of São Paulo, with the structuring of social data as the central idea, with a particular focus on the quality of the information. Given this, a set of tools will be validated, including the use of a database and tools for visualizing the results. Among the main deliverables related to products and the development of articles are the discoveries made during the research phase. The effectiveness and utility of the results will depend on studies involving real data, validated both by domain experts and by identifying and comparing the patterns found in this study with other phenomena described in the literature. The intention is to contribute to evidence-based understanding and decision-making in the social field.

Keywords: social data analysis, criminal records, computational techniques, data mining, big data

Procedia PDF Downloads 65
24767 Whole Coding Genome Inter-Clade Comparison to Predict Global Cancer-Protecting Variants

Authors: Lamis Naddaf, Yuval Tabach

Abstract:

In this research, we identified the missense genetic variants that have the potential to enhance resistance against cancer. Such field has not been widely explored, as researchers tend to investigate mutations that cause diseases, in response to the suffering of patients, rather than those mutations that protect from them. In conjunction with the genomic revolution, and the advances in genetic engineering and synthetic biology, identifying the protective variants will increase the power of genotype-phenotype predictions and can have significant implications on improved risk estimation, diagnostics, prognosis and even for personalized therapy and drug discovery. To approach our goal, we systematically investigated the sites of the coding genomes and picked up the alleles that showed a correlation with the species’ cancer resistance. We predicted 250 protecting variants (PVs) with a 0.01 false discovery rate and more than 20 thousand PVs with a 0.25 false discovery rate. Cancer resistance in Mammals and reptiles was significantly predicted by the number of PVs a species has. Moreover, Genes enriched with the protecting variants are enriched in pathways relevant to tumor suppression like pathways of Hedgehog signaling and silencing, which its improper activation is associated with the most common form of cancer malignancy. We also showed that the PVs are more abundant in healthy people compared to cancer patients within different human races.

Keywords: comparative genomics, machine learning, cancer resistance, cancer-protecting alleles

Procedia PDF Downloads 83
24766 The Parallelization of Algorithm Based on Partition Principle for Association Rules Discovery

Authors: Khadidja Belbachir, Hafida Belbachir

Abstract:

subsequently the expansion of the physical supports storage and the needs ceaseless to accumulate several data, the sequential algorithms of associations’ rules research proved to be ineffective. Thus the introduction of the new parallel versions is imperative. We propose in this paper, a parallel version of a sequential algorithm “Partition”. This last is fundamentally different from the other sequential algorithms, because it scans the data base only twice to generate the significant association rules. By consequence, the parallel approach does not require much communication between the sites. The proposed approach was implemented for an experimental study. The obtained results, shows a great reduction in execution time compared to the sequential version and Count Distributed algorithm.

Keywords: association rules, distributed data mining, partition, parallel algorithms

Procedia PDF Downloads 386
24765 The Different Learning Path Analysis of Students with Different Learning Attitudes and Styles in Arts Creation

Authors: Tracy Ho, Huann-Shyang Lin, Mina Lin

Abstract:

This study investigated the different learning path of students with different learning attitude and learning styles in Arts Creation. Based on direct instruction, guided-discovery learning, and discovery learning theories, a tablet app including the following three learning areas were developed for students: (1) replication and remix practice area, (2) guided creation area, and (3) free creation area. Thirty. students with different learning attitude and learning styles were invited to use this app. Students’ learning behaviors were categorized and defined. The results will provide both educators and researchers with insights that can form a useful foundation for designing different content and strategy with the application of new technologies in school teaching. It also sheds light on how an educational App can be designed to enhance Arts Creation.

Keywords: App, arts creation, learning attitude, learning style, tablet

Procedia PDF Downloads 257
24764 Exploring the History of Chinese Music Acoustic Technology through Data Fluctuations

Authors: Yang Yang, Lu Xin

Abstract:

The study of extant musical sites can provide a side-by-side picture of historical ethnomusicological information. In their data collection on Chinese opera houses, researchers found that one Ming Dynasty opera house reached a width of nearly 18 meters, while all opera houses of the same period and after it was far from such a width, being significantly smaller than 18 meters. The historical transient fluctuations in the data dimension of width that caused Chinese theatres to fluctuate in the absence of construction scale constraints have piqued the interest of researchers as to why there is data variation in width. What factors have contributed to the lack of further expansion in the width of theatres? To address this question, this study used a comparative approach to conduct a venue experiment between this theater stage and another theater stage for non-heritage opera performances, collecting the subjective perceptions of performers and audiences at different theater stages, as well as combining BK Connect platform software to measure data such as echo and delay. From the subjective and objective results, it is inferred that the Chinese ancients discovered and understood the acoustical phenomenon of the Haas effect by exploring the effect of stage width on musical performance and appreciation of listening states during the Ming Dynasty and utilized this discovery to serve music in subsequent stage construction. This discovery marked a node of evolution in Chinese architectural acoustics technology driven by musical demands. It is also instructive to note that, in contrast to many of the world's "unsuccessful civilizations," China can use a combination of heritage and intangible cultural research to chart a clear, demand-driven course for the evolution of human music technology, and that the findings of such research will complete the course of human exploration of music acoustics. The findings of such research will complete the journey of human exploration of music acoustics, and this practical experience can be applied to the exploration and understanding of other musical heritage base data.

Keywords: Haas effect, musical acoustics, history of acoustical technology, Chinese opera stage, structure

Procedia PDF Downloads 169
24763 Harnessing Emerging Creative Technology for Knowledge Discovery of Multiwavelenght Datasets

Authors: Basiru Amuneni

Abstract:

Astronomy is one domain with a rise in data. Traditional tools for data management have been employed in the quest for knowledge discovery. However, these traditional tools become limited in the face of big. One means of maximizing knowledge discovery for big data is the use of scientific visualisation. The aim of the work is to explore the possibilities offered by emerging creative technologies of Virtual Reality (VR) systems and game engines to visualize multiwavelength datasets. Game Engines are primarily used for developing video games, however their advanced graphics could be exploited for scientific visualization which provides a means to graphically illustrate scientific data to ease human comprehension. Modern astronomy is now in the era of multiwavelength data where a single galaxy for example, is captured by the telescope several times and at different electromagnetic wavelength to have a more comprehensive picture of the physical characteristics of the galaxy. Visualising this in an immersive environment would be more intuitive and natural for an observer. This work presents a standalone VR application that accesses galaxy FITS files. The application was built using the Unity Game Engine for the graphics underpinning and the OpenXR API for the VR infrastructure. The work used a methodology known as Design Science Research (DSR) which entails the act of ‘using design as a research method or technique’. The key stages of the galaxy modelling pipeline are FITS data preparation, Galaxy Modelling, Unity 3D Visualisation and VR Display. The FITS data format cannot be read by the Unity Game Engine directly. A DLL (CSHARPFITS) which provides a native support for reading and writing FITS files was used. The Galaxy modeller uses an approach that integrates cleaned FITS image pixels into the graphics pipeline of the Unity3d game Engine. The cleaned FITS images are then input to the galaxy modeller pipeline phase, which has a pre-processing script that extracts, pixel, galaxy world position, and colour maps the FITS image pixels. The user can visualise image galaxies in different light bands, control the blend of the image with similar images from different sources or fuse images for a holistic view. The framework will allow users to build tools to realise complex workflows for public outreach and possibly scientific work with increased scalability, near real time interactivity with ease of access. The application is presented in an immersive environment and can use all commercially available headset built on the OpenXR API. The user can select galaxies in the scene, teleport to the galaxy, pan, zoom in/out, and change colour gradients of the galaxy. The findings and design lessons learnt in the implementation of different use cases will contribute to the development and design of game-based visualisation tools in immersive environment by enabling informed decisions to be made.

Keywords: astronomy, visualisation, multiwavelenght dataset, virtual reality

Procedia PDF Downloads 75
24762 A Study of Various Ontology Learning Systems from Text and a Look into Future

Authors: Fatima Al-Aswadi, Chan Yong

Abstract:

With the large volume of unstructured data that increases day by day on the web, the motivation of representing the knowledge in this data in the machine processable form is increased. Ontology is one of the major cornerstones of representing the information in a more meaningful way on the semantic Web. The goal of Ontology learning from text is to elicit and represent domain knowledge in the machine readable form. This paper aims to give a follow-up review on the ontology learning systems from text and some of their defects. Furthermore, it discusses how far the ontology learning process will enhance in the future.

Keywords: concept discovery, deep learning, ontology learning, semantic relation, semantic web

Procedia PDF Downloads 493
24761 A Photoredox (C)sp³-(C)sp² Coupling Method Comparison Study

Authors: Shasline Gedeon, Tiffany W. Ardley, Ying Wang, Nathan J. Gesmundo, Katarina A. Sarris, Ana L. Aguirre

Abstract:

Drug discovery and delivery involve drug targeting, an approach that helps find a drug against a chosen target through high throughput screening and other methods by way of identifying the physical properties of the potential lead compound. Physical properties of potential drug candidates have been an imperative focus since the unveiling of Lipinski's Rule of 5 for oral drugs. Throughout a compound's journey from discovery, clinical phase trials, then becoming a classified drug on the market, the desirable properties are optimized while minimizing/eliminating toxicity and undesirable properties. In the pharmaceutical industry, the ability to generate molecules in parallel with maximum efficiency is a substantial factor achieved through sp²-sp² carbon coupling reactions, e.g., Suzuki Coupling reactions. These reaction types allow for the increase of aromatic fragments onto a compound. More recent literature has found benefits to decreasing aromaticity, calling for more sp³-sp² carbon coupling reactions instead. The objective of this project is to provide a comparison between various sp³-sp² carbon coupling methods and reaction conditions, collecting data on production of the desired product. There were four different coupling methods being tested amongst three cores and 4-5 installation groups per method; each method ran under three distinct reaction conditions. The tested methods include the Photoredox Decarboxylative Coupling, the Photoredox Potassium Alkyl Trifluoroborate (BF3K) Coupling, the Photoredox Cross-Electrophile (PCE) Coupling, and the Weix Cross-Electrophile (WCE) Coupling. The results concluded that the Decarboxylative method was very difficult in yielding product despite the several literature conditions chosen. The BF3K and PCE methods produced competitive results. Amongst the two Cross-Electrophile coupling methods, the Photoredox method surpassed the Weix method on numerous accounts. The results will be used to build future libraries.

Keywords: drug discovery, high throughput chemistry, photoredox chemistry, sp³-sp² carbon coupling methods

Procedia PDF Downloads 125
24760 Artificial Intelligence in Bioscience: The Next Frontier

Authors: Parthiban Srinivasan

Abstract:

With recent advances in computational power and access to enough data in biosciences, artificial intelligence methods are increasingly being used in drug discovery research. These methods are essentially a series of advanced statistics based exercises that review the past to indicate the likely future. Our goal is to develop a model that accurately predicts biological activity and toxicity parameters for novel compounds. We have compiled a robust library of over 150,000 chemical compounds with different pharmacological properties from literature and public domain databases. The compounds are stored in simplified molecular-input line-entry system (SMILES), a commonly used text encoding for organic molecules. We utilize an automated process to generate an array of numerical descriptors (features) for each molecule. Redundant and irrelevant descriptors are eliminated iteratively. Our prediction engine is based on a portfolio of machine learning algorithms. We found Random Forest algorithm to be a better choice for this analysis. We captured non-linear relationship in the data and formed a prediction model with reasonable accuracy by averaging across a large number of randomized decision trees. Our next step is to apply deep neural network (DNN) algorithm to predict the biological activity and toxicity properties. We expect the DNN algorithm to give better results and improve the accuracy of the prediction. This presentation will review all these prominent machine learning and deep learning methods, our implementation protocols and discuss these techniques for their usefulness in biomedical and health informatics.

Keywords: deep learning, drug discovery, health informatics, machine learning, toxicity prediction

Procedia PDF Downloads 345
24759 Machine Learning Application in Shovel Maintenance

Authors: Amir Taghizadeh Vahed, Adithya Thaduri

Abstract:

Shovels are the main components in the mining transportation system. The productivity of the mines depends on the availability of shovels due to its high capital and operating costs. The unplanned failure/shutdowns of a shovel results in higher repair costs, increase in downtime, as well as increasing indirect cost (i.e. loss of production and company’s reputation). In order to mitigate these failures, predictive maintenance can be useful approach using failure prediction. The modern mining machinery or shovels collect huge datasets automatically; it consists of reliability and maintenance data. However, the gathered datasets are useless until the information and knowledge of data are extracted. Machine learning as well as data mining, which has a major role in recent studies, has been used for the knowledge discovery process. In this study, data mining and machine learning approaches are implemented to detect not only anomalies but also patterns from a dataset and further detection of failures.

Keywords: maintenance, machine learning, shovel, conditional based monitoring

Procedia PDF Downloads 192
24758 Educating Empathy: Combining Active Listening and Moral Discovery to Facilitate Prosocial Connection

Authors: Erika Price, Lisa Johnson

Abstract:

Cognitive and dispositional empathy is decreasing among students worldwide, particularly those at university. This paper looks at the effects of encouraging empathetic positioning in divisive topics by teaching listening skills and moral discovery to university students. Two groups of university students were given the assignment to interview individuals they disagreed with on social issues (e.g. abortion, gun control, legalization of drugs, involvement in Ukraine, etc.). One group completed the assignment with no other instruction. The second group completed the assignment after receiving instruction in active listening and Jonathan Haidt’s theory of moral foundations in politics. Results show that when students are given both active listening techniques and awareness of moral foundations, they are significantly more likely to have socially positive interactions with those they disagree with on issues as compared to those who listen passively to ideological opponents. As students interacted with those they disagreed with, they evidenced prosocial behaviors of acknowledgement, validation, and even commonalities with their opponents’ viewpoints, signifying a heartening trend of empathetic connection that is waning in students. The research suggests that empathy is a skill that can be nurtured by active listening but that it is more fully cultivated when paired with the concept of moral foundations underpinning political ideologies. These findings shed light on how to create more effective pedagogies for social and emotional learning, as well as inclusion.

Keywords: empathy, listening skills, moral discovery, pedagogy, prosocial behavior

Procedia PDF Downloads 52
24757 Efficient Subgoal Discovery for Hierarchical Reinforcement Learning Using Local Computations

Authors: Adrian Millea

Abstract:

In hierarchical reinforcement learning, one of the main issues encountered is the discovery of subgoal states or options (which are policies reaching subgoal states) by partitioning the environment in a meaningful way. This partitioning usually requires an expensive global clustering operation or eigendecomposition of the Laplacian of the states graph. We propose a local solution to this issue, much more efficient than algorithms using global information, which successfully discovers subgoal states by computing a simple function, which we call heterogeneity for each state as a function of its neighbors. Moreover, we construct a value function using the difference in heterogeneity from one step to the next, as reward, such that we are able to explore the state space much more efficiently than say epsilon-greedy. The same principle can then be applied to higher level of the hierarchy, where now states are subgoals discovered at the level below.

Keywords: exploration, hierarchical reinforcement learning, locality, options, value functions

Procedia PDF Downloads 153
24756 Combating Malaria: A Drug Discovery Approach Using Thiazole Derivatives Against Prolific Parasite Enzyme PfPKG

Authors: Hari Bezwada, Michelle Cheon, Ryan Divan, Hannah Escritor, Michelle Kagramian, Isha Korgaonkar, Maya MacAdams, Udgita Pamidigantam, Richard Pilny, Eleanor Race, Angadh Singh, Nathan Zhang, LeeAnn Nguyen, Fina Liotta

Abstract:

Malaria is a deadly disease caused by the Plasmodium parasite, which continues to develop resistance to current antimalarial drugs. In this research project, the effectiveness of numerous thiazole derivatives was explored in inhibiting the PfPKG, a crucial part of the Plasmodium life cycle. This study involved the synthesis of six thiazole-derived amides to inhibit the PfPKG pathway. Nuclear Magnetic Resonance (NMR) spectroscopy and Infrared (IR) spectroscopy were used to characterize these compounds. Furthermore, AutoDocking software was used to predict binding affinities of these thiazole-derived amides in silico. In silico, compound 6 exhibited the highest predicted binding affinity to PfPKG, while compound 5 had the lowest affinity. Compounds 1-4 displayed varying degrees of predicted binding affinity. In-vitro, it was found that compound 4 had the best percent inhibition, while compound 5 had the worst percent inhibition. Overall, all six compounds had weak inhibition (approximately 30-39% at 10 μM), but these results provide a foundation for future drug discovery experiments.

Keywords: Medicinal Chemistry, Malaria, drug discovery, PfPKG, Thiazole, Plasmodium

Procedia PDF Downloads 71
24755 Realistic Study Discover Some Posture Deformities According to Some Biomechanical Variables for Schoolchildren

Authors: Basman Abdul Jabbar

Abstract:

The researchers aimed to improve the importance of the good posture without any divisions & deformities. The importance of research lied in the discovery posture deformities early so easily treated before its transformation into advanced abnormalities difficult to treat and may need surgical intervention. Research problem was noting that some previous studies were based on the discovery of posture deformities, which was dependent on the (self-evaluation) which this type did not have accuracy to discover deformities. The Samples were (500) schoolchildren aged (9-11 years, males) at Baghdad al Karak. They were students at primary schools. The measure included all posture deformities. The researcher used video camera to analyze the posture deformities according to biomechanical variables by Kinovea software for motion analysis. The researcher recommended the need to use accurate scientific methods for early detection of posture deformities in children which contribute to the prevention and reduction of distortions.

Keywords: biomechanics, children, deformities, posture

Procedia PDF Downloads 270
24754 Research on Construction of Subject Knowledge Base Based on Literature Knowledge Extraction

Authors: Yumeng Ma, Fang Wang, Jinxia Huang

Abstract:

Researchers put forward higher requirements for efficient acquisition and utilization of domain knowledge in the big data era. As literature is an effective way for researchers to quickly and accurately understand the research situation in their field, the knowledge discovery based on literature has become a new research method. As a tool to organize and manage knowledge in a specific domain, the subject knowledge base can be used to mine and present the knowledge behind the literature to meet the users' personalized needs. This study designs the construction route of the subject knowledge base for specific research problems. Information extraction method based on knowledge engineering is adopted. Firstly, the subject knowledge model is built through the abstraction of the research elements. Then under the guidance of the knowledge model, extraction rules of knowledge points are compiled to analyze, extract and correlate entities, relations, and attributes in literature. Finally, a database platform based on this structured knowledge is developed that can provide a variety of services such as knowledge retrieval, knowledge browsing, knowledge q&a, and visualization correlation. Taking the construction practices in the field of activating blood circulation and removing stasis as an example, this study analyzes how to construct subject knowledge base based on literature knowledge extraction. As the system functional test shows, this subject knowledge base can realize the expected service scenarios such as a quick query of knowledge, related discovery of knowledge and literature, knowledge organization. As this study enables subject knowledge base to help researchers locate and acquire deep domain knowledge quickly and accurately, it provides a transformation mode of knowledge resource construction and personalized precision knowledge services in the data-intensive research environment.

Keywords: knowledge model, literature knowledge extraction, precision knowledge services, subject knowledge base

Procedia PDF Downloads 151
24753 Psychosocial Consequences of Discovering Misattributed Paternity in Adulthood: Insider Action Research

Authors: Alyona Cerfontyne, Levita D'Souza, Lefteris Patlamazoglou

Abstract:

Unlike adoption and donor-assisted reproduction, misattributed paternity occurring within the context of spontaneous conception and outside of formally recognised practices of having a child remains largely an understudied phenomenon. In adulthood, to discover misattributed paternity, i.e., that the man you call your father is not related to you genetically, can have profound implications for everyone affected. Until the advent of direct-to-consumer DNA testing 20 years ago, such discoveries were relatively rare. Despite the growing number of individuals uncovering their biogenetic paternity through genetic testing, there is very limited research on misattributed paternity from the perspective of adult children affected by it. No research exists on how to support these individuals through counselling post-discovery. Framed as insider action research, this study aimed to explore the perceived psychosocial consequences of misattributed paternity discoveries and coping strategies used by individuals who discover their misattributed paternity status in adulthood. In total, 12 individuals with misattributed paternity participated in semi-structured interviews in July-August 2022. The collected data was analysed using reflexive thematic analysis. The study’s results indicate that discovering misattributed paternity in adulthood can be likened to a watershed moment forever changing the trajectory of one’s life. Psychological experiences consistent with trauma, as well as grief and loss, re-evaluation of close family relationships, reestablishment of one’s identity, as well as experiencing a profound need to belong are the key themes emerging from the analysis of psychosocial experiences. Post-discovery, individuals with misattributed paternity employ a wide range of emotional and problem-focused coping strategies, amongst which seeking connection with those who understand, searching for information on the new biogenetic family and finding new meanings to life are most prominent. The study contributes both to the academic and practical knowledge of experiences of misattributed paternity and highlights the importance of further research on the topic.

Keywords: discovery of misattributed paternity, misattributed paternity, paternal discrepancy, psychosocial consequences, coping

Procedia PDF Downloads 70
24752 Performance Analysis with the Combination of Visualization and Classification Technique for Medical Chatbot

Authors: Shajida M., Sakthiyadharshini N. P., Kamalesh S., Aswitha B.

Abstract:

Natural Language Processing (NLP) continues to play a strategic part in complaint discovery and medicine discovery during the current epidemic. This abstract provides an overview of performance analysis with a combination of visualization and classification techniques of NLP for a medical chatbot. Sentiment analysis is an important aspect of NLP that is used to determine the emotional tone behind a piece of text. This technique has been applied to various domains, including medical chatbots. In this, we have compared the combination of the decision tree with heatmap and Naïve Bayes with Word Cloud. The performance of the chatbot was evaluated using accuracy, and the results indicate that the combination of visualization and classification techniques significantly improves the chatbot's performance.

Keywords: sentimental analysis, NLP, medical chatbot, decision tree, heatmap, naïve bayes, word cloud

Procedia PDF Downloads 55
24751 Detection of Important Biological Elements in Drug-Drug Interaction Occurrence

Authors: Reza Ferdousi, Reza Safdari, Yadollah Omidi

Abstract:

Drug-drug interactions (DDIs) are main cause of the adverse drug reactions and nature of the functional and molecular complexity of drugs behavior in human body make them hard to prevent and treat. With the aid of new technologies derived from mathematical and computational science the DDIs problems can be addressed with minimum cost and efforts. Market basket analysis is known as powerful method to identify co-occurrence of thing to discover patterns and frequency of the elements. In this research, we used market basket analysis to identify important bio-elements in DDIs occurrence. For this, we collected all known DDIs from DrugBank. The obtained data were analyzed by market basket analysis method. We investigated all drug-enzyme, drug-carrier, drug-transporter and drug-target associations. To determine the importance of the extracted bio-elements, extracted rules were evaluated in terms of confidence and support. Market basket analysis of the over 45,000 known DDIs reveals more than 300 important rules that can be used to identify DDIs, CYP 450 family were the most frequent shared bio-elements. We applied extracted rules over 2,000,000 unknown drug pairs that lead to discovery of more than 200,000 potential DDIs. Analysis of the underlying reason behind the DDI phenomena can help to predict and prevent DDI occurrence. Ranking of the extracted rules based on strangeness of them can be a supportive tool to predict the outcome of an unknown DDI.

Keywords: drug-drug interaction, market basket analysis, rule discovery, important bio-elements

Procedia PDF Downloads 296
24750 What the Future Holds for Social Media Data Analysis

Authors: P. Wlodarczak, J. Soar, M. Ally

Abstract:

The dramatic rise in the use of Social Media (SM) platforms such as Facebook and Twitter provide access to an unprecedented amount of user data. Users may post reviews on products and services they bought, write about their interests, share ideas or give their opinions and views on political issues. There is a growing interest in the analysis of SM data from organisations for detecting new trends, obtaining user opinions on their products and services or finding out about their online reputations. A recent research trend in SM analysis is making predictions based on sentiment analysis of SM. Often indicators of historic SM data are represented as time series and correlated with a variety of real world phenomena like the outcome of elections, the development of financial indicators, box office revenue and disease outbreaks. This paper examines the current state of research in the area of SM mining and predictive analysis and gives an overview of the analysis methods using opinion mining and machine learning techniques.

Keywords: social media, text mining, knowledge discovery, predictive analysis, machine learning

Procedia PDF Downloads 409
24749 Comparison Of Data Mining Models To Predict Future Bridge Conditions

Authors: Pablo Martinez, Emad Mohamed, Osama Mohsen, Yasser Mohamed

Abstract:

Highway and bridge agencies, such as the Ministry of Transportation in Ontario, use the Bridge Condition Index (BCI) which is defined as the weighted condition of all bridge elements to determine the rehabilitation priorities for its bridges. Therefore, accurate forecasting of BCI is essential for bridge rehabilitation budgeting planning. The large amount of data available in regard to bridge conditions for several years dictate utilizing traditional mathematical models as infeasible analysis methods. This research study focuses on investigating different classification models that are developed to predict the bridge condition index in the province of Ontario, Canada based on the publicly available data for 2800 bridges over a period of more than 10 years. The data preparation is a key factor to develop acceptable classification models even with the simplest one, the k-NN model. All the models were tested, compared and statistically validated via cross validation and t-test. A simple k-NN model showed reasonable results (within 0.5% relative error) when predicting the bridge condition in an incoming year.

Keywords: asset management, bridge condition index, data mining, forecasting, infrastructure, knowledge discovery in databases, maintenance, predictive models

Procedia PDF Downloads 175
24748 An Enhanced Connectivity Aware Routing Protocol for Vehicular Ad Hoc Networks

Authors: Ahmadu Maidorawa, Kamalrulnizam Abu Bakar

Abstract:

This paper proposed an Enhanced Connectivity Aware Routing (ECAR) protocol for Vehicular Ad hoc Network (VANET). The protocol uses a control broadcast to reduce the number of overhead packets needed in a route discovery process. It is also equipped with an alternative backup route that is used whenever a primary path to destination failed, which highly reduces the frequent launching and re-launching of the route discovery process that waste useful bandwidth and unnecessarily prolonging the average packet delay. NS2 simulation results show that the performance of ECAR protocol outperformed the original connectivity aware routing (CAR) protocol by reducing the average packet delay by 28%, control overheads by 27% and increased the packet delivery ratio by 22%.

Keywords: alternative path, primary path, protocol, routing, VANET, vehicular ad hoc networks

Procedia PDF Downloads 384
24747 Proteomic Evaluation of Sex Differences in the Plasma of Non-human Primates Exposed to Ionizing Radiation for Biomarker Discovery

Authors: Christina Williams, Mehari Weldemariam, Ann M. Farese, Thomas J. MacVittie, Maureen A. Kane

Abstract:

Radiation exposure results in dose-dependent and time-dependent multi-organ damage. Drug development of medical countermeasures (MCM) for radiation-induced injury occurs under the FDA Animal Rule because human efficacy studies are not ethical or feasible. The FDA Animal Rule requires the representation of both sexes and describes several uses for biomarkers in MCM drug development studies. Currently, MCMs are limited and there is no FDA-approved biomarker for any radiation injury. Sex as a variable is essential to identifying biomarkers and developing effective MCMs for acute radiation exposure (ARS) and delayed effects of acute radiation exposure (DEARE). These studies aim to address the death of information on sex differences that have not been determined by studies that included only male, single-sex cohorts. Studies have reported differences in radiosensitivity according to sex. As such, biomarker discovery for radiation-induced damage must consider sex as a variable. This study evaluated the plasma proteomic profile of Rhesus macaque non-human primates after different exposures and doses, as well as time points after radiation. Exposures and doses included total body irradiation between 5-7.5 Gy and partial body irradiation with 5% bone marrow sparing at 9, 9.5 and 10 Gy. Timepoints after irradiation included days 1, 3, 60, and 180, which encompassed both acute radiation syndromes and delayed effects of acute radiation exposure. Bottom-up proteomic analyses of plasma included equal numbers of males and females. In the control animals, few proteomic differences are observed between the sexes. In the irradiated animals, there are a few sex differences, with changes mostly consisting of proteins upregulated in the female animals. Multiple canonical pathways were upregulated in irradiated animals relative to the control animals when subjected to pathway analysis, but differential responses between the sexes are limited. These data provide critical baseline differences according to sex and establish sex differences in non-human primate models relevant to drug development of MCM under the FDA Animal Rule.

Keywords: ionizing radiation, sex differences, plasma proteomics, biomarker discovery

Procedia PDF Downloads 68
24746 Spatial Relationship of Drug Smuggling Based on Geographic Information System Knowledge Discovery Using Decision Tree Algorithm

Authors: S. Niamkaeo, O. Robert, O. Chaowalit

Abstract:

In this investigation, we focus on discovering spatial relationship of drug smuggling along the northern border of Thailand. Thailand is no longer a drug production site, but Thailand is still one of the major drug trafficking hubs due to its topographic characteristics facilitating drug smuggling from neighboring countries. Our study areas cover three districts (Mae-jan, Mae-fahluang, and Mae-sai) in Chiangrai city and four districts (Chiangdao, Mae-eye, Chaiprakarn, and Wienghang) in Chiangmai city where drug smuggling of methamphetamine crystal and amphetamine occurs mostly. The data on drug smuggling incidents from 2011 to 2017 was collected from several national and local published news. Geo-spatial drug smuggling database was prepared. Decision tree algorithm was applied in order to discover the spatial relationship of factors related to drug smuggling, which was converted into rules using rule-based system. The factors including land use type, smuggling route, season and distance within 500 meters from check points were found that they were related to drug smuggling in terms of rules-based relationship. It was illustrated that drug smuggling was occurred mostly in forest area in winter. Drug smuggling exhibited was discovered mainly along topographic road where check points were not reachable. This spatial relationship of drug smuggling could support the Thai Office of Narcotics Control Board in surveillance drug smuggling.

Keywords: decision tree, drug smuggling, Geographic Information System, GIS knowledge discovery, rule-based system

Procedia PDF Downloads 156
24745 Code Embedding for Software Vulnerability Discovery Based on Semantic Information

Authors: Joseph Gear, Yue Xu, Ernest Foo, Praveen Gauravaran, Zahra Jadidi, Leonie Simpson

Abstract:

Deep learning methods have been seeing an increasing application to the long-standing security research goal of automatic vulnerability detection for source code. Attention, however, must still be paid to the task of producing vector representations for source code (code embeddings) as input for these deep learning models. Graphical representations of code, most predominantly Abstract Syntax Trees and Code Property Graphs, have received some use in this task of late; however, for very large graphs representing very large code snip- pets, learning becomes prohibitively computationally expensive. This expense may be reduced by intelligently pruning this input to only vulnerability-relevant information; however, little research in this area has been performed. Additionally, most existing work comprehends code based solely on the structure of the graph at the expense of the information contained by the node in the graph. This paper proposes Semantic-enhanced Code Embedding for Vulnerability Discovery (SCEVD), a deep learning model which uses semantic-based feature selection for its vulnerability classification model. It uses information from the nodes as well as the structure of the code graph in order to select features which are most indicative of the presence or absence of vulnerabilities. This model is implemented and experimentally tested using the SARD Juliet vulnerability test suite to determine its efficacy. It is able to improve on existing code graph feature selection methods, as demonstrated by its improved ability to discover vulnerabilities.

Keywords: code representation, deep learning, source code semantics, vulnerability discovery

Procedia PDF Downloads 140
24744 A Near-Optimal Domain Independent Approach for Detecting Approximate Duplicates

Authors: Abdelaziz Fellah, Allaoua Maamir

Abstract:

We propose a domain-independent merging-cluster filter approach complemented with a set of algorithms for identifying approximate duplicate entities efficiently and accurately within a single and across multiple data sources. The near-optimal merging-cluster filter (MCF) approach is based on the Monge-Elkan well-tuned algorithm and extended with an affine variant of the Smith-Waterman similarity measure. Then we present constant, variable, and function threshold algorithms that work conceptually in a divide-merge filtering fashion for detecting near duplicates as hierarchical clusters along with their corresponding representatives. The algorithms take recursive refinement approaches in the spirit of filtering, merging, and updating, cluster representatives to detect approximate duplicates at each level of the cluster tree. Experiments show a high effectiveness and accuracy of the MCF approach in detecting approximate duplicates by outperforming the seminal Monge-Elkan’s algorithm on several real-world benchmarks and generated datasets.

Keywords: data mining, data cleaning, approximate duplicates, near-duplicates detection, data mining applications and discovery

Procedia PDF Downloads 369
24743 Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.

Keywords: clustering, unsupervised learning, pattern recognition, categorical datasets, knowledge discovery, k-means

Procedia PDF Downloads 244
24742 Improving University Operations with Data Mining: Predicting Student Performance

Authors: Mladen Dragičević, Mirjana Pejić Bach, Vanja Šimičević

Abstract:

The purpose of this paper is to develop models that would enable predicting student success. These models could improve allocation of students among colleges and optimize the newly introduced model of government subsidies for higher education. For the purpose of collecting data, an anonymous survey was carried out in the last year of undergraduate degree student population using random sampling method. Decision trees were created of which two have been chosen that were most successful in predicting student success based on two criteria: Grade Point Average (GPA) and time that a student needs to finish the undergraduate program (time-to-degree). Decision trees have been shown as a good method of classification student success and they could be even more improved by increasing survey sample and developing specialized decision trees for each type of college. These types of methods have a big potential for use in decision support systems.

Keywords: data mining, knowledge discovery in databases, prediction models, student success

Procedia PDF Downloads 395
24741 Valorization, Conservation and Sustainable Production of Medicinal Plants in Morocco

Authors: Elachouri Mostafa, Fakchich Jamila, Lazaar Jamila, Elmadmad Mohammed, Marhom Mostafa

Abstract:

Of course, there has been a great growth in scientific information about medicinal plants in recent decades, but in many ways this has proved poor compensation, because such information is accessible, in practice, only to a very few people and anyway, rather little of it is relevant to problems of management and utilization, as encountered in the field. Active compounds are used in most traditional medicines and play an important role in advancing sustainable rural livelihoods through their conservation, cultivation, propagation, marketing and commercialization. Medicinal herbs are great resources for various pharmaceutical compounds and urgent measures are required to protect these plant species from their natural destruction and disappearance. Indeed, there is a real danger of indigenous Arab medicinal practices and knowledge disappearing altogether, further weakening traditional Arab culture and creating more insecurity, as well as forsaking a resource of inestimable economic and health care importance. As scientific approach, the ethnopharmacological investigation remains the principal way to improve, evaluate, and increase the odds of finding of biologically active compounds derived from medicinal plants. As developing country, belonging to the Mediterranean basin, Morocco country is endowed with resources of medicinal and aromatic plants. These plants have been used over the millennia for human welfare, even today. Besides, Morocco has a large plant biodiversity, in fact, its medicinal flora account more than 4200 species growing on various bioclimatic zones from subhumide to arid and Saharan. Nevertheless, the human and animal pressure resulting from the increase of rural population needs has led to degradation of this patrimony. In this paper, we focus our attention on ethnopharmacological studies carried out in Morocco. The goal of this work is to clarify the importance of herbs as platform for drugs discovery and further development, to highlight the importance of ethnopharmacological study as approach on discovery of natural products in the health care field, and to discuss the limit of ethnopharmacological investigation of drug discovery in Morocco.

Keywords: Morocco, medicinal plants, ethnopharmacology, natural products, drug-discovery

Procedia PDF Downloads 300
24740 Towards a Distributed Computation Platform Tailored for Educational Process Discovery and Analysis

Authors: Awatef Hicheur Cairns, Billel Gueni, Hind Hafdi, Christian Joubert, Nasser Khelifa

Abstract:

Given the ever changing needs of the job markets, education and training centers are increasingly held accountable for student success. Therefore, education and training centers have to focus on ways to streamline their offers and educational processes in order to achieve the highest level of quality in curriculum contents and managerial decisions. Educational process mining is an emerging field in the educational data mining (EDM) discipline, concerned with developing methods to discover, analyze and provide a visual representation of complete educational processes. In this paper, we present our distributed computation platform which allows different education centers and institutions to load their data and access to advanced data mining and process mining services. To achieve this, we present also a comparative study of the different clustering techniques developed in the context of process mining to partition efficiently educational traces. Our goal is to find the best strategy for distributing heavy analysis computations on many processing nodes of our platform.

Keywords: educational process mining, distributed process mining, clustering, distributed platform, educational data mining, ProM

Procedia PDF Downloads 438
24739 Information Needs and Information Usage of the Older Person Club’s Members in Bangkok

Authors: Siriporn Poolsuwan

Abstract:

This research aims to explore the information needs, information usages, and problems of information usage of the older people club’s members in Dusit District, Bangkok. There are 12 clubs and 746 club’s members in this district. The research results use for older person service in this district. Data is gathered from 252 club’s members by using questionnaires. The quantitative approach uses in research by percentage, means and standard deviation. The results are as follows (1) The older people need Information for entertainment, occupation and academic in the field of short story, computer work, and religion and morality. (2) The participants use Information from various sources. (3) The Problem of information usage is their language skills because of the older people’s literacy problem.

Keywords: information behavior, older person, information seeking, knowledge discovery and data mining

Procedia PDF Downloads 255