Search results for: data discovery
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24649

Search results for: data discovery

24619 Road Traffic Accidents Analysis in Mexico City through Crowdsourcing Data and Data Mining Techniques

Authors: Gabriela V. Angeles Perez, Jose Castillejos Lopez, Araceli L. Reyes Cabello, Emilio Bravo Grajales, Adriana Perez Espinosa, Jose L. Quiroz Fabian

Abstract:

Road traffic accidents are among the principal causes of traffic congestion, causing human losses, damages to health and the environment, economic losses and material damages. Studies about traditional road traffic accidents in urban zones represents very high inversion of time and money, additionally, the result are not current. However, nowadays in many countries, the crowdsourced GPS based traffic and navigation apps have emerged as an important source of information to low cost to studies of road traffic accidents and urban congestion caused by them. In this article we identified the zones, roads and specific time in the CDMX in which the largest number of road traffic accidents are concentrated during 2016. We built a database compiling information obtained from the social network known as Waze. The methodology employed was Discovery of knowledge in the database (KDD) for the discovery of patterns in the accidents reports. Furthermore, using data mining techniques with the help of Weka. The selected algorithms was the Maximization of Expectations (EM) to obtain the number ideal of clusters for the data and k-means as a grouping method. Finally, the results were visualized with the Geographic Information System QGIS.

Keywords: data mining, k-means, road traffic accidents, Waze, Weka

Procedia PDF Downloads 382
24618 High-Throughput, Purification-Free, Multiplexed Profiling of Circulating miRNA for Discovery, Validation, and Diagnostics

Authors: J. Hidalgo de Quintana, I. Stoner, M. Tackett, G. Doran, C. Rafferty, A. Windemuth, J. Tytell, D. Pregibon

Abstract:

We have developed the Multiplexed Circulating microRNA assay that allows the detection of up to 68 microRNA targets per sample. The assay combines particle­based multiplexing, using patented Firefly hydrogel particles, with single­ step RT-PCR signal. Thus, the Circulating microRNA assay leverages PCR sensitivity while eliminating the need for separate reverse transcription reactions and mitigating amplification biases introduced by target­-specific qPCR. Furthermore, the ability to multiplex targets in each well eliminates the need to split valuable samples into multiple reactions. Results from the Circulating microRNA assay are interpreted using Firefly Analysis Workbench, which allows visualization, normalization, and export of experimental data. To aid discovery and validation of biomarkers, we have generated fixed panels for Oncology, Cardiology, Neurology, Immunology, and Liver Toxicology. Here we present the data from several studies investigating circulating and tumor microRNA, showcasing the ability of the technology to sensitively and specifically detect microRNA biomarker signatures from fluid specimens.

Keywords: biomarkers, biofluids, miRNA, photolithography, flowcytometry

Procedia PDF Downloads 338
24617 Enhancing Students’ Achievement, Interest and Retention in Chemistry through an Integrated Teaching/Learning Approach

Authors: K. V. F. Fatokun, P. A. Eniayeju

Abstract:

This study concerns the effects of concept mapping-guided discovery integrated teaching approach on the learning style and achievement of chemistry students. The sample comprised 162 senior secondary school (SS 2) students drawn from two science schools in Nasarawa State which have equivalent mean scores of 9.68 and 9.49 in their pre-test. Five instruments were developed and validated while the sixth was purely adopted by the investigator for the study, Four null hypotheses were tested at α = 0.05 level of significance. Chi square analysis showed that there is a significant shift in students’ learning style from accommodating and diverging to converging and assimilating when exposed to concept mapping- guided discovery approach. Also t-test and ANOVA that those in experimental group achieve and retain content learnt better. Results of the Scheffe’s test for multiple comparisons showed that boys in the experimental group performed better than girls. It is therefore concluded that the concept mapping-guided discovery integrated approach should be used in secondary schools to successfully teach electrochemistry. It is strongly recommended that chemistry teachers should be encouraged to adopt this method for teaching difficult concepts.

Keywords: integrated teaching approach, concept mapping-guided discovery, achievement, retention, learning styles and interest

Procedia PDF Downloads 308
24616 Summarizing Data Sets for Data Mining by Using Statistical Methods in Coastal Engineering

Authors: Yunus Doğan, Ahmet Durap

Abstract:

Coastal regions are the one of the most commonly used places by the natural balance and the growing population. In coastal engineering, the most valuable data is wave behaviors. The amount of this data becomes very big because of observations that take place for periods of hours, days and months. In this study, some statistical methods such as the wave spectrum analysis methods and the standard statistical methods have been used. The goal of this study is the discovery profiles of the different coast areas by using these statistical methods, and thus, obtaining an instance based data set from the big data to analysis by using data mining algorithms. In the experimental studies, the six sample data sets about the wave behaviors obtained by 20 minutes of observations from Mersin Bay in Turkey and converted to an instance based form, while different clustering techniques in data mining algorithms were used to discover similar coastal places. Moreover, this study discusses that this summarization approach can be used in other branches collecting big data such as medicine.

Keywords: clustering algorithms, coastal engineering, data mining, data summarization, statistical methods

Procedia PDF Downloads 340
24615 Using Combination of Sets of Features of Molecules for Aqueous Solubility Prediction: A Random Forest Model

Authors: Muhammet Baldan, Emel Timuçin

Abstract:

Generally, absorption and bioavailability increase if solubility increases; therefore, it is crucial to predict them in drug discovery applications. Molecular descriptors and Molecular properties are traditionally used for the prediction of water solubility. There are various key descriptors that are used for this purpose, namely Drogan Descriptors, Morgan Descriptors, Maccs keys, etc., and each has different prediction capabilities with differentiating successes between different data sets. Another source for the prediction of solubility is structural features; they are commonly used for the prediction of solubility. However, there are little to no studies that combine three or more properties or descriptors for prediction to produce a more powerful prediction model. Unlike available models, we used a combination of those features in a random forest machine learning model for improved solubility prediction to better predict and, therefore, contribute to drug discovery systems.

Keywords: solubility, random forest, molecular descriptors, maccs keys

Procedia PDF Downloads 0
24614 Zika Virus NS5 Protein Potential Inhibitors: An Enhanced in silico Approach in Drug Discovery

Authors: Pritika Ramharack, Mahmoud E. S. Soliman

Abstract:

The re-emerging Zika virus is an arthropod-borne virus that has been described to have explosive potential as a worldwide pandemic. The initial transmission of the virus was through a mosquito vector, however, evolving modes of transmission has allowed the spread of the disease over continents. The virus already been linked to irreversible chronic central nervous system (CNS) conditions. The concerns of the scientific and clinical community are the consequences of Zika viral mutations, thus suggesting the urgent need for viral inhibitors. There have been large strides in vaccine development against the virus but there are still no FDA-approved drugs available. Rapid rational drug design and discovery research is fundamental in the production of potent inhibitors against the virus that will not just mask the virus, but destroy it completely. In silico drug design allows for this prompt screening of potential leads, thus decreasing the consumption of precious time and resources. This study demonstrates an optimized and proven screening technique in the discovery of two potential small molecule inhibitors of Zika virus Methyltransferase and RNA-dependent RNA polymerase. This in silico “per-residue energy decomposition pharmacophore” virtual screening approach will be critical in aiding scientists in the discovery of not only effective inhibitors of Zika viral targets, but also a wide range of anti-viral agents.

Keywords: NS5 protein inhibitors, per-residue decomposition, pharmacophore model, virtual screening, Zika virus

Procedia PDF Downloads 201
24613 Modeling Optimal Lipophilicity and Drug Performance in Ligand-Receptor Interactions: A Machine Learning Approach to Drug Discovery

Authors: Jay Ananth

Abstract:

The drug discovery process currently requires numerous years of clinical testing as well as money just for a single drug to earn FDA approval. For drugs that even make it this far in the process, there is a very slim chance of receiving FDA approval, resulting in detrimental hurdles to drug accessibility. To minimize these inefficiencies, numerous studies have implemented computational methods, although few computational investigations have focused on a crucial feature of drugs: lipophilicity. Lipophilicity is a physical attribute of a compound that measures its solubility in lipids and is a determinant of drug efficacy. This project leverages Artificial Intelligence to predict the impact of a drug’s lipophilicity on its performance by accounting for factors such as binding affinity and toxicity. The model predicted lipophilicity and binding affinity in the validation set with very high R² scores of 0.921 and 0.788, respectively, while also being applicable to a variety of target receptors. The results expressed a strong positive correlation between lipophilicity and both binding affinity and toxicity. The model helps in both drug development and discovery, providing every pharmaceutical company with recommended lipophilicity levels for drug candidates as well as a rapid assessment of early-stage drugs prior to any testing, eliminating significant amounts of time and resources currently restricting drug accessibility.

Keywords: drug discovery, lipophilicity, ligand-receptor interactions, machine learning, drug development

Procedia PDF Downloads 80
24612 Estimation of Coefficients of Ridge and Principal Components Regressions with Multicollinear Data

Authors: Rajeshwar Singh

Abstract:

The presence of multicollinearity is common in handling with several explanatory variables simultaneously due to exhibiting a linear relationship among them. A great problem arises in understanding the impact of explanatory variables on the dependent variable. Thus, the method of least squares estimation gives inexact estimates. In this case, it is advised to detect its presence first before proceeding further. Using the ridge regression degree of its occurrence is reduced but principal components regression gives good estimates in this situation. This paper discusses well-known techniques of the ridge and principal components regressions and applies to get the estimates of coefficients by both techniques. In addition to it, this paper also discusses the conflicting claim on the discovery of the method of ridge regression based on available documents.

Keywords: conflicting claim on credit of discovery of ridge regression, multicollinearity, principal components and ridge regressions, variance inflation factor

Procedia PDF Downloads 384
24611 Evaluation of Classification Algorithms for Diagnosis of Asthma in Iranian Patients

Authors: Taha SamadSoltani, Peyman Rezaei Hachesu, Marjan GhaziSaeedi, Maryam Zolnoori

Abstract:

Introduction: Data mining defined as a process to find patterns and relationships along data in the database to build predictive models. Application of data mining extended in vast sectors such as the healthcare services. Medical data mining aims to solve real-world problems in the diagnosis and treatment of diseases. This method applies various techniques and algorithms which have different accuracy and precision. The purpose of this study was to apply knowledge discovery and data mining techniques for the diagnosis of asthma based on patient symptoms and history. Method: Data mining includes several steps and decisions should be made by the user which starts by creation of an understanding of the scope and application of previous knowledge in this area and identifying KD process from the point of view of the stakeholders and finished by acting on discovered knowledge using knowledge conducting, integrating knowledge with other systems and knowledge documenting and reporting.in this study a stepwise methodology followed to achieve a logical outcome. Results: Sensitivity, Specifity and Accuracy of KNN, SVM, Naïve bayes, NN, Classification tree and CN2 algorithms and related similar studies was evaluated and ROC curves were plotted to show the performance of the system. Conclusion: The results show that we can accurately diagnose asthma, approximately ninety percent, based on the demographical and clinical data. The study also showed that the methods based on pattern discovery and data mining have a higher sensitivity compared to expert and knowledge-based systems. On the other hand, medical guidelines and evidence-based medicine should be base of diagnostics methods, therefore recommended to machine learning algorithms used in combination with knowledge-based algorithms.

Keywords: asthma, datamining, classification, machine learning

Procedia PDF Downloads 426
24610 Hybrid Reliability-Similarity-Based Approach for Supervised Machine Learning

Authors: Walid Cherif

Abstract:

Data mining has, over recent years, seen big advances because of the spread of internet, which generates everyday a tremendous volume of data, and also the immense advances in technologies which facilitate the analysis of these data. In particular, classification techniques are a subdomain of Data Mining which determines in which group each data instance is related within a given dataset. It is used to classify data into different classes according to desired criteria. Generally, a classification technique is either statistical or machine learning. Each type of these techniques has its own limits. Nowadays, current data are becoming increasingly heterogeneous; consequently, current classification techniques are encountering many difficulties. This paper defines new measure functions to quantify the resemblance between instances and then combines them in a new approach which is different from actual algorithms by its reliability computations. Results of the proposed approach exceeded most common classification techniques with an f-measure exceeding 97% on the IRIS Dataset.

Keywords: data mining, knowledge discovery, machine learning, similarity measurement, supervised classification

Procedia PDF Downloads 441
24609 Weighted-Distance Sliding Windows and Cooccurrence Graphs for Supporting Entity-Relationship Discovery in Unstructured Text

Authors: Paolo Fantozzi, Luigi Laura, Umberto Nanni

Abstract:

The problem of Entity relation discovery in structured data, a well covered topic in literature, consists in searching within unstructured sources (typically, text) in order to find connections among entities. These can be a whole dictionary, or a specific collection of named items. In many cases machine learning and/or text mining techniques are used for this goal. These approaches might be unfeasible in computationally challenging problems, such as processing massive data streams. A faster approach consists in collecting the cooccurrences of any two words (entities) in order to create a graph of relations - a cooccurrence graph. Indeed each cooccurrence highlights some grade of semantic correlation between the words because it is more common to have related words close each other than having them in the opposite sides of the text. Some authors have used sliding windows for such problem: they count all the occurrences within a sliding windows running over the whole text. In this paper we generalise such technique, coming up to a Weighted-Distance Sliding Window, where each occurrence of two named items within the window is accounted with a weight depending on the distance between items: a closer distance implies a stronger evidence of a relationship. We develop an experiment in order to support this intuition, by applying this technique to a data set consisting in the text of the Bible, split into verses.

Keywords: cooccurrence graph, entity relation graph, unstructured text, weighted distance

Procedia PDF Downloads 125
24608 Data Mining and Knowledge Management Application to Enhance Business Operations: An Exploratory Study

Authors: Zeba Mahmood

Abstract:

The modern business organizations are adopting technological advancement to achieve competitive edge and satisfy their consumer. The development in the field of Information technology systems has changed the way of conducting business today. Business operations today rely more on the data they obtained and this data is continuously increasing in volume. The data stored in different locations is difficult to find and use without the effective implementation of Data mining and Knowledge management techniques. Organizations who smartly identify, obtain and then convert data in useful formats for their decision making and operational improvements create additional value for their customers and enhance their operational capabilities. Marketers and Customer relationship departments of firm use Data mining techniques to make relevant decisions, this paper emphasizes on the identification of different data mining and Knowledge management techniques that are applied to different business industries. The challenges and issues of execution of these techniques are also discussed and critically analyzed in this paper.

Keywords: knowledge, knowledge management, knowledge discovery in databases, business, operational, information, data mining

Procedia PDF Downloads 509
24607 Analysis of Users’ Behavior on Book Loan Log Based on Association Rule Mining

Authors: Kanyarat Bussaban, Kunyanuth Kularbphettong

Abstract:

This research aims to create a model for analysis of student behavior using Library resources based on data mining technique in case of Suan Sunandha Rajabhat University. The model was created under association rules, apriori algorithm. The results were found 14 rules and the rules were tested with testing data set and it showed that the ability of classify data was 79.24 percent and the MSE was 22.91. The results showed that the user’s behavior model by using association rule technique can use to manage the library resources.

Keywords: behavior, data mining technique, a priori algorithm, knowledge discovery

Procedia PDF Downloads 380
24606 Knowledge Discovery and Data Mining Techniques in Textile Industry

Authors: Filiz Ersoz, Taner Ersoz, Erkin Guler

Abstract:

This paper addresses the issues and technique for textile industry using data mining techniques. Data mining has been applied to the stitching of garments products that were obtained from a textile company. Data mining techniques were applied to the data obtained from the CHAID algorithm, CART algorithm, Regression Analysis and, Artificial Neural Networks. Classification technique based analyses were used while data mining and decision model about the production per person and variables affecting about production were found by this method. In the study, the results show that as the daily working time increases, the production per person also decreases. In addition, the relationship between total daily working and production per person shows a negative result and the production per person show the highest and negative relationship.

Keywords: data mining, textile production, decision trees, classification

Procedia PDF Downloads 327
24605 Open educational Resources' Metadata: Towards the First Star to Quality of Open Educational Resources

Authors: Audrey Romero-Pelaez, Juan Carlos Morocho-Yunga

Abstract:

The increasing amount of open educational resources (OER) published on the web for consumption in teaching and learning environments also generates a growing need to ensure the quality of these resources. The low level of OER discovery is one of the most significant drawbacks when faced with its reuse, and as a consequence, high-quality educational resources can go unnoticed. Metadata enables the discovery of resources on the web. The purpose of this study is to lay the foundations for open educational resources to achieve their first quality star within the Quality4OER Framework. In this study, we evaluate the quality of OER metadata and establish the main guidelines on metadata quality in this context.

Keywords: open educational resources, OER quality, quality metadata

Procedia PDF Downloads 212
24604 Analysis and Rule Extraction of Coronary Artery Disease Data Using Data Mining

Authors: Rezaei Hachesu Peyman, Oliyaee Azadeh, Salahzadeh Zahra, Alizadeh Somayyeh, Safaei Naser

Abstract:

Coronary Artery Disease (CAD) is one major cause of disability in adults and one main cause of death in developed. In this study, data mining techniques including Decision Trees, Artificial neural networks (ANNs), and Support Vector Machine (SVM) analyze CAD data. Data of 4948 patients who had suffered from heart diseases were included in the analysis. CAD is the target variable, and 24 inputs or predictor variables are used for the classification. The performance of these techniques is compared in terms of sensitivity, specificity, and accuracy. The most significant factor influencing CAD is chest pain. Elderly males (age > 53) have a high probability to be diagnosed with CAD. SVM algorithm is the most useful way for evaluation and prediction of CAD patients as compared to non-CAD ones. Application of data mining techniques in analyzing coronary artery diseases is a good method for investigating the existing relationships between variables.

Keywords: classification, coronary artery disease, data-mining, knowledge discovery, extract

Procedia PDF Downloads 633
24603 Web-Based Cognitive Writing Instruction (WeCWI): A Theoretical-and-Pedagogical e-Framework for Language Development

Authors: Boon Yih Mah

Abstract:

Web-based Cognitive Writing Instruction (WeCWI)’s contribution towards language development can be divided into linguistic and non-linguistic perspectives. In linguistic perspective, WeCWI focuses on the literacy and language discoveries, while the cognitive and psychological discoveries are the hubs in non-linguistic perspective. In linguistic perspective, WeCWI draws attention to free reading and enterprises, which are supported by the language acquisition theories. Besides, the adoption of process genre approach as a hybrid guided writing approach fosters literacy development. Literacy and language developments are interconnected in the communication process; hence, WeCWI encourages meaningful discussion based on the interactionist theory that involves input, negotiation, output, and interactional feedback. Rooted in the e-learning interaction-based model, WeCWI promotes online discussion via synchronous and asynchronous communications, which allows interactions happened among the learners, instructor, and digital content. In non-linguistic perspective, WeCWI highlights on the contribution of reading, discussion, and writing towards cognitive development. Based on the inquiry models, learners’ critical thinking is fostered during information exploration process through interaction and questioning. Lastly, to lower writing anxiety, WeCWI develops the instructional tool with supportive features to facilitate the writing process. To bring a positive user experience to the learner, WeCWI aims to create the instructional tool with different interface designs based on two different types of perceptual learning style.

Keywords: WeCWI, literacy discovery, language discovery, cognitive discovery, psychological discovery

Procedia PDF Downloads 537
24602 Intra-miR-ExploreR, a Novel Bioinformatics Platform for Integrated Discovery of MiRNA:mRNA Gene Regulatory Networks

Authors: Surajit Bhattacharya, Daniel Veltri, Atit A. Patel, Daniel N. Cox

Abstract:

miRNAs have emerged as key post-transcriptional regulators of gene expression, however identification of biologically-relevant target genes for this epigenetic regulatory mechanism remains a significant challenge. To address this knowledge gap, we have developed a novel tool in R, Intra-miR-ExploreR, that facilitates integrated discovery of miRNA targets by incorporating target databases and novel target prediction algorithms, using statistical methods including Pearson and Distance Correlation on microarray data, to arrive at high confidence intragenic miRNA target predictions. We have explored the efficacy of this tool using Drosophila melanogaster as a model organism for bioinformatics analyses and functional validation. A number of putative targets were obtained which were also validated using qRT-PCR analysis. Additional features of the tool include downloadable text files containing GO analysis from DAVID and Pubmed links of literature related to gene sets. Moreover, we are constructing interaction maps of intragenic miRNAs, using both micro array and RNA-seq data, focusing on neural tissues to uncover regulatory codes via which these molecules regulate gene expression to direct cellular development.

Keywords: miRNA, miRNA:mRNA target prediction, statistical methods, miRNA:mRNA interaction network

Procedia PDF Downloads 478
24601 Using Printouts as Social Media Evidence and Its Authentication in the Courtroom

Authors: Chih-Ping Chang

Abstract:

Different from traditional objective evidence, social media evidence has its own characteristics with easily tampering, recoverability, and cannot be read without using other devices (such as a computer). Simply taking a screenshot from social network sites must be questioned its original identity. When the police search and seizure digital information, a common way they use is to directly print out digital data obtained and ask the signature of the parties at the presence, without taking original digital data back. In addition to the issue on its original identity, this conduct to obtain evidence may have another two results. First, it will easily allege that is tampering evidence because the police wanted to frame the suspect and falsified evidence. Second, it is not easy to discovery hidden information. The core evidence associated with crime may not appear in the contents of files. Through discovery the original file, data related to the file, such as the original producer, creation time, modification date, and even GPS location display can be revealed from hidden information. Therefore, how to show this kind of evidence in the courtroom will be arguably the most important task for ruling social media evidence. This article, first, will introduce forensic software, like EnCase, TCT, FTK, and analyze their function to prove the identity with another digital data. Then turning back to the court, the second part of this article will discuss legal standard for authentication of social media evidence and application of that forensic software in the courtroom. As the conclusion, this article will provide a rethinking, that is, what kind of authenticity is this rule of evidence chase for. Does legal system automatically operate the transcription of scientific knowledge? Or furthermore, it wants to better render justice, not only under scientific fact, but through multivariate debating.

Keywords: federal rule of evidence, internet forensic, printouts as evidence, social media evidence, United States v. Vayner

Procedia PDF Downloads 271
24600 Improvement of the Aerodynamic Behaviour of a Land Rover Discovery 4 in Turbulent Flow Using Computational Fluid Dynamics (CFD)

Authors: Ahmed Al-Saadi, Ali Hassanpour, Tariq Mahmud

Abstract:

The main objective of this study is to investigate ways to reduce the aerodynamic drag coefficient and to increase the stability of the full-size Sport Utility Vehicle using three-dimensional Computational Fluid Dynamics (CFD) simulation. The baseline model in the simulation was the Land Rover Discovery 4. Many aerodynamic devices and external design modifications were used in this study. These reduction aerodynamic techniques were tested individually or in combination to get the best design. All new models have the same capacity and comfort of the baseline model. Uniform freestream velocity of the air at inlet ranging from 28 m/s to 40 m/s was used. ANSYS Fluent software (version 16.0) was used to simulate all models. The drag coefficient obtained from the ANSYS Fluent for the baseline model was validated with experimental data. It is found that the use of modern aerodynamic add-on devices and modifications has a significant effect in reducing the aerodynamic drag coefficient.

Keywords: aerodynamics, RANS, sport utility vehicle, turbulent flow

Procedia PDF Downloads 290
24599 Screening for Hit Identification against Mycobacterium abscessus

Authors: Jichan Jang

Abstract:

Mycobacterium abscessus is a rapidly growing life-threatening mycobacterium with multiple drug-resistance mechanisms. In this study, we screened the library to identify active molecules targeting Mycobacterium abscessus using resazurin live/dead assays. In this screening assay, the Z-factor was 0.7, as an indication of the statistical confidence of the assay. A cut-off of 80% growth inhibition in the screening resulted in the identification of four different compounds at a single concentration (20 μM). Dose-response curves identified three different hit candidates, which generated good inhibitory curves. All hit candidates were expected to have different molecular targets. Thus, we found that compound X, identified, may be a promising candidate in the M. abscessus drug discovery pipeline.

Keywords: Mycobacterium abscessus, antibiotics, drug discovery, emerging Pathogen

Procedia PDF Downloads 180
24598 Improving Cryptographically Generated Address Algorithm in IPv6 Secure Neighbor Discovery Protocol through Trust Management

Authors: M. Moslehpour, S. Khorsandi

Abstract:

As transition to widespread use of IPv6 addresses has gained momentum, it has been shown to be vulnerable to certain security attacks such as those targeting Neighbor Discovery Protocol (NDP) which provides the address resolution functionality in IPv6. To protect this protocol, Secure Neighbor Discovery (SEND) is introduced. This protocol uses Cryptographically Generated Address (CGA) and asymmetric cryptography as a defense against threats on integrity and identity of NDP. Although SEND protects NDP against attacks, it is computationally intensive due to Hash2 condition in CGA. To improve the CGA computation speed, we parallelized CGA generation process and used the available resources in a trusted network. Furthermore, we focused on the influence of the existence of malicious nodes on the overall load of un-malicious ones in the network. According to the evaluation results, malicious nodes have adverse impacts on the average CGA generation time and on the average number of tries. We utilized a Trust Management that is capable of detecting and isolating the malicious node to remove possible incentives for malicious behavior. We have demonstrated the effectiveness of the Trust Management System in detecting the malicious nodes and hence improving the overall system performance.

Keywords: CGA, ICMPv6, IPv6, malicious node, modifier, NDP, overall load, SEND, trust management

Procedia PDF Downloads 165
24597 Constructing a Semi-Supervised Model for Network Intrusion Detection

Authors: Tigabu Dagne Akal

Abstract:

While advances in computer and communications technology have made the network ubiquitous, they have also rendered networked systems vulnerable to malicious attacks devised from a distance. These attacks or intrusions start with attackers infiltrating a network through a vulnerable host and then launching further attacks on the local network or Intranet. Nowadays, system administrators and network professionals can attempt to prevent such attacks by developing intrusion detection tools and systems using data mining technology. In this study, the experiments were conducted following the Knowledge Discovery in Database Process Model. The Knowledge Discovery in Database Process Model starts from selection of the datasets. The dataset used in this study has been taken from Massachusetts Institute of Technology Lincoln Laboratory. After taking the data, it has been pre-processed. The major pre-processing activities include fill in missed values, remove outliers; resolve inconsistencies, integration of data that contains both labelled and unlabelled datasets, dimensionality reduction, size reduction and data transformation activity like discretization tasks were done for this study. A total of 21,533 intrusion records are used for training the models. For validating the performance of the selected model a separate 3,397 records are used as a testing set. For building a predictive model for intrusion detection J48 decision tree and the Naïve Bayes algorithms have been tested as a classification approach for both with and without feature selection approaches. The model that was created using 10-fold cross validation using the J48 decision tree algorithm with the default parameter values showed the best classification accuracy. The model has a prediction accuracy of 96.11% on the training datasets and 93.2% on the test dataset to classify the new instances as normal, DOS, U2R, R2L and probe classes. The findings of this study have shown that the data mining methods generates interesting rules that are crucial for intrusion detection and prevention in the networking industry. Future research directions are forwarded to come up an applicable system in the area of the study.

Keywords: intrusion detection, data mining, computer science, data mining

Procedia PDF Downloads 270
24596 Role of Medicinal Plants in Treatment of Diseases and Drug Discovery in Azad Kashmir, Pakistan

Authors: Neelam Rashid, Muhammad Zafar, Mushtaq Ahmad, Khafsa Malik, Syed Nasar Shah

Abstract:

The present study was conducted to study the role of medicinal plants used to cure different ailments in Azad Kashmir. Various ethno medicinal surveys were carried out during 2016 to enlist the uses of plants against various ailments by rural communities of the area. Information was obtained from 60 local people including 45 males (10 traditional health practitioners) and 15 females by semi structured interviews and group discussions. 65 plant species belonging to 45 families were reported. The dominant plant habit was herbaceous (56%) while decoction was the most common method of utilization (40%). The most cited turmoil was the gastrointestinal disorders. The data obtained were analyzed using ethno medicinal indices such as FL, UV, ICF, FC, and RFC. Results revealed that various species had numerous uses in curing of diseases. So conservation of biodiversity of these medicinal plants and traditional knowledge can play important role in improving the local health conditions of rural people and modern drug discovery and development.

Keywords: medicinal plants, ailments, drug, health, traditional

Procedia PDF Downloads 219
24595 Discovery the Relics of Buddhist Stupa at Thanesar, Kurukshetra

Authors: Chander Shekhar, Manoj Kumar

Abstract:

Present paper deal with the discovery of the stupa’s relics which belongs to the Kushana period. These remains were found during the scientific clearance work at a mound near Brahma-SarovarThanesar, Kurukshetra. This archaeological work was done by Department of Archaeology & Museums Haryana Government. The relics of stupa show that it would have been similar to Assandh and Damekhstupa. As per-Buddhist literature, GoutamBudhha reached Thanesar. In memory of Buddh’s Journey, King Ashoka built a big Stupa at Thanesar on the bank of Sarasvati River. Chinese pilgrim Yuan Chuang also referred a Monastery and stupa near Aujas-ghatof Brahma-sarovar. It may be part of that settlement which was mentioned by Yuan Chuang.

Keywords: archaeology, stupa, buddhism, excavtoin

Procedia PDF Downloads 158
24594 Object-Centric Process Mining Using Process Cubes

Authors: Anahita Farhang Ghahfarokhi, Alessandro Berti, Wil M.P. van der Aalst

Abstract:

Process mining provides ways to analyze business processes. Common process mining techniques consider the process as a whole. However, in real-life business processes, different behaviors exist that make the overall process too complex to interpret. Process comparison is a branch of process mining that isolates different behaviors of the process from each other by using process cubes. Process cubes organize event data using different dimensions. Each cell contains a set of events that can be used as an input to apply process mining techniques. Existing work on process cubes assume single case notions. However, in real processes, several case notions (e.g., order, item, package, etc.) are intertwined. Object-centric process mining is a new branch of process mining addressing multiple case notions in a process. To make a bridge between object-centric process mining and process comparison, we propose a process cube framework, which supports process cube operations such as slice and dice on object-centric event logs. To facilitate the comparison, the framework is integrated with several object-centric process discovery approaches.

Keywords: multidimensional process mining, mMulti-perspective business processes, OLAP, process cubes, process discovery, process mining

Procedia PDF Downloads 229
24593 Using Textual Pre-Processing and Text Mining to Create Semantic Links

Authors: Ricardo Avila, Gabriel Lopes, Vania Vidal, Jose Macedo

Abstract:

This article offers a approach to the automatic discovery of semantic concepts and links in the domain of Oil Exploration and Production (E&P). Machine learning methods combined with textual pre-processing techniques were used to detect local patterns in texts and, thus, generate new concepts and new semantic links. Even using more specific vocabularies within the oil domain, our approach has achieved satisfactory results, suggesting that the proposal can be applied in other domains and languages, requiring only minor adjustments.

Keywords: semantic links, data mining, linked data, SKOS

Procedia PDF Downloads 148
24592 Medical Knowledge Management since the Integration of Heterogeneous Data until the Knowledge Exploitation in a Decision-Making System

Authors: Nadjat Zerf Boudjettou, Fahima Nader, Rachid Chalal

Abstract:

Knowledge management is to acquire and represent knowledge relevant to a domain, a task or a specific organization in order to facilitate access, reuse and evolution. This usually means building, maintaining and evolving an explicit representation of knowledge. The next step is to provide access to that knowledge, that is to say, the spread in order to enable effective use. Knowledge management in the medical field aims to improve the performance of the medical organization by allowing individuals in the care facility (doctors, nurses, paramedics, etc.) to capture, share and apply collective knowledge in order to make optimal decisions in real time. In this paper, we propose a knowledge management approach based on integration technique of heterogeneous data in the medical field by creating a data warehouse, a technique of extracting knowledge from medical data by choosing a technique of data mining, and finally an exploitation technique of that knowledge in a case-based reasoning system.

Keywords: data warehouse, data mining, knowledge discovery in database, KDD, medical knowledge management, Bayesian networks

Procedia PDF Downloads 363
24591 Helping the Development of Public Policies with Knowledge of Criminal Data

Authors: Diego De Castro Rodrigues, Marcelo B. Nery, Sergio Adorno

Abstract:

The project aims to develop a framework for social data analysis, particularly by mobilizing criminal records and applying descriptive computational techniques, such as associative algorithms and extraction of tree decision rules, among others. The methods and instruments discussed in this work will enable the discovery of patterns, providing a guided means to identify similarities between recurring situations in the social sphere using descriptive techniques and data visualization. The study area has been defined as the city of São Paulo, with the structuring of social data as the central idea, with a particular focus on the quality of the information. Given this, a set of tools will be validated, including the use of a database and tools for visualizing the results. Among the main deliverables related to products and the development of articles are the discoveries made during the research phase. The effectiveness and utility of the results will depend on studies involving real data, validated both by domain experts and by identifying and comparing the patterns found in this study with other phenomena described in the literature. The intention is to contribute to evidence-based understanding and decision-making in the social field.

Keywords: social data analysis, criminal records, computational techniques, data mining, big data

Procedia PDF Downloads 56
24590 Discovery of Two-dimensional Hexagonal MBene HfBO

Authors: Nanxi Miao, Junjie Wang

Abstract:

The discovery of 2D materials with distinct compositions and properties has been a research aim since the report of graphene. One of the latest members of the 2D material family is MXene, which is produced from the topochemical deintercalation of the A layer from a laminate MAX phase. Recently, analogous 2D MBenes (transitional metal borides) have been predicted by theoretical calculations as excellent alternatives in applications such as metal-ion batteries, magnetic devices, and catalysts. However, the practical applications of two-dimensional (2D) transition-metal borides (MBenes) have been severely hindered by the lack of accessible MBenes because of the difficulties in the selective etching of traditional ternary MAB phases with orthorhombic symmetry (ort-MAB). Here, we discover a family of ternary hexagonal MAB (h-MAB) phases and 2D hexagonal MBenes (h-MBenes) by ab initio predictions and experiments. Calculations suggest that the ternary h-MAB phases are more suitable precursors for MBenes than the ort-MAB phases. Based on the prediction, we report the experimental synthesis of h-MBene HfBO by selective removal of in from h-MAB Hf2InB2. The synthesized 2D HfBO delivered a specific capacity of 420 mAh g-1 as an anode material in lithium-ion batteries, demonstrating the potential for energy-storage applications. The discovery of this h-MBene HfBO added a new member to the growing family of 2D materials and provided opportunities for a wide range of novel applications.

Keywords: 2D materials, DFT calculations, high-throughput screening, lithium-ion batteries

Procedia PDF Downloads 38