Search results for: Data mining techniques

8766 A Comparative Study of Virus Detection Techniques

Authors: Sulaiman Al Amro, Ali Alkhalifah

Abstract:

The growing number of computer viruses and the detection of zero day malware have been the concern for security researchers for a large period of time. Existing antivirus products (AVs) rely on detecting virus signatures which do not provide a full solution to the problems associated with these viruses. The use of logic formulae to model the behaviour of viruses is one of the most encouraging recent developments in virus research, which provides alternatives to classic virus detection methods. In this paper, we proposed a comparative study about different virus detection techniques. This paper provides the advantages and drawbacks of different detection techniques. Different techniques will be used in this paper to provide a discussion about what technique is more effective to detect computer viruses.

Keywords: Computer viruses, virus detection, signature-based, behaviour-based, heuristic-based.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4592

8765 Design and Construction Validation of Pile Performance through High Strain Pile Dynamic Tests for both Contiguous Flight Auger and Drilled Displacement Piles

Authors: S. Pirrello

Abstract:

Sydney’s booming real estate market has pushed property developers to invest in historically “no-go” areas, which were previously too expensive to develop. These areas are usually near rivers where the sites are underlain by deep alluvial and estuarine sediments. In these ground conditions, conventional bored pile techniques are often not competitive. Contiguous Flight Auger (CFA) and Drilled Displacement (DD) Piles techniques are on the other hand suitable for these ground conditions. This paper deals with the design and construction challenges encountered with these piling techniques for a series of high-rise towers in Sydney’s West. The advantages of DD over CFA piles such as reduced overall spoil with substantial cost savings and achievable rock sockets in medium strength bedrock are discussed. Design performances were assessed with PIGLET. Pile performances are validated in two stages, during constructions with the interpretation of real-time data from the piling rigs’ on-board computer data, and after construction with analyses of results from high strain pile dynamic testing (PDA). Results are then presented and discussed. High Strain testing data are presented as Case Pile Wave Analysis Program (CAPWAP) analyses.

Keywords: Contiguous flight auger, case pile wave analysis, high strain pile, drilled displacement, pile performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 972

8764 Prediction of a Human Facial Image by ANN using Image Data and its Content on Web Pages

Authors: Chutimon Thitipornvanid, Siripun Sanguansintukul

Abstract:

Choosing the right metadata is a critical, as good information (metadata) attached to an image will facilitate its visibility from a pile of other images. The image-s value is enhanced not only by the quality of attached metadata but also by the technique of the search. This study proposes a technique that is simple but efficient to predict a single human image from a website using the basic image data and the embedded metadata of the image-s content appearing on web pages. The result is very encouraging with the prediction accuracy of 95%. This technique may become a great assist to librarians, researchers and many others for automatically and efficiently identifying a set of human images out of a greater set of images.

Keywords: Metadata, Prediction, Multi-layer perceptron, Human facial image, Image mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1208

8763 Heuristic Optimization Techniques for Network Reconfiguration in Distribution System

Authors: A. Charlangsut, N. Rugthaicharoencheep, S. Auchariyamet

Abstract:

Network reconfiguration is an operation to modify the network topology. The implementation of network reconfiguration has many advantages such as loss minimization, increasing system security and others. In this paper, two topics about the network reconfiguration in distribution system are briefly described. The first topic summarizes its impacts while the second explains some heuristic optimization techniques for solving the network reconfiguration problem.

Keywords: Network Reconfiguration, Optimization Techniques, Distribution System

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2747

8762 The Role of Synthetic Data in Aerial Object Detection

Authors: Ava Dodd, Jonathan Adams

Abstract:

The purpose of this study is to explore the characteristics of developing a machine learning application using synthetic data. The study is structured to develop the application for the purpose of deploying the computer vision model. The findings discuss the realities of attempting to develop a computer vision model for practical purpose, and detail the processes, tools and techniques that were used to meet accuracy requirements. The research reveals that synthetic data represent another variable that can be adjusted to improve the performance of a computer vision model. Further, a suite of tools and tuning recommendations are provided.

Keywords: computer vision, machine learning, synthetic data, YOLOv4

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 838

8761 A Brain Inspired Approach for Multi-View Patterns Identification

Authors: Yee Ling Boo, Damminda Alahakoon

Abstract:

Biologically human brain processes information in both unimodal and multimodal approaches. In fact, information is progressively abstracted and seamlessly fused. Subsequently, the fusion of multimodal inputs allows a holistic understanding of a problem. The proliferation of technology has exponentially produced various sources of data, which could be likened to being the state of multimodality in human brain. Therefore, this is an inspiration to develop a methodology for exploring multimodal data and further identifying multi-view patterns. Specifically, we propose a brain inspired conceptual model that allows exploration and identification of patterns at different levels of granularity, different types of hierarchies and different types of modalities. A structurally adaptive neural network is deployed to implement the proposed model. Furthermore, the acquisition of multi-view patterns with the proposed model is demonstrated and discussed with some experimental results.

Keywords: Multimodal, Granularity, Hierarchical Clustering, Growing Self Organising Maps, Data Mining

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1535

8760 Combined Beamforming and Channel Estimation in WCDMA Communication Systems

Authors: Nermin A. Mohamed, Mohamed F. Madkour

Abstract:

We address the problem of joint beamforming and multipath channel parameters estimation in Wideband Code Division Multiple Access (WCDMA) communication systems that employ Multiple-Access Interference (MAI) suppression techniques in the uplink (from mobile to base station). Most of the existing schemes rely on time multiplex a training sequence with the user data. In WCDMA, the channel parameters can also be estimated from a code multiplexed common pilot channel (CPICH) that could be corrupted by strong interference resulting in a bad estimate. In this paper, we present new methods to combine interference suppression together with channel estimation when using multiple receiving antennas by using adaptive signal processing techniques. Computer simulation is used to compare between the proposed methods and the existing conventional estimation techniques.

Keywords: Adaptive arrays, channel estimation, interferencecancellation, wideband code division multiple access (WCDMA).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2310

8759 Computer Aided Diagnosis of Polycystic Kidney Disease Using ANN

Authors: Anjan Babu G, Sumana G, Rajasekhar M

Abstract:

Many inherited diseases and non-hereditary disorders are common in the development of renal cystic diseases. Polycystic kidney disease (PKD) is a disorder developed within the kidneys in which grouping of cysts filled with water like fluid. PKD is responsible for 5-10% of end-stage renal failure treated by dialysis or transplantation. New experimental models, application of molecular biology techniques have provided new insights into the pathogenesis of PKD. Researchers are showing keen interest for developing an automated system by applying computer aided techniques for the diagnosis of diseases. In this paper a multilayered feed forward neural network with one hidden layer is constructed, trained and tested by applying back propagation learning rule for the diagnosis of PKD based on physical symptoms and test results of urinalysis collected from the individual patients. The data collected from 50 patients are used to train and test the network. Among these samples, 75% of the data used for training and remaining 25% of the data are used for testing purpose. Further, this trained network is used to implement for new samples. The output results in normality and abnormality of the patient.

Keywords: Dialysis, Hereditary, Transplantation, Polycystic, Pathogenesis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1996

8758 Comparison of Inter Cell Interference Coordination Approaches

Authors: Selma Sbit, Mohamed Bechir Dadi, Belgacem Chibani Rhaimi

Abstract:

This work aims to compare various techniques used in order to mitigate Inter-Cell Interference (ICI) in Long Term Evolution (LTE) and LTE-Advanced systems. For that, we will evaluate the performance of each one. In mobile communication networks, systems are limited by ICI particularly caused by deployment of small cells in conventional cell’s implementation. Therefore, various mitigation techniques, named Inter-Cell Interference Coordination techniques (ICIC), enhanced Inter-Cell Interference Coordination (eICIC) techniques and Coordinated Multi-Point transmission and reception (CoMP) are proposed. This paper presents a comparative study of these strategies. It can be concluded that CoMP techniques can ameliorate SINR and capacity system compared to ICIC and eICIC. In fact, SINR value reaches 15 dB for a distance of 0.5 km between user equipment and servant base station if we use CoMP technology whereas it cannot exceed 12 dB and 9 dB for eICIC and ICIC approaches respectively as reflected in simulations.

Keywords: 4th generation, interference, coordination, ICIC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1000

8757 Image Retrieval: Techniques, Challenge, and Trend

Authors: Hui Hui Wang, Dzulkifli Mohamad, N.A Ismail

Abstract:

This paper attempts to discuss the evolution of the retrieval techniques focusing on development, challenges and trends of the image retrieval. It highlights both the already addressed and outstanding issues. The explosive growth of image data leads to the need of research and development of Image Retrieval. However, Image retrieval researches are moving from keyword, to low level features and to semantic features. Drive towards semantic features is due to the problem of the keywords which can be very subjective and time consuming while low level features cannot always describe high level concepts in the users- mind.

Keywords: content based image retrieval, keyword based imageretrieval, semantic gap, semantic image retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2515

8756 Transcutaneous Inductive Powering Links Based on ASK Modulation Techniques

Authors: S. M. Abbas, M. A. Hannan, S. A. Samad, A. Hussain

Abstract:

This paper presented a modified efficient inductive powering link based on ASK modulator and proposed efficient class- E power amplifier. The design presents the external part which is located outside the body to transfer power and data to the implanted devices such as implanted Microsystems to stimulate and monitoring the nerves and muscles. The system operated with low band frequency 10MHZ according to industrial- scientific – medical (ISM) band to avoid the tissue heating. For external part, the modulation index is 11.1% and the modulation rate 7.2% with data rate 1 Mbit/s assuming Tbit = 1us. The system has been designed using 0.35-μm fabricated CMOS technology. The mathematical model is given and the design is simulated using OrCAD P Spice 16.2 software tool and for real-time simulation, the electronic workbench MULISIM 11 has been used.

Keywords: Implanted devices, ASK techniques, Class-E power amplifier, Inductive powering and low-frequency ISM band.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2367

8755 Techniques with Statistics for Web Page Watermarking

Authors: Mohamed Lahcen BenSaad, Sun XingMing

Abstract:

Information hiding, especially watermarking is a promising technique for the protection of intellectual property rights. This technology is mainly advanced for multimedia but the same has not been done for text. Web pages, like other documents, need a protection against piracy. In this paper, some techniques are proposed to show how to hide information in web pages using some features of the markup language used to describe these pages. Most of the techniques proposed here use the white space to hide information or some varieties of the language in representing elements. Experiments on a very small page and analysis of five thousands web pages show that these techniques have a wide bandwidth available for information hiding, and they might form a solid base to develop a robust algorithm for web page watermarking.

Keywords: Digital Watermarking, Information Hiding, Markup Language, Text watermarking, Software Watermarking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1786

8754 Hydrogeological Risk and Mining Tunnels: the Fontane-Rodoretto Mine Turin (Italy)

Authors: Paola Gattinoni, Laura Scesi, Elena Cerino Adbin, Daniele Cremonesi

Abstract:

The interaction of tunneling or mining with groundwater has become a very relevant problem not only due to the need to guarantee the safety of workers and to assure the efficiency of the tunnel drainage systems, but also to safeguard water resources from impoverishment and pollution risk. Therefore it is very important to forecast the drainage processes (i.e., the evaluation of drained discharge and drawdown caused by the excavation). The aim of this study was to know better the system and to quantify the flow drained from the Fontane mines, located in Val Germanasca (Turin, Italy). This allowed to understand the hydrogeological local changes in time. The work has therefore been structured as follows: the reconstruction of the conceptual model with the geological, hydrogeological and geological-structural study; the calculation of the tunnel inflows (through the use of structural methods) and the comparison with the measured flow rates; the water balance at the basin scale. In this way it was possible to understand what are the relationships between rainfall, groundwater level variations and the effect of the presence of tunnels as a means of draining water. Subsequently, it the effects produced by the excavation of the mining tunnels was quantified, through numerical modeling. In particular, the modeling made it possible to observe the drawdown variation as a function of number, excavation depth and different mines linings.

Keywords: Groundwater, Italy, numerical model, tunneling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1919

8753 Fuzzy Ideology based Long Term Load Forecasting

Authors: Jagadish H. Pujar

Abstract:

Fuzzy Load forecasting plays a paramount role in the operation and management of power systems. Accurate estimation of future power demands for various lead times facilitates the task of generating power reliably and economically. The forecasting of future loads for a relatively large lead time (months to few years) is studied here (long term load forecasting). Among the various techniques used in forecasting load, artificial intelligence techniques provide greater accuracy to the forecasts as compared to conventional techniques. Fuzzy Logic, a very robust artificial intelligent technique, is described in this paper to forecast load on long term basis. The paper gives a general algorithm to forecast long term load. The algorithm is an Extension of Short term load forecasting method to Long term load forecasting and concentrates not only on the forecast values of load but also on the errors incorporated into the forecast. Hence, by correcting the errors in the forecast, forecasts with very high accuracy have been achieved. The algorithm, in the paper, is demonstrated with the help of data collected for residential sector (LT2 (a) type load: Domestic consumers). Load, is determined for three consecutive years (from April-06 to March-09) in order to demonstrate the efficiency of the algorithm and to forecast for the next two years (from April-09 to March-11).

Keywords: Fuzzy Logic Control (FLC), Data DependantFactors(DDF), Model Dependent Factors(MDF), StatisticalError(SE), Short Term Load Forecasting (STLF), MiscellaneousError(ME).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2460

8752 Survey on Arabic Sentiment Analysis in Twitter

Authors: Sarah O. Alhumoud, Mawaheb I. Altuwaijri, Tarfa M. Albuhairi, Wejdan M. Alohaideb

Abstract:

Large-scale data stream analysis has become one of the important business and research priorities lately. Social networks like Twitter and other micro-blogging platforms hold an enormous amount of data that is large in volume, velocity and variety. Extracting valuable information and trends out of these data would aid in a better understanding and decision-making. Multiple analysis techniques are deployed for English content. Moreover, one of the languages that produce a large amount of data over social networks and is least analyzed is the Arabic language. The proposed paper is a survey on the research efforts to analyze the Arabic content in Twitter focusing on the tools and methods used to extract the sentiments for the Arabic content on Twitter.

Keywords: Big Data, Social Networks, Sentiment Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4343

8751 A Distance Function for Data with Missing Values and Its Application

Authors: Loai AbdAllah, Ilan Shimshoni

Abstract:

Missing values in data are common in real world applications. Since the performance of many data mining algorithms depend critically on it being given a good metric over the input space, we decided in this paper to define a distance function for unlabeled datasets with missing values. We use the Bhattacharyya distance, which measures the similarity of two probability distributions, to define our new distance function. According to this distance, the distance between two points without missing attributes values is simply the Mahalanobis distance. When on the other hand there is a missing value of one of the coordinates, the distance is computed according to the distribution of the missing coordinate. Our distance is general and can be used as part of any algorithm that computes the distance between data points. Because its performance depends strongly on the chosen distance measure, we opted for the k nearest neighbor classifier to evaluate its ability to accurately reflect object similarity. We experimented on standard numerical datasets from the UCI repository from different fields. On these datasets we simulated missing values and compared the performance of the kNN classifier using our distance to other three basic methods. Our experiments show that kNN using our distance function outperforms the kNN using other methods. Moreover, the runtime performance of our method is only slightly higher than the other methods.

Keywords: Missing values, Distance metric, Bhattacharyya distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2741

8750 Utilization of Process Mapping Tool to Enhance Production Drilling in Underground Metal Mining Operations

Authors: Sidharth Talan, Sanjay Kumar Sharma, Eoin Joseph Wallace, Nikita Agrawal

Abstract:

Underground mining is at the core of rapidly evolving metals and minerals sector due to the increasing mineral consumption globally. Even though the surface mines are still more abundant on earth, the scales of industry are slowly tipping towards underground mining due to rising depth and complexities of orebodies. Thus, the efficient and productive functioning of underground operations depends significantly on the synchronized performance of key elements such as operating site, mining equipment, manpower and mine services. Production drilling is the process of conducting long hole drilling for the purpose of charging and blasting these holes for the production of ore in underground metal mines. Thus, production drilling is the crucial segment in the underground metal mining value chain. This paper presents the process mapping tool to evaluate the production drilling process in the underground metal mining operation by dividing the given process into three segments namely Input, Process and Output. The three segments are further segregated into factors and sub-factors. As per the study, the major input factors crucial for the efficient functioning of production drilling process are power, drilling water, geotechnical support of the drilling site, skilled drilling operators, services installation crew, oils and drill accessories for drilling machine, survey markings at drill site, proper housekeeping, regular maintenance of drill machine, suitable transportation for reaching the drilling site and finally proper ventilation. The major outputs for the production drilling process are ore, waste as a result of dilution, timely reporting and investigation of unsafe practices, optimized process time and finally well fragmented blasted material within specifications set by the mining company. The paper also exhibits the drilling loss matrix, which is utilized to appraise the loss in planned production meters per day in a mine on account of availability loss in the machine due to breakdowns, underutilization of the machine and productivity loss in the machine measured in drilling meters per unit of percussion hour with respect to its planned productivity for the day. The given three losses would be essential to detect the bottlenecks in the process map of production drilling operation so as to instigate the action plan to suppress or prevent the causes leading to the operational performance deficiency. The given tool is beneficial to mine management to focus on the critical factors negatively impacting the production drilling operation and design necessary operational and maintenance strategies to mitigate them.

Keywords: Process map, drilling loss matrix, availability, utilization, productivity, percussion rate.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1078

8749 Learning to Order Terms: Supervised Interestingness Measures in Terminology Extraction

Authors: Jérôme Azé, Mathieu Roche, Yves Kodratoff, Michèle Sebag

Abstract:

Term Extraction, a key data preparation step in Text Mining, extracts the terms, i.e. relevant collocation of words, attached to specific concepts (e.g. genetic-algorithms and decisiontrees are terms associated to the concept “Machine Learning" ). In this paper, the task of extracting interesting collocations is achieved through a supervised learning algorithm, exploiting a few collocations manually labelled as interesting/not interesting. From these examples, the ROGER algorithm learns a numerical function, inducing some ranking on the collocations. This ranking is optimized using genetic algorithms, maximizing the trade-off between the false positive and true positive rates (Area Under the ROC curve). This approach uses a particular representation for the word collocations, namely the vector of values corresponding to the standard statistical interestingness measures attached to this collocation. As this representation is general (over corpora and natural languages), generality tests were performed by experimenting the ranking function learned from an English corpus in Biology, onto a French corpus of Curriculum Vitae, and vice versa, showing a good robustness of the approaches compared to the state-of-the-art Support Vector Machine (SVM).

Keywords: Text-mining, Terminology Extraction, Evolutionary algorithm, ROC Curve.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1654

8748 A New Version of Annotation Method with a XML-based Knowledge Base

Authors: Mohammad Yasrebi, Somayeh Khosravi

Abstract:

Machine-understandable data when strongly interlinked constitutes the basis for the SemanticWeb. Annotating web documents is one of the major techniques for creating metadata on the Web. Annotating websitexs defines the containing data in a form which is suitable for interpretation by machines. In this paper, we present a better and improved approach than previous [1] to annotate the texts of the websites depends on the knowledge base.

Keywords: Knowledge base, ontology, semantic annotation, XML.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1558

8747 From Industry 4.0 to Agriculture 4.0: A Framework to Manage Product Data in Agri-Food Supply Chain for Voluntary Traceability

Authors: Angelo Corallo, Maria Elena Latino, Marta Menegoli

Abstract:

Agri-food value chain involves various stakeholders with different roles. All of them abide by national and international rules and leverage marketing strategies to advance their products. Food products and related processing phases carry with it a big mole of data that are often not used to inform final customer. Some data, if fittingly identified and used, can enhance the single company, and/or the all supply chain creates a math between marketing techniques and voluntary traceability strategies. Moreover, as of late, the world has seen buying-models’ modification: customer is careful on wellbeing and food quality. Food citizenship and food democracy was born, leveraging on transparency, sustainability and food information needs. Internet of Things (IoT) and Analytics, some of the innovative technologies of Industry 4.0, have a significant impact on market and will act as a main thrust towards a genuine ‘4.0 change’ for agriculture. But, realizing a traceability system is not simple because of the complexity of agri-food supply chain, a lot of actors involved, different business models, environmental variations impacting products and/or processes, and extraordinary climate changes. In order to give support to the company involved in a traceability path, starting from business model analysis and related business process a Framework to Manage Product Data in Agri-Food Supply Chain for Voluntary Traceability was conceived. Studying each process task and leveraging on modeling techniques lead to individuate information held by different actors during agri-food supply chain. IoT technologies for data collection and Analytics techniques for data processing supply information useful to increase the efficiency intra-company and competitiveness in the market. The whole information recovered can be shown through IT solutions and mobile application to made accessible to the company, the entire supply chain and the consumer with the view to guaranteeing transparency and quality.

Keywords: Agriculture 4.0, agri-food supply chain, Industry 4.0, voluntary traceability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2341

8746 Influenza Pattern Analysis System through Mining Weblogs

Authors: Pei Lin Khoo, Yunli Lee

Abstract:

Weblogs are resource of social structure to discover and track the various type of information written by blogger. In this paper, we proposed to use mining weblogs technique for identifying the trends of influenza where blogger had disseminated their opinion for the anomaly disease. In order to identify the trends, web crawler is applied to perform a search and generated a list of visited links based on a set of influenza keywords. This information is used to implement the analytics report system for monitoring and analyzing the pattern and trends of influenza (H1N1). Statistical and graphical analysis reports are generated. Both types of the report have shown satisfactory reports that reflect the awareness of Malaysian on the issue of influenza outbreak through blogs.

Keywords: H1N1, Weblogs, Web Crawler, Analytics Report System.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2460

8745 Union is Strength in Lossy Image Compression

Authors: Mario Mastriani

Abstract:

In this work, we present a comparison between different techniques of image compression. First, the image is divided in blocks which are organized according to a certain scan. Later, several compression techniques are applied, combined or alone. Such techniques are: wavelets (Haar's basis), Karhunen-Loève Transform, etc. Simulations show that the combined versions are the best, with minor Mean Squared Error (MSE), and higher Peak Signal to Noise Ratio (PSNR) and better image quality, even in the presence of noise.

Keywords: Haar's basis, Image compression, Karhunen-LoèveTransform, Morton's scan, row-rafter scan.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1741

8744 An Approach for Reducing the Computational Complexity of LAMSTAR Intrusion Detection System using Principal Component Analysis

Authors: V. Venkatachalam, S. Selvan

Abstract:

The security of computer networks plays a strategic role in modern computer systems. Intrusion Detection Systems (IDS) act as the 'second line of defense' placed inside a protected network, looking for known or potential threats in network traffic and/or audit data recorded by hosts. We developed an Intrusion Detection System using LAMSTAR neural network to learn patterns of normal and intrusive activities, to classify observed system activities and compared the performance of LAMSTAR IDS with other classification techniques using 5 classes of KDDCup99 data. LAMSAR IDS gives better performance at the cost of high Computational complexity, Training time and Testing time, when compared to other classification techniques (Binary Tree classifier, RBF classifier, Gaussian Mixture classifier). we further reduced the Computational Complexity of LAMSTAR IDS by reducing the dimension of the data using principal component analysis which in turn reduces the training and testing time with almost the same performance.

Keywords: Binary Tree Classifier, Gaussian Mixture, IntrusionDetection System, LAMSTAR, Radial Basis Function.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1734

8743 Multidimensional Visualization Tools for Analysis of Expression Data

Authors: Urska Cvek, Marjan Trutschl, Randolph Stone II, Zanobia Syed, John L. Clifford, Anita L. Sabichi

Abstract:

Expression data analysis is based mostly on the statistical approaches that are indispensable for the study of biological systems. Large amounts of multidimensional data resulting from the high-throughput technologies are not completely served by biostatistical techniques and are usually complemented with visual, knowledge discovery and other computational tools. In many cases, in biological systems we only speculate on the processes that are causing the changes, and it is the visual explorative analysis of data during which a hypothesis is formed. We would like to show the usability of multidimensional visualization tools and promote their use in life sciences. We survey and show some of the multidimensional visualization tools in the process of data exploration, such as parallel coordinates and radviz and we extend them by combining them with the self-organizing map algorithm. We use a time course data set of transitional cell carcinoma of the bladder in our examples. Analysis of data with these tools has the potential to uncover additional relationships and non-trivial structures.

Keywords: microarrays, visualization, parallel coordinates, radviz, self-organizing maps.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2500

8742 Modeling of Reinforcement in Concrete Beams Using Machine Learning Tools

Authors: Yogesh Aggarwal

Abstract:

The paper discusses the results obtained to predict reinforcement in singly reinforced beam using Neural Net (NN), Support Vector Machines (SVM-s) and Tree Based Models. Major advantage of SVM-s over NN is of minimizing a bound on the generalization error of model rather than minimizing a bound on mean square error over the data set as done in NN. Tree Based approach divides the problem into a small number of sub problems to reach at a conclusion. Number of data was created for different parameters of beam to calculate the reinforcement using limit state method for creation of models and validation. The results from this study suggest a remarkably good performance of tree based and SVM-s models. Further, this study found that these two techniques work well and even better than Neural Network methods. A comparison of predicted values with actual values suggests a very good correlation coefficient with all four techniques.

Keywords: Linear Regression, M5 Model Tree, Neural Network, Support Vector Machines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2024

8741 Masquerade and “What Comes Behind Six Is More Than Seven”: Thoughts on Art History and Visual Culture Research Methods

Authors: Osa D Egonwa

Abstract:

In the 21^st century, the disciplinary boundaries of past centuries that we often create through mainstream art historical classification, techniques and sources may have been eroded by visual culture, which seems to provide a more inclusive umbrella for the new ways artists go about the creative process and its resultant commodities. Over the past four decades, artists in Africa have resorted to new materials, techniques and themes which have affected our ways of research on these artists and their art. Frontline artists such as El Anatsui, Yinka Shonibare, Erasmus Onyishi are demonstrating that any material is just suitable for artistic expression. Most of times, these materials come with their own techniques/effects and visual syntax: a combination of materials compounds techniques, formal aesthetic indexes, halo effects, and iconography. This tends to challenge the categories and we lean on to view, think and talk about them. This renders our main stream art historical research methods inadequate, thus suggesting new discursive concepts, terms and theories. This paper proposed the Africanist eclectic methods derived from the dual framework of Masquerade Theory and What Comes Behind Six is More Than Seven. This paper shares thoughts/research on art historical methods, terminological re-alignments on classification/source data, presentational format and interpretation arising from the emergent trends in our subject. The outcome provides useful tools to mediate new thoughts and experiences in recent African art and visual culture.

Keywords: Art Historical Methods, Classifications, Concepts , Re-alignment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 632

8740 Cross Project Software Fault Prediction at Design Phase

Authors: Pradeep Singh, Shrish Verma

Abstract:

Software fault prediction models are created by using the source code, processed metrics from the same or previous version of code and related fault data. Some company do not store and keep track of all artifacts which are required for software fault prediction. To construct fault prediction model for such company, the training data from the other projects can be one potential solution. Earlier we predicted the fault the less cost it requires to correct. The training data consists of metrics data and related fault data at function/module level. This paper investigates fault predictions at early stage using the cross-project data focusing on the design metrics. In this study, empirical analysis is carried out to validate design metrics for cross project fault prediction. The machine learning techniques used for evaluation is Naïve Bayes. The design phase metrics of other projects can be used as initial guideline for the projects where no previous fault data is available. We analyze seven datasets from NASA Metrics Data Program which offer design as well as code metrics. Overall, the results of cross project is comparable to the within company data learning.

Keywords: Software Metrics, Fault prediction, Cross project, Within project.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2536

8739 Inverse Sets-based Recognition of Video Clips

Authors: Alexei M. Mikhailov

Abstract:

The paper discusses the mathematics of pattern indexing and its applications to recognition of visual patterns that are found in video clips. It is shown that (a) pattern indexes can be represented by collections of inverted patterns, (b) solutions to pattern classification problems can be found as intersections and histograms of inverted patterns and, thus, matching of original patterns avoided.

Keywords: Artificial neural cortex, computational biology, data mining, pattern recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2105

8738 Use of Bayesian Network in Information Extraction from Unstructured Data Sources

Authors: Quratulain N. Rajput, Sajjad Haider

Abstract:

This paper applies Bayesian Networks to support information extraction from unstructured, ungrammatical, and incoherent data sources for semantic annotation. A tool has been developed that combines ontologies, machine learning, and information extraction and probabilistic reasoning techniques to support the extraction process. Data acquisition is performed with the aid of knowledge specified in the form of ontology. Due to the variable size of information available on different data sources, it is often the case that the extracted data contains missing values for certain variables of interest. It is desirable in such situations to predict the missing values. The methodology, presented in this paper, first learns a Bayesian network from the training data and then uses it to predict missing data and to resolve conflicts. Experiments have been conducted to analyze the performance of the presented methodology. The results look promising as the methodology achieves high degree of precision and recall for information extraction and reasonably good accuracy for predicting missing values.

Keywords: Information Extraction, Bayesian Network, ontology, Machine Learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2223

8737 Sequential Partitioning Brainbow Image Segmentation Using Bayesian

Authors: Yayun Hsu, Henry Horng-Shing Lu

Abstract:

This paper proposes a data-driven, biology-inspired neural segmentation method of 3D drosophila Brainbow images. We use Bayesian Sequential Partitioning algorithm for probabilistic modeling, which can be used to detect somas and to eliminate crosstalk effects. This work attempts to develop an automatic methodology for neuron image segmentation, which nowadays still lacks a complete solution due to the complexity of the image. The proposed method does not need any predetermined, risk-prone thresholds, since biological information is inherently included inside the image processing procedure. Therefore, it is less sensitive to variations in neuron morphology; meanwhile, its flexibility would be beneficial for tracing the intertwining structure of neurons.

Keywords: Brainbow, 3D imaging, image segmentation, neuron morphology, biological data mining, non-parametric learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2250