Search results for: Text Data Mining
7429 IT Systems of the US Federal Courts, Justice, and Governance
Authors: Joseph Zernik
Abstract:
Validity, integrity, and impacts of the IT systems of the US federal courts have been studied as part of the Human Rights Alert-NGO (HRA) submission for the 2015 Universal Periodic Review (UPR) of human rights in the United States by the Human Rights Council (HRC) of the United Nations (UN). The current report includes overview of IT system analysis, data-mining and case studies. System analysis and data-mining show: Development and implementation with no lawful authority, servers of unverified identity, invalidity in implementation of electronic signatures, authentication instruments and procedures, authorities and permissions; discrimination in access against the public and unrepresented (pro se) parties and in favor of attorneys; widespread publication of invalid judicial records and dockets, leading to their false representation and false enforcement. A series of case studies documents the impacts on individuals' human rights, on banking regulation, and on international matters. Significance is discussed in the context of various media and expert reports, which opine unprecedented corruption of the US justice system today, and which question, whether the US Constitution was in fact suspended. Similar findings were previously reported in IT systems of the State of California and the State of Israel, which were incorporated, subject to professional HRC staff review, into the UN UPR reports (2010 and 2013). Solutions are proposed, based on the principles of publicity of the law and the separation of power: Reliance on US IT and legal experts under accountability to the legislative branch, enhancing transparency, ongoing vigilance by human rights and internet activists. IT experts should assume more prominent civic duties in the safeguard of civil society in our era.
Keywords: E-justice, federal courts, United States, human rights, banking regulation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21497428 Sequential Partitioning Brainbow Image Segmentation Using Bayesian
Authors: Yayun Hsu, Henry Horng-Shing Lu
Abstract:
This paper proposes a data-driven, biology-inspired neural segmentation method of 3D drosophila Brainbow images. We use Bayesian Sequential Partitioning algorithm for probabilistic modeling, which can be used to detect somas and to eliminate crosstalk effects. This work attempts to develop an automatic methodology for neuron image segmentation, which nowadays still lacks a complete solution due to the complexity of the image. The proposed method does not need any predetermined, risk-prone thresholds, since biological information is inherently included inside the image processing procedure. Therefore, it is less sensitive to variations in neuron morphology; meanwhile, its flexibility would be beneficial for tracing the intertwining structure of neurons.
Keywords: Brainbow, 3D imaging, image segmentation, neuron morphology, biological data mining, non-parametric learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22617427 On the Efficient Implementation of a Serial and Parallel Decomposition Algorithm for Fast Support Vector Machine Training Including a Multi-Parameter Kernel
Authors: Tatjana Eitrich, Bruno Lang
Abstract:
This work deals with aspects of support vector machine learning for large-scale data mining tasks. Based on a decomposition algorithm for support vector machine training that can be run in serial as well as shared memory parallel mode we introduce a transformation of the training data that allows for the usage of an expensive generalized kernel without additional costs. We present experiments for the Gaussian kernel, but usage of other kernel functions is possible, too. In order to further speed up the decomposition algorithm we analyze the critical problem of working set selection for large training data sets. In addition, we analyze the influence of the working set sizes onto the scalability of the parallel decomposition scheme. Our tests and conclusions led to several modifications of the algorithm and the improvement of overall support vector machine learning performance. Our method allows for using extensive parameter search methods to optimize classification accuracy.
Keywords: Support Vector Machine Training, Multi-ParameterKernels, Shared Memory Parallel Computing, Large Data
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14437426 Seasonal Variation of the Impact of Mining Activities on Ga-Selati River in Limpopo Province, South Africa
Authors: Joshua N. Edokpayi, John O. Odiyo, Patience P. Shikwambana
Abstract:
Water is a very rare natural resource in South Africa. Ga-Selati River is used for both domestic and industrial purposes. This study was carried out in order to assess the quality of Ga-Selati River in a mining area of Limpopo Province-Phalaborwa. The pH, Electrical Conductivity (EC) and Total Dissolved Solids (TDS) were determined using a Crinson multimeter while turbidity was measured using a Labcon Turbidimeter. The concentrations of Al, Ca, Cd, Cr, Fe, K, Mg, Mn, Na and Pb were analysed in triplicate using a Varian 520 flame atomic absorption spectrometer (AAS) supplied by PerkinElmer, after acid digestion with nitric acid in a fume cupboard. The average pH of the river from eight different sampling sites was 8.00 and 9.38 in wet and dry season respectively. Higher EC values were determined in the dry season (138.7 mS/m) than in the wet season (96.93 mS/m). Similarly, TDS values were higher in dry (929.29 mg/L) than in the wet season (640.72 mg/L) season. These values exceeded the recommended guideline of South Africa Department of Water Affairs and Forestry (DWAF) for domestic water use (70 mS/m) and that of the World Health Organization (WHO) (600 mS/m), respectively. Turbidity varied between 1.78-5.20 and 0.95-2.37 NTU in both wet and dry seasons. Total hardness of 312.50 mg/L and 297.75 mg/L as the concentration of CaCO3 was computed for the river in both the wet and the dry seasons and the river water was categorised as very hard. Mean concentration of the metals studied in both the wet and the dry seasons are: Na (94.06 mg/L and 196.3 mg/L), K (11.79 mg/L and 13.62 mg/L), Ca (45.60 mg/L and 41.30 mg/L), Mg (48.41 mg/L and 44.71 mg/L), Al (0.31 mg/L and 0.38 mg/L), Cd (0.01 mg/L and 0.01 mg/L), Cr (0.02 mg/L and 0.09 mg/L), Pb (0.05 mg/L and 0.06 mg/L), Mn (0.31 mg/L and 0.11 mg/L) and Fe (0.76 mg/L and 0.69 mg/L). Results from this study reveal that most of the metals were present in concentrations higher than the recommended guidelines of DWAF and WHO for domestic use and the protection of aquatic life.Keywords: Contamination, mining activities, surface water, trace metals.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19897425 Improved C-Fuzzy Decision Tree for Intrusion Detection
Authors: Krishnamoorthi Makkithaya, N. V. Subba Reddy, U. Dinesh Acharya
Abstract:
As the number of networked computers grows, intrusion detection is an essential component in keeping networks secure. Various approaches for intrusion detection are currently being in use with each one has its own merits and demerits. This paper presents our work to test and improve the performance of a new class of decision tree c-fuzzy decision tree to detect intrusion. The work also includes identifying best candidate feature sub set to build the efficient c-fuzzy decision tree based Intrusion Detection System (IDS). We investigated the usefulness of c-fuzzy decision tree for developing IDS with a data partition based on horizontal fragmentation. Empirical results indicate the usefulness of our approach in developing the efficient IDS.Keywords: Data mining, Decision tree, Feature selection, Fuzzyc- means clustering, Intrusion detection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15777424 Development of a Serial Signal Monitoring Program for Educational Purposes
Authors: Jungho Moon, Lae-Jeong Park
Abstract:
This paper introduces a signal monitoring program developed with a view to helping electrical engineering students get familiar with sensors with digital output. Because the output of digital sensors cannot be simply monitored by a measuring instrument such as an oscilloscope, students tend to have a hard time dealing with digital sensors. The monitoring program runs on a PC and communicates with an MCU that reads the output of digital sensors via an asynchronous communication interface. Receiving the sensor data from the MCU, the monitoring program shows time and/or frequency domain plots of the data in real time. In addition, the monitoring program provides a serial terminal that enables the user to exchange text information with the MCU while the received data is plotted. The user can easily observe the output of digital sensors and configure the digital sensors in real time, which helps students who do not have enough experiences with digital sensors. Though the monitoring program was programmed in the Matlab programming language, it runs without the Matlab since it was compiled as a standalone executable.Keywords: Digital sensor, MATLAB, MCU, signal monitoring program.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21157423 MHD Stagnation Point Flow towards a Shrinking Sheet with Suction in an Upper-Convected Maxwell (UCM) Fluid
Authors: K. Jafar, R. Nazar, A. Ishak, I. Pop
Abstract:
The present analysis considers the steady stagnation point flow and heat transfer towards a permeable shrinking sheet in an upper-convected Maxwell (UCM) electrically conducting fluid, with a constant magnetic field applied in the transverse direction to flow and a local heat generation within the boundary layer, with a heat generation rate proportional to (T-T)p Using a similarity transformation, the governing system of partial differential equations is first transformed into a system of ordinary differential equations, which is then solved numerically using a finite-difference scheme known as the Keller-box method. Numerical results are obtained for the flow and thermal fields for various values of the stretching/shrinking parameter λ, the magnetic parameter M, the elastic parameter K, the Prandtl number Pr, the suction parameter s, the heat generation parameter Q, and the exponent p. The results indicate the existence of dual solutions for the shrinking sheet up to a critical value λc whose value depends on the value of M, K, and s. In the presence of internal heat absorption (Q<0) the surface heat transfer rate decreases with increasing p but increases with parameters Q and s when the sheet is either stretched or shrunk.
Keywords: Magnetohydrodynamic (MHD), boundary layer flow, UCM fluid, stagnation point, shrinking sheet.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20717422 Examining the Dubbing Strategies Used in the Egyptian Dubbed Version of Mulan (1998)
Authors: Shaza Melies, Saadeya Salem, Seham Kareh
Abstract:
Cartoon films are multisemiotic as various modes integrate in the production of meaning. This study aims to examine the cultural and linguistic specific references in the Egyptian dubbed cartoon film Mulan. The study examines the translation strategies implemented in the Egyptian dubbed version of Mulan to meet the cultural preferences of the audience. The study reached the following findings: Using the traditional translation strategies does not deliver the intended meaning of the source text and causes loss in the intended humor. As a result, the findings showed that in the dubbed version, translators tend to omit, change, or add information to the target text to be accepted by the audience. The contrastive analysis of the Mulan (English and dubbed versions) proves the connotations that the dubbing has taken to be accepted by the target audience. Cartoon films are multisemiotic as various modes integrate in the production of meaning. This study aims to examine the cultural and linguistic specific references in the Egyptian dubbed cartoon film Mulan. The study examines the translation strategies implemented in the Egyptian dubbed version of Mulan to meet the cultural preferences of the audience. The study reached the following findings: Using the traditional translation strategies does not deliver the intended meaning of the source text and causes loss in the intended humor. As a result, the findings showed that in the dubbed version, translators tend to omit, change, or add information to the target text to be accepted by the audience. The contrastive analysis of the Mulan (English and dubbed versions) proves the connotations that the dubbing has taken to be accepted by the target audience.Keywords: Domestication, dubbing, Mulan, translation theories.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7417421 The Robust Clustering with Reduction Dimension
Authors: Dyah E. Herwindiati
Abstract:
A clustering is process to identify a homogeneous groups of object called as cluster. Clustering is one interesting topic on data mining. A group or class behaves similarly characteristics. This paper discusses a robust clustering process for data images with two reduction dimension approaches; i.e. the two dimensional principal component analysis (2DPCA) and principal component analysis (PCA). A standard approach to overcome this problem is dimension reduction, which transforms a high-dimensional data into a lower-dimensional space with limited loss of information. One of the most common forms of dimensionality reduction is the principal components analysis (PCA). The 2DPCA is often called a variant of principal component (PCA), the image matrices were directly treated as 2D matrices; they do not need to be transformed into a vector so that the covariance matrix of image can be constructed directly using the original image matrices. The decomposed classical covariance matrix is very sensitive to outlying observations. The objective of paper is to compare the performance of robust minimizing vector variance (MVV) in the two dimensional projection PCA (2DPCA) and the PCA for clustering on an arbitrary data image when outliers are hiden in the data set. The simulation aspects of robustness and the illustration of clustering images are discussed in the end of paperKeywords: Breakdown point, Consistency, 2DPCA, PCA, Outlier, Vector Variance
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16987420 Localizing and Recognizing Integral Pitches of Cheque Document Images
Authors: Bremananth R., Veerabadran C. S., Andy W. H. Khong
Abstract:
Automatic reading of handwritten cheque is a computationally complex process and it plays an important role in financial risk management. Machine vision and learning provide a viable solution to this problem. Research effort has mostly been focused on recognizing diverse pitches of cheques and demand drafts with an identical outline. However most of these methods employ templatematching to localize the pitches and such schemes could potentially fail when applied to different types of outline maintained by the bank. In this paper, the so-called outline problem is resolved by a cheque information tree (CIT), which generalizes the localizing method to extract active-region-of-entities. In addition, the weight based density plot (WBDP) is performed to isolate text entities and read complete pitches. Recognition is based on texture features using neural classifiers. Legal amount is subsequently recognized by both texture and perceptual features. A post-processing phase is invoked to detect the incorrect readings by Type-2 grammar using the Turing machine. The performance of the proposed system was evaluated using cheque and demand drafts of 22 different banks. The test data consists of a collection of 1540 leafs obtained from 10 different account holders from each bank. Results show that this approach can easily be deployed without significant design amendments.Keywords: Cheque reading, Connectivity checking, Text localization, Texture analysis, Turing machine, Signature verification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16587419 Integration of Support Vector Machine and Bayesian Neural Network for Data Mining and Classification
Authors: Essam Al-Daoud
Abstract:
Several combinations of the preprocessing algorithms, feature selection techniques and classifiers can be applied to the data classification tasks. This study introduces a new accurate classifier, the proposed classifier consist from four components: Signal-to- Noise as a feature selection technique, support vector machine, Bayesian neural network and AdaBoost as an ensemble algorithm. To verify the effectiveness of the proposed classifier, seven well known classifiers are applied to four datasets. The experiments show that using the suggested classifier enhances the classification rates for all datasets.Keywords: AdaBoost, Bayesian neural network, Signal-to-Noise, support vector machine, MCMC.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20217418 Graphical Password Security Evaluation by Fuzzy AHP
Authors: Arash Habibi Lashkari, Azizah Abdul Manaf, Maslin Masrom
Abstract:
In today's day and age, one of the important topics in information security is authentication. There are several alternatives to text-based authentication of which includes Graphical Password (GP) or Graphical User Authentication (GUA). These methods stems from the fact that humans recognized and remembers images better than alphanumerical text characters. This paper will focus on the security aspect of GP algorithms and what most researchers have been working on trying to define these security features and attributes. The goal of this study is to develop a fuzzy decision model that allows automatic selection of available GP algorithms by taking into considerations the subjective judgments of the decision makers who are more than 50 postgraduate students of computer science. The approach that is being proposed is based on the Fuzzy Analytic Hierarchy Process (FAHP) which determines the criteria weight as a linear formula.Keywords: Graphical Password, Authentication Security, Attack Patterns, Brute force attack, Dictionary attack, Guessing Attack, Spyware attack, Shoulder surfing attack, Social engineering Attack, Password Entropy, Password Space.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19377417 Evaluating some Feature Selection Methods for an Improved SVM Classifier
Authors: Daniel Morariu, Lucian N. Vintan, Volker Tresp
Abstract:
Text categorization is the problem of classifying text documents into a set of predefined classes. After a preprocessing step the documents are typically represented as large sparse vectors. When training classifiers on large collections of documents, both the time and memory restrictions can be quite prohibitive. This justifies the application of features selection methods to reduce the dimensionality of the document-representation vector. Four feature selection methods are evaluated: Random Selection, Information Gain (IG), Support Vector Machine (called SVM_FS) and Genetic Algorithm with SVM (GA_FS). We showed that the best results were obtained with SVM_FS and GA_FS methods for a relatively small dimension of the features vector comparative with the IG method that involves longer vectors, for quite similar classification accuracies. Also we present a novel method to better correlate SVM kernel-s parameters (Polynomial or Gaussian kernel).
Keywords: Features selection, learning with kernels, support vector machine, genetic algorithms and classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15407416 Skin Effect: A Natural Phenomenon for Minimization of Ground Bounce in VLSI RC Interconnect
Authors: Shilpi Lavania
Abstract:
As the frequency of operation has attained a range of GHz and signal rise time continues to increase interconnect technology is suffering due to various high frequency effects as well as ground bounce problem. In some recent studies a high frequency effect i.e. skin effect has been modeled and its drawbacks have been discussed. This paper strives to make an impression on the advantage side of modeling skin effect for interconnect line. The proposed method has considered a CMOS with RC interconnect. Delay and noise considering ground bounce problem and with skin effect are discussed. The simulation results reveal an advantage of considering skin effect for minimization of ground bounce problem during the working of the model. Noise and delay variations with temperature are also presented.
Keywords: Interconnect, Skin effect, Ground Bounce, Delay, Noise.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 31417415 Context-aware Recommender Systems using Data Mining Techniques
Authors: Kyoung-jae Kim, Hyunchul Ahn, Sangwon Jeong
Abstract:
This study proposes a novel recommender system to provide the advertisements of context-aware services. Our proposed model is designed to apply a modified collaborative filtering (CF) algorithm with regard to the several dimensions for the personalization of mobile devices – location, time and the user-s needs type. In particular, we employ a classification rule to understand user-s needs type using a decision tree algorithm. In addition, we collect primary data from the mobile phone users and apply them to the proposed model to validate its effectiveness. Experimental results show that the proposed system makes more accurate and satisfactory advertisements than comparative systems.Keywords: Location-based advertisement, Recommender system, Collaborative filtering, User needs type, Mobile user.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21757414 Semi-Automatic Method to Assist Expert for Association Rules Validation
Authors: Amdouni Hamida, Gammoudi Mohamed Mohsen
Abstract:
In order to help the expert to validate association rules extracted from data, some quality measures are proposed in the literature. We distinguish two categories: objective and subjective measures. The first one depends on a fixed threshold and on data quality from which the rules are extracted. The second one consists on providing to the expert some tools in the objective to explore and visualize rules during the evaluation step. However, the number of extracted rules to validate remains high. Thus, the manually mining rules task is very hard. To solve this problem, we propose, in this paper, a semi-automatic method to assist the expert during the association rule's validation. Our method uses rule-based classification as follow: (i) We transform association rules into classification rules (classifiers), (ii) We use the generated classifiers for data classification. (iii) We visualize association rules with their quality classification to give an idea to the expert and to assist him during validation process.Keywords: Association rules, Rule-based classification, Classification quality, Validation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17927413 Use and Relationship of Shell Nouns as Cohesive Devices in the Quality of Second Language Writing
Authors: Kristine D. de Leon, Junifer A. Abatayo, Jose Cristina M. Pariña
Abstract:
The current study is a comparative analysis of the use of shell nouns as a cohesive device (CD) in an English for Second Language (ESL) setting in order to identify their use and relationship in the quality of second language (L2) writing. As these nouns were established to anticipate the meaning within, across or outside the text, their use has fascinated writing researchers. The corpus of the study included published articles from reputable journals and graduate students’ papers in order to analyze the frequency of shell nouns using “highly prevalent” nouns in the academic community, to identify the different lexicogrammatical patterns where these nouns occur and to the functions connected with these patterns. The result of the study implies that published authors used more shell nouns in their paper than graduate students. However, the functions of the different lexicogrammatical patterns for the frequently occurring shell nouns are somewhat similar. These results could help students in enhancing the cohesion of their text and in comprehending it.
Keywords: Anaphoric-cataphoric, cohesive device, lexicogrammatical, shell nouns.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6537412 Exploring Influence Range of Tainan City Using Electronic Toll Collection Big Data
Authors: Chen Chou, Feng-Tyan Lin
Abstract:
Big Data has been attracted a lot of attentions in many fields for analyzing research issues based on a large number of maternal data. Electronic Toll Collection (ETC) is one of Intelligent Transportation System (ITS) applications in Taiwan, used to record starting point, end point, distance and travel time of vehicle on the national freeway. This study, taking advantage of ETC big data, combined with urban planning theory, attempts to explore various phenomena of inter-city transportation activities. ETC, one of government's open data, is numerous, complete and quick-update. One may recall that living area has been delimited with location, population, area and subjective consciousness. However, these factors cannot appropriately reflect what people’s movement path is in daily life. In this study, the concept of "Living Area" is replaced by "Influence Range" to show dynamic and variation with time and purposes of activities. This study uses data mining with Python and Excel, and visualizes the number of trips with GIS to explore influence range of Tainan city and the purpose of trips, and discuss living area delimited in current. It dialogues between the concepts of "Central Place Theory" and "Living Area", presents the new point of view, integrates the application of big data, urban planning and transportation. The finding will be valuable for resource allocation and land apportionment of spatial planning.
Keywords: Big Data, ITS, influence range, living area, central place theory, visualization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9777411 Machine Learning Methods for Network Intrusion Detection
Authors: Mouhammad Alkasassbeh, Mohammad Almseidin
Abstract:
Network security engineers work to keep services available all the time by handling intruder attacks. Intrusion Detection System (IDS) is one of the obtainable mechanisms that is used to sense and classify any abnormal actions. Therefore, the IDS must be always up to date with the latest intruder attacks signatures to preserve confidentiality, integrity, and availability of the services. The speed of the IDS is a very important issue as well learning the new attacks. This research work illustrates how the Knowledge Discovery and Data Mining (or Knowledge Discovery in Databases) KDD dataset is very handy for testing and evaluating different Machine Learning Techniques. It mainly focuses on the KDD preprocess part in order to prepare a decent and fair experimental data set. The J48, MLP, and Bayes Network classifiers have been chosen for this study. It has been proven that the J48 classifier has achieved the highest accuracy rate for detecting and classifying all KDD dataset attacks, which are of type DOS, R2L, U2R, and PROBE.
Keywords: IDS, DDoS, MLP, KDD.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7287410 Multi Language Text Editor for Burushaski and Urdu through Unicode
Authors: Irfan Qadir Baig, Muhammad Sharif, Aman Ullah Khan
Abstract:
This paper introduces an isolated and unique ancient language Burushaski, spoken in Hunza, Nagar, Yasin and parts of Gilgit in the Northern Areas of Pakistan. It explains the working mechanism of Multi Language Text Editor for Urdu and Burushaski. It is developed under the use of ISO/IEC 10646 Unicode standards for Urdu and Burushaski open-type fonts. It gives an ample opportunity to this regional ancient language to have a modern Information technology for its promotion and preservation. The main objective of this research paper is to help preserve the heritage of such rare languages and give smart way of automation. It also facilitates to those who are interested in undertaking research on Burushaski or keen to trace fonatic relationship between the national Urdu language and Burushaski. Since this editor covers both Burushaski and Urdu so it can play an important role to introduce Burusho linguistic culture to the world at large. Precisely, as a result of this research paper, Burushaski publication through IT means would be possible.Keywords: Burushaski, Bri Naqsh, Unicode, Burusho, Hunza, Meshaski.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21127409 Glass Bottle Inspector Based on Machine Vision
Authors: Huanjun Liu, Yaonan Wang, Feng Duan
Abstract:
This text studies glass bottle intelligent inspector based machine vision instead of manual inspection. The system structure is illustrated in detail in this paper. The text presents the method based on watershed transform methods to segment the possible defective regions and extract features of bottle wall by rules. Then wavelet transform are used to exact features of bottle finish from images. After extracting features, the fuzzy support vector machine ensemble is putted forward as classifier. For ensuring that the fuzzy support vector machines have good classification ability, the GA based ensemble method is used to combining the several fuzzy support vector machines. The experiments demonstrate that using this inspector to inspect glass bottles, the accuracy rate may reach above 97.5%.Keywords: Intelligent Inspection, Support Vector Machines, Ensemble Methods, watershed transform, Wavelet Transform
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 38987408 Performance Analysis of Chrominance Red and Chrominance Blue in JPEG
Authors: Mamta Garg
Abstract:
While compressing text files is useful, compressing still image files is almost a necessity. A typical image takes up much more storage than a typical text message and without compression images would be extremely clumsy to store and distribute. The amount of information required to store pictures on modern computers is quite large in relation to the amount of bandwidth commonly available to transmit them over the Internet and applications. Image compression addresses the problem of reducing the amount of data required to represent a digital image. Performance of any image compression method can be evaluated by measuring the root-mean-square-error & peak signal to noise ratio. The method of image compression that will be analyzed in this paper is based on the lossy JPEG image compression technique, the most popular compression technique for color images. JPEG compression is able to greatly reduce file size with minimal image degradation by throwing away the least “important" information. In JPEG, both color components are downsampled simultaneously, but in this paper we will compare the results when the compression is done by downsampling the single chroma part. In this paper we will demonstrate more compression ratio is achieved when the chrominance blue is downsampled as compared to downsampling the chrominance red in JPEG compression. But the peak signal to noise ratio is more when the chrominance red is downsampled as compared to downsampling the chrominance blue in JPEG compression. In particular we will use the hats.jpg as a demonstration of JPEG compression using low pass filter and demonstrate that the image is compressed with barely any visual differences with both methods.Keywords: JPEG, Discrete Cosine Transform, Quantization, Color Space Conversion, Image Compression, Peak Signal to Noise Ratio & Compression Ratio.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16797407 AI-Based Techniques for Online Social Media Network Sentiment Analysis: A Methodical Review
Authors: A. M. John-Otumu, M. M. Rahman, O. C. Nwokonkwo, M. C. Onuoha
Abstract:
Online social media networks have long served as a primary arena for group conversations, gossip, text-based information sharing and distribution. The use of natural language processing techniques for text classification and unbiased decision making has not been far-fetched. Proper classification of these textual information in a given context has also been very difficult. As a result, a systematic review was conducted from previous literature on sentiment classification and AI-based techniques. The study was done in order to gain a better understanding of the process of designing and developing a robust and more accurate sentiment classifier that could correctly classify social media textual information of a given context between hate speech and inverted compliments with a high level of accuracy using the knowledge gain from the evaluation of different artificial intelligence techniques reviewed. The study evaluated over 250 articles from digital sources like ACM digital library, Google Scholar, and IEEE Xplore; and whittled down the number of research to 52 articles. Findings revealed that deep learning approaches such as Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Bidirectional Encoder Representations from Transformer (BERT), and Long Short-Term Memory (LSTM) outperformed various machine learning techniques in terms of performance accuracy. A large dataset is also required to develop a robust sentiment classifier. Results also revealed that data can be obtained from places like Twitter, movie reviews, Kaggle, Stanford Sentiment Treebank (SST), and SemEval Task4 based on the required domain. The hybrid deep learning techniques like CNN+LSTM, CNN+ Gated Recurrent Unit (GRU), CNN+BERT outperformed single deep learning techniques and machine learning techniques. Python programming language outperformed Java programming language in terms of development simplicity and AI-based library functionalities. Finally, the study recommended the findings obtained for building robust sentiment classifier in the future.
Keywords: Artificial Intelligence, Natural Language Processing, Sentiment Analysis, Social Network, Text.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5957406 CVOIP-FRU: Comprehensive VoIP Forensics Report Utility
Authors: Alejandro Villegas, Cihan Varol
Abstract:
Voice over Internet Protocol (VoIP) products is an emerging technology that can contain forensically important information for a criminal activity. Without having the user name and passwords, this forensically important information can still be gathered by the investigators. Although there are a few VoIP forensic investigative applications available in the literature, most of them are particularly designed to collect evidence from the Skype product. Therefore, in order to assist law enforcement with collecting forensically important information from variety of Betamax VoIP tools, CVOIP-FRU framework is developed. CVOIP-FRU provides a data gathering solution that retrieves usernames, contact lists, as well as call and SMS logs from Betamax VoIP products. It is a scripting utility that searches for data within the registry, logs and the user roaming profiles in Windows and Mac OSX operating systems. Subsequently, it parses the output into readable text and html formats. One superior way of CVOIP-FRU compared to the other applications that due to intelligent data filtering capabilities and cross platform scripting back end of CVOIP-FRU, it is expandable to include other VoIP solutions as well. Overall, this paper reveals the exploratory analysis performed in order to find the key data paths and locations, the development stages of the framework, and the empirical testing and quality assurance of CVOIP-FRU.
Keywords: Betamax, digital forensics, report utility, VoIP, VoIP Buster, VoIPWise.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 31247405 SUPAR: System for User-Centric Profiling of Association Rules in Streaming Data
Authors: Sarabjeet Kaur Kochhar
Abstract:
With a surge of stream processing applications novel techniques are required for generation and analysis of association rules in streams. The traditional rule mining solutions cannot handle streams because they generally require multiple passes over the data and do not guarantee the results in a predictable, small time. Though researchers have been proposing algorithms for generation of rules from streams, there has not been much focus on their analysis. We propose Association rule profiling, a user centric process for analyzing association rules and attaching suitable profiles to them depending on their changing frequency behavior over a previous snapshot of time in a data stream. Association rule profiles provide insights into the changing nature of associations and can be used to characterize the associations. We discuss importance of characteristics such as predictability of linkages present in the data and propose metric to quantify it. We also show how association rule profiles can aid in generation of user specific, more understandable and actionable rules. The framework is implemented as SUPAR: System for Usercentric Profiling of Association Rules in streaming data. The proposed system offers following capabilities: i) Continuous monitoring of frequency of streaming item-sets and detection of significant changes therein for association rule profiling. ii) Computation of metrics for quantifying predictability of associations present in the data. iii) User-centric control of the characterization process: user can control the framework through a) constraint specification and b) non-interesting rule elimination.Keywords: Data Streams, User subjectivity, Change detection, Association rule profiles, Predictability.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14597404 Evaluation of Energy and Environmental Aspects of Reduced Tillage Systems Applied in Maize Cultivation
Authors: E. Sarauskis, L. Masilionyte, Z. Kriauciuniene, K. Romaneckas, S. Buragiene
Abstract:
In maize growing technologies, tillage technological operations are the most time-consuming and require the greatest fuel input. Substitution of conventional tillage, involving deep ploughing, by other reduced tillage methods can reduce technological production costs, diminish soil degradation and environmental pollution from greenhouse gas emissions, as well as improve economic competitiveness of agricultural produce.
Experiments designed to assess energy and environmental aspects associated with different reduced tillage systems, applied in maize cultivation were conducted at Aleksandras Stulginskis University taking into account Lithuania’s economic and climate conditions. The study involved 5 tillage treatments: deep ploughing (DP, control), shallow ploughing (SP), deep cultivation (DC), shallow cultivation (SC) and no-tillage (NT).
Our experimental evidence suggests that with the application of reduced tillage systems it is feasible to reduce fuel consumption by 13-58% and working time input by 8.4% to nearly 3-fold, to reduce the cost price of maize cultivation technological operations, decrease environmental pollution with CO2 gas by 30 to 146 kg ha-1, compared with the deep ploughing.
Keywords: Reduced tillage, energy and environmental assessment, fuel consumption, CO2 emission, maize.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20967403 Non-Homogeneous Layered Fiber Reinforced Concrete
Authors: Vitalijs Lusis, Andrejs Krasnikovs
Abstract:
Fiber reinforced concrete is important material for load bearing structural elements. Usually fibers are homogeneously distributed in a concrete body having arbitrary spatial orientations. At the same time, in many situations, fiber concrete with oriented fibers is more optimal. Is obvious, that is possible to create constructions with oriented short fibers in them, in different ways. Present research is devoted to one of such approaches- fiber reinforced concrete prisms having dimensions 100mm ×100mm ×400mmwith layers of non-homogeneously distributed fibers inside them were fabricated.
Simultaneously prisms with homogeneously dispersed fibers were produced for reference as well. Prisms were tested under four point bending conditions. During the tests vertical deflection at the center of every prism and crack opening were measured (using linear displacements transducers in real timescale). Prediction results were discussed.
Keywords: Fiber reinforced concrete, 4-point bending, steel fiber.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 30097402 Application of Artificial Neural Network to Classification Surface Water Quality
Authors: S. Wechmongkhonkon, N.Poomtong, S. Areerachakul
Abstract:
Water quality is a subject of ongoing concern. Deterioration of water quality has initiated serious management efforts in many countries. This study endeavors to automatically classify water quality. The water quality classes are evaluated using 6 factor indices. These factors are pH value (pH), Dissolved Oxygen (DO), Biochemical Oxygen Demand (BOD), Nitrate Nitrogen (NO3N), Ammonia Nitrogen (NH3N) and Total Coliform (TColiform). The methodology involves applying data mining techniques using multilayer perceptron (MLP) neural network models. The data consisted of 11 sites of canals in Dusit district in Bangkok, Thailand. The data is obtained from the Department of Drainage and Sewerage Bangkok Metropolitan Administration during 2007-2011. The results of multilayer perceptron neural network exhibit a high accuracy multilayer perception rate at 96.52% in classifying the water quality of Dusit district canal in Bangkok Subsequently, this encouraging result could be applied with plan and management source of water quality.Keywords: artificial neural network, classification, surface water quality
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 32107401 Measuring Text-Based Semantics Relatedness Using WordNet
Authors: Madiha Khan, Sidrah Ramzan, Seemab Khan, Shahzad Hassan, Kamran Saeed
Abstract:
Measuring semantic similarity between texts is calculating semantic relatedness between texts using various techniques. Our web application (Measuring Relatedness of Concepts-MRC) allows user to input two text corpuses and get semantic similarity percentage between both using WordNet. Our application goes through five stages for the computation of semantic relatedness. Those stages are: Preprocessing (extracts keywords from content), Feature Extraction (classification of words into Parts-of-Speech), Synonyms Extraction (retrieves synonyms against each keyword), Measuring Similarity (using keywords and synonyms, similarity is measured) and Visualization (graphical representation of similarity measure). Hence the user can measure similarity on basis of features as well. The end result is a percentage score and the word(s) which form the basis of similarity between both texts with use of different tools on same platform. In future work we look forward for a Web as a live corpus application that provides a simpler and user friendly tool to compare documents and extract useful information.
Keywords: GraphViz representation, semantic relatedness, similarity measurement, WordNet similarity.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8387400 An Intelligent Approach of Rough Set in Knowledge Discovery Databases
Authors: Hrudaya Ku. Tripathy, B. K. Tripathy, Pradip K. Das
Abstract:
Knowledge Discovery in Databases (KDD) has evolved into an important and active area of research because of theoretical challenges and practical applications associated with the problem of discovering (or extracting) interesting and previously unknown knowledge from very large real-world databases. Rough Set Theory (RST) is a mathematical formalism for representing uncertainty that can be considered an extension of the classical set theory. It has been used in many different research areas, including those related to inductive machine learning and reduction of knowledge in knowledge-based systems. One important concept related to RST is that of a rough relation. In this paper we presented the current status of research on applying rough set theory to KDD, which will be helpful for handle the characteristics of real-world databases. The main aim is to show how rough set and rough set analysis can be effectively used to extract knowledge from large databases.Keywords: Data mining, Data tables, Knowledge discovery in database (KDD), Rough sets.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2337