Abstracts | Computer and Information Engineering
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3708

World Academy of Science, Engineering and Technology

[Computer and Information Engineering]

Online ISSN : 1307-6892

2448 Communication in a Heterogeneous Ad Hoc Network

Authors: C. Benjbara, A. Habbani

Abstract:

Wireless networks are getting more and more used in every new technology or feature, especially those without infrastructure (Ad hoc mode) which provide a low cost alternative to the infrastructure mode wireless networks and a great flexibility for application domains such as environmental monitoring, smart cities, precision agriculture, and so on. These application domains present a common characteristic which is the need of coexistence and intercommunication between modules belonging to different types of ad hoc networks like wireless sensor networks, mesh networks, mobile ad hoc networks, vehicular ad hoc networks, etc. This vision to bring to life such heterogeneous networks will make humanity duties easier but its development path is full of challenges. One of these challenges is the communication complexity between its components due to the lack of common or compatible protocols standard. This article proposes a new patented routing protocol based on the OLSR standard in order to resolve the heterogeneous ad hoc networks communication issue. This new protocol is applied on a specific network architecture composed of MANET, VANET, and FANET.

Keywords: Ad hoc, heterogeneous, ID-Node, OLSR

Procedia PDF Downloads 215
2447 Automatic Lexicon Generation for Domain Specific Dataset for Mining Public Opinion on China Pakistan Economic Corridor

Authors: Tayyaba Azim, Bibi Amina

Abstract:

The increase in the popularity of opinion mining with the rapid growth in the availability of social networks has attracted a lot of opportunities for research in the various domains of Sentiment Analysis and Natural Language Processing (NLP) using Artificial Intelligence approaches. The latest trend allows the public to actively use the internet for analyzing an individual’s opinion and explore the effectiveness of published facts. The main theme of this research is to account the public opinion on the most crucial and extensively discussed development projects, China Pakistan Economic Corridor (CPEC), considered as a game changer due to its promise of bringing economic prosperity to the region. So far, to the best of our knowledge, the theme of CPEC has not been analyzed for sentiment determination through the ML approach. This research aims to demonstrate the use of ML approaches to spontaneously analyze the public sentiment on Twitter tweets particularly about CPEC. Support Vector Machine SVM is used for classification task classifying tweets into positive, negative and neutral classes. Word2vec and TF-IDF features are used with the SVM model, a comparison of the trained model on manually labelled tweets and automatically generated lexicon is performed. The contributions of this work are: Development of a sentiment analysis system for public tweets on CPEC subject, construction of an automatic generation of the lexicon of public tweets on CPEC, different themes are identified among tweets and sentiments are assigned to each theme. It is worth noting that the applications of web mining that empower e-democracy by improving political transparency and public participation in decision making via social media have not been explored and practised in Pakistan region on CPEC yet.

Keywords: machine learning, natural language processing, sentiment analysis, support vector machine, Word2vec

Procedia PDF Downloads 148
2446 Bug Localization on Single-Line Bugs of Apache Commons Math Library

Authors: Cherry Oo, Hnin Min Oo

Abstract:

Software bug localization is one of the most costly tasks in program repair technique. Therefore, there is a high claim for automated bug localization techniques that can monitor programmers to the locations of bugs, with slight human arbitration. Spectrum-based bug localization aims to help software developers to discover bugs rapidly by investigating abstractions of the program traces to make a ranking list of most possible buggy modules. Using the Apache Commons Math library project, we study the diagnostic accuracy using our spectrum-based bug localization metric. Our outcomes show that the greater performance of a specific similarity coefficient, used to inspect the program spectra, is mostly effective on localizing of single line bugs.

Keywords: software testing, bug localization, program spectra, bug

Procedia PDF Downloads 143
2445 Classification Based on Deep Neural Cellular Automata Model

Authors: Yasser F. Hassan

Abstract:

Deep learning structure is a branch of machine learning science and greet achievement in research and applications. Cellular neural networks are regarded as array of nonlinear analog processors called cells connected in a way allowing parallel computations. The paper discusses how to use deep learning structure for representing neural cellular automata model. The proposed learning technique in cellular automata model will be examined from structure of deep learning. A deep automata neural cellular system modifies each neuron based on the behavior of the individual and its decision as a result of multi-level deep structure learning. The paper will present the architecture of the model and the results of simulation of approach are given. Results from the implementation enrich deep neural cellular automata system and shed a light on concept formulation of the model and the learning in it.

Keywords: cellular automata, neural cellular automata, deep learning, classification

Procedia PDF Downloads 198
2444 Comparison between XGBoost, LightGBM and CatBoost Using a Home Credit Dataset

Authors: Essam Al Daoud

Abstract:

Gradient boosting methods have been proven to be a very important strategy. Many successful machine learning solutions were developed using the XGBoost and its derivatives. The aim of this study is to investigate and compare the efficiency of three gradient methods. Home credit dataset is used in this work which contains 219 features and 356251 records. However, new features are generated and several techniques are used to rank and select the best features. The implementation indicates that the LightGBM is faster and more accurate than CatBoost and XGBoost using variant number of features and records.

Keywords: gradient boosting, XGBoost, LightGBM, CatBoost, home credit

Procedia PDF Downloads 171
2443 Functional and Efficient Query Interpreters: Principle, Application and Performances’ Comparison

Authors: Laurent Thiry, Michel Hassenforder

Abstract:

This paper presents a general approach to implement efficient queries’ interpreters in a functional programming language. Indeed, most of the standard tools actually available use an imperative and/or object-oriented language for the implementation (e.g. Java for Jena-Fuseki) but other paradigms are possible with, maybe, better performances. To proceed, the paper first explains how to model data structures and queries in a functional point of view. Then, it proposes a general methodology to get performances (i.e. number of computation steps to answer a query) then it explains how to integrate some optimization techniques (short-cut fusion and, more important, data transformations). It then compares the functional server proposed to a standard tool (Fuseki) demonstrating that the first one can be twice to ten times faster to answer queries.

Keywords: data transformation, functional programming, information server, optimization

Procedia PDF Downloads 157
2442 Single Valued Neutrosophic Hesitant Fuzzy Rough Set and Its Application

Authors: K. M. Alsager, N. O. Alshehri

Abstract:

In this paper, we proposed the notion of single valued neutrosophic hesitant fuzzy rough set, by combining single valued neutrosophic hesitant fuzzy set and rough set. The combination of single valued neutrosophic hesitant fuzzy set and rough set is a powerful tool for dealing with uncertainty, granularity and incompleteness of knowledge in information systems. We presented both definition and some basic properties of the proposed model. Finally, we gave a general approach which is applied to a decision making problem in disease diagnoses, and demonstrated the effectiveness of the approach by a numerical example.

Keywords: single valued neutrosophic fuzzy set, single valued neutrosophic fuzzy hesitant set, rough set, single valued neutrosophic hesitant fuzzy rough set

Procedia PDF Downloads 272
2441 Multimodal Convolutional Neural Network for Musical Instrument Recognition

Authors: Yagya Raj Pandeya, Joonwhoan Lee

Abstract:

The dynamic behavior of music and video makes it difficult to evaluate musical instrument playing in a video by computer system. Any television or film video clip with music information are rich sources for analyzing musical instruments using modern machine learning technologies. In this research, we integrate the audio and video information sources using convolutional neural network (CNN) and pass network learned features through recurrent neural network (RNN) to preserve the dynamic behaviors of audio and video. We use different pre-trained CNN for music and video feature extraction and then fine tune each model. The music network use 2D convolutional network and video network use 3D convolution (C3D). Finally, we concatenate each music and video feature by preserving the time varying features. The long short term memory (LSTM) network is used for long-term dynamic feature characterization and then use late fusion with generalized mean. The proposed network performs better performance to recognize the musical instrument using audio-video multimodal neural network.

Keywords: multimodal, 3D convolution, music-video feature extraction, generalized mean

Procedia PDF Downloads 215
2440 Bypassing Docker Transport Layer Security Using Remote Code Execution

Authors: Michael J. Hahn

Abstract:

Docker is a powerful tool used by many companies such as PayPal, MetLife, Expedia, Visa, and many others. Docker works by bundling multiple applications, binaries, and libraries together on top of an operating system image called a container. The container runs on a Docker engine that in turn runs on top of a standard operating system. This centralization saves a lot of system resources. In this paper, we will be demonstrating how to bypass Transport Layer Security and execute remote code within Docker containers built on a base image of Alpine Linux version 3.7.0 through the use of .apk files due to flaws in the Alpine Linux package management program. This exploit renders any applications built using Docker with a base image of Alpine Linux vulnerable to unwanted outside forces.

Keywords: cloud, cryptography, Docker, Linux, security

Procedia PDF Downloads 198
2439 Using Artificial Intelligence Technology to Build the User-Oriented Platform for Integrated Archival Service

Authors: Lai Wenfang

Abstract:

Tthis study will describe how to use artificial intelligence (AI) technology to build the user-oriented platform for integrated archival service. The platform will be launched in 2020 by the National Archives Administration (NAA) in Taiwan. With the progression of information communication technology (ICT) the NAA has built many systems to provide archival service. In order to cope with new challenges, such as new ICT, artificial intelligence or blockchain etc. the NAA will try to use the natural language processing (NLP) and machine learning (ML) skill to build a training model and propose suggestions based on the data sent to the platform. NAA expects the platform not only can automatically inform the sending agencies’ staffs which records catalogues are against the transfer or destroy rules, but also can use the model to find the details hidden in the catalogues and suggest NAA’s staff whether the records should be or not to be, to shorten the auditing time. The platform keeps all the users’ browse trails; so that the platform can predict what kinds of archives user could be interested and recommend the search terms by visualization, moreover, inform them the new coming archives. In addition, according to the Archives Act, the NAA’s staff must spend a lot of time to mark or remove the personal data, classified data, etc. before archives provided. To upgrade the archives access service process, the platform will use some text recognition pattern to black out automatically, the staff only need to adjust the error and upload the correct one, when the platform has learned the accuracy will be getting higher. In short, the purpose of the platform is to deduct the government digital transformation and implement the vision of a service-oriented smart government.

Keywords: artificial intelligence, natural language processing, machine learning, visualization

Procedia PDF Downloads 174
2438 Fuzzy Neuro Approach for Integrated Water Management System

Authors: Stuti Modi, Aditi Kambli

Abstract:

This paper addresses the need for intelligent water management and distribution system in smart cities to ensure optimal consumption and distribution of water for drinking and sanitation purposes. Water being a limited resource in cities require an effective system for collection, storage and distribution. In this paper, applications of two mostly widely used particular types of data-driven models, namely artificial neural networks (ANN) and fuzzy logic-based models, to modelling in the water resources management field are considered. The objective of this paper is to review the principles of various types and architectures of neural network and fuzzy adaptive systems and their applications to integrated water resources management. Final goal of the review is to expose and formulate progressive direction of their applicability and further research of the AI-related and data-driven techniques application and to demonstrate applicability of the neural networks, fuzzy systems and other machine learning techniques in the practical issues of the regional water management. Apart from this the paper will deal with water storage, using ANN to find optimum reservoir level and predicting peak daily demands.

Keywords: artificial neural networks, fuzzy systems, peak daily demand prediction, water management and distribution

Procedia PDF Downloads 186
2437 A Recognition Method of Ancient Yi Script Based on Deep Learning

Authors: Shanxiong Chen, Xu Han, Xiaolong Wang, Hui Ma

Abstract:

Yi is an ethnic group mainly living in mainland China, with its own spoken and written language systems, after development of thousands of years. Ancient Yi is one of the six ancient languages in the world, which keeps a record of the history of the Yi people and offers documents valuable for research into human civilization. Recognition of the characters in ancient Yi helps to transform the documents into an electronic form, making their storage and spreading convenient. Due to historical and regional limitations, research on recognition of ancient characters is still inadequate. Thus, deep learning technology was applied to the recognition of such characters. Five models were developed on the basis of the four-layer convolutional neural network (CNN). Alpha-Beta divergence was taken as a penalty term to re-encode output neurons of the five models. Two fully connected layers fulfilled the compression of the features. Finally, at the softmax layer, the orthographic features of ancient Yi characters were re-evaluated, their probability distributions were obtained, and characters with features of the highest probability were recognized. Tests conducted show that the method has achieved higher precision compared with the traditional CNN model for handwriting recognition of the ancient Yi.

Keywords: recognition, CNN, Yi character, divergence

Procedia PDF Downloads 165
2436 An Improved Data Aided Channel Estimation Technique Using Genetic Algorithm for Massive Multi-Input Multiple-Output

Authors: M. Kislu Noman, Syed Mohammed Shamsul Islam, Shahriar Hassan, Raihana Pervin

Abstract:

With the increasing rate of wireless devices and high bandwidth operations, wireless networking and communications are becoming over crowded. To cope with such crowdy and messy situation, massive MIMO is designed to work with hundreds of low costs serving antennas at a time as well as improve the spectral efficiency at the same time. TDD has been used for gaining beamforming which is a major part of massive MIMO, to gain its best improvement to transmit and receive pilot sequences. All the benefits are only possible if the channel state information or channel estimation is gained properly. The common methods to estimate channel matrix used so far is LS, MMSE and a linear version of MMSE also proposed in many research works. We have optimized these methods using genetic algorithm to minimize the mean squared error and finding the best channel matrix from existing algorithms with less computational complexity. Our simulation result has shown that the use of GA worked beautifully on existing algorithms in a Rayleigh slow fading channel and existence of Additive White Gaussian Noise. We found that the GA optimized LS is better than existing algorithms as GA provides optimal result in some few iterations in terms of MSE with respect to SNR and computational complexity.

Keywords: channel estimation, LMMSE, LS, MIMO, MMSE

Procedia PDF Downloads 191
2435 Selecting Answers for Questions with Multiple Answer Choices in Arabic Question Answering Based on Textual Entailment Recognition

Authors: Anes Enakoa, Yawei Liang

Abstract:

Question Answering (QA) system is one of the most important and demanding tasks in the field of Natural Language Processing (NLP). In QA systems, the answer generation task generates a list of candidate answers to the user's question, in which only one answer is correct. Answer selection is one of the main components of the QA, which is concerned with selecting the best answer choice from the candidate answers suggested by the system. However, the selection process can be very challenging especially in Arabic due to its particularities. To address this challenge, an approach is proposed to answer questions with multiple answer choices for Arabic QA systems based on Textual Entailment (TE) recognition. The developed approach employs a Support Vector Machine that considers lexical, semantic and syntactic features in order to recognize the entailment between the generated hypotheses (H) and the text (T). A set of experiments has been conducted for performance evaluation and the overall performance of the proposed method reached an accuracy of 67.5% with C@1 score of 80.46%. The obtained results are promising and demonstrate that the proposed method is effective for TE recognition task.

Keywords: information retrieval, machine learning, natural language processing, question answering, textual entailment

Procedia PDF Downloads 145
2434 Non-Targeted Adversarial Object Detection Attack: Fast Gradient Sign Method

Authors: Bandar Alahmadi, Manohar Mareboyana, Lethia Jackson

Abstract:

Today, there are many applications that are using computer vision models, such as face recognition, image classification, and object detection. The accuracy of these models is very important for the performance of these applications. One challenge that facing the computer vision models is the adversarial examples attack. In computer vision, the adversarial example is an image that is intentionally designed to cause the machine learning model to misclassify it. One of very well-known method that is used to attack the Convolution Neural Network (CNN) is Fast Gradient Sign Method (FGSM). The goal of this method is to find the perturbation that can fool the CNN using the gradient of the cost function of CNN. In this paper, we introduce a novel model that can attack Regional-Convolution Neural Network (R-CNN) that use FGSM. We first extract the regions that are detected by R-CNN, and then we resize these regions into the size of regular images. Then, we find the best perturbation of the regions that can fool CNN using FGSM. Next, we add the resulted perturbation to the attacked region to get a new region image that looks similar to the original image to human eyes. Finally, we placed the regions back to the original image and test the R-CNN with the attacked images. Our model could drop the accuracy of the R-CNN when we tested with Pascal VOC 2012 dataset.

Keywords: adversarial examples, attack, computer vision, image processing

Procedia PDF Downloads 193
2433 Machine Learning and Internet of Thing for Smart-Hydrology of the Mantaro River Basin

Authors: Julio Jesus Salazar, Julio Jesus De Lama

Abstract:

the fundamental objective of hydrological studies applied to the engineering field is to determine the statistically consistent volumes or water flows that, in each case, allow us to size or design a series of elements or structures to effectively manage and develop a river basin. To determine these values, there are several ways of working within the framework of traditional hydrology: (1) Study each of the factors that influence the hydrological cycle, (2) Study the historical behavior of the hydrology of the area, (3) Study the historical behavior of hydrologically similar zones, and (4) Other studies (rain simulators or experimental basins). Of course, this range of studies in a certain basin is very varied and complex and presents the difficulty of collecting the data in real time. In this complex space, the study of variables can only be overcome by collecting and transmitting data to decision centers through the Internet of things and artificial intelligence. Thus, this research work implemented the learning project of the sub-basin of the Shullcas river in the Andean basin of the Mantaro river in Peru. The sensor firmware to collect and communicate hydrological parameter data was programmed and tested in similar basins of the European Union. The Machine Learning applications was programmed to choose the algorithms that direct the best solution to the determination of the rainfall-runoff relationship captured in the different polygons of the sub-basin. Tests were carried out in the mountains of Europe, and in the sub-basins of the Shullcas river (Huancayo) and the Yauli river (Jauja) with heights close to 5000 m.a.s.l., giving the following conclusions: to guarantee a correct communication, the distance between devices should not pass the 15 km. It is advisable to minimize the energy consumption of the devices and avoid collisions between packages, the distances oscillate between 5 and 10 km, in this way the transmission power can be reduced and a higher bitrate can be used. In case the communication elements of the devices of the network (internet of things) installed in the basin do not have good visibility between them, the distance should be reduced to the range of 1-3 km. The energy efficiency of the Atmel microcontrollers present in Arduino is not adequate to meet the requirements of system autonomy. To increase the autonomy of the system, it is recommended to use low consumption systems, such as the Ashton Raggatt McDougall or ARM Cortex L (Ultra Low Power) microcontrollers or even the Cortex M; and high-performance direct current (DC) to direct current (DC) converters. The Machine Learning System has initiated the learning of the Shullcas system to generate the best hydrology of the sub-basin. This will improve as machine learning and the data entered in the big data coincide every second. This will provide services to each of the applications of the complex system to return the best data of determined flows.

Keywords: hydrology, internet of things, machine learning, river basin

Procedia PDF Downloads 160
2432 Adaptation of Hough Transform Algorithm for Text Document Skew Angle Detection

Authors: Kayode A. Olaniyi, Olabanji F. Omotoye, Adeola A. Ogunleye

Abstract:

The skew detection and correction form an important part of digital document analysis. This is because uncompensated skew can deteriorate document features and can complicate further document image processing steps. Efficient text document analysis and digitization can rarely be achieved when a document is skewed even at a small angle. Once the documents have been digitized through the scanning system and binarization also achieved, document skew correction is required before further image analysis. Research efforts have been put in this area with algorithms developed to eliminate document skew. Skew angle correction algorithms can be compared based on performance criteria. Most important performance criteria are accuracy of skew angle detection, range of skew angle for detection, speed of processing the image, computational complexity and consequently memory space used. The standard Hough Transform has successfully been implemented for text documentation skew angle estimation application. However, the standard Hough Transform algorithm level of accuracy depends largely on how much fine the step size for the angle used. This consequently consumes more time and memory space for increase accuracy and, especially where number of pixels is considerable large. Whenever the Hough transform is used, there is always a tradeoff between accuracy and speed. So a more efficient solution is needed that optimizes space as well as time. In this paper, an improved Hough transform (HT) technique that optimizes space as well as time to robustly detect document skew is presented. The modified algorithm of Hough Transform presents solution to the contradiction between the memory space, running time and accuracy. Our algorithm starts with the first step of angle estimation accurate up to zero decimal place using the standard Hough Transform algorithm achieving minimal running time and space but lacks relative accuracy. Then to increase accuracy, suppose estimated angle found using the basic Hough algorithm is x degree, we then run again basic algorithm from range between ±x degrees with accuracy of one decimal place. Same process is iterated till level of desired accuracy is achieved. The procedure of our skew estimation and correction algorithm of text images is implemented using MATLAB. The memory space estimation and process time are also tabulated with skew angle assumption of within 00 and 450. The simulation results which is demonstrated in Matlab show the high performance of our algorithms with less computational time and memory space used in detecting document skew for a variety of documents with different levels of complexity.

Keywords: hough-transform, skew-detection, skew-angle, skew-correction, text-document

Procedia PDF Downloads 159
2431 Cooperation of Unmanned Vehicles for Accomplishing Missions

Authors: Ahmet Ozcan, Onder Alparslan, Anil Sezgin, Omer Cetin

Abstract:

The use of unmanned systems for different purposes has become very popular over the past decade. Expectations from these systems have also shown an incredible increase in this parallel. But meeting the demands of the tasks are often not possible with the usage of a single unmanned vehicle in a mission, so it is necessary to use multiple autonomous vehicles with different abilities together in coordination. Therefore the usage of the same type of vehicles together as a swarm is helped especially to satisfy the time constraints of the missions effectively. In other words, it allows sharing the workload by the various numbers of homogenous platforms together. Besides, it is possible to say there are many kinds of problems that require the usage of the different capabilities of the heterogeneous platforms together cooperatively to achieve successful results. In this case, cooperative working brings additional problems beyond the homogeneous clusters. In the scenario presented as an example problem, it is expected that an autonomous ground vehicle, which is lack of its position information, manage to perform point-to-point navigation without losing its way in a previously unknown labyrinth. Furthermore, the ground vehicle is equipped with very limited sensors such as ultrasonic sensors that can detect obstacles. It is very hard to plan or complete the mission for the ground vehicle by self without lost its way in the unknown labyrinth. Thus, in order to assist the ground vehicle, the autonomous air drone is also used to solve the problem cooperatively. The autonomous drone also has limited sensors like downward looking camera and IMU, and it also lacks computing its global position. In this context, it is aimed to solve the problem effectively without taking additional support or input from the outside, just benefiting capabilities of two autonomous vehicles. To manage the point-to-point navigation in a previously unknown labyrinth, the platforms have to work together coordinated. In this paper, cooperative work of heterogeneous unmanned systems is handled in an applied sample scenario, and it is mentioned that how to work together with an autonomous ground vehicle and the autonomous flying platform together in a harmony to take advantage of different platform-specific capabilities. The difficulties of using heterogeneous multiple autonomous platforms in a mission are put forward, and the successful solutions are defined and implemented against the problems like spatially distributed tasks planning, simultaneous coordinated motion, effective communication, and sensor fusion.

Keywords: unmanned systems, heterogeneous autonomous vehicles, coordination, task planning

Procedia PDF Downloads 128
2430 Optimizing Network Latency with Fast Path Assignment for Incoming Flows

Authors: Qing Lyu, Hang Zhu

Abstract:

Various flows in the network require to go through different types of middlebox. The improper placement of network middlebox and path assignment for flows could greatly increase the network latency and also decrease the performance of network. Minimizing the total end to end latency of all the ows requires to assign path for the incoming flows. In this paper, the flow path assignment problem in regard to the placement of various kinds of middlebox is studied. The flow path assignment problem is formulated to a linear programming problem, which is very time consuming. On the other hand, a naive greedy algorithm is studied. Which is very fast but causes much more latency than the linear programming algorithm. At last, the paper presents a heuristic algorithm named FPA, which takes bottleneck link information and estimated bandwidth occupancy into consideration, and achieves near optimal latency in much less time. Evaluation results validate the effectiveness of the proposed algorithm.

Keywords: flow path, latency, middlebox, network

Procedia PDF Downloads 207
2429 A Comprehensive Study and Evaluation on Image Fashion Features Extraction

Authors: Yuanchao Sang, Zhihao Gong, Longsheng Chen, Long Chen

Abstract:

Clothing fashion represents a human’s aesthetic appreciation towards everyday outfits and appetite for fashion, and it reflects the development of status in society, humanity, and economics. However, modelling fashion by machine is extremely challenging because fashion is too abstract to be efficiently described by machines. Even human beings can hardly reach a consensus about fashion. In this paper, we are dedicated to answering a fundamental fashion-related problem: what image feature best describes clothing fashion? To address this issue, we have designed and evaluated various image features, ranging from traditional low-level hand-crafted features to mid-level style awareness features to various current popular deep neural network-based features, which have shown state-of-the-art performance in various vision tasks. In summary, we tested the following 9 feature representations: color, texture, shape, style, convolutional neural networks (CNNs), CNNs with distance metric learning (CNNs&DML), AutoEncoder, CNNs with multiple layer combination (CNNs&MLC) and CNNs with dynamic feature clustering (CNNs&DFC). Finally, we validated the performance of these features on two publicly available datasets. Quantitative and qualitative experimental results on both intra-domain and inter-domain fashion clothing image retrieval showed that deep learning based feature representations far outweigh traditional hand-crafted feature representation. Additionally, among all deep learning based methods, CNNs with explicit feature clustering performs best, which shows feature clustering is essential for discriminative fashion feature representation.

Keywords: convolutional neural network, feature representation, image processing, machine modelling

Procedia PDF Downloads 139
2428 Designing Directed Network with Optimal Controllability

Authors: Liang Bai, Yandong Xiao, Haorang Wang, Songyang Lao

Abstract:

The directedness of links is crucial to determine the controllability in complex networks. Even the edge directions can determine the controllability of complex networks. Obviously, for a given network, we wish to design its edge directions that make this network approach the optimal controllability. In this work, we firstly introduce two methods to enhance network by assigning edge directions. However, these two methods could not completely mitigate the negative effects of inaccessibility and dilations. Thus, to approach the optimal network controllability, the edge directions must mitigate the negative effects of inaccessibility and dilations as much as possible. Finally, we propose the edge direction for optimal controllability. The optimal method has been found to be successfully useful on real-world and synthetic networks.

Keywords: complex network, dynamics, network control, optimization

Procedia PDF Downloads 185
2427 A Lexicographic Approach to Obstacles Identified in the Ontological Representation of the Tree of Life

Authors: Sandra Young

Abstract:

The biodiversity literature is vast and heterogeneous. In today’s data age, numbers of data integration and standardisation initiatives aim to facilitate simultaneous access to all the literature across biodiversity domains for research and forecasting purposes. Ontologies are being used increasingly to organise this information, but the rationalisation intrinsic to ontologies can hit obstacles when faced with the intrinsic fluidity and inconsistency found in the domains comprising biodiversity. Essentially the problem is a conceptual one: biological taxonomies are formed on the basis of specific, physical specimens yet nomenclatural rules are used to provide labels to describe these physical objects. These labels are ambiguous representations of the physical specimen. An example of this is with the genus Melpomene, the scientific nomenclatural representation of a genus of ferns, but also for a genus of spiders. The physical specimens for each of these are vastly different, but they have been assigned the same nomenclatural reference. While there is much research into the conceptual stability of the taxonomic concept versus the nomenclature used, to the best of our knowledge as yet no research has looked empirically at the literature to see the conceptual plurality or singularity of the use of these species’ names, the linguistic representation of a physical entity. Language itself uses words as symbols to represent real world concepts, whether physical entities or otherwise, and as such lexicography has a well-founded history in the conceptual mapping of words in context for dictionary making. This makes it an ideal candidate to explore this problem. The lexicographic approach uses corpus-based analysis to look at word use in context, with a specific focus on collocated word frequencies (the frequencies of words used in specific grammatical and collocational contexts). It allows for inconsistencies and contradictions in the source data and in fact includes these in the word characterisation so that 100% of the available evidence is counted. Corpus analysis is indeed suggested as one of the ways to identify concepts for ontology building, because of its ability to look empirically at data and show patterns in language usage, which can indicate conceptual ideas which go beyond words themselves. In this sense it could potentially be used to identify if the hierarchical structures present within the empirical body of literature match those which have been identified in ontologies created to represent them. The first stages of this research have revealed a hierarchical structure that becomes apparent in the biodiversity literature when annotating scientific species’ names, common names and more general names as classes, which will be the focus of this paper. The next step in the research is focusing on a larger corpus in which specific words can be analysed and then compared with existing ontological structures looking at the same material, to evaluate the methods by means of an alternative perspective. This research aims to provide evidence as to the validity of the current methods in knowledge representation for biological entities, and also shed light on the way that scientific nomenclature is used within the literature.

Keywords: ontology, biodiversity, lexicography, knowledge representation, corpus linguistics

Procedia PDF Downloads 137
2426 Enhancing the Performance of Bug Reporting System by Handling Duplicate Reporting Reports: Artificial Intelligence Based Mantis

Authors: Afshan Saad, Muhammad Saad, Shah Muhammad Emaduddin

Abstract:

Bug reporting systems are most important tool that guides regarding different maintenance activities in software engineering. Duplicate bug reports which describe the bugs and issues in bug reporting system repository increases processing time of bug triage that monitors all such activities and software programmers who are working and spending time on reports which were assigned by triage. These reports can reveal imperfections and degrade software quality. As there is a number of the potential duplicate bug reports increases, the number of bug reports in bug repository increases. Identifying duplicate bug reports help in decreasing development work load in fixing defects. However, it is difficult to manually identify all possible duplicates because of the huge number of already reported bug reports. In this paper, an artificial intelligence based system using Mantis is proposed to automatically detect duplicate bug reports. When new bugs are submitted to repository triages will mark it with a tag. It will investigate that whether it is a duplicate of an existing bug report by matching or not. Reports with duplicate tags will be eliminated from the repository which not only will improve the performance of the system but can also save cost and effort waste on bug triage and finding the duplicate bug.

Keywords: bug tracking, triager, tool, quality assurance

Procedia PDF Downloads 194
2425 A Mean–Variance–Skewness Portfolio Optimization Model

Authors: Kostas Metaxiotis

Abstract:

Portfolio optimization is one of the most important topics in finance. This paper proposes a mean–variance–skewness (MVS) portfolio optimization model. Traditionally, the portfolio optimization problem is solved by using the mean–variance (MV) framework. In this study, we formulate the proposed model as a three-objective optimization problem, where the portfolio's expected return and skewness are maximized whereas the portfolio risk is minimized. For solving the proposed three-objective portfolio optimization model we apply an adapted version of the non-dominated sorting genetic algorithm (NSGAII). Finally, we use a real dataset from FTSE-100 for validating the proposed model.

Keywords: evolutionary algorithms, portfolio optimization, skewness, stock selection

Procedia PDF Downloads 198
2424 Efficient Frequent Itemset Mining Methods over Real-Time Spatial Big Data

Authors: Hamdi Sana, Emna Bouazizi, Sami Faiz

Abstract:

In recent years, there is a huge increase in the use of spatio-temporal applications where data and queries are continuously moving. As a result, the need to process real-time spatio-temporal data seems clear and real-time stream data management becomes a hot topic. Sliding window model and frequent itemset mining over dynamic data are the most important problems in the context of data mining. Thus, sliding window model for frequent itemset mining is a widely used model for data stream mining due to its emphasis on recent data and its bounded memory requirement. These methods use the traditional transaction-based sliding window model where the window size is based on a fixed number of transactions. Actually, this model supposes that all transactions have a constant rate which is not suited for real-time applications. And the use of this model in such applications endangers their performance. Based on these observations, this paper relaxes the notion of window size and proposes the use of a timestamp-based sliding window model. In our proposed frequent itemset mining algorithm, support conditions are used to differentiate frequents and infrequent patterns. Thereafter, a tree is developed to incrementally maintain the essential information. We evaluate our contribution. The preliminary results are quite promising.

Keywords: real-time spatial big data, frequent itemset, transaction-based sliding window model, timestamp-based sliding window model, weighted frequent patterns, tree, stream query

Procedia PDF Downloads 161
2423 The Design of Information Technology System for Traceability of Thailand’s Tubtimjun Roseapple

Authors: Pimploi Tirastittam, Phutthiwat Waiyawuththanapoom, Sawanath Treesathon

Abstract:

As there are several countries which import agriculture product from Thailand, those countries demand Thailand to establish the traceability system. The traceability system is the tool to reduce the risk in the supply chain in a very effective way as it will help the stakeholder in the supply chain to identify the defect point which will reduce the cost of operation in the supply chain. This research is aimed to design the traceability system for Tubtimjun roseapple for exporting to China, and it is the qualitative research. The data was collected from the expert in the tuntimjun roseapple and fruit exporting industry, and the data was used to design the traceability system. The design of the tubtimjun roseapple traceability system was followed the theory of supply chain which starts from the upstream of the supply chain to the downstream of the supply chain to support the process and condition of the exporting which included the database designing, system architecture, user interface design and information technology of the traceability system.

Keywords: design information, technology system, traceability, tubtimjun roseapple

Procedia PDF Downloads 170
2422 Object Recognition Approach Based on Generalized Hough Transform and Color Distribution Serving in Generating Arabic Sentences

Authors: Nada Farhani, Naim Terbeh, Mounir Zrigui

Abstract:

The recognition of the objects contained in images has always presented a challenge in the field of research because of several difficulties that the researcher can envisage because of the variability of shape, position, contrast of objects, etc. In this paper, we will be interested in the recognition of objects. The classical Hough Transform (HT) presented a tool for detecting straight line segments in images. The technique of HT has been generalized (GHT) for the detection of arbitrary forms. With GHT, the forms sought are not necessarily defined analytically but rather by a particular silhouette. For more precision, we proposed to combine the results from the GHT with the results from a calculation of similarity between the histograms and the spatiograms of the images. The main purpose of our work is to use the concepts from recognition to generate sentences in Arabic that summarize the content of the image.

Keywords: recognition of shape, generalized hough transformation, histogram, spatiogram, learning

Procedia PDF Downloads 158
2421 Early Recognition and Grading of Cataract Using a Combined Log Gabor/Discrete Wavelet Transform with ANN and SVM

Authors: Hadeer R. M. Tawfik, Rania A. K. Birry, Amani A. Saad

Abstract:

Eyes are considered to be the most sensitive and important organ for human being. Thus, any eye disorder will affect the patient in all aspects of life. Cataract is one of those eye disorders that lead to blindness if not treated correctly and quickly. This paper demonstrates a model for automatic detection, classification, and grading of cataracts based on image processing techniques and artificial intelligence. The proposed system is developed to ease the cataract diagnosis process for both ophthalmologists and patients. The wavelet transform combined with 2D Log Gabor Wavelet transform was used as feature extraction techniques for a dataset of 120 eye images followed by a classification process that classified the image set into three classes; normal, early, and advanced stage. A comparison between the two used classifiers, the support vector machine SVM and the artificial neural network ANN were done for the same dataset of 120 eye images. It was concluded that SVM gave better results than ANN. SVM success rate result was 96.8% accuracy where ANN success rate result was 92.3% accuracy.

Keywords: cataract, classification, detection, feature extraction, grading, log-gabor, neural networks, support vector machines, wavelet

Procedia PDF Downloads 332
2420 Arabic Character Recognition Using Regression Curves with the Expectation Maximization Algorithm

Authors: Abdullah A. AlShaher

Abstract:

In this paper, we demonstrate how regression curves can be used to recognize 2D non-rigid handwritten shapes. Each shape is represented by a set of non-overlapping uniformly distributed landmarks. The underlying models utilize 2nd order of polynomials to model shapes within a training set. To estimate the regression models, we need to extract the required coefficients which describe the variations for a set of shape class. Hence, a least square method is used to estimate such modes. We then proceed by training these coefficients using the apparatus Expectation Maximization algorithm. Recognition is carried out by finding the least error landmarks displacement with respect to the model curves. Handwritten isolated Arabic characters are used to evaluate our approach.

Keywords: character recognition, regression curves, handwritten Arabic letters, expectation maximization algorithm

Procedia PDF Downloads 145
2419 Short Text Classification Using Part of Speech Feature to Analyze Students' Feedback of Assessment Components

Authors: Zainab Mutlaq Ibrahim, Mohamed Bader-El-Den, Mihaela Cocea

Abstract:

Students' textual feedback can hold unique patterns and useful information about learning process, it can hold information about advantages and disadvantages of teaching methods, assessment components, facilities, and other aspects of teaching. The results of analysing such a feedback can form a key point for institutions’ decision makers to advance and update their systems accordingly. This paper proposes a data mining framework for analysing end of unit general textual feedback using part of speech feature (PoS) with four machine learning algorithms: support vector machines, decision tree, random forest, and naive bays. The proposed framework has two tasks: first, to use the above algorithms to build an optimal model that automatically classifies the whole data set into two subsets, one subset is tailored to assessment practices (assessment related), and the other one is the non-assessment related data. Second task to use the same algorithms to build an optimal model for whole data set, and the new data subsets to automatically detect their sentiment. The significance of this paper is to compare the performance of the above four algorithms using part of speech feature to the performance of the same algorithms using n-grams feature. The paper follows Knowledge Discovery and Data Mining (KDDM) framework to construct the classification and sentiment analysis models, which is understanding the assessment domain, cleaning and pre-processing the data set, selecting and running the data mining algorithm, interpreting mined patterns, and consolidating the discovered knowledge. The results of this paper experiments show that both models which used both features performed very well regarding first task. But regarding the second task, models that used part of speech feature has underperformed in comparison with models that used unigrams and bigrams.

Keywords: assessment, part of speech, sentiment analysis, student feedback

Procedia PDF Downloads 142