Search results for: text localization and extraction.

1287 A Novel Arabic Text Steganography Method Using Letter Points and Extensions

Authors: Adnan Abdul-Aziz Gutub, Manal Mohammad Fattani

Abstract:

This paper presents a new steganography approach suitable for Arabic texts. It can be classified under steganography feature coding methods. The approach hides secret information bits within the letters benefiting from their inherited points. To note the specific letters holding secret bits, the scheme considers the two features, the existence of the points in the letters and the redundant Arabic extension character. We use the pointed letters with extension to hold the secret bit 'one' and the un-pointed letters with extension to hold 'zero'. This steganography technique is found attractive to other languages having similar texts to Arabic such as Persian and Urdu.

Keywords: Arabic text, Cryptography, Feature coding, Information security, Text steganography, Text watermarking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3460

1286 Microwave Pretreatment of Seeds to Extract High Quality Vegetable Oil

Authors: S. Azadmard-Damirchi, K. Alirezalu, B. Fathi Achachlouei

Abstract:

Microwave energy is a superior alternative to several other thermal treatments. Extraction techniques are widely employed for the isolation of bioactive compounds and vegetable oils from oil seeds. Among the different and new available techniques, microwave pretreatment of seeds is a simple and desirable method for production of high quality vegetable oils. Microwave pretreatment for oil extraction has many advantages as follow: improving oil extraction yield and quality, direct extraction capability, lower energy consumption, faster processing time and reduced solvent levels compared with conventional methods. It allows also for better retention and availability of desirable nutraceuticals, such as phytosterols and tocopherols, canolol and phenolic compounds in the extracted oil such as rapeseed oil. This can be a new step to produce nutritional vegetable oils with improved shelf life because of high antioxidant content.

Keywords: Microwave pretreatment, vegetable oil extraction, nutraceuticals, oil quality

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4842

1285 Oil Extraction from Microalgae Dunalliela sp. by Polar and Non-Polar Solvents

Authors: A. Zonouzi, M. Auli, M. Javanmard Dakheli, M. A. Hejazi

Abstract:

Microalgae are tiny photosynthetic plants. Nowadays, microalgae are being used as nutrient-dense foods and sources of fine chemicals. They have significant amounts of lipid, carotenoids, vitamins, protein, minerals, chlorophyll, and pigments. Oil extraction from algae is a hotly debated topic currently because introducing an efficient method could decrease the process cost. This can determine the sustainability of algae-based foods. Scientific research works show that solvent extraction using chloroform/methanol (2:1) mixture is one of the efficient methods for oil extraction from algal cells, but both methanol and chloroform are toxic solvents, and therefore, the extracted oil will not be suitable for food application. In this paper, the effect of two food grade solvents (hexane and hexane/ isopropanol) on oil extraction yield from microalgae Dunaliella sp. was investigated and the results were compared with chloroform/methanol (2:1) extraction yield. It was observed that the oil extraction yield using hexane, hexane/isopropanol (3:2) and chloroform/methanol (2:1) mixture were 5.4, 13.93, and 17.5 (% w/w, dry basis), respectively. The fatty acid profile derived from GC illustrated that the palmitic (36.62%), oleic (18.62%), and stearic acids (19.08%) form the main portion of fatty acid composition of microalgae Dunalliela sp. oil. It was concluded that, the addition of isopropanol as polar solvent could increase the extraction yield significantly. Isopropanol solves cell wall phospholipids and enhances the release of intercellular lipids, which improves accessing of hexane to fatty acids.

Keywords: Fatty acid profile, Microalgae, Oil extraction, Polar solvent.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2132

1284 Experimental Study of Hyperparameter Tuning a Deep Learning Convolutional Recurrent Network for Text Classification

Authors: Bharatendra Rai

Abstract:

Sequences of words in text data have long-term dependencies and are known to suffer from vanishing gradient problem when developing deep learning models. Although recurrent networks such as long short-term memory networks help overcome this problem, achieving high text classification performance is a challenging problem. Convolutional recurrent networks that combine advantages of long short-term memory networks and convolutional neural networks, can be useful for text classification performance improvements. However, arriving at suitable hyperparameter values for convolutional recurrent networks is still a challenging task where fitting of a model requires significant computing resources. This paper illustrates the advantages of using convolutional recurrent networks for text classification with the help of statistically planned computer experiments for hyperparameter tuning.

Keywords: Convolutional recurrent networks, hyperparameter tuning, long short-term memory networks, Tukey honest significant differences

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 35

1283 Degraded Document Analysis and Extraction of Original Text Document: An Approach without Optical Character Recognition

Authors: L. Hamsaveni, Navya Prakash, Suresha

Abstract:

Document Image Analysis recognizes text and graphics in documents acquired as images. An approach without Optical Character Recognition (OCR) for degraded document image analysis has been adopted in this paper. The technique involves document imaging methods such as Image Fusing and Speeded Up Robust Features (SURF) Detection to identify and extract the degraded regions from a set of document images to obtain an original document with complete information. In case, degraded document image captured is skewed, it has to be straightened (deskew) to perform further process. A special format of image storing known as YCbCr is used as a tool to convert the Grayscale image to RGB image format. The presented algorithm is tested on various types of degraded documents such as printed documents, handwritten documents, old script documents and handwritten image sketches in documents. The purpose of this research is to obtain an original document for a given set of degraded documents of the same source.

Keywords: Grayscale image format, image fusing, SURF detection, YCbCr image format.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1113

1282 Eclectic Rule-Extraction from Support Vector Machines

Authors: Nahla Barakat, Joachim Diederich

Abstract:

Support vector machines (SVMs) have shown superior performance compared to other machine learning techniques, especially in classification problems. Yet one limitation of SVMs is the lack of an explanation capability which is crucial in some applications, e.g. in the medical and security domains. In this paper, a novel approach for eclectic rule-extraction from support vector machines is presented. This approach utilizes the knowledge acquired by the SVM and represented in its support vectors as well as the parameters associated with them. The approach includes three stages; training, propositional rule-extraction and rule quality evaluation. Results from four different experiments have demonstrated the value of the approach for extracting comprehensible rules of high accuracy and fidelity.

Keywords: Data mining, hybrid rule-extraction algorithms, medical diagnosis, SVMs

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1657

1281 Data Embedding Based on Better Use of Bits in Image Pixels

Authors: Rehab H. Alwan, Fadhil J. Kadhim, Ahmad T. Al-Taani

Abstract:

In this study, a novel approach of image embedding is introduced. The proposed method consists of three main steps. First, the edge of the image is detected using Sobel mask filters. Second, the least significant bit LSB of each pixel is used. Finally, a gray level connectivity is applied using a fuzzy approach and the ASCII code is used for information hiding. The prior bit of the LSB represents the edged image after gray level connectivity, and the remaining six bits represent the original image with very little difference in contrast. The proposed method embeds three images in one image and includes, as a special case of data embedding, information hiding, identifying and authenticating text embedded within the digital images. Image embedding method is considered to be one of the good compression methods, in terms of reserving memory space. Moreover, information hiding within digital image can be used for security information transfer. The creation and extraction of three embedded images, and hiding text information is discussed and illustrated, in the following sections.

Keywords: Image embedding, Edge detection, gray level connectivity, information hiding, digital image compression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2100

1280 Preliminary Evaluation of Feasibility for Wind Energy Production on Offshore Extraction Platforms

Authors: M. Raciti Castelli, S. De Betta, E. Benini

Abstract:

A preliminary evaluation of the feasibility of installing small wind turbines on offshore oil and gas extraction platforms is presented. Some aerodynamic considerations are developed in order to determine the best rotor architecture to exploit the wind potential on such installations, assuming that wind conditions over the platforms are similar to those registered on the roofs of urban buildings. Economical considerations about both advantages and disadvantages of the exploitation of wind energy on offshore extraction platforms with respect to conventional offshore wind plants, is also presented. Finally, wind charts of European offshore winds are presented together with a map of the major offshore installations.

Keywords: Extraction platform, offshore wind energy, verticalaxis wind turbine (VAWT).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1765

1279 Influences of Juice Extraction and Drying Methods on the Chemical Analysis of Lemon Peels

Authors: Azza A. Abou-Arab, Marwa H. Mahmoud, Ferial M. Abu-Salem

Abstract:

This study aimed to determine the influence of some different juice extraction methods (screw type hand operated juice extractor and pressed squeeze juice extractor) as well as drying methods (microwave, solar and oven drying) on the chemical properties of lemon peels. It could be concluded that extraction of juice by screw type and drying of peel using the microwave drying method were the best preparative processing steps methods for lemon peel utilization as food additives.

Keywords: Lemon peel, extraction of juice methods, chemical analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 968

1278 Weighted-Distance Sliding Windows and Cooccurrence Graphs for Supporting Entity-Relationship Discovery in Unstructured Text

Authors: Paolo Fantozzi, Luigi Laura, Umberto Nanni

Abstract:

The problem of Entity relation discovery in structured data, a well covered topic in literature, consists in searching within unstructured sources (typically, text) in order to find connections among entities. These can be a whole dictionary, or a specific collection of named items. In many cases machine learning and/or text mining techniques are used for this goal. These approaches might be unfeasible in computationally challenging problems, such as processing massive data streams. A faster approach consists in collecting the cooccurrences of any two words (entities) in order to create a graph of relations - a cooccurrence graph. Indeed each cooccurrence highlights some grade of semantic correlation between the words because it is more common to have related words close each other than having them in the opposite sides of the text. Some authors have used sliding windows for such problem: they count all the occurrences within a sliding windows running over the whole text. In this paper we generalise such technique, coming up to a Weighted-Distance Sliding Window, where each occurrence of two named items within the window is accounted with a weight depending on the distance between items: a closer distance implies a stronger evidence of a relationship. We develop an experiment in order to support this intuition, by applying this technique to a data set consisting in the text of the Bible, split into verses.

Keywords: Cooccurrence graph, entity relation graph, unstructured text, weighted distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 637

1277 Stroke Extraction and Approximation with Interpolating Lagrange Curves

Authors: Bence Kővári, ZSolt Kertész

Abstract:

This paper proposes a stroke extraction method for use in off-line signature verification. After giving a brief overview of the current ongoing researches an algorithm is introduced for detecting and following strokes in static images of signatures. Problems like the handling of junctions and variations in line width and line intensity are discussed in detail. Results are validated by both using an existing on-line signature database and by employing image registration methods.

Keywords: Stroke extraction, spline fitting, off-line signatureverification, image registration.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1937

1276 Biocompatible Ionic Liquids in Liquid – Liquid Extraction of Lactic Acid: A Comparative Study

Authors: Konstantza Tonova, Ivan Svinyarov, Milen G. Bogdanov

Abstract:

Ionic liquids consisting of a phosphonium cationic moiety and a saccharinate anion are synthesized and compared with their precursors, phosphonium chlorides, in reference to their extraction efficiency towards L-lactic acid. On the base of measurements of the acid and the water partitioning in the equilibrium biphasic systems, the molar ratios between acid, water and ionic liquid are estimated which allows to deduce the lactic acid extractive pathway. The effect of a salting-out addition that strengthens hydrophobicity in both phases is studied in view to reveal the best biphasic system with respect to IL low toxicity and high extraction efficiency.

Keywords: Biphasic system, Extraction, Ionic liquids, Lactic acid.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2672

1275 HaskellFL: A Tool for Detecting Logical Errors in Haskell

Authors: Vanessa Vasconcelos, Mariza A. S. Bigonha

Abstract:

Understanding and using the functional paradigm is a challenge for many programmers. Looking for logical errors in code may take a lot of a developer’s time when a program grows in size. In order to facilitate both processes, this paper presents HaskellFL, a tool that uses fault localization techniques to locate a logical error in Haskell code. The Haskell subset used in this work is sufficiently expressive for those studying Functional Programming to get immediate help debugging their code and to answer questions about key concepts associated with the functional paradigm. HaskellFL was tested against Functional Programming assignments submitted by students enrolled at the Functional Programming class at the Federal University of Minas Gerais and against exercises from the Exercism Haskell track that are publicly available in GitHub. This work also evaluated the effectiveness of two fault localization techniques, Tarantula and Ochiai, in the Haskell context. Furthermore, the EXAM score was chosen to evaluate the tool’s effectiveness, and results showed that HaskellFL reduced the effort needed to locate an error for all tested scenarios. The results also showed that the Ochiai method was more effective than Tarantula.

Keywords: Debug, fault localization, functional programming, Haskell.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 655

1274 An Automatic Feature Extraction Technique for 2D Punch Shapes

Authors: Awais Ahmad Khan, Emad Abouel Nasr, H. M. A. Hussein, Abdulrahman Al-Ahmari

Abstract:

Sheet-metal parts have been widely applied in electronics, communication and mechanical industries in recent decades; but the advancement in sheet-metal part design and manufacturing is still behind in comparison with the increasing importance of sheet-metal parts in modern industry. This paper presents a methodology for automatic extraction of some common 2D internal sheet metal features. The features used in this study are taken from Unipunch ™ catalogue. The extraction process starts with the data extraction from STEP file using an object oriented approach and with the application of suitable algorithms and rules, all features contained in the catalogue are automatically extracted. Since the extracted features include geometry and engineering information, they will be effective for downstream application such as feature rebuilding and process planning.

Keywords: Feature Extraction, Internal Features, Punch Shapes, Sheet metal, STEP.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2050

1273 Development of Multimodal e-Slide Presentation to Support Self-Learning for the Visually Impaired

Authors: Rustam Asnawi, Wan Fatimah Wan Ahmad

Abstract:

Currently electronic slide (e-slide) is one of the most common styles in educational presentation. Unfortunately, the utilization of e-slide for the visually impaired is uncommon since they are unable to see the content of such e-slides which are usually composed of text, images and animation. This paper proposes a model for presenting e-slide in multimodal presentation i.e. using conventional slide concurrent with voicing, in both languages Malay and English. At the design level, live multimedia presentation concept is used, while at the implementation level several components are used. The text content of each slide is extracted using COM component, Microsoft Speech API for voicing the text in English language and the text in Malay language is voiced using dictionary approach. To support the accessibility, an auditory user interface is provided as an additional feature. A prototype of such model named as VSlide has been developed and introduced.

Keywords: presentation, self-learning, slide, visually impaired

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1519

1272 Object Localization in Medical Images Using Genetic Algorithms

Authors: George Karkavitsas, Maria Rangoussi

Abstract:

We present a genetic algorithm application to the problem of object registration (i.e., object detection, localization and recognition) in a class of medical images containing various types of blood cells. The genetic algorithm approach taken here is seen to be most appropriate for this type of image, due to the characteristics of the objects. Successful cell registration results on real life microscope images of blood cells show the potential of the proposed approach.

Keywords: Genetic algorithms, object registration, pattern recognition, blood cell microscope images.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1923

1271 The Influence of Preprocessing Parameters on Text Categorization

Authors: Jan Pomikalek, Radim Rehurek

Abstract:

Text categorization (the assignment of texts in natural language into predefined categories) is an important and extensively studied problem in Machine Learning. Currently, popular techniques developed to deal with this task include many preprocessing and learning algorithms, many of which in turn require tuning nontrivial internal parameters. Although partial studies are available, many authors fail to report values of the parameters they use in their experiments, or reasons why these values were used instead of others. The goal of this work then is to create a more thorough comparison of preprocessing parameters and their mutual influence, and report interesting observations and results.

Keywords: Text categorization, machine learning, electronic documents, classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1538

1270 Comparison of Different Solvents and Extraction Methods for Isolation of Phenolic Compounds from Horseradish Roots (Armoracia rusticana)

Authors: Lolita Tomsone, Zanda Kruma, Ruta Galoburda

Abstract:

Horseradish (Armoracia rusticana) is a perennial herb belonging to the Brassicaceae family and contains biologically active substances. The aim of the current research was to determine best method for extraction of phenolic compounds from horseradish roots showing high antiradical activity. Three genotypes (No. 105; No. 106 and variety ‘Turku’) of horseradish roots were extracted with eight different solvents: n-hexane, ethyl acetate, diethyl ether, 2-propanol, acetone, ethanol (95%), ethanol / water / acetic acid (80/20/1 v/v/v) and ethanol / water (80/20 by volume) using two extraction methods (conventional and Soxhlet). As the best solvents ethanol and ethanol / water solutions can be chosen. Although in Soxhlet extracts TPC was higher, scavenging activity of DPPH˙ radicals did not increase. It can be concluded that using Soxhlet extraction method more compounds that are not effective antioxidants.

Keywords: DPPH˙, extraction, solvent, Soxhlet, TPC

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14451

1269 Slovenian Text-to-Speech Synthesis for Speech User Interfaces

Authors: Jerneja Žganec Gros, Aleš Mihelič, Nikola Pavešić, Mario Žganec, Stanislav Gruden

Abstract:

The paper presents the design concept of a unitselection text-to-speech synthesis system for the Slovenian language. Due to its modular and upgradable architecture, the system can be used in a variety of speech user interface applications, ranging from server carrier-grade voice portal applications, desktop user interfaces to specialized embedded devices. Since memory and processing power requirements are important factors for a possible implementation in embedded devices, lexica and speech corpora need to be reduced. We describe a simple and efficient implementation of a greedy subset selection algorithm that extracts a compact subset of high coverage text sentences. The experiment on a reference text corpus showed that the subset selection algorithm produced a compact sentence subset with a small redundancy. The adequacy of the spoken output was evaluated by several subjective tests as they are recommended by the International Telecommunication Union ITU.

Keywords: text-to-speech synthesis, prosody modeling, speech user interface.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1401

1268 A Text Classification Approach Based on Natural Language Processing and Machine Learning Techniques

Authors: Rim Messaoudi, Nogaye-Gueye Gning, François Azelart

Abstract:

Automatic text classification applies mostly natural language processing (NLP) and other artificial intelligence (AI)-guided techniques to automatically classify text in a faster and more accurate manner. This paper discusses the subject of using predictive maintenance to manage incident tickets inside the sociality. It focuses on proposing a tool that treats and analyses comments and notes written by administrators after resolving an incident ticket. The goal here is to increase the quality of these comments. Additionally, this tool is based on NLP and machine learning techniques to realize the textual analytics of the extracted data. This approach was tested using real data taken from the French National Railways (SNCF) company and was given a high-quality result.

Keywords: Machine learning, text classification, NLP techniques, semantic representation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 76

1267 Calculus of Turbojet Performances for Ideal Case

Authors: S. Bennoud, S. Hocine, H. Slme

Abstract:

Developments in turbine cooling technology play an important role in increasing the thermal efficiency and the power output of recent gas turbines, in particular the turbojets.

Advanced turbojets operate at high temperatures to improve thermal efficiency and power output. These temperatures are far above the permissible metal temperatures. Therefore, there is a critical need to cool the blades in order to give theirs a maximum life period for safe operation.

The focused objective of this work is to calculate the turbojet performances, as well as the calculation of turbine blades cooling.

The developed application able the calculation of turbojet performances to different altitudes in order to find a point of optimal use making possible to maintain the turbine blades at an acceptable maximum temperature and to limit the local variations in temperatures in order to guarantee their integrity during all the lifespan of the engine.

Keywords: Brayton cycle, Turbine Blades Cooling, Turbojet Cycle, turbojet performances.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2166

1266 Edge-end Pixel Extraction for Edge-based Image Segmentation

Authors: Mahinda P. Pathegama, Özdemir Göl

Abstract:

Extraction of edge-end-pixels is an important step for the edge linking process to achieve edge-based image segmentation. This paper presents an algorithm to extract edge-end pixels together with their directional sensitivities as an augmentation to the currently available mathematical models. The algorithm is implemented in the Java environment because of its inherent compatibility with web interfaces since its main use is envisaged to be for remote image analysis on a virtual instrumentation platform.

Keywords: edge-end pixels, image processing, imagesegmentation, pixel extraction

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2113

1265 Study on Extraction of Ceric Oxide from Monazite Concentrate

Authors: Lwin Thuzar Shwe, Nwe Nwe Soe, Kay Thi Lwin

Abstract:

Cerium oxide is to be recovered from monazite, which contains about 27.35% CeO2. The principal objective of this study is to be able to extract cerium oxide from monazite of Moemeik Myitsone Area. The treatment of monazite in this study involves three main steps; extraction of cerium hydroxide from monazite, solvent extraction of cerium hydroxide, and precipitation with oxalic acid and calcination of cerium oxalate.

Keywords: Calcination, Digestion, Precipitation, SolventExtraction

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2547

1264 Understanding and Political Participation in Constitutional Monarchy of Dusit District Residents

Authors: Sudaporn Arundee

Abstract:

The purposes of this research were to study in three areas: 1) to study political understanding and participating of the constitutional monarchy, 2) to study the level of participation. This paper drew upon data collected from 395 Dusit residents by using questionnaire. In addition, a simple random sampling was utilized to collect data.

The findings revealed that 94 percent of respondents had a very good understanding of constitution monarchy with a mean of 4.8. However, the respondents overall had a very low level of participation with the mean score of 1.69 and standard deviation of .719.

Keywords: Constitution Monarchy, Political Understanding, Political Participating.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1728

1263 Classifier Based Text Mining for Neural Network

Authors: M. Govindarajan, R. M. Chandrasekaran

Abstract:

Text Mining is around applying knowledge discovery techniques to unstructured text is termed knowledge discovery in text (KDT), or Text data mining or Text Mining. In Neural Network that address classification problems, training set, testing set, learning rate are considered as key tasks. That is collection of input/output patterns that are used to train the network and used to assess the network performance, set the rate of adjustments. This paper describes a proposed back propagation neural net classifier that performs cross validation for original Neural Network. In order to reduce the optimization of classification accuracy, training time. The feasibility the benefits of the proposed approach are demonstrated by means of five data sets like contact-lenses, cpu, weather symbolic, Weather, labor-nega-data. It is shown that , compared to exiting neural network, the training time is reduced by more than 10 times faster when the dataset is larger than CPU or the network has many hidden units while accuracy ('percent correct') was the same for all datasets but contact-lences, which is the only one with missing attributes. For contact-lences the accuracy with Proposed Neural Network was in average around 0.3 % less than with the original Neural Network. This algorithm is independent of specify data sets so that many ideas and solutions can be transferred to other classifier paradigms.

Keywords: Back propagation, classification accuracy, textmining, time complexity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4164

1262 Web Data Scraping Technology Using Term Frequency Inverse Document Frequency to Enhance the Big Data Quality on Sentiment Analysis

Authors: Sangita Pokhrel, Nalinda Somasiri, Rebecca Jeyavadhanam, Swathi Ganesan

Abstract:

Tourism is a booming industry with huge future potential for global wealth and employment. There are countless data generated over social media sites every day, creating numerous opportunities to bring more insights to decision-makers. The integration of big data technology into the tourism industry will allow companies to conclude where their customers have been and what they like. This information can then be used by businesses, such as those in charge of managing visitor centres or hotels, etc., and the tourist can get a clear idea of places before visiting. The technical perspective of natural language is processed by analysing the sentiment features of online reviews from tourists, and we then supply an enhanced long short-term memory (LSTM) framework for sentiment feature extraction of travel reviews. We have constructed a web review database using a crawler and web scraping technique for experimental validation to evaluate the effectiveness of our methodology. The text form of sentences was first classified through VADER and RoBERTa model to get the polarity of the reviews. In this paper, we have conducted study methods for feature extraction, such as Count Vectorization and Term Frequency – Inverse Document Frequency (TFIDF) Vectorization and implemented Convolutional Neural Network (CNN) classifier algorithm for the sentiment analysis to decide if the tourist’s attitude towards the destinations is positive, negative, or simply neutral based on the review text that they posted online. The results demonstrated that from the CNN algorithm, after pre-processing and cleaning the dataset, we received an accuracy of 96.12% for the positive and negative sentiment analysis.

Keywords: Counter vectorization, Convolutional Neural Network, Crawler, data technology, Long Short-Term Memory, LSTM, Web Scraping, sentiment analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 109

1261 Localizing and Recognizing Integral Pitches of Cheque Document Images

Authors: Bremananth R., Veerabadran C. S., Andy W. H. Khong

Abstract:

Automatic reading of handwritten cheque is a computationally complex process and it plays an important role in financial risk management. Machine vision and learning provide a viable solution to this problem. Research effort has mostly been focused on recognizing diverse pitches of cheques and demand drafts with an identical outline. However most of these methods employ templatematching to localize the pitches and such schemes could potentially fail when applied to different types of outline maintained by the bank. In this paper, the so-called outline problem is resolved by a cheque information tree (CIT), which generalizes the localizing method to extract active-region-of-entities. In addition, the weight based density plot (WBDP) is performed to isolate text entities and read complete pitches. Recognition is based on texture features using neural classifiers. Legal amount is subsequently recognized by both texture and perceptual features. A post-processing phase is invoked to detect the incorrect readings by Type-2 grammar using the Turing machine. The performance of the proposed system was evaluated using cheque and demand drafts of 22 different banks. The test data consists of a collection of 1540 leafs obtained from 10 different account holders from each bank. Results show that this approach can easily be deployed without significant design amendments.

Keywords: Cheque reading, Connectivity checking, Text localization, Texture analysis, Turing machine, Signature verification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1608

1260 Principle Components Updates via Matrix Perturbations

Authors: Aiman Elragig, Hanan Dreiwi, Dung Ly, Idriss Elmabrook

Abstract:

This paper highlights a new approach to look at online principle components analysis (OPCA). Given a data matrix X ∈ R,^m x n we characterise the online updates of its covariance as a matrix perturbation problem. Up to the principle components, it turns out that online updates of the batch PCA can be captured by symmetric matrix perturbation of the batch covariance matrix. We have shown that as n→ n0 >> 1, the batch covariance and its update become almost similar. Finally, utilize our new setup of online updates to find a bound on the angle distance of the principle components of X and its update.

Keywords: Online data updates, covariance matrix, online principle component analysis (OPCA), matrix perturbation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 990

1259 Model of Obstacle Avoidance on Hard Disk Drive Manufacturing with Distance Constraint

Authors: Rawinun Praserttaweelap, Somyot Kiatwanidvilai

Abstract:

Obstacle avoidance is the one key for the robot system in unknown environment. The robots should be able to know their position and safety region. This research starts on the path planning which are SLAM and AMCL in ROS system. In addition, the best parameters of the obstacle avoidance function are required. In situation on Hard Disk Drive Manufacturing, the distance between robots and obstacles are very serious due to the manufacturing constraint. The simulations are accomplished by the SLAM and AMCL with adaptive velocity and safety region calculation.

Keywords: Obstacle avoidance, simultaneous localization and mapping, adaptive Monte Carlo localization, KLD sampling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 432

1258 n-Butanol as an Extractant for Lactic Acid Recovery

Authors: Kanungnit Chawong, Panarat Rattanaphanee

Abstract:

Extraction of lactic acid from aqueous solution using n-butanol as an extractant was studied. Effect of mixing time, pH of the aqueous solution, initial lactic acid concentration, and volume ratio between the organic and the aqueous phase were investigated. Distribution coefficient and degree of lactic acid extraction was found to increase when the pH of aqueous solution was decreased. The pH Effect was substantially pronounced at pH of the aqueous solution less than 1. Initial lactic acid concentration and organic-toaqueous volume ratio appeared to have positive effect on the distribution coefficient and the degree of extraction. Due to the nature of n-butanol that is partially miscible in water, incorporation of aqueous solution into organic phase was observed in the extraction with large organic-to-aqueous volume ratio.

Keywords: Lactic acid, liquid-liquid extraction, n-Butanol, Solvating extractant.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3116