Search results for: pattern recognition approach
16974 An End-to-end Piping and Instrumentation Diagram Information Recognition System
Authors: Taekyong Lee, Joon-Young Kim, Jae-Min Cha
Abstract:
Piping and instrumentation diagram (P&ID) is an essential design drawing describing the interconnection of process equipment and the instrumentation installed to control the process. P&IDs are modified and managed throughout a whole life cycle of a process plant. For the ease of data transfer, P&IDs are generally handed over from a design company to an engineering company as portable document format (PDF) which is hard to be modified. Therefore, engineering companies have to deploy a great deal of time and human resources only for manually converting P&ID images into a computer aided design (CAD) file format. To reduce the inefficiency of the P&ID conversion, various symbols and texts in P&ID images should be automatically recognized. However, recognizing information in P&ID images is not an easy task. A P&ID image usually contains hundreds of symbol and text objects. Most objects are pretty small compared to the size of a whole image and are densely packed together. Traditional recognition methods based on geometrical features are not capable enough to recognize every elements of a P&ID image. To overcome these difficulties, state-of-the-art deep learning models, RetinaNet and connectionist text proposal network (CTPN) were used to build a system for recognizing symbols and texts in a P&ID image. Using the RetinaNet and the CTPN model carefully modified and tuned for P&ID image dataset, the developed system recognizes texts, equipment symbols, piping symbols and instrumentation symbols from an input P&ID image and save the recognition results as the pre-defined extensible markup language format. In the test using a commercial P&ID image, the P&ID information recognition system correctly recognized 97% of the symbols and 81.4% of the texts.Keywords: object recognition system, P&ID, symbol recognition, text recognition
Procedia PDF Downloads 15316973 Understanding the Interactive Nature in Auditory Recognition of Phonological/Grammatical/Semantic Errors at the Sentence Level: An Investigation Based upon Japanese EFL Learners’ Self-Evaluation and Actual Language Performance
Authors: Hirokatsu Kawashima
Abstract:
One important element of teaching/learning listening is intensive listening such as listening for precise sounds, words, grammatical, and semantic units. Several classroom-based investigations have been conducted to explore the usefulness of auditory recognition of phonological, grammatical and semantic errors in such a context. The current study reports the results of one such investigation, which targeted auditory recognition of phonological, grammatical, and semantic errors at the sentence level. 56 Japanese EFL learners participated in this investigation, in which their recognition performance of phonological, grammatical and semantic errors was measured on a 9-point scale by learners’ self-evaluation from the perspective of 1) two types of similar English sound (vowel and consonant minimal pair words), 2) two types of sentence word order (verb phrase-based and noun phrase-based word orders), and 3) two types of semantic consistency (verb-purpose and verb-place agreements), respectively, and their general listening proficiency was examined using standardized tests. A number of findings have been made about the interactive relationships between the three types of auditory error recognition and general listening proficiency. Analyses based on the OPLS (Orthogonal Projections to Latent Structure) regression model have disclosed, for example, that the three types of auditory error recognition are linked in a non-linear way: the highest explanatory power for general listening proficiency may be attained when quadratic interactions between auditory recognition of errors related to vowel minimal pair words and that of errors related to noun phrase-based word order are embraced (R2=.33, p=.01).Keywords: auditory error recognition, intensive listening, interaction, investigation
Procedia PDF Downloads 51416972 Transcultural Study on Social Intelligence
Authors: Martha Serrano-Arias, Martha Frías-Armenta
Abstract:
Significant results have been found both supporting universality of emotion recognition and cultural background influence. Thus, the aim of this research was to test a Mexican version of the MTSI in different cultures to find differences in their performance. The MTSI-Mx assesses through a scenario approach were subjects must evaluate real persons. Two target persons were used for the construction, a man (FS) and a woman (AD). The items were grouped in four variables: Picture, Video, and FS and AD scenarios. The test was applied to 201 students from Mexico and Germany. T-test for picture and FS scenario show no significance. Video and AD had a significance at the 5% level. Results show slight differences between cultures, although a more comprehensive research is needed to conclude which culture can perform better in this kind of assessments.Keywords: emotion recognition, MTSI, social intelligence, transcultural study
Procedia PDF Downloads 32716971 Makhraj Recognition Using Convolutional Neural Network
Authors: Zan Azma Nasruddin, Irwan Mazlin, Nor Aziah Daud, Fauziah Redzuan, Fariza Hanis Abdul Razak
Abstract:
This paper focuses on a machine learning that learn the correct pronunciation of Makhraj Huroofs. Usually, people need to find an expert to pronounce the Huroof accurately. In this study, the researchers have developed a system that is able to learn the selected Huroofs which are ha, tsa, zho, and dza using the Convolutional Neural Network. The researchers present the chosen type of the CNN architecture to make the system that is able to learn the data (Huroofs) as quick as possible and produces high accuracy during the prediction. The researchers have experimented the system to measure the accuracy and the cross entropy in the training process.Keywords: convolutional neural network, Makhraj recognition, speech recognition, signal processing, tensorflow
Procedia PDF Downloads 33516970 Identifying the Structural Components of Old Buildings from Floor Plans
Authors: Shi-Yu Xu
Abstract:
The top three risk factors that have contributed to building collapses during past earthquake events in Taiwan are: "irregular floor plans or elevations," "insufficient columns in single-bay buildings," and the "weak-story problem." Fortunately, these unsound structural characteristics can be directly identified from the floor plans. However, due to the vast number of old buildings, conducting manual inspections to identify these compromised structural features in all existing structures would be time-consuming and prone to human errors. This study aims to develop an algorithm that utilizes artificial intelligence techniques to automatically pinpoint the structural components within a building's floor plans. The obtained spatial information will be utilized to construct a digital structural model of the building. This information, particularly regarding the distribution of columns in the floor plan, can then be used to conduct preliminary seismic assessments of the building. The study employs various image processing and pattern recognition techniques to enhance detection efficiency and accuracy. The study enables a large-scale evaluation of structural vulnerability for numerous old buildings, providing ample time to arrange for structural retrofitting in those buildings that are at risk of significant damage or collapse during earthquakes.Keywords: structural vulnerability detection, object recognition, seismic capacity assessment, old buildings, artificial intelligence
Procedia PDF Downloads 9016969 Pattern Synthesis of Nonuniform Linear Arrays Including Mutual Coupling Effects Based on Gaussian Process Regression and Genetic Algorithm
Authors: Ming Su, Ziqiang Mu
Abstract:
This paper proposes a synthesis method for nonuniform linear antenna arrays that combine Gaussian process regression (GPR) and genetic algorithm (GA). In this method, the GPR model can be used to calculate the array radiation pattern in the presence of mutual coupling effects, and then the GA is used to optimize the excitations and locations of the elements so as to generate the desired radiation pattern. In this paper, taking a 9-element nonuniform linear array as an example and the desired radiation pattern corresponding to a Chebyshev distribution as the optimization objective, optimize the excitations and locations of the elements. Finally, the optimization results are verified by electromagnetic simulation software CST, which shows that the method is effective.Keywords: nonuniform linear antenna arrays, GPR, GA, mutual coupling effects, active element pattern
Procedia PDF Downloads 11016968 The Artificial Intelligence Technologies Used in PhotoMath Application
Authors: Tala Toonsi, Marah Alagha, Lina Alnowaiser, Hala Rajab
Abstract:
This report is about the Photomath app, which is an AI application that uses image recognition technology, specifically optical character recognition (OCR) algorithms. The (OCR) algorithm translates the images into a mathematical equation, and the app automatically provides a step-by-step solution. The application supports decimals, basic arithmetic, fractions, linear equations, and multiple functions such as logarithms. Testing was conducted to examine the usage of this app, and results were collected by surveying ten participants. Later, the results were analyzed. This paper seeks to answer the question: To what level the artificial intelligence features are accurate and the speed of process in this app. It is hoped this study will inform about the efficiency of AI in Photomath to the users.Keywords: photomath, image recognition, app, OCR, artificial intelligence, mathematical equations.
Procedia PDF Downloads 17216967 Relationship between Finger Print Pattern and Gender among Adolescents of Igala Ethnic Group, Kogi State, Nigeria
Authors: Paul Idoko Ukanu, Sunday Abba, Balogun Sadiya
Abstract:
The study of the finger prints patterns among the Igala ethnic groups was done in order to see their association gender. A cross sectional study was conducted and a total of 602 subjects participated in this study, 322 females and 280 males, which were mainly secondary school students between the age ranges of 13-19 years. The subjects fingerprint pattern was obtained by allowing them place the tip of each finger on the stamp pad, which is then imprinted on the questionnaire, this was done for both the left and right hand. Female had higher arch, whorl and loop finger print pattern in most of the right fingers than the males, the differences were statistically significant for the right index, right ring finger and right little finger, but were statistically insignificant for right thumb and right middle finger as p = 0.207 and 0.726, respectively. The result also revealed that males had higher arch finger print pattern in the right index and right little finger than the females, which was statistically significant (p = 0.001), and also a high whorl finger print pattern than the females in the right middle and ring finger.Keywords: arch, loop, whorl, fingers
Procedia PDF Downloads 14916966 Computational Approach to the Interaction of Neurotoxins and Kv1.3 Channel
Authors: Janneth González, George Barreto, Ludis Morales, Angélica Sabogal
Abstract:
Sea anemone neurotoxins are peptides that interact with Na+ and K+ channels, resulting in specific alterations on their functions. Some of these neurotoxins (1ROO, 1BGK, 2K9E, 1BEI) are important for the treatment of nearly eighty autoimmune disorders due to their specificity for Kv1.3 channel. The aim of this study was to identify the common residues among these neurotoxins by computational methods, and establish whether there is a pattern useful for the future generation of a treatment for autoimmune diseases. Our results showed eight new key common residues between the studied neurotoxins interacting with a histidine ring and the selectivity filter of the receptor, thus showing a possible pattern of interaction. This knowledge may serve as an input for the design of more promising drugs for autoimmune treatments.Keywords: neurotoxins, potassium channel, Kv1.3, computational methods, autoimmune diseases
Procedia PDF Downloads 37616965 The Design Inspired by Phra Maha Chedi of King Rama I-IV at Wat Phra Chetuphon Vimolmangklaram Rajwaramahaviharn
Authors: Taechit Cheuypoung
Abstract:
The research will focus on creating pattern designs that are inspired by the pagodas, Phra Maha Chedi of King Rama I-IV, that are located in the temple, Wat Phra Chetuphon Vimolmangklararm Rajwaramahaviharn. Different aspects of the temple were studied, including the history, architecture, significance of the temple, and techniques used to decorate the pagodas, Phra Maha Chedi of King Rama I-IV. Moreover, composition of arts and the form of pattern designs which all led to the outcome of four Thai application pattern. The four patterns combine Thai traditional design with international scheme, however, maintaining the distinctiveness of the glaze mosaic tiles of each Phra Maha Chedi. The patterns consist of rounded and notched petal flowers, leaves and vine, and various square shapes, and original colors which are updated for modernity. These elements are then grouped and combined with new techniques, resulting in pattern designs with modern aspects and simultaneously reflecting the charm and the aesthetic of Thai craftsmanship which are eternally embedded in the designs.Keywords: Chedi, Pagoda, pattern, Wat
Procedia PDF Downloads 38916964 Non-Mammalian Pattern Recognition Receptor from Rock Bream (Oplegnathus fasciatus): Genomic Characterization and Transcriptional Profile upon Bacterial and Viral Inductions
Authors: Thanthrige Thiunuwan Priyathilaka, Don Anushka Sandaruwan Elvitigala, Bong-Soo Lim, Hyung-Bok Jeong, Jehee Lee
Abstract:
Toll like receptors (TLRs) are a phylogeneticaly conserved family of pattern recognition receptors, which participates in the host immune responses against various pathogens and pathogen derived mitogen. TLR21, a non-mammalian type, is almost restricted to the fish species even though those can be identified rarely in avians and amphibians. Herein, this study was carried out to identify and characterize TLR21 from rock bream (Oplegnathus fasciatus) designated as RbTLR21, at transcriptional and genomic level. In this study, the full length cDNA and genomic sequence of RbTLR21 was identified using previously constructed cDNA sequence database and BAC library, respectively. Identified RbTLR21 sequence was characterized using several bioinformatics tools. The quantitative real time PCR (qPCR) experiment was conducted to determine tissue specific expressional distribution of RbTLR21. Further, transcriptional modulation of RbTLR21 upon the stimulation with Streptococcus iniae (S. iniae), rock bream iridovirus (RBIV) and Edwardsiella tarda (E. tarda) was analyzed in spleen tissues. The complete coding sequence of RbTLR21 was 2919 bp in length which can encode a protein consisting of 973 amino acid residues with molecular mass of 112 kDa and theoretical isoelectric point of 8.6. The anticipated protein sequence resembled a typical TLR domain architecture including C-terminal ectodomain with 16 leucine rich repeats, a transmembrane domain, cytoplasmic TIR domain and signal peptide with 23 amino acid residues. Moreover, protein folding pattern prediction of RbTLR21 exhibited well-structured and folded ectodomain, transmembrane domain and cytoplasmc TIR domain. According to the pair wise sequence analysis data, RbTLR21 showed closest homology with orange-spotted grouper (Epinephelus coioides) TLR21with 76.9% amino acid identity. Furthermore, our phylogenetic analysis revealed that RbTLR21 shows a close evolutionary relationship with its ortholog from Danio rerio. Genomic structure of RbTLR21 consisted of single exon similar to its ortholog of zebra fish. Sevaral putative transcription factor binding sites were also identified in 5ʹ flanking region of RbTLR21. The RBTLR 21 was ubiquitously expressed in all the tissues we tested. Relatively, high expression levels were found in spleen, liver and blood tissues. Upon induction with rock bream iridovirus, RbTLR21 expression was upregulated at the early phase of post induction period even though RbTLR21 expression level was fluctuated at the latter phase of post induction period. Post Edwardsiella tarda injection, RbTLR transcripts were upregulated throughout the experiment. Similarly, Streptococcus iniae induction exhibited significant upregulations of RbTLR21 mRNA expression in the spleen tissues. Collectively, our findings suggest that RbTLR21 is indeed a homolog of TLR21 family members and RbTLR21 may be involved in host immune responses against bacterial and DNA viral infections.Keywords: rock bream, toll like receptor 21 (TLR21), pattern recognition receptor, genomic characterization
Procedia PDF Downloads 54316963 ARABEX: Automated Dotted Arabic Expiration Date Extraction using Optimized Convolutional Autoencoder and Custom Convolutional Recurrent Neural Network
Authors: Hozaifa Zaki, Ghada Soliman
Abstract:
In this paper, we introduced an approach for Automated Dotted Arabic Expiration Date Extraction using Optimized Convolutional Autoencoder (ARABEX) with bidirectional LSTM. This approach is used for translating the Arabic dot-matrix expiration dates into their corresponding filled-in dates. A custom lightweight Convolutional Recurrent Neural Network (CRNN) model is then employed to extract the expiration dates. Due to the lack of available dataset images for the Arabic dot-matrix expiration date, we generated synthetic images by creating an Arabic dot-matrix True Type Font (TTF) matrix to address this limitation. Our model was trained on a realistic synthetic dataset of 3287 images, covering the period from 2019 to 2027, represented in the format of yyyy/mm/dd. We then trained our custom CRNN model using the generated synthetic images to assess the performance of our model (ARABEX) by extracting expiration dates from the translated images. Our proposed approach achieved an accuracy of 99.4% on the test dataset of 658 images, while also achieving a Structural Similarity Index (SSIM) of 0.46 for image translation on our dataset. The ARABEX approach demonstrates its ability to be applied to various downstream learning tasks, including image translation and reconstruction. Moreover, this pipeline (ARABEX+CRNN) can be seamlessly integrated into automated sorting systems to extract expiry dates and sort products accordingly during the manufacturing stage. By eliminating the need for manual entry of expiration dates, which can be time-consuming and inefficient for merchants, our approach offers significant results in terms of efficiency and accuracy for Arabic dot-matrix expiration date recognition.Keywords: computer vision, deep learning, image processing, character recognition
Procedia PDF Downloads 8216962 Effect of Tillage Practices and Planting Patterns on Growth and Yield of Maize (Zee Maize)
Authors: O. R. Obalowu, F. B. Akande, T. P Abegunrin
Abstract:
Maize (Zea may) is mostly grown and consumed by Nigeria farmers using different tillage practices which have a great effect on its growth and yield. In order to maximize output, there is need to recommend a suitable tillage practice for crop production which will increase the growth and yield of maize. This study investigated the effect of tillage practices and planting pattern on the growth and yield of maize. The experiment was arranged in a 4x3x3 Randomized Complete Block Design (RCBD) layout, with four tillage practices consisting of no-tillage (NT), disc ploughing only (Ponly), disc ploughing followed by harrowing (PH), and disc ploughing, harrowing then ridging (PHR). Three planting patterns which include; 65 x 75, 75 x 75 and 85 x 75 cm spacing within and between the rows respectively, were randomly applied on the plots. All treatments were replicated three times. Data which consist of plant height, stem girth, leaf area and weight of maize per plots were taken and recorded. Data gathered were analyzed using Analysis of Variance (ANOVA) in the Minitab Software Package. The result shows that PHR under the third planting pattern has the highest growth rate (216.50 cm) while NT under the first planting pattern has the lowest mean value of growth rate (115.60 cm). Also, Ponly under the first planting pattern gives a better maize yield (19.45 kg) when compared with other tillage practices while NT under first planting pattern recorded the least yield of maize (9.40 kg). In conclusion, considering soil and weather conditions of the research area, plough only under the first planting pattern (65 x 75 cm) is the best alternative for the production of the Swan maize variety.Keywords: tillage practice, planting pattern, disc ploughing, harrowing, ridging
Procedia PDF Downloads 49316961 Features Vector Selection for the Recognition of the Fragmented Handwritten Numeric Chains
Authors: Salim Ouchtati, Aissa Belmeguenai, Mouldi Bedda
Abstract:
In this study, we propose an offline system for the recognition of the fragmented handwritten numeric chains. Firstly, we realized a recognition system of the isolated handwritten digits, in this part; the study is based mainly on the evaluation of neural network performances, trained with the gradient backpropagation algorithm. The used parameters to form the input vector of the neural network are extracted from the binary images of the isolated handwritten digit by several methods: the distribution sequence, sondes application, the Barr features, and the centered moments of the different projections and profiles. Secondly, the study is extended for the reading of the fragmented handwritten numeric chains constituted of a variable number of digits. The vertical projection was used to segment the numeric chain at isolated digits and every digit (or segment) was presented separately to the entry of the system achieved in the first part (recognition system of the isolated handwritten digits).Keywords: features extraction, handwritten numeric chains, image processing, neural networks
Procedia PDF Downloads 26716960 Evolution of Pop Art Pattern on Modern Ao Dai
Authors: Mai Anh Pham Ho
Abstract:
Ao Dai is the traditional dress of Vietnamese women that consists of a long tunic with slits on either side and wide trousers. This is the Vietnamese national costume which most common worn by women in daily life. The Vietnamese men may wear Ao Dai on special occasions like New Year Eve or Wedding Ceremony. Ao Dai is one of the few Vietnamese words that appear in English language dictionaries. Nowadays, there are variations in modern Ao Dai that consist of a short tunic on knee and slim trousers with the other materials like kaki or jeans. This paper aims to apply Pop art pattern on modern Ao Dai through the image of Vietnamese women by modifying the creation process of fashion design. It reflects on how modern culture is involved in Ao Dai and how it affects on fashion design. The research method of this paper is done through surveying the various examples of technological applications to fashion design, then the pop art pattern with the image of Vietnamese women is applied on modern Ao Dai. The results of this paper have shown through the collection of modern Ao Dai with three artworks applied the pop art pattern. In conclusion, the role of fashion technology supports and evolves the traditional value in order to establish the Vietnamese national personality as well as distinguish to other cultural values in the world.Keywords: pop art pattern, Vietnamese national costume, modern ao dai, fashion design
Procedia PDF Downloads 28316959 The Importance of Visual Communication in Artificial Intelligence
Authors: Manjitsingh Rajput
Abstract:
Visual communication plays an important role in artificial intelligence (AI) because it enables machines to understand and interpret visual information, similar to how humans do. This abstract explores the importance of visual communication in AI and emphasizes the importance of various applications such as computer vision, object emphasis recognition, image classification and autonomous systems. In going deeper, with deep learning techniques and neural networks that modify visual understanding, In addition to AI programming, the abstract discusses challenges facing visual interfaces for AI, such as data scarcity, domain optimization, and interpretability. Visual communication and other approaches, such as natural language processing and speech recognition, have also been explored. Overall, this abstract highlights the critical role that visual communication plays in advancing AI capabilities and enabling machines to perceive and understand the world around them. The abstract also explores the integration of visual communication with other modalities like natural language processing and speech recognition, emphasizing the critical role of visual communication in AI capabilities. This methodology explores the importance of visual communication in AI development and implementation, highlighting its potential to enhance the effectiveness and accessibility of AI systems. It provides a comprehensive approach to integrating visual elements into AI systems, making them more user-friendly and efficient. In conclusion, Visual communication is crucial in AI systems for object recognition, facial analysis, and augmented reality, but challenges like data quality, interpretability, and ethics must be addressed. Visual communication enhances user experience, decision-making, accessibility, and collaboration. Developers can integrate visual elements for efficient and accessible AI systems.Keywords: visual communication AI, computer vision, visual aid in communication, essence of visual communication.
Procedia PDF Downloads 9716958 Speech Recognition Performance by Adults: A Proposal for a Battery for Marathi
Authors: S. B. Rathna Kumar, Pranjali A Ujwane, Panchanan Mohanty
Abstract:
The present study aimed to develop a battery for assessing speech recognition performance by adults in Marathi. A total of four word lists were developed by considering word frequency, word familiarity, words in common use, and phonemic balance. Each word list consists of 25 words (15 monosyllabic words in CVC structure and 10 monosyllabic words in CVCV structure). Equivalence analysis and performance-intensity function testing was carried using the four word lists on a total of 150 native speakers of Marathi belonging to different regions of Maharashtra (Vidarbha, Marathwada, Khandesh and Northern Maharashtra, Pune, and Konkan). The subjects were further equally divided into five groups based on above mentioned regions. It was found that there was no significant difference (p > 0.05) in the speech recognition performance between groups for each word list and between word lists for each group. Hence, the four word lists developed were equally difficult for all the groups and can be used interchangeably. The performance-intensity (PI) function curve showed semi-linear function, and the groups’ mean slope of the linear portions of the curve indicated an average linear slope of 4.64%, 4.73%, 4.68%, and 4.85% increase in word recognition score per dB for list 1, list 2, list 3 and list 4 respectively. Although, there is no data available on speech recognition tests for adults in Marathi, most of the findings of the study are in line with the findings of research reports on other languages. The four word lists, thus developed, were found to have sufficient reliability and validity in assessing speech recognition performance by adults in Marathi.Keywords: speech recognition performance, phonemic balance, equivalence analysis, performance-intensity function testing, reliability, validity
Procedia PDF Downloads 35816957 Face Recognition Using Body-Worn Camera: Dataset and Baseline Algorithms
Authors: Ali Almadan, Anoop Krishnan, Ajita Rattani
Abstract:
Facial recognition is a widely adopted technology in surveillance, border control, healthcare, banking services, and lately, in mobile user authentication with Apple introducing “Face ID” moniker with iPhone X. A lot of research has been conducted in the area of face recognition on datasets captured by surveillance cameras, DSLR, and mobile devices. Recently, face recognition technology has also been deployed on body-worn cameras to keep officers safe, enabling situational awareness and providing evidence for trial. However, limited academic research has been conducted on this topic so far, without the availability of any publicly available datasets with a sufficient sample size. This paper aims to advance research in the area of face recognition using body-worn cameras. To this aim, the contribution of this work is two-fold: (1) collection of a dataset consisting of a total of 136,939 facial images of 102 subjects captured using body-worn cameras in in-door and daylight conditions and (2) evaluation of various deep-learning architectures for face identification on the collected dataset. Experimental results suggest a maximum True Positive Rate(TPR) of 99.86% at False Positive Rate(FPR) of 0.000 obtained by SphereFace based deep learning architecture in daylight condition. The collected dataset and the baseline algorithms will promote further research and development. A downloadable link of the dataset and the algorithms is available by contacting the authors.Keywords: face recognition, body-worn cameras, deep learning, person identification
Procedia PDF Downloads 16316956 A Novel Method for Face Detection
Authors: H. Abas Nejad, A. R. Teymoori
Abstract:
Facial expression recognition is one of the open problems in computer vision. Robust neutral face recognition in real time is a major challenge for various supervised learning based facial expression recognition methods. This is due to the fact that supervised methods cannot accommodate all appearance variability across the faces with respect to race, pose, lighting, facial biases, etc. in the limited amount of training data. Moreover, processing each and every frame to classify emotions is not required, as the user stays neutral for the majority of the time in usual applications like video chat or photo album/web browsing. Detecting neutral state at an early stage, thereby bypassing those frames from emotion classification would save the computational power. In this work, we propose a light-weight neutral vs. emotion classification engine, which acts as a preprocessor to the traditional supervised emotion classification approaches. It dynamically learns neutral appearance at Key Emotion (KE) points using a textural statistical model, constructed by a set of reference neutral frames for each user. The proposed method is made robust to various types of user head motions by accounting for affine distortions based on a textural statistical model. Robustness to dynamic shift of KE points is achieved by evaluating the similarities on a subset of neighborhood patches around each KE point using the prior information regarding the directionality of specific facial action units acting on the respective KE point. The proposed method, as a result, improves ER accuracy and simultaneously reduces the computational complexity of ER system, as validated on multiple databases.Keywords: neutral vs. emotion classification, Constrained Local Model, procrustes analysis, Local Binary Pattern Histogram, statistical model
Procedia PDF Downloads 34016955 Pre-Analysis of Printed Circuit Boards Based on Multispectral Imaging for Vision Based Recognition of Electronics Waste
Authors: Florian Kleber, Martin Kampel
Abstract:
The increasing demand of gallium, indium and rare-earth elements for the production of electronics, e.g. solid state-lighting, photovoltaics, integrated circuits, and liquid crystal displays, will exceed the world-wide supply according to current forecasts. Recycling systems to reclaim these materials are not yet in place, which challenges the sustainability of these technologies. This paper proposes a multispectral imaging system as a basis for a vision based recognition system for valuable components of electronics waste. Multispectral images intend to enhance the contrast of images of printed circuit boards (single components, as well as labels) for further analysis, such as optical character recognition and entire printed circuit board recognition. The results show that a higher contrast is achieved in the near infrared compared to ultraviolet and visible light.Keywords: electronics waste, multispectral imaging, printed circuit boards, rare-earth elements
Procedia PDF Downloads 41616954 Improved Performance in Content-Based Image Retrieval Using Machine Learning Approach
Authors: B. Ramesh Naik, T. Venugopal
Abstract:
This paper presents a novel approach which improves the high-level semantics of images based on machine learning approach. The contemporary approaches for image retrieval and object recognition includes Fourier transforms, Wavelets, SIFT and HoG. Though these descriptors helpful in a wide range of applications, they exploit zero order statistics, and this lacks high descriptiveness of image features. These descriptors usually take benefit of primitive visual features such as shape, color, texture and spatial locations to describe images. These features do not adequate to describe high-level semantics of the images. This leads to a gap in semantic content caused to unacceptable performance in image retrieval system. A novel method has been proposed referred as discriminative learning which is derived from machine learning approach that efficiently discriminates image features. The analysis and results of proposed approach were validated thoroughly on WANG and Caltech-101 Databases. The results proved that this approach is very competitive in content-based image retrieval.Keywords: CBIR, discriminative learning, region weight learning, scale invariant feature transforms
Procedia PDF Downloads 18316953 Data Mining of Students' Performance Using Artificial Neural Network: Turkish Students as a Case Study
Authors: Samuel Nii Tackie, Oyebade K. Oyedotun, Ebenezer O. Olaniyi, Adnan Khashman
Abstract:
Artificial neural networks have been used in different fields of artificial intelligence, and more specifically in machine learning. Although, other machine learning options are feasible in most situations, but the ease with which neural networks lend themselves to different problems which include pattern recognition, image compression, classification, computer vision, regression etc. has earned it a remarkable place in the machine learning field. This research exploits neural networks as a data mining tool in predicting the number of times a student repeats a course, considering some attributes relating to the course itself, the teacher, and the particular student. Neural networks were used in this work to map the relationship between some attributes related to students’ course assessment and the number of times a student will possibly repeat a course before he passes. It is the hope that the possibility to predict students’ performance from such complex relationships can help facilitate the fine-tuning of academic systems and policies implemented in learning environments. To validate the power of neural networks in data mining, Turkish students’ performance database has been used; feedforward and radial basis function networks were trained for this task; and the performances obtained from these networks evaluated in consideration of achieved recognition rates and training time.Keywords: artificial neural network, data mining, classification, students’ evaluation
Procedia PDF Downloads 61516952 The Combination of the Mel Frequency Cepstral Coefficients, Perceptual Linear Prediction, Jitter and Shimmer Coefficients for the Improvement of Automatic Recognition System for Dysarthric Speech
Authors: Brahim Fares Zaidi
Abstract:
Our work aims to improve our Automatic Recognition System for Dysarthria Speech based on the Hidden Models of Markov and the Hidden Markov Model Toolkit to help people who are sick. With pronunciation problems, we applied two techniques of speech parameterization based on Mel Frequency Cepstral Coefficients and Perceptual Linear Prediction and concatenated them with JITTER and SHIMMER coefficients in order to increase the recognition rate of a dysarthria speech. For our tests, we used the NEMOURS database that represents speakers with dysarthria and normal speakers.Keywords: ARSDS, HTK, HMM, MFCC, PLP
Procedia PDF Downloads 11016951 Current Drainage Attack Correction via Adjusting the Attacking Saw-Function Asymmetry
Authors: Yuri Boiko, Iluju Kiringa, Tet Yeap
Abstract:
Current drainage attack suggested previously is further studied in regular settings of closed-loop controlled Brushless DC (BLDC) motor with Kalman filter in the feedback loop. Modeling and simulation experiments are conducted in a Matlab environment, implementing the closed-loop control model of BLDC motor operation in position sensorless mode under Kalman filter drive. The current increase in the motor windings is caused by the controller (p-controller in our case) affected by false data injection of substitution of the angular velocity estimates with distorted values. Operation of multiplication to distortion coefficient, values of which are taken from the distortion function synchronized in its periodicity with the rotor’s position change. A saw function with a triangular tooth shape is studied herewith for the purpose of carrying out the bias injection with current drainage consequences. The specific focus here is on how the asymmetry of the tooth in the saw function affects the flow of current drainage. The purpose is two-fold: (i) to produce and collect the signature of an asymmetric saw in the attack for further pattern recognition process, and (ii) to determine conditions of improving stealthiness of such attack via regulating asymmetry in saw function used. It is found that modification of the symmetry in the saw tooth affects the periodicity of current drainage modulation. Specifically, the modulation frequency of the drained current for a fully asymmetric tooth shape coincides with the saw function modulation frequency itself. Increasing the symmetry parameter for the triangle tooth shape leads to an increase in the modulation frequency for the drained current. Moreover, such frequency reaches the switching frequency of the motor windings for fully symmetric triangular shapes, thus becoming undetectable and improving the stealthiness of the attack. Therefore, the collected signatures of the attack can serve for attack parameter identification via the pattern recognition route.Keywords: bias injection attack, Kalman filter, BLDC motor, control system, closed loop, P-controller, PID-controller, current drainage, saw-function, asymmetry
Procedia PDF Downloads 8116950 Industrial Process Mining Based on Data Pattern Modeling and Nonlinear Analysis
Authors: Hyun-Woo Cho
Abstract:
Unexpected events may occur with serious impacts on industrial process. This work utilizes a data representation technique to model and to analyze process data pattern for the purpose of diagnosis. In this work, the use of triangular representation of process data is evaluated using simulation process. Furthermore, the effect of using different pre-treatment techniques based on such as linear or nonlinear reduced spaces was compared. This work extracted the fault pattern in the reduced space, not in the original data space. The results have shown that the non-linear technique based diagnosis method produced more reliable results and outperforms linear method.Keywords: process monitoring, data analysis, pattern modeling, fault, nonlinear techniques
Procedia PDF Downloads 38816949 Multimodal Data Fusion Techniques in Audiovisual Speech Recognition
Authors: Hadeer M. Sayed, Hesham E. El Deeb, Shereen A. Taie
Abstract:
In the big data era, we are facing a diversity of datasets from different sources in different domains that describe a single life event. These datasets consist of multiple modalities, each of which has a different representation, distribution, scale, and density. Multimodal fusion is the concept of integrating information from multiple modalities in a joint representation with the goal of predicting an outcome through a classification task or regression task. In this paper, multimodal fusion techniques are classified into two main classes: model-agnostic techniques and model-based approaches. It provides a comprehensive study of recent research in each class and outlines the benefits and limitations of each of them. Furthermore, the audiovisual speech recognition task is expressed as a case study of multimodal data fusion approaches, and the open issues through the limitations of the current studies are presented. This paper can be considered a powerful guide for interested researchers in the field of multimodal data fusion and audiovisual speech recognition particularly.Keywords: multimodal data, data fusion, audio-visual speech recognition, neural networks
Procedia PDF Downloads 11416948 Distant Speech Recognition Using Laser Doppler Vibrometer
Authors: Yunbin Deng
Abstract:
Most existing applications of automatic speech recognition relies on cooperative subjects at a short distance to a microphone. Standoff speech recognition using microphone arrays can extend the subject to sensor distance somewhat, but it is still limited to only a few feet. As such, most deployed applications of standoff speech recognitions are limited to indoor use at short range. Moreover, these applications require air passway between the subject and the sensor to achieve reasonable signal to noise ratio. This study reports long range (50 feet) automatic speech recognition experiments using a Laser Doppler Vibrometer (LDV) sensor. This study shows that the LDV sensor modality can extend the speech acquisition standoff distance far beyond microphone arrays to hundreds of feet. In addition, LDV enables 'listening' through the windows for uncooperative subjects. This enables new capabilities in automatic audio and speech intelligence, surveillance, and reconnaissance (ISR) for law enforcement, homeland security and counter terrorism applications. The Polytec LDV model OFV-505 is used in this study. To investigate the impact of different vibrating materials, five parallel LDV speech corpora, each consisting of 630 speakers, are collected from the vibrations of a glass window, a metal plate, a plastic box, a wood slate, and a concrete wall. These are the common materials the application could encounter in a daily life. These data were compared with the microphone counterpart to manifest the impact of various materials on the spectrum of the LDV speech signal. State of the art deep neural network modeling approaches is used to conduct continuous speaker independent speech recognition on these LDV speech datasets. Preliminary phoneme recognition results using time-delay neural network, bi-directional long short term memory, and model fusion shows great promise of using LDV for long range speech recognition. To author’s best knowledge, this is the first time an LDV is reported for long distance speech recognition application.Keywords: covert speech acquisition, distant speech recognition, DSR, laser Doppler vibrometer, LDV, speech intelligence surveillance and reconnaissance, ISR
Procedia PDF Downloads 18016947 Interactive Shadow Play Animation System
Authors: Bo Wan, Xiu Wen, Lingling An, Xiaoling Ding
Abstract:
The paper describes a Chinese shadow play animation system based on Kinect. Users, without any professional training, can personally manipulate the shadow characters to finish a shadow play performance by their body actions and get a shadow play video through giving the record command to our system if they want. In our system, Kinect is responsible for capturing human movement and voice commands data. Gesture recognition module is used to control the change of the shadow play scenes. After packaging the data from Kinect and the recognition result from gesture recognition module, VRPN transmits them to the server-side. At last, the server-side uses the information to control the motion of shadow characters and video recording. This system not only achieves human-computer interaction, but also realizes the interaction between people. It brings an entertaining experience to users and easy to operate for all ages. Even more important is that the application background of Chinese shadow play embodies the protection of the art of shadow play animation.Keywords: hadow play animation, Kinect, gesture recognition, VRPN, HCI
Procedia PDF Downloads 40216946 Reed: An Approach Towards Quickly Bootstrapping Multilingual Acoustic Models
Authors: Bipasha Sen, Aditya Agarwal
Abstract:
Multilingual automatic speech recognition (ASR) system is a single entity capable of transcribing multiple languages sharing a common phone space. Performance of such a system is highly dependent on the compatibility of the languages. State of the art speech recognition systems are built using sequential architectures based on recurrent neural networks (RNN) limiting the computational parallelization in training. This poses a significant challenge in terms of time taken to bootstrap and validate the compatibility of multiple languages for building a robust multilingual system. Complex architectural choices based on self-attention networks are made to improve the parallelization thereby reducing the training time. In this work, we propose Reed, a simple system based on 1D convolutions which uses very short context to improve the training time. To improve the performance of our system, we use raw time-domain speech signals directly as input. This enables the convolutional layers to learn feature representations rather than relying on handcrafted features such as MFCC. We report improvement on training and inference times by atleast a factor of 4x and 7.4x respectively with comparable WERs against standard RNN based baseline systems on SpeechOcean's multilingual low resource dataset.Keywords: convolutional neural networks, language compatibility, low resource languages, multilingual automatic speech recognition
Procedia PDF Downloads 12416945 Neural Network Approach to Classifying Truck Traffic
Authors: Ren Moses
Abstract:
The process of classifying vehicles on a highway is hereby viewed as a pattern recognition problem in which connectionist techniques such as artificial neural networks (ANN) can be used to assign vehicles to their correct classes and hence to establish optimum axle spacing thresholds. In the United States, vehicles are typically classified into 13 classes using a methodology commonly referred to as “Scheme F”. In this research, the ANN model was developed, trained, and applied to field data of vehicles. The data comprised of three vehicular features—axle spacing, number of axles per vehicle, and overall vehicle weight. The ANN reduced the classification error rate from 9.5 percent to 6.2 percent when compared to an existing classification algorithm that is not ANN-based and which uses two vehicular features for classification, that is, axle spacing and number of axles. The inclusion of overall vehicle weight as a third classification variable further reduced the error rate from 6.2 percent to only 3.0 percent. The promising results from the neural networks were used to set up new thresholds that reduce classification error rate.Keywords: artificial neural networks, vehicle classification, traffic flow, traffic analysis, and highway opera-tions
Procedia PDF Downloads 312