Search results for: speaker recognition
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1727

Search results for: speaker recognition

1367 Design and Development of Novel Anion Selective Chemosensors Derived from Vitamin B6 Cofactors

Authors: Darshna Sharma, Suban K. Sahoo

Abstract:

The detection of intracellular fluoride in human cancer cell HeLa was achieved by chemosensors derived from vitamin B6 cofactors using fluorescence imaging technique. These sensors were first synthesized by condensation of pyridoxal/pyridoxal phosphate with 2-amino(thio)phenol. The anion recognition ability was explored by experimental (UV-VIS, fluorescence and 1H NMR) and theoretical DFT [(B3LYP/6-31G(d,p)] methods in DMSO and mixed DMSO-H2O system. All the developed sensors showed both naked-eye detectable color change and remarkable fluorescence enhancement in the presence of F- and AcO-. The anion recognition was occurred through the formation of hydrogen bonded complexes between these anions and sensor, followed by the partial deprotonation of sensor. The detection limit of these sensors were down to micro(nano) molar level of F- and AcO-.

Keywords: chemosensors, fluoride, acetate, turn-on, live cells imaging, DFT

Procedia PDF Downloads 377
1366 A Review on Predictive Sound Recognition System

Authors: Ajay Kadam, Ramesh Kagalkar

Abstract:

The proposed research objective is to add to a framework for programmed recognition of sound. In this framework the real errand is to distinguish any information sound stream investigate it & anticipate the likelihood of diverse sounds show up in it. To create and industrially conveyed an adaptable sound web crawler a flexible sound search engine. The calculation is clamor and contortion safe, computationally productive, and hugely adaptable, equipped for rapidly recognizing a short portion of sound stream caught through a phone microphone in the presence of frontal area voices and other predominant commotion, and through voice codec pressure, out of a database of over accessible tracks. The algorithm utilizes a combinatorial hashed time-recurrence group of stars examination of the sound, yielding ordinary properties, for example, transparency, in which numerous tracks combined may each be distinguished.

Keywords: fingerprinting, pure tone, white noise, hash function

Procedia PDF Downloads 296
1365 Dynamic Gabor Filter Facial Features-Based Recognition of Emotion in Video Sequences

Authors: T. Hari Prasath, P. Ithaya Rani

Abstract:

In the world of visual technology, recognizing emotions from the face images is a challenging task. Several related methods have not utilized the dynamic facial features effectively for high performance. This paper proposes a method for emotions recognition using dynamic facial features with high performance. Initially, local features are captured by Gabor filter with different scale and orientations in each frame for finding the position and scale of face part from different backgrounds. The Gabor features are sent to the ensemble classifier for detecting Gabor facial features. The region of dynamic features is captured from the Gabor facial features in the consecutive frames which represent the dynamic variations of facial appearances. In each region of dynamic features is normalized using Z-score normalization method which is further encoded into binary pattern features with the help of threshold values. The binary features are passed to Multi-class AdaBoost classifier algorithm with the well-trained database contain happiness, sadness, surprise, fear, anger, disgust, and neutral expressions to classify the discriminative dynamic features for emotions recognition. The developed method is deployed on the Ryerson Multimedia Research Lab and Cohn-Kanade databases and they show significant performance improvement owing to their dynamic features when compared with the existing methods.

Keywords: detecting face, Gabor filter, multi-class AdaBoost classifier, Z-score normalization

Procedia PDF Downloads 246
1364 Image Rotation Using an Augmented 2-Step Shear Transform

Authors: Hee-Choul Kwon, Heeyong Kwon

Abstract:

Image rotation is one of main pre-processing steps for image processing or image pattern recognition. It is implemented with a rotation matrix multiplication. It requires a lot of floating point arithmetic operations and trigonometric calculations, so it takes a long time to execute. Therefore, there has been a need for a high speed image rotation algorithm without two major time-consuming operations. However, the rotated image has a drawback, i.e. distortions. We solved the problem using an augmented two-step shear transform. We compare the presented algorithm with the conventional rotation with images of various sizes. Experimental results show that the presented algorithm is superior to the conventional rotation one.

Keywords: high-speed rotation operation, image rotation, transform matrix, image processing, pattern recognition

Procedia PDF Downloads 240
1363 Musical Instrument Recognition in Polyphonic Audio Through Convolutional Neural Networks and Spectrograms

Authors: Rujia Chen, Akbar Ghobakhlou, Ajit Narayanan

Abstract:

This study investigates the task of identifying musical instruments in polyphonic compositions using Convolutional Neural Networks (CNNs) from spectrogram inputs, focusing on binary classification. The model showed promising results, with an accuracy of 97% on solo instrument recognition. When applied to polyphonic combinations of 1 to 10 instruments, the overall accuracy was 64%, reflecting the increasing challenge with larger ensembles. These findings contribute to the field of Music Information Retrieval (MIR) by highlighting the potential and limitations of current approaches in handling complex musical arrangements. Future work aims to include a broader range of musical sounds, including electronic and synthetic sounds, to improve the model's robustness and applicability in real-time MIR systems.

Keywords: binary classifier, CNN, spectrogram, instrument

Procedia PDF Downloads 1
1362 Auto Classification of Multiple ECG Arrhythmic Detection via Machine Learning Techniques: A Review

Authors: Ng Liang Shen, Hau Yuan Wen

Abstract:

Arrhythmia analysis of ECG signal plays a major role in diagnosing most of the cardiac diseases. Therefore, a single arrhythmia detection of an electrocardiographic (ECG) record can determine multiple pattern of various algorithms and match accordingly each ECG beats based on Machine Learning supervised learning. These researchers used different features and classification methods to classify different arrhythmia types. A major problem in these studies is the fact that the symptoms of the disease do not show all the time in the ECG record. Hence, a successful diagnosis might require the manual investigation of several hours of ECG records. The point of this paper presents investigations cardiovascular ailment in Electrocardiogram (ECG) Signals for Cardiac Arrhythmia utilizing examination of ECG irregular wave frames via heart beat as correspond arrhythmia which with Machine Learning Pattern Recognition.

Keywords: electrocardiogram, ECG, classification, machine learning, pattern recognition, detection, QRS

Procedia PDF Downloads 339
1361 Vision-Based Daily Routine Recognition for Healthcare with Transfer Learning

Authors: Bruce X. B. Yu, Yan Liu, Keith C. C. Chan

Abstract:

We propose to record Activities of Daily Living (ADLs) of elderly people using a vision-based system so as to provide better assistive and personalization technologies. Current ADL-related research is based on data collected with help from non-elderly subjects in laboratory environments and the activities performed are predetermined for the sole purpose of data collection. To obtain more realistic datasets for the application, we recorded ADLs for the elderly with data collected from real-world environment involving real elderly subjects. Motivated by the need to collect data for more effective research related to elderly care, we chose to collect data in the room of an elderly person. Specifically, we installed Kinect, a vision-based sensor on the ceiling, to capture the activities that the elderly subject performs in the morning every day. Based on the data, we identified 12 morning activities that the elderly person performs daily. To recognize these activities, we created a HARELCARE framework to investigate into the effectiveness of existing Human Activity Recognition (HAR) algorithms and propose the use of a transfer learning algorithm for HAR. We compared the performance, in terms of accuracy, and training progress. Although the collected dataset is relatively small, the proposed algorithm has a good potential to be applied to all daily routine activities for healthcare purposes such as evidence-based diagnosis and treatment.

Keywords: daily activity recognition, healthcare, IoT sensors, transfer learning

Procedia PDF Downloads 107
1360 On Overcoming Common Oral Speech Problems through Authentic Films

Authors: Tamara Matevosyan

Abstract:

The present paper discusses the main problems that students face while developing oral skills through authentic films. It states that special attention should be paid not only to the study of verbal speech but also to non-verbal communication. Authentic films serve as an important tool to understand both native speaker’s gestures and their culture of pausing while speaking. Various phonetic difficulties causing phonetic interference in actual speech are covered in the paper emphasizing the role of authentic films in overcoming them.

Keywords: compressive speech, filled pauses, unfilled pauses, pausing culture

Procedia PDF Downloads 315
1359 Effect of the Keyword Strategy on Lexical Semantic Acquisition: Recognition, Retention and Comprehension in an English as Second Language Context

Authors: Fatima Muhammad Shitu

Abstract:

This study seeks to investigate the effect of the keyword strategy on lexico–semantic acquisition, recognition, retention and comprehension in an ESL context. The aim of the study is to determine whether the keyword strategy can be used to enhance acquisition. As a quasi- experimental research, the objectives of the study include: To determine the extent to which the scores obtained by the subjects, who were trained on the use of the keyword strategy for acquisition, differ at the pre-tests and the post–tests and also to find out the relationship in the scores obtained at these tests levels. The sample for the study consists of 300 hundred undergraduate ESL Students in the Federal College of Education, Kano. The seventy-five lexical items for acquisition belong to the lexical field category known as register, and they include Medical, Agriculture and Photography registers (MAP). These were divided in the ratio twenty-five (25) lexical items in each lexical field. The testing technique was used to collect the data while the descriptive and inferential statistics were employed for data analysis. For the purpose of testing, the two kinds of tests administered at each test level include the WARRT (Word Acquisition, Recognition, and Retention Test) and the CCPT (Cloze Comprehension Passage Test). The results of the study revealed that there are significant differences in the scores obtained between the pre-tests, and the post–tests and there are no correlations in the scores obtained as well. This implies that the keyword strategy has effectively enhanced the acquisition of the lexical items studied.

Keywords: keyword, lexical, semantics, strategy

Procedia PDF Downloads 284
1358 Speech Emotion Recognition: A DNN and LSTM Comparison in Single and Multiple Feature Application

Authors: Thiago Spilborghs Bueno Meyer, Plinio Thomaz Aquino Junior

Abstract:

Through speech, which privileges the functional and interactive nature of the text, it is possible to ascertain the spatiotemporal circumstances, the conditions of production and reception of the discourse, the explicit purposes such as informing, explaining, convincing, etc. These conditions allow bringing the interaction between humans closer to the human-robot interaction, making it natural and sensitive to information. However, it is not enough to understand what is said; it is necessary to recognize emotions for the desired interaction. The validity of the use of neural networks for feature selection and emotion recognition was verified. For this purpose, it is proposed the use of neural networks and comparison of models, such as recurrent neural networks and deep neural networks, in order to carry out the classification of emotions through speech signals to verify the quality of recognition. It is expected to enable the implementation of robots in a domestic environment, such as the HERA robot from the RoboFEI@Home team, which focuses on autonomous service robots for the domestic environment. Tests were performed using only the Mel-Frequency Cepstral Coefficients, as well as tests with several characteristics of Delta-MFCC, spectral contrast, and the Mel spectrogram. To carry out the training, validation and testing of the neural networks, the eNTERFACE’05 database was used, which has 42 speakers from 14 different nationalities speaking the English language. The data from the chosen database are videos that, for use in neural networks, were converted into audios. It was found as a result, a classification of 51,969% of correct answers when using the deep neural network, when the use of the recurrent neural network was verified, with the classification with accuracy equal to 44.09%. The results are more accurate when only the Mel-Frequency Cepstral Coefficients are used for the classification, using the classifier with the deep neural network, and in only one case, it is possible to observe a greater accuracy by the recurrent neural network, which occurs in the use of various features and setting 73 for batch size and 100 training epochs.

Keywords: emotion recognition, speech, deep learning, human-robot interaction, neural networks

Procedia PDF Downloads 130
1357 Bedouin Dispersion in Israel: Between Sustainable Development and Social Non-Recognition

Authors: Tamir Michal

Abstract:

The subject of Bedouin dispersion has accompanied the State of Israel from the day of its establishment. From a legal point of view, this subject has offered a launchpad for creative judicial decisions. Thus, for example, the first court decision in Israel to recognize affirmative action (Avitan), dealt with a petition submitted by a Jew appealing the refusal of the State to recognize the Petitioner’s entitlement to the long-term lease of a plot designated for Bedouins. The Supreme Court dismissed the petition, holding that there existed a public interest in assisting Bedouin to establish permanent urban settlements, an interest which justifies giving them preference by selling them plots at subsidized prices. In another case (The Forum for Coexistence in the Negev) the Supreme Court extended equitable relief for the purpose of constructing a bridge, even though the construction infringed the Law, in order to allow the children of dispersed Bedouin to reach school. Against this background, the recent verdict, delivered during the Protective Edge military campaign, which dismissed a petition aimed at forcing the State to spread out Protective Structures in Bedouin villages in the Negev against the risk of being hit from missiles launched from Gaza (Abu Afash) is disappointing. Even if, in arguendo, no selective discrimination was involved in the State’s decision not to provide such protection, the decision, and its affirmation by the Court, is problematic when examined through the prism of the Theory of Recognition. The article analyses the issue by tools of theory of Recognition, according to which people develop their identities through mutual relations of recognition in different fields. In the social context, the path to recognition is cognitive respect, which is provided by means of legal rights. By seeing other participants in Society as bearers of rights and obligations, the individual develops an understanding of his legal condition as reflected in the attitude to others. Consequently, even if the Court’s decision may be justified on strict legal grounds, the fact that Jewish settlements were protected during the military operation, whereas Bedouin villages were not, is a setback in the struggle to make the Bedouin citizens with equal rights in Israeli society. As the Court held, ‘Beyond their protective function, the Migunit [Protective Structures] may make a moral and psychological contribution that should not be undervalued’. This contribution is one that the Bedouin did not receive in the Abu Afash verdict. The basic thesis is that the Court’s verdict analyzed above clearly demonstrates that the reliance on classical liberal instruments (e.g., equality) cannot secure full appreciation of all aspects of Bedouin life, and hence it can in fact prejudice them. Therefore, elements of the recognition theory should be added, in order to find the channel for cognitive dignity, thereby advancing the Bedouins’ ability to perceive themselves as equal human beings in the Israeli society.

Keywords: bedouin dispersion, cognitive respect, recognition theory, sustainable development

Procedia PDF Downloads 327
1356 2L1, a Bridge between L1 and L2

Authors: Elena Ginghina

Abstract:

There are two major categories of language acquisition: first and second language acquisition, which distinguish themselves in their learning process and in their ultimate attainment. However, in the case of a bilingual child, one of the languages he grows up with receives gradually the features of a second language. This phenomenon characterizes the successive first language acquisition, when the initial state of the child is already marked by another language. Nevertheless, the dominance of the languages can change throughout the life, if the exposure to language and the quality of the input are better in 2L1. Related to the exposure to language and the quality of the input, there are cases even at the simultaneous bilingualism, where the two languages although learned from birth one, differ from one another at some point. This paper aims to see, what makes a 2L1 to become a second language and under what circumstances can a L2 learner reach a native or a near native speaker level.

Keywords: bilingualism, first language acquisition, native speakers of German, second language acquisition

Procedia PDF Downloads 540
1355 Myanmar Consonants Recognition System Based on Lip Movements Using Active Contour Model

Authors: T. Thein, S. Kalyar Myo

Abstract:

Human uses visual information for understanding the speech contents in noisy conditions or in situations where the audio signal is not available. The primary advantage of visual information is that it is not affected by the acoustic noise and cross talk among speakers. Using visual information from the lip movements can improve the accuracy and robustness of automatic speech recognition. However, a major challenge with most automatic lip reading system is to find a robust and efficient method for extracting the linguistically relevant speech information from a lip image sequence. This is a difficult task due to variation caused by different speakers, illumination, camera setting and the inherent low luminance and chrominance contrast between lip and non-lip region. Several researchers have been developing methods to overcome these problems; the one is lip reading. Moreover, it is well known that visual information about speech through lip reading is very useful for human speech recognition system. Lip reading is the technique of a comprehensive understanding of underlying speech by processing on the movement of lips. Therefore, lip reading system is one of the different supportive technologies for hearing impaired or elderly people, and it is an active research area. The need for lip reading system is ever increasing for every language. This research aims to develop a visual teaching method system for the hearing impaired persons in Myanmar, how to pronounce words precisely by identifying the features of lip movement. The proposed research will work a lip reading system for Myanmar Consonants, one syllable consonants (င (Nga)၊ ည (Nya)၊ မ (Ma)၊ လ (La)၊ ၀ (Wa)၊ သ (Tha)၊ ဟ (Ha)၊ အ (Ah) ) and two syllable consonants ( က(Ka Gyi)၊ ခ (Kha Gway)၊ ဂ (Ga Nge)၊ ဃ (Ga Gyi)၊ စ (Sa Lone)၊ ဆ (Sa Lain)၊ ဇ (Za Gwe) ၊ ဒ (Da Dway)၊ ဏ (Na Gyi)၊ န (Na Nge)၊ ပ (Pa Saug)၊ ဘ (Ba Gone)၊ ရ (Ya Gaug)၊ ဠ (La Gyi) ). In the proposed system, there are three subsystems, the first one is the lip localization system, which localizes the lips in the digital inputs. The next one is the feature extraction system, which extracts features of lip movement suitable for visual speech recognition. And the final one is the classification system. In the proposed research, Two Dimensional Discrete Cosine Transform (2D-DCT) and Linear Discriminant Analysis (LDA) with Active Contour Model (ACM) will be used for lip movement features extraction. Support Vector Machine (SVM) classifier is used for finding class parameter and class number in training set and testing set. Then, experiments will be carried out for the recognition accuracy of Myanmar consonants using the only visual information on lip movements which are useful for visual speech of Myanmar languages. The result will show the effectiveness of the lip movement recognition for Myanmar Consonants. This system will help the hearing impaired persons to use as the language learning application. This system can also be useful for normal hearing persons in noisy environments or conditions where they can find out what was said by other people without hearing voice.

Keywords: feature extraction, lip reading, lip localization, Active Contour Model (ACM), Linear Discriminant Analysis (LDA), Support Vector Machine (SVM), Two Dimensional Discrete Cosine Transform (2D-DCT)

Procedia PDF Downloads 260
1354 Understanding Europe’s Role in the Area of Liberty, Security, and Justice as an International Actor

Authors: Barrere Sarah

Abstract:

The area of liberty, security, and justice within the European Union is still a work in progress. No one can deny that the EU struggles between a monistic and a dualist approach. The aim of our essay is to first review how the European law is perceived by the rest of the international scene. It will then discuss two main mechanisms at play: the interpretation of larger international treaties and the penal mechanisms of European law. Finally, it will help us understand the role of a penal Europe on the international scene with concrete examples. Special attention will be paid to cases that deal with fundamental rights as they represent an interesting case study in Europe and in the rest of the World. It could illustrate the aforementioned duality currently present in the Union’s interpretation of international public law. On the other hand, it will explore some specific European penal mechanism through mutual recognition and the European arrest warrant in the transnational criminality frame. Concerning the interpretation of the treaties, it will first, underline the ambiguity and the general nature of some treaties that leave the EU exposed to tension and misunderstanding then it will review the validity of an EU act (whether or not it is compatible with the rules of International law). Finally, it will focus on the most complete manifestation of liberty, security and justice through the principle of mutual recognition. Used initially in commercial matters, it has become “the cornerstone” of European construction. It will see how it is applied in judicial decisions (its main event and achieving success is via the European arrest warrant) and how European member states have managed to develop this cooperation.

Keywords: European penal law, international scene, liberty security and justice area, mutual recognition

Procedia PDF Downloads 374
1353 Transportation Language Register as One of Language Community

Authors: Diyah Atiek Mustikawati

Abstract:

Language register refers to a variety of a language used for particular purpose or in a particular social setting. Language register also means as a concept of adapting one’s use of language to conform to standards or tradition in a given professional or social situation. This descriptive study tends to discuss about the form of language register in transportation aspect, factors, also the function of use it. Mostly, language register in transportation aspect uses short sentences in form of informal register. The factor caused language register used are speaker, word choice, background of language. The functions of language register in transportations aspect are to make communication between crew easily, also to keep safety when they were in bad condition. Transportation language register developed naturally as one of variety of language used.

Keywords: language register, language variety, communication, transportation

Procedia PDF Downloads 442
1352 The Image as an Initial Element of the Cognitive Understanding of Words

Authors: S. Pesina, T. Solonchak

Abstract:

An analysis of word semantics focusing on the invariance of advanced imagery in several pressing problems. Interest in the language of imagery is caused by the introduction, in the linguistics sphere, of a new paradigm, the center of which is the personality of the speaker (the subject of the language). Particularly noteworthy is the question of the place of the image when discussing the lexical, phraseological values and the relationship of imagery and metaphors. In part, the formation of a metaphor, as an interaction between two intellective entities, occurs at a cognitive level, and it is the category of the image, having cognitive roots, which aides in the correct interpretation of the results of this process on the lexical-semantic level.

Keywords: image, metaphor, concept, creation of a metaphor, cognitive linguistics, erased image, vivid image

Procedia PDF Downloads 323
1351 Hand Gestures Based Emotion Identification Using Flex Sensors

Authors: S. Ali, R. Yunus, A. Arif, Y. Ayaz, M. Baber Sial, R. Asif, N. Naseer, M. Jawad Khan

Abstract:

In this study, we have proposed a gesture to emotion recognition method using flex sensors mounted on metacarpophalangeal joints. The flex sensors are fixed in a wearable glove. The data from the glove are sent to PC using Wi-Fi. Four gestures: finger pointing, thumbs up, fist open and fist close are performed by five subjects. Each gesture is categorized into sad, happy, and excited class based on the velocity and acceleration of the hand gesture. Seventeen inspectors observed the emotions and hand gestures of the five subjects. The emotional state based on the investigators assessment and acquired movement speed data is compared. Overall, we achieved 77% accurate results. Therefore, the proposed design can be used for emotional state detection applications.

Keywords: emotion identification, emotion models, gesture recognition, user perception

Procedia PDF Downloads 244
1350 Comparison of Various Classification Techniques Using WEKA for Colon Cancer Detection

Authors: Beema Akbar, Varun P. Gopi, V. Suresh Babu

Abstract:

Colon cancer causes the deaths of about half a million people every year. The common method of its detection is histopathological tissue analysis, it leads to tiredness and workload to the pathologist. A novel method is proposed that combines both structural and statistical pattern recognition used for the detection of colon cancer. This paper presents a comparison among the different classifiers such as Multilayer Perception (MLP), Sequential Minimal Optimization (SMO), Bayesian Logistic Regression (BLR) and k-star by using classification accuracy and error rate based on the percentage split method. The result shows that the best algorithm in WEKA is MLP classifier with an accuracy of 83.333% and kappa statistics is 0.625. The MLP classifier which has a lower error rate, will be preferred as more powerful classification capability.

Keywords: colon cancer, histopathological image, structural and statistical pattern recognition, multilayer perception

Procedia PDF Downloads 547
1349 LuMee: A Centralized Smart Protector for School Children who are Using Online Education

Authors: Lumindu Dilumka, Ranaweera I. D., Sudusinghe S. P., Sanduni Kanchana A. M. K.

Abstract:

This study was motivated by the challenges experienced by parents and guardians in ensuring the safety of children in cyberspace. In the last two or three years, online education has become very popular all over the world due to the Covid 19 pandemic. Therefore, parents, guardians and teachers must ensure the safety of children in cyberspace. Children are more likely to go astray and there are plenty of online programs are waiting to get them on the wrong track and also, children who are engaging in the online education can be distracted at any moment. Therefore, parents should keep a close check on their children's online activity. Apart from that, due to the unawareness of children, they tempt to share their sensitive information, causing a chance of being a victim of phishing attacks from outsiders. These problems can be overcome through the proposed web-based system. We use feature extraction, web tracking and analysis mechanisms, image processing and name entity recognition to implement this web-based system.

Keywords: online education, cyber bullying, social media, face recognition, web tracker, privacy data

Procedia PDF Downloads 52
1348 Training a Neural Network to Segment, Detect and Recognize Numbers

Authors: Abhisek Dash

Abstract:

This study had three neural networks, one for number segmentation, one for number detection and one for number recognition all of which are coupled to one another. All networks were trained on the MNIST dataset and were convolutional. It was assumed that the images had lighter background and darker foreground. The segmentation network took 28x28 images as input and had sixteen outputs. Segmentation training starts when a dark pixel is encountered. Taking a window(7x7) over that pixel as focus, the eight neighborhood of the focus was checked for further dark pixels. The segmentation network was then trained to move in those directions which had dark pixels. To this end the segmentation network had 16 outputs. They were arranged as “go east”, ”don’t go east ”, “go south east”, “don’t go south east”, “go south”, “don’t go south” and so on w.r.t focus window. The focus window was resized into a 28x28 image and the network was trained to consider those neighborhoods which had dark pixels. The neighborhoods which had dark pixels were pushed into a queue in a particular order. The neighborhoods were then popped one at a time stitched to the existing partial image of the number one at a time and trained on which neighborhoods to consider when the new partial image was presented. The above process was repeated until the image was fully covered by the 7x7 neighborhoods and there were no more uncovered black pixels. During testing the network scans and looks for the first dark pixel. From here on the network predicts which neighborhoods to consider and segments the image. After this step the group of neighborhoods are passed into the detection network. The detection network took 28x28 images as input and had two outputs denoting whether a number was detected or not. Since the ground truth of the bounds of a number was known during training the detection network outputted in favor of number not found until the bounds were not met and vice versa. The recognition network was a standard CNN that also took 28x28 images and had 10 outputs for recognition of numbers from 0 to 9. This network was activated only when the detection network votes in favor of number detected. The above methodology could segment connected and overlapping numbers. Additionally the recognition unit was only invoked when a number was detected which minimized false positives. It also eliminated the need for rules of thumb as segmentation is learned. The strategy can also be extended to other characters as well.

Keywords: convolutional neural networks, OCR, text detection, text segmentation

Procedia PDF Downloads 129
1347 A Fast Version of the Generalized Multi-Directional Radon Transform

Authors: Ines Elouedi, Atef Hammouda

Abstract:

This paper presents a new fast version of the generalized Multi-Directional Radon Transform method. The new method uses the inverse Fast Fourier Transform to lead to a faster Generalized Radon projections. We prove in this paper that the fast algorithm leads to almost the same results of the eldest one but with a considerable lower time computation cost. The projection end result of the fast method is a parameterized Radon space where a high valued pixel allows the detection of a curve from the original image. The proposed fast inversion algorithm leads to an exact reconstruction of the initial image from the Radon space. We show examples of the impact of this algorithm on the pattern recognition domain.

Keywords: fast generalized multi-directional Radon transform, curve, exact reconstruction, pattern recognition

Procedia PDF Downloads 248
1346 Algorithm for Recognizing Trees along Power Grid Using Multispectral Imagery

Authors: C. Hamamura, V. Gialluca

Abstract:

Much of the Eclectricity Distributors has about 70% of its electricity interruptions arising from cause "trees", alone or associated with wind and rain and with or without falling branch and / or trees. This contributes inexorably and significantly to outages, resulting in high costs as compensation in addition to the operation and maintenance costs. On the other hand, there is little data structure and solutions to better organize the trees pruning plan effectively, minimizing costs and environmentally friendly. This work describes the development of an algorithm to provide data of trees associated to power grid. The method is accomplished on several steps using satellite imagery and geographically vectorized grid. A sliding window like approach is performed to seek the area around the grid. The proposed method counted 764 trees on a patch of the grid, which was very close to the 738 trees counted manually. The trees data was used as a part of a larger project that implements a system to optimize tree pruning plan.

Keywords: image pattern recognition, trees pruning, trees recognition, neural network

Procedia PDF Downloads 475
1345 Robust Features for Impulsive Noisy Speech Recognition Using Relative Spectral Analysis

Authors: Hajer Rahali, Zied Hajaiej, Noureddine Ellouze

Abstract:

The goal of speech parameterization is to extract the relevant information about what is being spoken from the audio signal. In speech recognition systems Mel-Frequency Cepstral Coefficients (MFCC) and Relative Spectral Mel-Frequency Cepstral Coefficients (RASTA-MFCC) are the two main techniques used. It will be shown in this paper that it presents some modifications to the original MFCC method. In our work the effectiveness of proposed changes to MFCC called Modified Function Cepstral Coefficients (MODFCC) were tested and compared against the original MFCC and RASTA-MFCC features. The prosodic features such as jitter and shimmer are added to baseline spectral features. The above-mentioned techniques were tested with impulsive signals under various noisy conditions within AURORA databases.

Keywords: auditory filter, impulsive noise, MFCC, prosodic features, RASTA filter

Procedia PDF Downloads 396
1344 Optimization for Autonomous Robotic Construction by Visual Guidance through Machine Learning

Authors: Yangzhi Li

Abstract:

Network transfer of information and performance customization is now a viable method of digital industrial production in the era of Industry 4.0. Robot platforms and network platforms have grown more important in digital design and construction. The pressing need for novel building techniques is driven by the growing labor scarcity problem and increased awareness of construction safety. Robotic approaches in construction research are regarded as an extension of operational and production tools. Several technological theories related to robot autonomous recognition, which include high-performance computing, physical system modeling, extensive sensor coordination, and dataset deep learning, have not been explored using intelligent construction. Relevant transdisciplinary theory and practice research still has specific gaps. Optimizing high-performance computing and autonomous recognition visual guidance technologies improves the robot's grasp of the scene and capacity for autonomous operation. Intelligent vision guidance technology for industrial robots has a serious issue with camera calibration, and the use of intelligent visual guiding and identification technologies for industrial robots in industrial production has strict accuracy requirements. It can be considered that visual recognition systems have challenges with precision issues. In such a situation, it will directly impact the effectiveness and standard of industrial production, necessitating a strengthening of the visual guiding study on positioning precision in recognition technology. To best facilitate the handling of complicated components, an approach for the visual recognition of parts utilizing machine learning algorithms is proposed. This study will identify the position of target components by detecting the information at the boundary and corner of a dense point cloud and determining the aspect ratio in accordance with the guidelines for the modularization of building components. To collect and use components, operational processing systems assign them to the same coordinate system based on their locations and postures. The RGB image's inclination detection and the depth image's verification will be used to determine the component's present posture. Finally, a virtual environment model for the robot's obstacle-avoidance route will be constructed using the point cloud information.

Keywords: robotic construction, robotic assembly, visual guidance, machine learning

Procedia PDF Downloads 54
1343 Working Conditions, Motivation and Job Performance of Hotel Workers

Authors: Thushel Jayaweera

Abstract:

In performance evaluation literature, there has been no investigation indicating the impact of job characteristics, working conditions and motivation on the job performance among the hotel workers in Britain. This study tested the relationship between working conditions (physical and psychosocial working conditions) and job performance (task and contextual performance) with motivators (e.g. recognition, achievement, the work itself, the possibility for growth and work significance) as the mediating variable. A total of 254 hotel workers in 25 hotels in Bristol, United Kingdom participated in this study. Working conditions influenced job performance and motivation moderated the relationship between working conditions and job performance. Poor workplace conditions resulted in decreasing employee performance. The results point to the importance of motivators among hotel workers and highlighted that work be designed to provide recognition and sense of autonomy on the job to enhance job performance of the hotel workers. These findings have implications for organizational interventions aimed at increasing employee job performance.

Keywords: hotel workers, working conditions, motivation, job characteristics, job performance

Procedia PDF Downloads 563
1342 Item-Trait Pattern Recognition of Replenished Items in Multidimensional Computerized Adaptive Testing

Authors: Jianan Sun, Ziwen Ye

Abstract:

Multidimensional computerized adaptive testing (MCAT) is a popular research topic in psychometrics. It is important for practitioners to clearly know the item-trait patterns of administered items when a test like MCAT is operated. Item-trait pattern recognition refers to detecting which latent traits in a psychological test are measured by each of the specified items. If the item-trait patterns of the replenished items in MCAT item pool are well detected, the interpretability of the items can be improved, which can further promote the abilities of the examinees who attending the MCAT to be accurately estimated. This research explores to solve the item-trait pattern recognition problem of the replenished items in MCAT item pool from the perspective of statistical variable selection. The popular multidimensional item response theory model, multidimensional two-parameter logistic model, is assumed to fit the response data of MCAT. The proposed method uses the least absolute shrinkage and selection operator (LASSO) to detect item-trait patterns of replenished items based on the essential information of item responses and ability estimates of examinees collected from a designed MCAT procedure. Several advantages of the proposed method are outlined. First, the proposed method does not strictly depend on the relative order between the replenished items and the selected operational items, so it allows the replenished items to be mixed into the operational items in reasonable order such as considering content constraints or other test requirements. Second, the LASSO used in this research improves the interpretability of the multidimensional replenished items in MCAT. Third, the proposed method can exert the advantage of shrinkage method idea for variable selection, so it can help to check item quality and key dimension features of replenished items and saves more costs of time and labors in response data collection than traditional factor analysis method. Moreover, the proposed method makes sure the dimensions of replenished items are recognized to be consistent with the dimensions of operational items in MCAT item pool. Simulation studies are conducted to investigate the performance of the proposed method under different conditions for varying dimensionality of item pool, latent trait correlation, item discrimination, test lengths and item selection criteria in MCAT. Results show that the proposed method can accurately detect the item-trait patterns of the replenished items in the two-dimensional and the three-dimensional item pool. Selecting enough operational items from the item pool consisting of high discriminating items by Bayesian A-optimality in MCAT can improve the recognition accuracy of item-trait patterns of replenished items for the proposed method. The pattern recognition accuracy for the conditions with correlated traits is better than those with independent traits especially for the item pool consisting of comparatively low discriminating items. To sum up, the proposed data-driven method based on the LASSO can accurately and efficiently detect the item-trait patterns of replenished items in MCAT.

Keywords: item-trait pattern recognition, least absolute shrinkage and selection operator, multidimensional computerized adaptive testing, variable selection

Procedia PDF Downloads 96
1341 Importance of Developing a Decision Support System for Diagnosis of Glaucoma

Authors: Murat Durucu

Abstract:

Glaucoma is a condition of irreversible blindness, early diagnosis and appropriate interventions to make the patients able to see longer time. In this study, it addressed that the importance of developing a decision support system for glaucoma diagnosis. Glaucoma occurs when pressure happens around the eyes it causes some damage to the optic nerves and deterioration of vision. There are different levels ranging blindness of glaucoma disease. The diagnosis at an early stage allows a chance for therapies that slows the progression of the disease. In recent years, imaging technology from Heidelberg Retinal Tomography (HRT), Stereoscopic Disc Photo (SDP) and Optical Coherence Tomography (OCT) have been used for the diagnosis of glaucoma. This better accuracy and faster imaging techniques in response technique of OCT have become the most common method used by experts. Although OCT images or HRT precision and quickness, especially in the early stages, there are still difficulties and mistakes are occurred in diagnosis of glaucoma. It is difficult to obtain objective results on diagnosis and placement process of the doctor's. It seems very important to develop an objective decision support system for diagnosis and level the glaucoma disease for patients. By using OCT images and pattern recognition systems, it is possible to develop a support system for doctors to make their decisions on glaucoma. Thus, in this recent study, we develop an evaluation and support system to the usage of doctors. Pattern recognition system based computer software would help the doctors to make an objective evaluation for their patients. It is intended that after development and evaluation processes of the software, the system is planning to be serve for the usage of doctors in different hospitals.

Keywords: decision support system, glaucoma, image processing, pattern recognition

Procedia PDF Downloads 259
1340 Evolution of the Speaker in Russian Military Poetry of the Second Half of the 20th Century

Authors: Ilya A. Snegirev

Abstract:

The article focuses on the comparative study of Russian military poetry of the 20th century. To make a complete description, the verse of different genres, mainly minor lyrical form, is taken. The study makes it possible to emphasize the idea that genre is not completely representative for a comprehensive research, as it is also necessary to dwell upon the strategies of war description in verse. Furthermore, the tendency of lyrical hero individualization is noted. This tendency can be traced throughout the whole second half of the 20th century – the poets of the Second World War – and further, to the whole post-war poetry. To characterize these changes, the texts by K.M. Simonov and A.A. Surkov are being analyzed as the examples of the qualitative transition to an individual hero.

Keywords: literature’s evolution, narrator, storytelling poetry, tradition

Procedia PDF Downloads 167
1339 Prosodic Realization of Focus in the Public Speeches Delivered by Spanish Learners of English and English Native Speakers

Authors: Raúl Jiménez Vilches

Abstract:

Native (L1) speakers can mark prosodically one part of an utterance and make it more relevant as opposed to the rest of the constituents. Conversely, non-native (L2) speakers encounter problems when it comes to marking prosodically information structure in English. In fact, the L2 speaker’s choice for the prosodic realization of focus is not so clear and often obscures the intended pragmatic meaning and the communicative value in general. This paper reports some of the findings obtained in an L2 prosodic training course for Spanish learners of English within the context of public speaking. More specifically, it analyses the effects of the course experiment in relation to the non-native production of the tonic syllable to mark focus and compares it with the public speeches delivered by native English speakers. The whole experimental training was executed throughout eighteen input sessions (1,440 minutes total time) and all the sessions took place in the classroom. In particular, the first part of the course provided explicit instruction on the recognition and production of the tonic syllable and how the tonic syllable is used to express focus. The non-native and native oral presentations were acoustically analyzed using Praat software for speech analysis (7,356 words in total). The investigation adopted mixed and embedded methodologies. Quantitative information is needed when measuring acoustically the phonetic realization of focus. Qualitative data such as questionnaires, interviews, and observations were also used to interpret the quantitative data. The embedded experiment design was implemented through the analysis of the public speeches before and after the intervention. Results indicate that, even after the L2 prosodic training course, Spanish learners of English still show some major inconsistencies in marking focus effectively. Although there was occasional improvement regarding the choice for location and word classes, Spanish learners were, in general, far from achieving similar results to the ones obtained by the English native speakers in the two types of focus. The prosodic realization of focus seems to be one of the hardest areas of the English prosodic system to be mastered by Spanish learners. A funded research project is in the process of moving the present classroom-based experiment to an online environment (mobile app) and determining whether there is a more effective focus usage through CAPT (Computer-Assisted Pronunciation) tools.

Keywords: focus, prosody, public speaking, Spanish learners of English

Procedia PDF Downloads 63
1338 The Face Sync-Smart Attendance

Authors: Bekkem Chakradhar Reddy, Y. Soni Priya, Mathivanan G., L. K. Joshila Grace, N. Srinivasan, Asha P.

Abstract:

Currently, there are a lot of problems related to marking attendance in schools, offices, or other places. Organizations tasked with collecting daily attendance data have numerous concerns. There are different ways to mark attendance. The most commonly used method is collecting data manually by calling each student. It is a longer process and problematic. Now, there are a lot of new technologies that help to mark attendance automatically. It reduces work and records the data. We have proposed to implement attendance marking using the latest technologies. We have implemented a system based on face identification and analyzing faces. The project is developed by gathering faces and analyzing data, using deep learning algorithms to recognize faces effectively. The data is recorded and forwarded to the host through mail. The project was implemented in Python and Python libraries used are CV2, Face Recognition, and Smtplib.

Keywords: python, deep learning, face recognition, CV2, smtplib, Dlib.

Procedia PDF Downloads 22