Search results for: speech processing
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4226

Search results for: speech processing

3896 Development of a Social Assistive Robot for Elderly Care

Authors: Edwin Foo, Woei Wen, Lui, Meijun Zhao, Shigeru Kuchii, Chin Sai Wong, Chung Sern Goh, Yi Hao He

Abstract:

This presentation presents an elderly care and assistive social robot development work. We named this robot JOS and he is restricted to table top operation. JOS is designed to have a maximum volume of 3600 cm3 with its base restricted to 250 mm and his mission is to provide companion, assist and help the elderly. In order for JOS to accomplish his mission, he will be equipped with perception, reaction and cognition capability. His appearance will be not human like but more towards cute and approachable type. JOS will also be designed to be neutral gender. However, the robot will still have eyes, eyelid and a mouth. For his eyes and eyelids, they will be built entirely with Robotis Dynamixel AX18 motor. To realize this complex task, JOS will be also be equipped with micro-phone array, vision camera and Intel i5 NUC computer and a powered by a 12 V lithium battery that will be self-charging. His face is constructed using 1 motor each for the eyelid, 2 motors for the eyeballs, 3 motors for the neck mechanism and 1 motor for the lips movement. The vision senor will be house on JOS forehead and the microphone array will be somewhere below the mouth. For the vision system, Omron latest OKAO vision sensor is used. It is a compact and versatile sensor that is only 60mm by 40mm in size and operates with only 5V supply. In addition, OKAO vision sensor is capable of identifying the user and recognizing the expression of the user. With these functions, JOS is able to track and identify the user. If he cannot recognize the user, JOS will ask the user if he would want him to remember the user. If yes, JOS will store the user information together with the capture face image into a database. This will allow JOS to recognize the user the next time the user is with JOS. In addition, JOS is also able to interpret the mood of the user through the facial expression of the user. This will allow the robot to understand the user mood and behavior and react according. Machine learning will be later incorporated to learn the behavior of the user so as to understand the mood of the user and requirement better. For the speech system, Microsoft speech and grammar engine is used for the speech recognition. In order to use the speech engine, we need to build up a speech grammar database that captures the commonly used words by the elderly. This database is built from research journals and literature on elderly speech and also interviewing elderly what do they want to robot to assist them with. Using the result from the interview and research from journal, we are able to derive a set of common words the elderly frequently used to request for the help. It is from this set that we build up our grammar database. In situation where there is more than one person near JOS, he is able to identify the person who is talking to him through an in-house developed microphone array structure. In order to make the robot more interacting, we have also included the capability for the robot to express his emotion to the user through the facial expressions by changing the position and movement of the eyelids and mouth. All robot emotions will be in response to the user mood and request. Lastly, we are expecting to complete this phase of project and test it with elderly and also delirium patient by Feb 2015.

Keywords: social robot, vision, elderly care, machine learning

Procedia PDF Downloads 421
3895 Semi-Supervised Learning for Spanish Speech Recognition Using Deep Neural Networks

Authors: B. R. Campomanes-Alvarez, P. Quiros, B. Fernandez

Abstract:

Automatic Speech Recognition (ASR) is a machine-based process of decoding and transcribing oral speech. A typical ASR system receives acoustic input from a speaker or an audio file, analyzes it using algorithms, and produces an output in the form of a text. Some speech recognition systems use Hidden Markov Models (HMMs) to deal with the temporal variability of speech and Gaussian Mixture Models (GMMs) to determine how well each state of each HMM fits a short window of frames of coefficients that represents the acoustic input. Another way to evaluate the fit is to use a feed-forward neural network that takes several frames of coefficients as input and produces posterior probabilities over HMM states as output. Deep neural networks (DNNs) that have many hidden layers and are trained using new methods have been shown to outperform GMMs on a variety of speech recognition systems. Acoustic models for state-of-the-art ASR systems are usually training on massive amounts of data. However, audio files with their corresponding transcriptions can be difficult to obtain, especially in the Spanish language. Hence, in the case of these low-resource scenarios, building an ASR model is considered as a complex task due to the lack of labeled data, resulting in an under-trained system. Semi-supervised learning approaches arise as necessary tasks given the high cost of transcribing audio data. The main goal of this proposal is to develop a procedure based on acoustic semi-supervised learning for Spanish ASR systems by using DNNs. This semi-supervised learning approach consists of: (a) Training a seed ASR model with a DNN using a set of audios and their respective transcriptions. A DNN with a one-hidden-layer network was initialized; increasing the number of hidden layers in training, to a five. A refinement, which consisted of the weight matrix plus bias term and a Stochastic Gradient Descent (SGD) training were also performed. The objective function was the cross-entropy criterion. (b) Decoding/testing a set of unlabeled data with the obtained seed model. (c) Selecting a suitable subset of the validated data to retrain the seed model, thereby improving its performance on the target test set. To choose the most precise transcriptions, three confidence scores or metrics, regarding the lattice concept (based on the graph cost, the acoustic cost and a combination of both), was performed as selection technique. The performance of the ASR system will be calculated by means of the Word Error Rate (WER). The test dataset was renewed in order to extract the new transcriptions added to the training dataset. Some experiments were carried out in order to select the best ASR results. A comparison between a GMM-based model without retraining and the DNN proposed system was also made under the same conditions. Results showed that the semi-supervised ASR-model based on DNNs outperformed the GMM-model, in terms of WER, in all tested cases. The best result obtained an improvement of 6% relative WER. Hence, these promising results suggest that the proposed technique could be suitable for building ASR models in low-resource environments.

Keywords: automatic speech recognition, deep neural networks, machine learning, semi-supervised learning

Procedia PDF Downloads 322
3894 Synthesis and Characterisation of Bi-Substituted Magnetite Nanoparticles by Mechanochemical Processing (MCP)

Authors: Morteza Mohri Esfahani, Amir S. H. Rozatian, Morteza Mozaffari

Abstract:

Single phase magnetite nanoparticles and Bi-substituted ones were prepared by mechanochemical processing (MCP). The effects of Bi-substitution on the structural and magnetic properties of the nanoparticles were studied by X-ray Diffraction (XRD) and magnetometry techniques, respectively. The XRD results showed that all samples have spinel phase and by increasing Bi content, the main diffraction peaks were shifted to higher angles, which means the lattice parameter decreases from 0.843 to 0.838 nm and then increases to 0.841 nm. Also, the results revealed that increasing Bi content lead to a decrease in saturation magnetization (Ms) from 74.9 to 48.8 emu/g and an increase in coercivity (Hc) from 96.8 to 137.1 Oe.

Keywords: bi-substituted magnetite nanoparticles, mechanochemical processing, X-ray diffraction, magnetism

Procedia PDF Downloads 514
3893 Online Prediction of Nonlinear Signal Processing Problems Based Kernel Adaptive Filtering

Authors: Hamza Nejib, Okba Taouali

Abstract:

This paper presents two of the most knowing kernel adaptive filtering (KAF) approaches, the kernel least mean squares and the kernel recursive least squares, in order to predict a new output of nonlinear signal processing. Both of these methods implement a nonlinear transfer function using kernel methods in a particular space named reproducing kernel Hilbert space (RKHS) where the model is a linear combination of kernel functions applied to transform the observed data from the input space to a high dimensional feature space of vectors, this idea known as the kernel trick. Then KAF is the developing filters in RKHS. We use two nonlinear signal processing problems, Mackey Glass chaotic time series prediction and nonlinear channel equalization to figure the performance of the approaches presented and finally to result which of them is the adapted one.

Keywords: online prediction, KAF, signal processing, RKHS, Kernel methods, KRLS, KLMS

Procedia PDF Downloads 375
3892 Efficient Pre-Processing of Single-Cell Assay for Transposase Accessible Chromatin with High-Throughput Sequencing Data

Authors: Fan Gao, Lior Pachter

Abstract:

The primary tool currently used to pre-process 10X Chromium single-cell ATAC-seq data is Cell Ranger, which can take very long to run on standard datasets. To facilitate rapid pre-processing that enables reproducible workflows, we present a suite of tools called scATAK for pre-processing single-cell ATAC-seq data that is 15 to 18 times faster than Cell Ranger on mouse and human samples. Our tool can also calculate chromatin interaction potential matrices, and generate open chromatin signal and interaction traces for cell groups. We use scATAK tool to explore the chromatin regulatory landscape of a healthy adult human brain and unveil cell-type specific features, and show that it provides a convenient and computational efficient approach for pre-processing single-cell ATAC-seq data.

Keywords: single-cell, ATAC-seq, bioinformatics, open chromatin landscape, chromatin interactome

Procedia PDF Downloads 136
3891 Design and Development of 5-DOF Color Sorting Manipulator for Industrial Applications

Authors: Atef A. Ata, Sohair F. Rezeka, Ahmed El-Shenawy, Mohammed Diab

Abstract:

Image processing in today’s world grabs massive attentions as it leads to possibilities of broaden application in many fields of high technology. The real challenge is how to improve existing sorting system applications which consists of two integrated stations of processing and handling with a new image processing feature. Existing color sorting techniques use a set of inductive, capacitive, and optical sensors to differentiate object color. This research presents a mechatronics color sorting system solution with the application of image processing. A 5-DOF robot arm is designed and developed with pick and place operation to be main part of the color sorting system. Image processing procedure senses the circular objects in an image captured in real time by a webcam attached at the end-effector then extracts color and position information out of it. This information is passed as a sequence of sorting commands to the manipulator that has pick-and-place mechanism. Performance analysis proves that this color based object sorting system works very accurate under ideal condition in term of adequate illumination, circular objects shape and color. The circular objects tested for sorting are red, green and blue. For non-ideal condition, such as unspecified color the accuracy reduces to 80%.

Keywords: robotics manipulator, 5-DOF manipulator, image processing, color sorting, pick-and-place

Procedia PDF Downloads 345
3890 Capacity Enhancement for Agricultural Workers in Mangosteen Product

Authors: Cholpassorn Sitthiwarongchai, Chutikarn Sriviboon

Abstract:

The two primary objectives of this research were (1) to examine the current knowledge and actual circumstance of agricultural workers about mangosteen product processing; and (2) to analyze and evaluate ways to develop capacity of mangosteen product processing. The population of this study was 15,125 people who work in the agricultural sector, in this context, mangosteen production, in the eastern part of Thailand that included Chantaburi Province, Rayong Province, Trad Province and Pracheenburi Province. The sample size based on Yamane’s calculation with 95% reliability was therefore 392 samples. Mixed method was employed included questionnaire and focus group discussion with Connoisseurship Model used in order to collect quantitative and qualitative data. Key informants were used in the focus group including agricultural business owners, academic people in agro food processing, local academics, local community development staff, OTOP subcommittee, and representatives of agro processing industry professional organizations. The study found that the majority of the respondents agreed with a high level (in five-rating scale) towards most of variables of knowledge management in agro food processing. The result of the current knowledge and actual circumstance of agricultural human resource in an arena of mangosteen product processing revealed that mostly, the respondents agreed at a high level to establish 7 variables. The guideline to developing the body of knowledge in order to enhance the capacity of the agricultural workers in mangosteen product processing was delivered in the focus group discussion. The discussion finally contributed to an idea to produce manuals for mangosteen product processing methods, with 4 products chosen: (1) mangosteen soap, (2) mangosteen juice, (3) mangosteen toffee, and (4) mangosteen preserves or jam.

Keywords: capacity enhancement, agricultural workers, mangosteen product processing, marketing management

Procedia PDF Downloads 191
3889 Sociolinguistics and Language Change

Authors: Banazzouz Halima

Abstract:

Throughout the ages, language has been viewed not only as a simple code of communicating information but rather as the most powerful and versatile medium of maintaining relationships with other people. While,by the end of the 18th century, such matters of scientific investigation concerning the study of human language began to occur under the scope of “Linguistics” generally defined as the scientific study of language. Linguistics, thus, provides a growing body of scientific knowledge about language which can guide the activity of the language teacher and student as well. Moreover,as times passed, the linguistic development engaged language in a broadly practiced academic discipline having relationship with other sciences such as: psychology, sociology, anthropology etc. Therefore, “Sociolinguistics” was given birth during the 1960’s. In fact, the given abstract is mainly linguistic, inserted under the scope of “Sociolinguistics” and by far it highlights on the process of linguistic variation and language change to show that all languages change through time and linguistic systems may vary from one speech community to another providing there is a sense of vitality where people of different parts of the globe may mutually and intelligibly communicate and comprehend each other.

Keywords: language change-sociolinguistics, social context-speech community, vitality of language, linguistic variation, urban dialectology, urban dialectology

Procedia PDF Downloads 605
3888 The Resistance Reader Program Based on Image Processing

Authors: Janpen Srijan, Nahathai Tanmang, Thanit Purathanang, Anun Dowchern, Saksit Summart, Seangduan Kampimpa

Abstract:

This paper presents the resistance reader program based on image processing by using MATLAB. The proposed program is divided into six parts; the first part is the web camera; the second part is a watt selection before shooting the resistor; the third part is a part of finding the position of the color on the mid-point of resistor; the fourth part is a part of identifying color code of the resistor; the fifth part is a part of taking the number of values for each color for resistance calculation and the last part is a part of displaying result of resistance value. The experimental result of the resistance reader program based on image processing was able to display the resistance value of resistor. The accuracy of proposed program is 85 percent for 1 watt resistor. It has 15 percent of reading error because a problem with the color code of some resistor was too bright.

Keywords: resistance reader program, image processing, resistor, MATLAB

Procedia PDF Downloads 360
3887 Mistranslation in Cross Cultural Communication: A Discourse Analysis on Former President Bush’s Speech in 2001

Authors: Lowai Abed

Abstract:

The differences in languages play a big role in cross-cultural communication. If meanings are not translated accurately, the risk can be crucial not only on an interpersonal level, but also on the international and political levels. The use of metaphorical language by politicians can cause great confusion, often leading to statements being misconstrued. In these situations, it is the translators who struggle to put forward the intended meaning with clarity and this makes translation an important field to study and analyze when it comes to cross-cultural communication. Owing to the growing importance of language and the power of translation in politics, this research analyzes part of President Bush’s speech in 2001 in which he used the word “Crusade” which caused his statement to be misconstrued. The research uses a discourse analysis of cross-cultural communication literature which provides answers supported by historical, linguistic, and communicative perspectives. The first finding indicates that the word ‘crusade’ carries different meaning and significance in the narratives of the Western world when compared to the Middle East. The second one is that, linguistically, maintaining cultural meanings through translation is quite difficult and challenging. Third, when it comes to the cross-cultural communication perspective, the common and frequent usage of literal translation is a sign of poor strategies being followed in translation training. Based on the example of Bush’s speech, this paper hopes to highlight the weak practices in translation in cross-cultural communication which are still commonly used across the world. Translation studies have to take issues such as this seriously and attempt to find a solution. In every language, there are words and phrases that have cultural, historical and social meanings that are woven into the language. Literal translation is not the solution for this problem because that strategy is unable to convey these meanings in the target language.

Keywords: crusade, metaphor, mistranslation, war in terror

Procedia PDF Downloads 86
3886 Patterns of TV Simultaneous Interpreting of Emotive Overtones in Trump’s Victory Speech from English into Arabic

Authors: Hanan Al-Jabri

Abstract:

Simultaneous interpreting is deemed to be the most challenging mode of interpreting by many scholars. The special constraints involved in this task including time constraints, different linguistic systems, and stress pose a great challenge to most interpreters. These constraints are likely to maximise when the interpreting task is done live on TV. The TV interpreter is exposed to a wide variety of audiences with different backgrounds and needs and is mostly asked to interpret high profile tasks which raise his/her levels of stress, which further complicate the task. Under these constraints, which require fast and efficient performance, TV interpreters of four TV channels were asked to render Trump's victory speech into Arabic. However, they had also to deal with the burden of rendering English emotive overtones employed by the speaker into a whole different linguistic system. The current study aims at investigating the way TV interpreters, who worked in the simultaneous mode, handled this task; it aims at exploring and evaluating the TV interpreters’ linguistic choices and whether the original emotive effect was maintained, upgraded, downgraded or abandoned in their renditions. It also aims at exploring the possible difficulties and challenges that emerged during this process and might have influenced the interpreters’ linguistic choices. To achieve its aims, the study analysed Trump’s victory speech delivered on November 6, 2016, along with four Arabic simultaneous interpretations produced by four TV channels: Al-Jazeera, RT, CBC News, and France 24. The analysis of the study relied on two frameworks: a macro and a micro framework. The former presents an overview of the wider context of the English speech as well as an overview of the speaker and his political background to help understand the linguistic choices he made in the speech, and the latter framework investigates the linguistic tools which were employed by the speaker to stir people’s emotions. These tools were investigated based on Shamaa’s (1978) classification of emotive meaning according to their linguistic level: phonological, morphological, syntactic, and semantic and lexical levels. Moreover, this level investigates the patterns of rendition which were detected in the Arabic deliveries. The results of the study identified different rendition patterns in the Arabic deliveries, including parallel rendition, approximation, condensation, elaboration, transformation, expansion, generalisation, explicitation, paraphrase, and omission. The emerging patterns, as suggested by the analysis, were influenced by factors such as speedy and continuous delivery of some stretches, and highly-dense segments among other factors. The study aims to contribute to a better understanding of TV simultaneous interpreting between English and Arabic, as well as the practices of TV interpreters when rendering emotiveness especially that little is known about interpreting practices in the field of TV, particularly between Arabic and English.

Keywords: emotive overtones, interpreting strategies, political speeches, TV interpreting

Procedia PDF Downloads 138
3885 Efficient Filtering of Graph Based Data Using Graph Partitioning

Authors: Nileshkumar Vaishnav, Aditya Tatu

Abstract:

An algebraic framework for processing graph signals axiomatically designates the graph adjacency matrix as the shift operator. In this setup, we often encounter a problem wherein we know the filtered output and the filter coefficients, and need to find out the input graph signal. Solution to this problem using direct approach requires O(N3) operations, where N is the number of vertices in graph. In this paper, we adapt the spectral graph partitioning method for partitioning of graphs and use it to reduce the computational cost of the filtering problem. We use the example of denoising of the temperature data to illustrate the efficacy of the approach.

Keywords: graph signal processing, graph partitioning, inverse filtering on graphs, algebraic signal processing

Procedia PDF Downloads 285
3884 Embedded System of Signal Processing on FPGA: Underwater Application Architecture

Authors: Abdelkader Elhanaoui, Mhamed Hadji, Rachid Skouri, Said Agounad

Abstract:

The purpose of this paper is to study the phenomenon of acoustic scattering by using a new method. The signal processing (Fast Fourier Transform FFT Inverse Fast Fourier Transform iFFT and BESSEL functions) is widely applied to obtain information with high precision accuracy. Signal processing has a wider implementation in general-purpose pro-cessors. Our interest was focused on the use of FPGAs (Field-Programmable Gate Ar-rays) in order to minimize the computational complexity in single processor architecture, then be accelerated on FPGA and meet real-time and energy efficiency requirements. Gen-eral-purpose processors are not efficient for signal processing. We implemented the acous-tic backscattered signal processing model on the Altera DE-SOC board and compared it to Odroid xu4. By comparison, the computing latency of Odroid xu4 and FPGA is 60 sec-onds and 3 seconds, respectively. The detailed SoC FPGA-based system has shown that acoustic spectra are performed up to 20 times faster than the Odroid xu4 implementation. FPGA-based system of processing algorithms is realized with an absolute error of about 10⁻³. This study underlines the increasing importance of embedded systems in underwater acoustics, especially in non-destructive testing. It is possible to obtain information related to the detection and characterization of submerged cells. So we have achieved good exper-imental results in real-time and energy efficiency.

Keywords: DE1 FPGA, acoustic scattering, form function, signal processing, non-destructive testing

Procedia PDF Downloads 58
3883 ViraPart: A Text Refinement Framework for Automatic Speech Recognition and Natural Language Processing Tasks in Persian

Authors: Narges Farokhshad, Milad Molazadeh, Saman Jamalabbasi, Hamed Babaei Giglou, Saeed Bibak

Abstract:

The Persian language is an inflectional subject-object-verb language. This fact makes Persian a more uncertain language. However, using techniques such as Zero-Width Non-Joiner (ZWNJ) recognition, punctuation restoration, and Persian Ezafe construction will lead us to a more understandable and precise language. In most of the works in Persian, these techniques are addressed individually. Despite that, we believe that for text refinement in Persian, all of these tasks are necessary. In this work, we proposed a ViraPart framework that uses embedded ParsBERT in its core for text clarifications. First, used the BERT variant for Persian followed by a classifier layer for classification procedures. Next, we combined models outputs to output cleartext. In the end, the proposed model for ZWNJ recognition, punctuation restoration, and Persian Ezafe construction performs the averaged F1 macro scores of 96.90%, 92.13%, and 98.50%, respectively. Experimental results show that our proposed approach is very effective in text refinement for the Persian language.

Keywords: Persian Ezafe, punctuation, ZWNJ, NLP, ParsBERT, transformers

Procedia PDF Downloads 191
3882 Techniques to Characterize Subpopulations among Hearing Impaired Patients and Its Impact for Hearing Aid Fitting

Authors: Vijaya K. Narne, Gerard Loquet, Tobias Piechowiak, Dorte Hammershoi, Jesper H. Schmidt

Abstract:

BEAR, which stands for better hearing rehabilitation is a large-scale project in Denmark designed and executed by three national universities, three hospitals, and the hearing aid industry with the aim to improve hearing aid fitting. A total of 1963 hearing impaired people were included and were segmented into subgroups based on hearing-loss, demographics, audiological and questionnaires data (i.e., the speech, spatial and qualities of hearing scale [SSQ-12] and the International Outcome Inventory for Hearing-Aids [IOI-HA]). With the aim to provide a better hearing-aid fit to individual patients, we applied modern machine learning techniques with traditional audiograms rule-based systems. Results show that age, speech discrimination scores, and audiogram configurations were evolved as important parameters in characterizing sub-population from the data-set. The attempt to characterize sub-population reveal a clearer picture about the individual hearing difficulties encountered and the benefits derived from more individualized hearing aids.

Keywords: hearing loss, audiological data, machine learning, hearing aids

Procedia PDF Downloads 134
3881 The Discourse Analysis of Friday Sermons in Pakistan: A Social Perspective

Authors: Syed Hamid Farooq Bukhari

Abstract:

This study intends to clarify the Friday sermon by evaluating the formation of its discourse, the composition, and selection of its subject matters, the structure, and functions of its rules as well as the outline of its communication proceeds, and the distinctiveness of its words along with definite provisions. In this research, a qualitative and descriptive method is used to draw out conclusions. This paper considers the sermon mechanism of the speech and advances it contextually. The information was composed in Pakistan and several of its mosques supposing the imams of the city and the location of the mosques. The presentation and analysis of the facts have directed to the subsequent conclusions: (1) the Friday sermon holds verbal discussion that has habitual and classic formation, (2) the approaches of the formation of the subjects consist of storytelling, quotation as well as the use of accepted terms, (3) the composition of the codes involves Arabic, English, Urdu, and many other local languages, (4) the expressions of the speech include all types of sermon acts, (5) different requisites emerge in the sermons demonstrating that the Friday sermon functions as an index or usage of verbal communication in an exacting field.

Keywords: Friday, sermons, Pakistan, social

Procedia PDF Downloads 143
3880 Contrastive Focus Marking in Brazilian Children under Typical and Atypical Phonological Development

Authors: Geovana Soncin, Larissa Berti

Abstract:

Some aspects of prosody acquisition remain still unclear, especially regarding atypical speech development processes. This work deals with prosody acquisition and its implications for clinical purposes. Therefore, we analyze speech samples produced by adult speakers, children in typical language development, and children with phonological disorders. Phonological disorder comprises deviating manifestations characterized by inconsistencies in the phonological representation of a linguistic system under acquisition. The clinical assessment is performed mostly based on contrasts whose manifestations occur in the segmental level of a phonological system. Prosodic organization of spoken utterances is not included in the standard assessment. However, assuming that prosody is part of the phonological system, it was hypothesized that children with Phonological Disorders could present inconsistencies that also occur at a prosodic level. Based on this hypothesis, the paper aims to analyze contrastive focus marking in the speech of children with Phonological Disorders in comparison with the speech of children under Typical Language Development and adults. The participants of all groups were native speakers of Brazilian Portuguese. The investigation was designed in such a way as to identify differences and similarities among the groups that could be interpreted as clues of normal or deviant processes of prosody acquisition. Contrastive focus in Brazilian Portuguese is marked by increasing duration, f0, and intensity on the focused element as well as by a particular type of pitch accent (L*+H). Thirty-nine subjects participated, thirteen from each group. Acoustic analysis was performed, considering duration, intensity, and intonation as parameters. Children with PD were recruited in sessions from a service provided by Speech-Language Pathology Therapy; children in TD, paired in age and sex with the first group, were recruited in a regular school; and 20-24 years old adults were recruited from a University class. In a game prepared to elicit focused sentences, all of them produced the sentence “Girls love red dress,” marking focus on different syntactic positions: subject, verb, and object. Results showed that adults, children in typical language development, and children with Phonological Disorders marked contrastive focus differently: typical children used all parameters like adults do; however, in comparison with them, they exaggerated duration and, in the opposite direction, they did not increase f0 in a sufficient magnitude as adults; children with Phonological Disorder presented inconsistencies in duration, not increasing it in some syntactic positions, and also in intonation, not producing the representative pitch accent of contrastive focus. The results suggest prosody is also affected by phonological disorder and give clues of developmental processes of prosody acquisition.

Keywords: Brazilian Portuguese, contrastive focus, phonological disorder, prosody acquisition

Procedia PDF Downloads 66
3879 Teaching Tools for Web Processing Services

Authors: Rashid Javed, Hardy Lehmkuehler, Franz Josef-Behr

Abstract:

Web Processing Services (WPS) have up growing concern in geoinformation research. However, teaching about them is difficult because of the generally complex circumstances of their use. They limit the possibilities for hands- on- exercises on Web Processing Services. To support understanding however a Training Tools Collection was brought on the way at University of Applied Sciences Stuttgart (HFT). It is limited to the scope of Geostatistical Interpolation of sample point data where different algorithms can be used like IDW, Nearest Neighbor etc. The Tools Collection aims to support understanding of the scope, definition and deployment of Web Processing Services. For example it is necessary to characterize the input of Interpolation by the data set, the parameters for the algorithm and the interpolation results (here a grid of interpolated values is assumed). This paper reports on first experiences using a pilot installation. This was intended to find suitable software interfaces for later full implementations and conclude on potential user interface characteristics. Experiences were made with Deegree software, one of several Services Suites (Collections). Being strictly programmed in Java, Deegree offers several OGC compliant Service Implementations that also promise to be of benefit for the project. The mentioned parameters for a WPS were formalized following the paradigm that any meaningful component will be defined in terms of suitable standards. E.g. the data output can be defined as a GML file. But, the choice of meaningful information pieces and user interactions is not free but partially determined by the selected WPS Processing Suite.

Keywords: deegree, interpolation, IDW, web processing service (WPS)

Procedia PDF Downloads 333
3878 Conversational Assistive Technology of Visually Impaired Person for Social Interaction

Authors: Komal Ghafoor, Tauqir Ahmad, Murtaza Hanif, Hira Zaheer

Abstract:

Assistive technology has been developed to support visually impaired people in their social interactions. Conversation assistive technology is designed to enhance communication skills, facilitate social interaction, and improve the quality of life of visually impaired individuals. This technology includes speech recognition, text-to-speech features, and other communication devices that enable users to communicate with others in real time. The technology uses natural language processing and machine learning algorithms to analyze spoken language and provide appropriate responses. It also includes features such as voice commands and audio feedback to provide users with a more immersive experience. These technologies have been shown to increase the confidence and independence of visually impaired individuals in social situations and have the potential to improve their social skills and relationships with others. Overall, conversation-assistive technology is a promising tool for empowering visually impaired people and improving their social interactions. One of the key benefits of conversation-assistive technology is that it allows visually impaired individuals to overcome communication barriers that they may face in social situations. It can help them to communicate more effectively with friends, family, and colleagues, as well as strangers in public spaces. By providing a more seamless and natural way to communicate, this technology can help to reduce feelings of isolation and improve overall quality of life. The main objective of this research is to give blind users the capability to move around in unfamiliar environments through a user-friendly device by face, object, and activity recognition system. This model evaluates the accuracy of activity recognition. This device captures the front view of the blind, detects the objects, recognizes the activities, and answers the blind query. It is implemented using the front view of the camera. The local dataset is collected that includes different 1st-person human activities. The results obtained are the identification of the activities that the VGG-16 model was trained on, where Hugging, Shaking Hands, Talking, Walking, Waving video, etc.

Keywords: dataset, visually impaired person, natural language process, human activity recognition

Procedia PDF Downloads 36
3877 Recognition of Objects in a Maritime Environment Using a Combination of Pre- and Post-Processing of the Polynomial Fit Method

Authors: R. R. Hordijk, O. J. G. Somsen

Abstract:

Traditionally, radar systems are the eyes and ears of a ship. However, these systems have their drawbacks and nowadays they are extended with systems that work with video and photos. Processing of data from these videos and photos is however very labour-intensive and efforts are being made to automate this process. A major problem when trying to recognize objects in water is that the 'background' is not homogeneous so that traditional image recognition technics do not work well. Main question is, can a method be developed which automate this recognition process. There are a large number of parameters involved to facilitate the identification of objects on such images. One is varying the resolution. In this research, the resolution of some images has been reduced to the extreme value of 1% of the original to reduce clutter before the polynomial fit (pre-processing). It turned out that the searched object was clearly recognizable as its grey value was well above the average. Another approach is to take two images of the same scene shortly after each other and compare the result. Because the water (waves) fluctuates much faster than an object floating in the water one can expect that the object is the only stable item in the two images. Both these methods (pre-processing and comparing two images of the same scene) delivered useful results. Though it is too early to conclude that with these methods all image problems can be solved they are certainly worthwhile for further research.

Keywords: image processing, image recognition, polynomial fit, water

Procedia PDF Downloads 512
3876 The Grammatical Dictionary Compiler: A System for Kartvelian Languages

Authors: Liana Lortkipanidze, Nino Amirezashvili, Nino Javashvili

Abstract:

The purpose of the grammatical dictionary is to provide information on the morphological and syntactic characteristics of the basic word in the dictionary entry. The electronic grammatical dictionaries are used as a tool of automated morphological analysis for texts processing. The Georgian Grammatical Dictionary should contain grammatical information for each word: part of speech, type of declension/conjugation, grammatical forms of the word (paradigm), alternative variants of basic word/lemma. In this paper, we present the system for compiling the Georgian Grammatical Dictionary automatically. We propose dictionary-based methods for extending grammatical lexicons. The input lexicon contains only a few number of words with identical grammatical features. The extension is based on similarity measures between features of words; more precisely, we add words to the extended lexicons, which are similar to those, which are already in the grammatical dictionary. Our dictionaries are corpora-based, and for the compiling, we introduce the method for lemmatization of unknown words, i.e., words of which neither full form nor lemma is in the grammatical dictionary.

Keywords: acquisition of lexicon, Georgian grammatical dictionary, lemmatization rules, morphological processor

Procedia PDF Downloads 124
3875 Resume Ranking Using Custom Word2vec and Rule-Based Natural Language Processing Techniques

Authors: Subodh Chandra Shakya, Rajendra Sapkota, Aakash Tamang, Shushant Pudasaini, Sujan Adhikari, Sajjan Adhikari

Abstract:

Lots of efforts have been made in order to measure the semantic similarity between the text corpora in the documents. Techniques have been evolved to measure the similarity of two documents. One such state-of-art technique in the field of Natural Language Processing (NLP) is word to vector models, which converts the words into their word-embedding and measures the similarity between the vectors. We found this to be quite useful for the task of resume ranking. So, this research paper is the implementation of the word2vec model along with other Natural Language Processing techniques in order to rank the resumes for the particular job description so as to automate the process of hiring. The research paper proposes the system and the findings that were made during the process of building the system.

Keywords: chunking, document similarity, information extraction, natural language processing, word2vec, word embedding

Procedia PDF Downloads 134
3874 Crop Classification using Unmanned Aerial Vehicle Images

Authors: Iqra Yaseen

Abstract:

One of the well-known areas of computer science and engineering, image processing in the context of computer vision has been essential to automation. In remote sensing, medical science, and many other fields, it has made it easier to uncover previously undiscovered facts. Grading of diverse items is now possible because of neural network algorithms, categorization, and digital image processing. Its use in the classification of agricultural products, particularly in the grading of seeds or grains and their cultivars, is widely recognized. A grading and sorting system enables the preservation of time, consistency, and uniformity. Global population growth has led to an increase in demand for food staples, biofuel, and other agricultural products. To meet this demand, available resources must be used and managed more effectively. Image processing is rapidly growing in the field of agriculture. Many applications have been developed using this approach for crop identification and classification, land and disease detection and for measuring other parameters of crop. Vegetation localization is the base of performing these task. Vegetation helps to identify the area where the crop is present. The productivity of the agriculture industry can be increased via image processing that is based upon Unmanned Aerial Vehicle photography and satellite. In this paper we use the machine learning techniques like Convolutional Neural Network, deep learning, image processing, classification, You Only Live Once to UAV imaging dataset to divide the crop into distinct groups and choose the best way to use it.

Keywords: image processing, UAV, YOLO, CNN, deep learning, classification

Procedia PDF Downloads 79
3873 Identifying and Understand Pragmatic Failures in Portuguese Foreign Language by Chinese Learners in Macau

Authors: Carla Lopes

Abstract:

It is clear nowadays that the proper performance of different speech acts is one of the most difficult obstacles that a foreign language learner has to overcome to be considered communicatively competent. This communication presents the results of an investigation on the pragmatic performance of Portuguese Language students at the University of Macau. The research discussed herein is based on a survey consisting of fourteen speaking situations to which the participants must respond in writing, and that includes different types of speech acts: apology, response to a compliment, refusal, complaint, disagreement and the understanding of the illocutionary force of indirect speech acts. The responses were classified in a five levels Likert scale (quantified from 1 to 5) according to their suitability for the particular situation. In general terms, we can summarize that about 45% of the respondents' answers were pragmatically competent, 10 % were acceptable and 45 % showed weaknesses at socio-pragmatic competence level. Given that the linguistic deviations were not taken into account, we can conclude that the faults are of cultural origin. It is natural that in the presence of orthogonal cultures, such as Chinese and Portuguese, there are failures of this type, barely solved in the four years of the undergraduate program. The target population, native speakers of Cantonese or Mandarin, make their first contact with the English language before joining the Bachelor of Portuguese Language. An analysis of the socio - pragmatic failures in the respondents’ answers suggests the conclusion that many of them are due to the lack of cultural knowledge. They try to compensate for this either using their native culture or resorting to a Western culture that they consider close to the Portuguese, that is the English or US culture, previously studied, and also widely present in the media and on the internet. This phenomenon, known as 'pragmatic transfer', can result in a linguistic behavior that may be considered inauthentic or pragmatically awkward. The resulting speech act is grammatically correct but is not pragmatically feasible, since it is not suitable to the culture of the target language, either because it does not exist or because the conditions of its use are in fact different. Analysis of the responses also supports the conclusion that these students present large deviations from the expected and stereotyped behavior of Chinese students. We can speculate while this linguistic behavior is the consequence of the Macao globalization that culturally casts the students, makes them more open, and distinguishes them from the typical Chinese students.

Keywords: Portuguese foreign language, pragmatic failures, pragmatic transfer, pragmatic competence

Procedia PDF Downloads 193
3872 Family Satisfaction with Neuro-Linguistic Care for Patients with Alzheimer’s Disease

Authors: Sara Sahraoui

Abstract:

This research studied the effect of Alzheimer's disease (AD) on language information processing in subjects with Alzheimer’s disease (AD) who were bilingual (French and dialectical Arabic). The results show a disorder of certain semantic aspects of their mother tongue (L1). On the other hand, grammatical levels appeared to be relatively unaffected in oral speech in L1 but were disturbed in the second language (L2). In consequence, we constructed a cognitive-language stimulation protocol for bilingual patients (PSCLAB) to respond to this disorder. The efficacy of this protocol in terms of rehabilitation was assessed in 30 such patients through discourse analysis carried out before and after initiating the protocol. The results show that cognitive/language training using the PSCLAB appears to improve the language behaviour of bilingual patients with AD. However, this survey study aims to verify the satisfaction of patients’ relatives with the results of cognitive language training by PSCLAB. We developed a brief instrument to measure the satisfaction of family members. The results report that the patient's relatives are satisfied with the results of cognitive training by PSCLAB.

Keywords: satisfaction, Alzheimer's disease, rehabilitation, levels language

Procedia PDF Downloads 45
3871 A Survey on Speech Emotion-Based Music Recommendation System

Authors: Chirag Kothawade, Gourie Jagtap, PreetKaur Relusinghani, Vedang Chavan, Smitha S. Bhosale

Abstract:

Psychological research has proven that music relieves stress, elevates mood, and is responsible for the release of “feel-good” chemicals like oxytocin, serotonin, and dopamine. It comes as no surprise that music has been a popular tool in rehabilitation centers and therapy for various disorders, thus with the interminably rising numbers of people facing mental health-related issues across the globe, addressing mental health concerns is more crucial than ever. Despite the existing music recommendation systems, there is a dearth of holistically curated algorithms that take care of the needs of users. Given that, an undeniable majority of people turn to music on a regular basis and that music has been proven to increase cognition, memory, and sleep quality while reducing anxiety, pain, and blood pressure, it is the need of the hour to fashion a product that extracts all the benefits of music in the most extensive and deployable method possible. Our project aims to ameliorate our users’ mental state by building a comprehensive mood-based music recommendation system called “Viby”.

Keywords: language, communication, speech recognition, interaction

Procedia PDF Downloads 38
3870 Giant Achievements in Food Processing

Authors: Farnaz Amidi Fazli

Abstract:

After long period of human experience about food processing from raw eating to canning of food in the last century now it is time to use novel technologies which are sometimes completely different from common technologies. It is possible to decontaminate food without using heat or the foods are stored without using cold chain. Pulsed electric field (PEF) processing is a non-thermal method of food preservation that uses short bursts of electricity, PEF can be used for processing liquid and semi-liquid food products. PEF processing offers high quality fresh-like liquid foods with excellent flavor, nutritional value, and shelf-life. High pressure processing (HPP) technology has the potential to fulfill both consumer and scientific requirements. The use of HPP for over 50 years has found applications in non-food industries. For food applications, ‘high pressure’ can be generally considered to be up to 600 MPa for most food products. After years, freezing has its high potential to food preservation due to new and quick freezing methods. Foods which are prepared by this technology have more acceptability and high quality comparing with old fashion slow freezing. Thus, quick freezing has further been adopted as a widespread commercial method for long-term preservation of perishable foods which improved both the health and convenience of everyone in the industrialised countries. Above parameters are achieved by Fluidised-bed freezing systems, freezing by immersion and Hydrofluidisation on the other hand new thawing methods like high-pressure, microwave, ohmic, and acoustic thawing have a key role in quality and adaptability of final product.

Keywords: quick freezing, thawing, high pressure, pulse electric, hydrofluidisation

Procedia PDF Downloads 302
3869 AI-Based Techniques for Online Social Media Network Sentiment Analysis: A Methodical Review

Authors: A. M. John-Otumu, M. M. Rahman, O. C. Nwokonkwo, M. C. Onuoha

Abstract:

Online social media networks have long served as a primary arena for group conversations, gossip, text-based information sharing and distribution. The use of natural language processing techniques for text classification and unbiased decision-making has not been far-fetched. Proper classification of this textual information in a given context has also been very difficult. As a result, we decided to conduct a systematic review of previous literature on sentiment classification and AI-based techniques that have been used in order to gain a better understanding of the process of designing and developing a robust and more accurate sentiment classifier that can correctly classify social media textual information of a given context between hate speech and inverted compliments with a high level of accuracy by assessing different artificial intelligence techniques. We evaluated over 250 articles from digital sources like ScienceDirect, ACM, Google Scholar, and IEEE Xplore and whittled down the number of research to 31. Findings revealed that Deep learning approaches such as CNN, RNN, BERT, and LSTM outperformed various machine learning techniques in terms of performance accuracy. A large dataset is also necessary for developing a robust sentiment classifier and can be obtained from places like Twitter, movie reviews, Kaggle, SST, and SemEval Task4. Hybrid Deep Learning techniques like CNN+LSTM, CNN+GRU, CNN+BERT outperformed single Deep Learning techniques and machine learning techniques. Python programming language outperformed Java programming language in terms of sentiment analyzer development due to its simplicity and AI-based library functionalities. Based on some of the important findings from this study, we made a recommendation for future research.

Keywords: artificial intelligence, natural language processing, sentiment analysis, social network, text

Procedia PDF Downloads 97
3868 Decision Making, Reward Processing and Response Selection

Authors: Benmansour Nassima, Benmansour Souheyla

Abstract:

The appropriate integration of reward processing and decision making provided by the environment is vital for behavioural success and individuals’ well being in everyday life. Functional neurological investigation has already provided an inclusive image on affective and emotional (motivational) processing in the healthy human brain and has recently focused its interest also on the assessment of brain function in anxious and depressed individuals. This article offers an overview on the theoretical approaches that relate emotion and decision-making, and spotlights investigation with anxious or depressed individuals to reveal how emotions can interfere with decision-making. This research aims at incorporating the emotional structure based on response and stimulation with a Bayesian approach to decision-making in terms of probability and value processing. It seeks to show how studies of individuals with emotional dysfunctions bear out that alterations of decision-making can be considered in terms of altered probability and value subtraction. The utmost objective is to critically determine if the probabilistic representation of belief affords could be a critical approach to scrutinize alterations in probability and value representation in subjective with anxiety and depression, and draw round the general implications of this approach.

Keywords: decision-making, motivation, alteration, reward processing, response selection

Procedia PDF Downloads 448
3867 A Corpus-Based Study on the Lexical, Syntactic and Sequential Features across Interpreting Types

Authors: Qianxi Lv, Junying Liang

Abstract:

Among the various modes of interpreting, simultaneous interpreting (SI) is regarded as a ‘complex’ and ‘extreme condition’ of cognitive tasks while consecutive interpreters (CI) do not have to share processing capacity between tasks. Given that SI exerts great cognitive demand, it makes sense to posit that the output of SI may be more compromised than that of CI in the linguistic features. The bulk of the research has stressed the varying cognitive demand and processes involved in different modes of interpreting; however, related empirical research is sparse. In keeping with our interest in investigating the quantitative linguistic factors discriminating between SI and CI, the current study seeks to examine the potential lexical simplification, syntactic complexity and sequential organization mechanism with a self-made inter-model corpus of transcribed simultaneous and consecutive interpretation, translated speech and original speech texts with a total running word of 321960. The lexical features are extracted in terms of the lexical density, list head coverage, hapax legomena, and type-token ratio, as well as core vocabulary percentage. Dependency distance, an index for syntactic complexity and reflective of processing demand is employed. Frequency motif is a non-grammatically-bound sequential unit and is also used to visualize the local function distribution of interpreting the output. While SI is generally regarded as multitasking with high cognitive load, our findings evidently show that CI may impose heavier or taxing cognitive resource differently and hence yields more lexically and syntactically simplified output. In addition, the sequential features manifest that SI and CI organize the sequences from the source text in different ways into the output, to minimize the cognitive load respectively. We reasoned the results in the framework that cognitive demand is exerted both on maintaining and coordinating component of Working Memory. On the one hand, the information maintained in CI is inherently larger in volume compared to SI. On the other hand, time constraints directly influence the sentence reformulation process. The temporal pressure from the input in SI makes the interpreters only keep a small chunk of information in the focus of attention. Thus, SI interpreters usually produce the output by largely retaining the source structure so as to relieve the information from the working memory immediately after formulated in the target language. Conversely, CI interpreters receive at least a few sentences before reformulation, when they are more self-paced. CI interpreters may thus tend to retain and generate the information in a way to lessen the demand. In other words, interpreters cope with the high demand in the reformulation phase of CI by generating output with densely distributed function words, more content words of higher frequency values and fewer variations, simpler structures and more frequently used language sequences. We consequently propose a revised effort model based on the result for a better illustration of cognitive demand during both interpreting types.

Keywords: cognitive demand, corpus-based, dependency distance, frequency motif, interpreting types, lexical simplification, sequential units distribution, syntactic complexity

Procedia PDF Downloads 148