Search results for: speaker segmentation
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 572

Search results for: speaker segmentation

152 The Effect of Speech-Shaped Noise and Speaker’s Voice Quality on First-Grade Children’s Speech Perception and Listening Comprehension

Authors: I. Schiller, D. Morsomme, A. Remacle

Abstract:

Children’s ability to process spoken language develops until the late teenage years. At school, where efficient spoken language processing is key to academic achievement, listening conditions are often unfavorable. High background noise and poor teacher’s voice represent typical sources of interference. It can be assumed that these factors particularly affect primary school children, because their language and literacy skills are still low. While it is generally accepted that background noise and impaired voice impede spoken language processing, there is an increasing need for analyzing impacts within specific linguistic areas. Against this background, the aim of the study was to investigate the effect of speech-shaped noise and imitated dysphonic voice on first-grade primary school children’s speech perception and sentence comprehension. Via headphones, 5 to 6-year-old children, recruited within the French-speaking community of Belgium, listened to and performed a minimal-pair discrimination task and a sentence-picture matching task. Stimuli were randomly presented according to four experimental conditions: (1) normal voice / no noise, (2) normal voice / noise, (3) impaired voice / no noise, and (4) impaired voice / noise. The primary outcome measure was task score. How did performance vary with respect to listening condition? Preliminary results will be presented with respect to speech perception and sentence comprehension and carefully interpreted in the light of past findings. This study helps to support our understanding of children’s language processing skills under adverse conditions. Results shall serve as a starting point for probing new measures to optimize children’s learning environment.

Keywords: impaired voice, sentence comprehension, speech perception, speech-shaped noise, spoken language processing

Procedia PDF Downloads 167
151 On the Implementation of The Pulse Coupled Neural Network (PCNN) in the Vision of Cognitive Systems

Authors: Hala Zaghloul, Taymoor Nazmy

Abstract:

One of the great challenges of the 21st century is to build a robot that can perceive and act within its environment and communicate with people, while also exhibiting the cognitive capabilities that lead to performance like that of people. The Pulse Coupled Neural Network, PCNN, is a relative new ANN model that derived from a neural mammal model with a great potential in the area of image processing as well as target recognition, feature extraction, speech recognition, combinatorial optimization, compressed encoding. PCNN has unique feature among other types of neural network, which make it a candid to be an important approach for perceiving in cognitive systems. This work show and emphasis on the potentials of PCNN to perform different tasks related to image processing. The main drawback or the obstacle that prevent the direct implementation of such technique, is the need to find away to control the PCNN parameters toward perform a specific task. This paper will evaluate the performance of PCNN standard model for processing images with different properties, and select the important parameters that give a significant result, also, the approaches towards find a way for the adaptation of the PCNN parameters to perform a specific task.

Keywords: cognitive system, image processing, segmentation, PCNN kernels

Procedia PDF Downloads 257
150 Automatic Music Score Recognition System Using Digital Image Processing

Authors: Yuan-Hsiang Chang, Zhong-Xian Peng, Li-Der Jeng

Abstract:

Music has always been an integral part of human’s daily lives. But, for the most people, reading musical score and turning it into melody is not easy. This study aims to develop an Automatic music score recognition system using digital image processing, which can be used to read and analyze musical score images automatically. The technical approaches included: (1) staff region segmentation; (2) image preprocessing; (3) note recognition; and (4) accidental and rest recognition. Digital image processing techniques (e.g., horizontal /vertical projections, connected component labeling, morphological processing, template matching, etc.) were applied according to musical notes, accidents, and rests in staff notations. Preliminary results showed that our system could achieve detection and recognition rates of 96.3% and 91.7%, respectively. In conclusion, we presented an effective automated musical score recognition system that could be integrated in a system with a media player to play music/songs given input images of musical score. Ultimately, this system could also be incorporated in applications for mobile devices as a learning tool, such that a music player could learn to play music/songs.

Keywords: connected component labeling, image processing, morphological processing, optical musical recognition

Procedia PDF Downloads 393
149 Study of Aerosol Deposition and Shielding Effects on Fluorescent Imaging Quantitative Evaluation in Protective Equipment Validation

Authors: Shinhao Yang, Hsiao-Chien Huang, Chin-Hsiang Luo

Abstract:

The leakage of protective clothing is an important issue in the occupational health field. There is no quantitative method for measuring the leakage of personal protective equipment. This work aims to measure the quantitative leakage of the personal protective equipment by using the fluorochrome aerosol tracer. The fluorescent aerosols were employed as airborne particulates in a controlled chamber with ultraviolet (UV) light-detectable stickers. After an exposure-and-leakage test, the protective equipment was removed and photographed with UV-scanning to evaluate areas, color depth ratio, and aerosol deposition and shielding effects of the areas where fluorescent aerosols had adhered to the body through the protective equipment. Thus, this work built a calculation software for quantitative leakage ratio of protective clothing based on fluorescent illumination depth/aerosol concentration ratio, illumination/Fa ratio, aerosol deposition and shielding effects, and the leakage area ratio on the segmentation. The results indicated that the two-repetition total leakage rate of the X, Y, and Z type protective clothing for subject T were about 3.05, 4.21, and 3.52 (mg/m2). For five-repetition, the leakage rate of T were about 4.12, 4.52, and 5.11 (mg/m2).

Keywords: fluorochrome, deposition, shielding effects, digital image processing, leakage ratio, personal protective equipment

Procedia PDF Downloads 299
148 Neural Machine Translation for Low-Resource African Languages: Benchmarking State-of-the-Art Transformer for Wolof

Authors: Cheikh Bamba Dione, Alla Lo, Elhadji Mamadou Nguer, Siley O. Ba

Abstract:

In this paper, we propose two neural machine translation (NMT) systems (French-to-Wolof and Wolof-to-French) based on sequence-to-sequence with attention and transformer architectures. We trained our models on a parallel French-Wolof corpus of about 83k sentence pairs. Because of the low-resource setting, we experimented with advanced methods for handling data sparsity, including subword segmentation, back translation, and the copied corpus method. We evaluate the models using the BLEU score and find that transformer outperforms the classic seq2seq model in all settings, in addition to being less sensitive to noise. In general, the best scores are achieved when training the models on word-level-based units. For subword-level models, using back translation proves to be slightly beneficial in low-resource (WO) to high-resource (FR) language translation for the transformer (but not for the seq2seq) models. A slight improvement can also be observed when injecting copied monolingual text in the target language. Moreover, combining the copied method data with back translation leads to a substantial improvement of the translation quality.

Keywords: backtranslation, low-resource language, neural machine translation, sequence-to-sequence, transformer, Wolof

Procedia PDF Downloads 119
147 Deep Supervision Based-Unet to Detect Buildings Changes from VHR Aerial Imagery

Authors: Shimaa Holail, Tamer Saleh, Xiongwu Xiao

Abstract:

Building change detection (BCD) from satellite imagery is an essential topic in urbanization monitoring, agricultural land management, and updating geospatial databases. Recently, methods for detecting changes based on deep learning have made significant progress and impressive results. However, it has the problem of being insensitive to changes in buildings with complex spectral differences, and the features being extracted are not discriminatory enough, resulting in incomplete buildings and irregular boundaries. To overcome these problems, we propose a dual Siamese network based on the Unet model with the addition of a deep supervision strategy (DS) in this paper. This network consists of a backbone (encoder) based on ImageNet pre-training, a fusion block, and feature pyramid networks (FPN) to enhance the step-by-step information of the changing regions and obtain a more accurate BCD map. To train the proposed method, we created a new dataset (EGY-BCD) of high-resolution and multi-temporal aerial images captured over New Cairo in Egypt to detect building changes for this purpose. The experimental results showed that the proposed method is effective and performs well with the EGY-BCD dataset regarding the overall accuracy, F1-score, and mIoU, which were 91.6 %, 80.1 %, and 73.5 %, respectively.

Keywords: building change detection, deep supervision, semantic segmentation, EGY-BCD dataset

Procedia PDF Downloads 84
146 The Trajectory of the Ball in Football Game

Authors: Mahdi Motahari, Mojtaba Farzaneh, Ebrahim Sepidbar

Abstract:

Tracking of moving and flying targets is one of the most important issues in image processing topic. Estimating of trajectory of desired object in short-term and long-term scale is more important than tracking of moving and flying targets. In this paper, a new way of identifying and estimating of future trajectory of a moving ball in long-term scale is estimated by using synthesis and interaction of image processing algorithms including noise removal and image segmentation, Kalman filter algorithm in order to estimating of trajectory of ball in football game in short-term scale and intelligent adaptive neuro-fuzzy algorithm based on time series of traverse distance. The proposed system attain more than 96% identify accuracy by using aforesaid methods and relaying on aforesaid algorithms and data base video in format of synthesis and interaction. Although the present method has high precision, it is time consuming. By comparing this method with other methods we realize the accuracy and efficiency of that.

Keywords: tracking, signal processing, moving targets and flying, artificial intelligent systems, estimating of trajectory, Kalman filter

Procedia PDF Downloads 440
145 An Event-Related Potential Investigation of Speech-in-Noise Recognition in Native and Nonnative Speakers of English

Authors: Zahra Fotovatnia, Jeffery A. Jones, Alexandra Gottardo

Abstract:

Speech communication often occurs in environments where noise conceals part of a message. Listeners should compensate for the lack of auditory information by picking up distinct acoustic cues and using semantic and sentential context to recreate the speaker’s intended message. This situation seems to be more challenging in a nonnative than native language. On the other hand, early bilinguals are expected to show an advantage over the late bilingual and monolingual speakers of a language due to their better executive functioning components. In this study, English monolingual speakers were compared with early and late nonnative speakers of English to understand speech in noise processing (SIN) and the underlying neurobiological features of this phenomenon. Auditory mismatch negativities (MMNs) were recorded using a double-oddball paradigm in response to a minimal pair that differed in their middle vowel (beat/bit) at Wilfrid Laurier University in Ontario, Canada. The results did not show any significant structural and electroneural differences across groups. However, vocabulary knowledge correlated positively with performance on tests that measured SIN processing in participants who learned English after age 6. Moreover, their performance on the test negatively correlated with the integral area amplitudes in the left superior temporal gyrus (STG). In addition, the STG was engaged before the inferior frontal gyrus (IFG) in noise-free and low-noise test conditions in all groups. We infer that the pre-attentive processing of words engages temporal lobes earlier than the fronto-central areas and that vocabulary knowledge helps the nonnative perception of degraded speech.

Keywords: degraded speech perception, event-related brain potentials, mismatch negativities, brain regions

Procedia PDF Downloads 83
144 Video Text Information Detection and Localization in Lecture Videos Using Moments

Authors: Belkacem Soundes, Guezouli Larbi

Abstract:

This paper presents a robust and accurate method for text detection and localization over lecture videos. Frame regions are classified into text or background based on visual feature analysis. However, lecture video shows significant degradation mainly related to acquisition conditions, camera motion and environmental changes resulting in low quality videos. Hence, affecting feature extraction and description efficiency. Moreover, traditional text detection methods cannot be directly applied to lecture videos. Therefore, robust feature extraction methods dedicated to this specific video genre are required for robust and accurate text detection and extraction. Method consists of a three-step process: Slide region detection and segmentation; Feature extraction and non-text filtering. For robust and effective features extraction moment functions are used. Two distinct types of moments are used: orthogonal and non-orthogonal. For orthogonal Zernike Moments, both Pseudo Zernike moments are used, whereas for non-orthogonal ones Hu moments are used. Expressivity and description efficiency are given and discussed. Proposed approach shows that in general, orthogonal moments show high accuracy in comparison to the non-orthogonal one. Pseudo Zernike moments are more effective than Zernike with better computation time.

Keywords: text detection, text localization, lecture videos, pseudo zernike moments

Procedia PDF Downloads 128
143 Decision Making Approach through Generalized Fuzzy Entropy Measure

Authors: H. D. Arora, Anjali Dhiman

Abstract:

Uncertainty is found everywhere and its understanding is central to decision making. Uncertainty emerges as one has less information than the total information required describing a system and its environment. Uncertainty and information are so closely associated that the information provided by an experiment for example, is equal to the amount of uncertainty removed. It may be pertinent to point out that uncertainty manifests itself in several forms and various kinds of uncertainties may arise from random fluctuations, incomplete information, imprecise perception, vagueness etc. For instance, one encounters uncertainty due to vagueness in communication through natural language. Uncertainty in this sense is represented by fuzziness resulting from imprecision of meaning of a concept expressed by linguistic terms. Fuzzy set concept provides an appropriate mathematical framework for dealing with the vagueness. Both information theory, proposed by Shannon (1948) and fuzzy set theory given by Zadeh (1965) plays an important role in human intelligence and various practical problems such as image segmentation, medical diagnosis etc. Numerous approaches and theories dealing with inaccuracy and uncertainty have been proposed by different researcher. In the present communication, we generalize fuzzy entropy proposed by De Luca and Termini (1972) corresponding to Shannon entropy(1948). Further, some of the basic properties of the proposed measure were examined. We also applied the proposed measure to the real life decision making problem.

Keywords: entropy, fuzzy sets, fuzzy entropy, generalized fuzzy entropy, decision making

Procedia PDF Downloads 419
142 Brain Tumor Detection and Classification Using Pre-Trained Deep Learning Models

Authors: Aditya Karade, Sharada Falane, Dhananjay Deshmukh, Vijaykumar Mantri

Abstract:

Brain tumors pose a significant challenge in healthcare due to their complex nature and impact on patient outcomes. The application of deep learning (DL) algorithms in medical imaging have shown promise in accurate and efficient brain tumour detection. This paper explores the performance of various pre-trained DL models ResNet50, Xception, InceptionV3, EfficientNetB0, DenseNet121, NASNetMobile, VGG19, VGG16, and MobileNet on a brain tumour dataset sourced from Figshare. The dataset consists of MRI scans categorizing different types of brain tumours, including meningioma, pituitary, glioma, and no tumour. The study involves a comprehensive evaluation of these models’ accuracy and effectiveness in classifying brain tumour images. Data preprocessing, augmentation, and finetuning techniques are employed to optimize model performance. Among the evaluated deep learning models for brain tumour detection, ResNet50 emerges as the top performer with an accuracy of 98.86%. Following closely is Xception, exhibiting a strong accuracy of 97.33%. These models showcase robust capabilities in accurately classifying brain tumour images. On the other end of the spectrum, VGG16 trails with the lowest accuracy at 89.02%.

Keywords: brain tumour, MRI image, detecting and classifying tumour, pre-trained models, transfer learning, image segmentation, data augmentation

Procedia PDF Downloads 47
141 Enhancing Learners' Metacognitive, Cultural and Linguistic Proficiency through Egyptian Series

Authors: Hanan Eltayeb, Reem Al Refaie

Abstract:

To be able to connect and relate to shows spoken in a foreign language, advanced learners must understand not only linguistics inferences but also cultural, metacognitive, and pragmatic connotations in colloquial Egyptian TV series. These connotations are needed to both understand the different facets of the dramas put before them, and they’re also consistently grown and formulated through watching these shows. The inferences have become a staple in the Egyptian colloquial culture over the years, making their way into day-to-day conversations as Egyptians use them to speak, relate, joke, and connect with each other, without having known one another from previous times. As for advanced learners, they need to understand these inferences not only to watch these shows, but also to be able to converse with Egyptians on a level that surpasses the formal, or standard. When faced with some of the somewhat recent shows on the Egyptian screens, learners faced challenges in understanding pragmatics, cultural, and religious background of the target language and consequently not able to interact effectively with a native speaker in real-life situations. This study aims to enhance the linguistic and cultural proficiency of learners through studying two genres of TV Colloquial Egyptian series. Study samples derived from two recent comedian and social Egyptian series ('The Seventh Neighbor' سابع جار, and 'Nelly and Sherihan' نيللي و شريهان). When learners watch such series, they are usually faced with a problem understanding inferences that have to do with social, religious, and political events that are addressed in the series. Using discourse analysis of the sematic, semantic, pragmatic, cultural, and linguistic characteristics of the target language, some major deductions were highlighted and repeated, showing a pattern in both. The research paper concludes that there are many sets of lingual and para-lingual phrases, idioms, and proverbs to be acquired and used effectively by teaching these series. The strategies adopted in the study can be applied to different types of media, like movies, TV shows, and even cartoons, to enhance student proficiency.

Keywords: Egyptian series, culture, linguistic competence, pragmatics, semantics, social

Procedia PDF Downloads 115
140 Autonomous Vehicle Detection and Classification in High Resolution Satellite Imagery

Authors: Ali J. Ghandour, Houssam A. Krayem, Abedelkarim A. Jezzini

Abstract:

High-resolution satellite images and remote sensing can provide global information in a fast way compared to traditional methods of data collection. Under such high resolution, a road is not a thin line anymore. Objects such as cars and trees are easily identifiable. Automatic vehicles enumeration can be considered one of the most important applications in traffic management. In this paper, autonomous vehicle detection and classification approach in highway environment is proposed. This approach consists mainly of three stages: (i) first, a set of preprocessing operations are applied including soil, vegetation, water suppression. (ii) Then, road networks detection and delineation is implemented using built-up area index, followed by several morphological operations. This step plays an important role in increasing the overall detection accuracy since vehicles candidates are objects contained within the road networks only. (iii) Multi-level Otsu segmentation is implemented in the last stage, resulting in vehicle detection and classification, where detected vehicles are classified into cars and trucks. Accuracy assessment analysis is conducted over different study areas to show the great efficiency of the proposed method, especially in highway environment.

Keywords: remote sensing, object identification, vehicle and road extraction, vehicle and road features-based classification

Procedia PDF Downloads 209
139 Sociolinguistic Aspects and Language Contact, Lexical Consequences in Francoprovençal Settings

Authors: Carmela Perta

Abstract:

In Italy the coexistence of standard language, its varieties and different minority languages - historical and migration languages - has been a way to study language contact in different directions; the focus of most of the studies is either the relations among the languages of the social repertoire, or the study of contact phenomena occurring in a particular structural level. However, studies on contact facts in relation to a given sociolinguistic situation of the speech community are still not present in literature. As regard the language level to investigate from the perspective of contact, it is commonly claimed that the lexicon is the most volatile part of language and most likely to undergo change due to superstrate influence, indeed first lexical features are borrowed, then, under long term cultural pressure, structural features may also be borrowed. The aim of this paper is to analyse language contact in two historical minority communities where Francoprovençal is spoken, in relation to their sociolinguistic situation. In this perspective, firstly lexical borrowings present in speakers’ speech production will be examined, trying to find a possible correlation between this part of the lexicon and informants’ sociolinguistic variables; secondly a possible correlation between a particular community sociolinguistic situation and lexical borrowing will be found. Methods used to collect data are based on the results obtained from 24 speakers in both the villages; the speaker group in the two communities consisted of 3 males and 3 females in each of four age groups, ranging in age from 9 to 85, and then divided into five groups according to their occupations. Speakers were asked to describe a sequence of pictures naming common objects and then describing scenes when they used these objects: they are common objects, frequently pronounced and belonging to semantic areas which are usually resistant and which are thought to survive. A subset of this task, involving 19 items with Italian source is examined here: in order to determine the significance of the independent variables (social factors) on the dependent variable (lexical variation) the statistical package SPSS, particularly the linear regression, was used.

Keywords: borrowing, Francoprovençal, language change, lexicon

Procedia PDF Downloads 350
138 An Event-Related Potential Study of Individual Differences in Word Recognition: The Evidence from Morphological Knowledge of Sino-Korean Prefixes

Authors: Jinwon Kang, Seonghak Jo, Joohee Ahn, Junghye Choi, Sun-Young Lee

Abstract:

A morphological priming has proved its importance by showing that segmentation occurs in morphemes when visual words are recognized within a noticeably short time. Regarding Sino-Korean prefixes, this study conducted an experiment on visual masked priming tasks with 57 ms stimulus-onset asynchrony (SOA) to see how individual differences in the amount of morphological knowledge affect morphological priming. The relationship between the prime and target words were classified as morphological (e.g., 미개척 migaecheog [unexplored] – 미해결 mihaegyel [unresolved]), semantical (e.g., 친환경 chinhwangyeong [eco-friendly]) – 무공해 mugonghae [no-pollution]), and orthographical (e.g., 미용실 miyongsil [beauty shop] – 미확보 mihwagbo [uncertainty]) conditions. We then compared the priming by configuring irrelevant paired stimuli for each condition’s control group. As a result, in the behavioral data, we observed facilitatory priming from a group with high morphological knowledge only under the morphological condition. In contrast, a group with low morphological knowledge showed the priming only under the orthographic condition. In the event-related potential (ERP) data, the group with high morphological knowledge presented the N250 only under the morphological condition. The findings of this study imply that individual differences in morphological knowledge in Korean may have a significant influence on the segmental processing of Korean word recognition.

Keywords: ERP, individual differences, morphological priming, sino-Korean prefixes

Procedia PDF Downloads 187
137 Lexical Knowledge of Verb Particle Constructions with the Particle on by Mexican English Learners

Authors: Sarai Alvarado Pineda, Ricardo Maldonado Soto

Abstract:

The acquisition of Verb Particle Constructions is a challenge for Spanish speakers learning English. The acquisition is particularly difficult for speakers of languages with no verb particle constructions. The purpose of the current study is to define the procedural steps in the acquisition of constructions with the particle on. There are three outstanding meanings for the particle on; Surface: The movie is based on a true story, Activation: John turn on the light, Continuity: The band played on all night. The central aim of this study is to measure how Mexican Spanish participants respond to both the three meanings mentioned above and the degree of meaning transparency/opacity of on verb particle constructions. Forty Mexican Spanish learners of English (20 basic and 20 advanced) are compared against a control group of 20 American native English speakers through a reaction time test (PsychoPy2 2015). The participants were asked to discriminate 90 items based on their knowledge of these constructions. There are 30 items per meaning divided into two groups of transparent and opaque meaning. Results revealed three major findings: Advanced students have a reaction time similar to that of native speakers (advanced 4.5s versus native 3.7s), while students with a lower level of English proficiency, show a high reaction time (7s). Likewise, there is a shorter reaction time in constructions with lower opacity in the three groups of participants, with differences between each level (basic 6.7s, advanced 4.3s, and native 3.4s). Finally, a difference in reaction time can be identified according to the meaning provided by the construction. The reaction time for the activation category (5.27s) is greater than continuity (5.04s), and this category is also slower than the surface (4.94s). The study shows that the level of sensitivity of English learners increases significantly aiming towards native speaker patterns as determined by the level of transparency of meaning of each construction as well as the degree of entrenchment of each constructional meaning.

Keywords: meaning of the particle, opacity, reaction time, verb particle constructions

Procedia PDF Downloads 243
136 Comparative Study of Case Files in the Context of H. P. Grice’s Pragmatic Theory

Authors: Tugce Arslan

Abstract:

For a communicative act to be carried out successfully, the speaker and the listener must consider certain principles in line with the intention–centered “Cooperative Principle” expressed by H. P. Grice. Violation of a communication principle causes the listener to make new inferences called “implicatures”. In this study, focusing on the linguistic use of H. P. Grice’s principles, we aim to find out which principles of conversation are generally followed in case files from different fields and which principles are frequently violated. Three case files were examined, and the violating and the abiding cases of the maxims were classified in terms of four categories (Quality, Quantity, Relevance and Manner). The results of this investigation is reported below (V: Violating, A: Abiding): Quality Quantity Relevance Manner V A V A V A V A Case 1 10 8 5 9 3 15 16 6 Case 2 4 5 11 6 2 11 7 14 Case 3 21 13 7 12 9 14 15 9 Total 35 26 23 27 14 40 38 29 The excerpts were selected from files covering three different areas: the Assize Court, the Family Court and the Commercial Court of First Instance. In this way, the relations between the types of violations and the types of courts are examined. Our main finding is that in the 1st and the 3rd file, as the cases of violation in “Quality” and “Manner” increase, the cases of violation in “Quantity” and “Relevance” decrease. In the second file, on the other hand, as the cases of violation in “Quantity” increase, the cases of violation in “Quality”, “Relevance” and “Manner” decrease. In the talk, we shall compare these results with the results obtained in the study of Tajabadi, Dowlatabadi, and Mehric (2014), which examined various case files in Iran. Our main finding is that in the study conducted in Iran, violations were found only on the principles of “Quantity” and “Relevance”, while violations were found on the principles of “Quality”, “Quantity” and “Manner” in this study. In this case, it shows us that there is a connection between at least two maxims. In both cases, it has been noticed that the “Quantity” maxim is a common denominator. Studies in this field can be enlightening for many areas such as discourse analysis, legal studies, etc. Accordingly, comments will be made about the nature of the violations mentioned in H. P. Grice’s “Cooperation Principle”. We shall also discuss various conversational practices that cannot be analysed with these maxims.

Keywords: comparative analysis, cooperation principle, forensic linguistics, pragmatic.

Procedia PDF Downloads 193
135 Technology Transfer of Indigenous Technologies: Emerging Aid to Indian Health Sector

Authors: Tripta Dixit, Smita Sahu, William Selvamurthy, Sadhana Srivastava

Abstract:

India is battling with the issues of accessibility, affordability and availability of quality health to the masses. Indian medical heritage which dated back to 3000 BC unveils the rich knowledge pool which has undergone a perceptible change over years, such as eradication of many communicable diseases, increasing individual awareness of quality health and import driven medical device market etc. Despite a slew of initiatives the holistic slogan of ‘health for all’ remains elusive and a concern for the nation. The 21st-century projects a myriad of challenges like cultural diversity, large population, demographic dividend and geographical segmentation leading to varied needs of people as per their regional conditions of climate, disease prevalence, nutrition and sanitation. But these challenges are also opportunities for the development of indigenous, low cost and accessible technologies to tackle them. This requires reinforcing the potential of indigenous technologies in coordination with prevailing health issues in various regions of country. This paper emphasis on the strategy for exploring the indigenous technologies with entrusted up-scaling to meet the diverse needs of the people. This review proposes to adopt technology transfer as a strategy to establish a vibrant ecosystem for identifying and up-scaling the indigenous medical technologies with diligent hand-holding for public health.

Keywords: health, indigenous, medical technology, technology transfer

Procedia PDF Downloads 233
134 Random Forest Classification for Population Segmentation

Authors: Regina Chua

Abstract:

To reduce the costs of re-fielding a large survey, a Random Forest classifier was applied to measure the accuracy of classifying individuals into their assigned segments with the fewest possible questions. Given a long survey, one needed to determine the most predictive ten or fewer questions that would accurately assign new individuals to custom segments. Furthermore, the solution needed to be quick in its classification and usable in non-Python environments. In this paper, a supervised Random Forest classifier was modeled on a dataset with 7,000 individuals, 60 questions, and 254 features. The Random Forest consisted of an iterative collection of individual decision trees that result in a predicted segment with robust precision and recall scores compared to a single tree. A random 70-30 stratified sampling for training the algorithm was used, and accuracy trade-offs at different depths for each segment were identified. Ultimately, the Random Forest classifier performed at 87% accuracy at a depth of 10 with 20 instead of 254 features and 10 instead of 60 questions. With an acceptable accuracy in prioritizing feature selection, new tools were developed for non-Python environments: a worksheet with a formulaic version of the algorithm and an embedded function to predict the segment of an individual in real-time. Random Forest was determined to be an optimal classification model by its feature selection, performance, processing speed, and flexible application in other environments.

Keywords: machine learning, supervised learning, data science, random forest, classification, prediction, predictive modeling

Procedia PDF Downloads 73
133 Classification of Land Cover Usage from Satellite Images Using Deep Learning Algorithms

Authors: Shaik Ayesha Fathima, Shaik Noor Jahan, Duvvada Rajeswara Rao

Abstract:

Earth's environment and its evolution can be seen through satellite images in near real-time. Through satellite imagery, remote sensing data provide crucial information that can be used for a variety of applications, including image fusion, change detection, land cover classification, agriculture, mining, disaster mitigation, and monitoring climate change. The objective of this project is to propose a method for classifying satellite images according to multiple predefined land cover classes. The proposed approach involves collecting data in image format. The data is then pre-processed using data pre-processing techniques. The processed data is fed into the proposed algorithm and the obtained result is analyzed. Some of the algorithms used in satellite imagery classification are U-Net, Random Forest, Deep Labv3, CNN, ANN, Resnet etc. In this project, we are using the DeepLabv3 (Atrous convolution) algorithm for land cover classification. The dataset used is the deep globe land cover classification dataset. DeepLabv3 is a semantic segmentation system that uses atrous convolution to capture multi-scale context by adopting multiple atrous rates in cascade or in parallel to determine the scale of segments.

Keywords: area calculation, atrous convolution, deep globe land cover classification, deepLabv3, land cover classification, resnet 50

Procedia PDF Downloads 120
132 Intelligent Fishers Harness Aquatic Organisms and Climate Change

Authors: Shih-Fang Lo, Tzu-Wei Guo, Chih-Hsuan Lee

Abstract:

Tropical fisheries are vulnerable to the physical and biogeochemical oceanic changes associated with climate change. Warmer temperatures and extreme weather have beendamaging the abundance and growth patterns of aquatic organisms. In recent year, the shrinking of fish stock and labor shortage have increased the threat to global aquacultural production. Thus, building a climate-resilient and sustainable mechanism becomes an urgent, important task for global citizens. To tackle the problem, Taiwanese fishermen applies the artificial intelligence (AI) technology. In brief, the AI system (1) measures real-time water quality and chemical parameters infish ponds; (2) monitors fish stock through segmentation, detection, and classification; and (3) implements fishermen’sprevious experiences, perceptions, and real-life practices. Applying this system can stabilize the aquacultural production and potentially increase the labor force. Furthermore, this AI technology can build up a more resilient and sustainable system for the fishermen so that they can mitigate the influence of extreme weather while maintaining or even increasing their aquacultural production. In the future, when the AI system collected and analyzed more and more data, it can be applied to different regions of the world or even adapt to the future technological or societal changes, continuously providing the most relevant and useful information for fishermen in the world.

Keywords: aquaculture, artificial intelligence (AI), real-time system, sustainable fishery

Procedia PDF Downloads 97
131 Iris Feature Extraction and Recognition Based on Two-Dimensional Gabor Wavelength Transform

Authors: Bamidele Samson Alobalorun, Ifedotun Roseline Idowu

Abstract:

Biometrics technologies apply the human body parts for their unique and reliable identification based on physiological traits. The iris recognition system is a biometric–based method for identification. The human iris has some discriminating characteristics which provide efficiency to the method. In order to achieve this efficiency, there is a need for feature extraction of the distinct features from the human iris in order to generate accurate authentication of persons. In this study, an approach for an iris recognition system using 2D Gabor for feature extraction is applied to iris templates. The 2D Gabor filter formulated the patterns that were used for training and equally sent to the hamming distance matching technique for recognition. A comparison of results is presented using two iris image subjects of different matching indices of 1,2,3,4,5 filter based on the CASIA iris image database. By comparing the two subject results, the actual computational time of the developed models, which is measured in terms of training and average testing time in processing the hamming distance classifier, is found with best recognition accuracy of 96.11% after capturing the iris localization or segmentation using the Daughman’s Integro-differential, the normalization is confined to the Daugman’s rubber sheet model.

Keywords: Daugman rubber sheet, feature extraction, Hamming distance, iris recognition system, 2D Gabor wavelet transform

Procedia PDF Downloads 42
130 Robustness of the Deep Chroma Extractor and Locally-Normalized Quarter Tone Filters in Automatic Chord Estimation under Reverberant Conditions

Authors: Luis Alvarado, Victor Poblete, Isaac Gonzalez, Yetzabeth Gonzalez

Abstract:

In MIREX 2016 (http://www.music-ir.org/mirex), the deep neural network (DNN)-Deep Chroma Extractor, proposed by Korzeniowski and Wiedmer, reached the highest score in an audio chord recognition task. In the present paper, this tool is assessed under acoustic reverberant environments and distinct source-microphone distances. The evaluation dataset comprises The Beatles and Queen datasets. These datasets are sequentially re-recorded with a single microphone in a real reverberant chamber at four reverberation times (0 -anechoic-, 1, 2, and 3 s, approximately), as well as four source-microphone distances (32, 64, 128, and 256 cm). It is expected that the performance of the trained DNN will dramatically decrease under these acoustic conditions with signals degraded by room reverberation and distance to the source. Recently, the effect of the bio-inspired Locally-Normalized Cepstral Coefficients (LNCC), has been assessed in a text independent speaker verification task using speech signals degraded by additive noise at different signal-to-noise ratios with variations of recording distance, and it has also been assessed under reverberant conditions with variations of recording distance. LNCC showed a performance so high as the state-of-the-art Mel Frequency Cepstral Coefficient filters. Based on these results, this paper proposes a variation of locally-normalized triangular filters called Locally-Normalized Quarter Tone (LNQT) filters. By using the LNQT spectrogram, robustness improvements of the trained Deep Chroma Extractor are expected, compared with classical triangular filters, and thus compensating the music signal degradation improving the accuracy of the chord recognition system.

Keywords: chord recognition, deep neural networks, feature extraction, music information retrieval

Procedia PDF Downloads 209
129 Algorithm for Quantification of Pulmonary Fibrosis in Chest X-Ray Exams

Authors: Marcela de Oliveira, Guilherme Giacomini, Allan Felipe Fattori Alves, Ana Luiza Menegatti Pavan, Maria Eugenia Dela Rosa, Fernando Antonio Bacchim Neto, Diana Rodrigues de Pina

Abstract:

It is estimated that each year one death every 10 seconds (about 2 million deaths) in the world is attributed to tuberculosis (TB). Even after effective treatment, TB leaves sequelae such as, for example, pulmonary fibrosis, compromising the quality of life of patients. Evaluations of the aforementioned sequel are usually performed subjectively by radiology specialists. Subjective evaluation may indicate variations inter and intra observers. The examination of x-rays is the diagnostic imaging method most accomplished in the monitoring of patients diagnosed with TB and of least cost to the institution. The application of computational algorithms is of utmost importance to make a more objective quantification of pulmonary impairment in individuals with tuberculosis. The purpose of this research is the use of computer algorithms to quantify the pulmonary impairment pre and post-treatment of patients with pulmonary TB. The x-ray images of 10 patients with TB diagnosis confirmed by examination of sputum smears were studied. Initially the segmentation of the total lung area was performed (posteroanterior and lateral views) then targeted to the compromised region by pulmonary sequel. Through morphological operators and the application of signal noise tool, it was possible to determine the compromised lung volume. The largest difference found pre- and post-treatment was 85.85% and the smallest was 54.08%.

Keywords: algorithm, radiology, tuberculosis, x-rays exam

Procedia PDF Downloads 397
128 Podcasting: A Tool for an Enhanced Learning Experience of Introductory Courses to Science and Engineering Students

Authors: Yaser E. Greish, Emad F. Hindawy, Maryam S. Al Nehayan

Abstract:

Introductory courses such as General Chemistry I, General Physics I and General Biology need special attention as students taking these courses are usually at their first year of the university. In addition to the language barrier for most of them, they also face other difficulties if these elementary courses are taught in the traditional way. Changing the routine method of teaching of these courses is therefore mandated. In this regard, podcasting of chemistry lectures was used as an add-on to the traditional and non-traditional methods of teaching chemistry to science and non-science students. Podcasts refer to video files that are distributed in a digital format through the Internet using personal computers or mobile devices. Pedagogical strategy is another way of identifying podcasts. Three distinct teaching approaches are evident in the current literature and include receptive viewing, problem-solving, and created video podcasts. The digital format and dispensing of video podcasts have stabilized over the past eight years, the type of podcasts vary considerably according to their purpose, degree of segmentation, pedagogical strategy, and academic focus. In this regard, the whole syllabus of 'General Chemistry I' course was developed as podcasts and were delivered to students throughout the semester. Students used the podcasted files extensively during their studies, especially as part of their preparations for exams. Feedback of students strongly supported the idea of using podcasting as it reflected its effect on the overall understanding of the subject, and a consequent improvement of their grades.

Keywords: podcasting, introductory course, interactivity, flipped classroom

Procedia PDF Downloads 245
127 Automatic Registration of Rail Profile Based Local Maximum Curvature Entropy

Authors: Hao Wang, Shengchun Wang, Weidong Wang

Abstract:

On the influence of train vibration and environmental noise on the measurement of track wear, we proposed a method for automatic extraction of circular arc on the inner or outer side of the rail waist and achieved the high-precision registration of rail profile. Firstly, a polynomial fitting method based on truncated residual histogram was proposed to find the optimal fitting curve of the profile and reduce the influence of noise on profile curve fitting. Then, based on the curvature distribution characteristics of the fitting curve, the interval search algorithm based on dynamic window’s maximum curvature entropy was proposed to realize the automatic segmentation of small circular arc. At last, we fit two circle centers as matching reference points based on small circular arcs on both sides and realized the alignment from the measured profile to the standard designed profile. The static experimental results show that the mean and standard deviation of the method are controlled within 0.01mm with small measurement errors and high repeatability. The dynamic test also verified the repeatability of the method in the train-running environment, and the dynamic measurement deviation of rail wear is within 0.2mm with high repeatability.

Keywords: curvature entropy, profile registration, rail wear, structured light, train-running

Procedia PDF Downloads 239
126 Distant Speech Recognition Using Laser Doppler Vibrometer

Authors: Yunbin Deng

Abstract:

Most existing applications of automatic speech recognition relies on cooperative subjects at a short distance to a microphone. Standoff speech recognition using microphone arrays can extend the subject to sensor distance somewhat, but it is still limited to only a few feet. As such, most deployed applications of standoff speech recognitions are limited to indoor use at short range. Moreover, these applications require air passway between the subject and the sensor to achieve reasonable signal to noise ratio. This study reports long range (50 feet) automatic speech recognition experiments using a Laser Doppler Vibrometer (LDV) sensor. This study shows that the LDV sensor modality can extend the speech acquisition standoff distance far beyond microphone arrays to hundreds of feet. In addition, LDV enables 'listening' through the windows for uncooperative subjects. This enables new capabilities in automatic audio and speech intelligence, surveillance, and reconnaissance (ISR) for law enforcement, homeland security and counter terrorism applications. The Polytec LDV model OFV-505 is used in this study. To investigate the impact of different vibrating materials, five parallel LDV speech corpora, each consisting of 630 speakers, are collected from the vibrations of a glass window, a metal plate, a plastic box, a wood slate, and a concrete wall. These are the common materials the application could encounter in a daily life. These data were compared with the microphone counterpart to manifest the impact of various materials on the spectrum of the LDV speech signal. State of the art deep neural network modeling approaches is used to conduct continuous speaker independent speech recognition on these LDV speech datasets. Preliminary phoneme recognition results using time-delay neural network, bi-directional long short term memory, and model fusion shows great promise of using LDV for long range speech recognition. To author’s best knowledge, this is the first time an LDV is reported for long distance speech recognition application.

Keywords: covert speech acquisition, distant speech recognition, DSR, laser Doppler vibrometer, LDV, speech intelligence surveillance and reconnaissance, ISR

Procedia PDF Downloads 156
125 Information Management Approach in the Prediction of Acute Appendicitis

Authors: Ahmad Shahin, Walid Moudani, Ali Bekraki

Abstract:

This research aims at presenting a predictive data mining model to handle an accurate diagnosis of acute appendicitis with patients for the purpose of maximizing the health service quality, minimizing morbidity/mortality, and reducing cost. However, acute appendicitis is the most common disease which requires timely accurate diagnosis and needs surgical intervention. Although the treatment of acute appendicitis is simple and straightforward, its diagnosis is still difficult because no single sign, symptom, laboratory or image examination accurately confirms the diagnosis of acute appendicitis in all cases. This contributes in increasing morbidity and negative appendectomy. In this study, the authors propose to generate an accurate model in prediction of patients with acute appendicitis which is based, firstly, on the segmentation technique associated to ABC algorithm to segment the patients; secondly, on applying fuzzy logic to process the massive volume of heterogeneous and noisy data (age, sex, fever, white blood cell, neutrophilia, CRP, urine, ultrasound, CT, appendectomy, etc.) in order to express knowledge and analyze the relationships among data in a comprehensive manner; and thirdly, on applying dynamic programming technique to reduce the number of data attributes. The proposed model is evaluated based on a set of benchmark techniques and even on a set of benchmark classification problems of osteoporosis, diabetes and heart obtained from the UCI data and other data sources.

Keywords: healthcare management, acute appendicitis, data mining, classification, decision tree

Procedia PDF Downloads 330
124 Kannada HandWritten Character Recognition by Edge Hinge and Edge Distribution Techniques Using Manhatan and Minimum Distance Classifiers

Authors: C. V. Aravinda, H. N. Prakash

Abstract:

In this paper, we tried to convey fusion and state of art pertaining to SIL character recognition systems. In the first step, the text is preprocessed and normalized to perform the text identification correctly. The second step involves extracting relevant and informative features. The third step implements the classification decision. The three stages which involved are Data acquisition and preprocessing, Feature extraction, and Classification. Here we concentrated on two techniques to obtain features, Feature Extraction & Feature Selection. Edge-hinge distribution is a feature that characterizes the changes in direction of a script stroke in handwritten text. The edge-hinge distribution is extracted by means of a windowpane that is slid over an edge-detected binary handwriting image. Whenever the mid pixel of the window is on, the two edge fragments (i.e. connected sequences of pixels) emerging from this mid pixel are measured. Their directions are measured and stored as pairs. A joint probability distribution is obtained from a large sample of such pairs. Despite continuous effort, handwriting identification remains a challenging issue, due to different approaches use different varieties of features, having different. Therefore, our study will focus on handwriting recognition based on feature selection to simplify features extracting task, optimize classification system complexity, reduce running time and improve the classification accuracy.

Keywords: word segmentation and recognition, character recognition, optical character recognition, hand written character recognition, South Indian languages

Procedia PDF Downloads 473
123 Characteristic Sentence Stems in Academic English Texts: Definition, Identification, and Extraction

Authors: Jingjie Li, Wenjie Hu

Abstract:

Phraseological units in academic English texts have been a central focus in recent corpus linguistic research. A wide variety of phraseological units have been explored, including collocations, chunks, lexical bundles, patterns, semantic sequences, etc. This paper describes a special category of clause-level phraseological units, namely, Characteristic Sentence Stems (CSSs), with a view to describing their defining criteria and extraction method. CSSs are contiguous lexico-grammatical sequences which contain a subject-predicate structure and which are frame expressions characteristic of academic writing. The extraction of CSSs consists of six steps: Part-of-speech tagging, n-gram segmentation, structure identification, significance of occurrence calculation, text range calculation, and overlapping sequence reduction. Significance of occurrence calculation is the crux of this study. It includes the computing of both the internal association and the boundary independence of a CSS and tests the occurring significance of the CSS from both inside and outside perspectives. A new normalization algorithm is also introduced into the calculation of LocalMaxs for reducing overlapping sequences. It is argued that many sentence stems are so recurrent in academic texts that the most typical of them have become the habitual ways of making meaning in academic writing. Therefore, studies of CSSs could have potential implications and reference value for academic discourse analysis, English for Academic Purposes (EAP) teaching and writing.

Keywords: characteristic sentence stem, extraction method, phraseological unit, the statistical measure

Procedia PDF Downloads 144