Search results for: voice segmentation
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 900

Search results for: voice segmentation

630 Hyperspectral Image Classification Using Tree Search Algorithm

Authors: Shreya Pare, Parvin Akhter

Abstract:

Remotely sensing image classification becomes a very challenging task owing to the high dimensionality of hyperspectral images. The pixel-wise classification methods fail to take the spatial structure information of an image. Therefore, to improve the performance of classification, spatial information can be integrated into the classification process. In this paper, the multilevel thresholding algorithm based on a modified fuzzy entropy function is used to perform the segmentation of hyperspectral images. The fuzzy parameters of the MFE function have been optimized by using a new meta-heuristic algorithm based on the Tree-Search algorithm. The segmented image is classified by a large distribution machine (LDM) classifier. Experimental results are shown on a hyperspectral image dataset. The experimental outputs indicate that the proposed technique (MFE-TSA-LDM) achieves much higher classification accuracy for hyperspectral images when compared to state-of-art classification techniques. The proposed algorithm provides accurate segmentation and classification maps, thus becoming more suitable for image classification with large spatial structures.

Keywords: classification, hyperspectral images, large distribution margin, modified fuzzy entropy function, multilevel thresholding, tree search algorithm, hyperspectral image classification using tree search algorithm

Procedia PDF Downloads 135
629 Intelligent Rheumatoid Arthritis Identification System Based Image Processing and Neural Classifier

Authors: Abdulkader Helwan

Abstract:

Rheumatoid joint inflammation is characterized as a perpetual incendiary issue which influences the joints by hurting body tissues Therefore, there is an urgent need for an effective intelligent identification system of knee Rheumatoid arthritis especially in its early stages. This paper is to develop a new intelligent system for the identification of Rheumatoid arthritis of the knee utilizing image processing techniques and neural classifier. The system involves two principle stages. The first one is the image processing stage in which the images are processed using some techniques such as RGB to gryascale conversion, rescaling, median filtering, background extracting, images subtracting, segmentation using canny edge detection, and features extraction using pattern averaging. The extracted features are used then as inputs for the neural network which classifies the X-ray knee images as normal or abnormal (arthritic) based on a backpropagation learning algorithm which involves training of the network on 400 X-ray normal and abnormal knee images. The system was tested on 400 x-ray images and the network shows good performance during that phase, resulting in a good identification rate 97%.

Keywords: rheumatoid arthritis, intelligent identification, neural classifier, segmentation, backpropoagation

Procedia PDF Downloads 505
628 Seashore Debris Detection System Using Deep Learning and Histogram of Gradients-Extractor Based Instance Segmentation Model

Authors: Anshika Kankane, Dongshik Kang

Abstract:

Marine debris has a significant influence on coastal environments, damaging biodiversity, and causing loss and damage to marine and ocean sector. A functional cost-effective and automatic approach has been used to look up at this problem. Computer vision combined with a deep learning-based model is being proposed to identify and categorize marine debris of seven kinds on different beach locations of Japan. This research compares state-of-the-art deep learning models with a suggested model architecture that is utilized as a feature extractor for debris categorization. The model is being proposed to detect seven categories of litter using a manually constructed debris dataset, with the help of Mask R-CNN for instance segmentation and a shape matching network called HOGShape, which can then be cleaned on time by clean-up organizations using warning notifications of the system. The manually constructed dataset for this system is created by annotating the images taken by fixed KaKaXi camera using CVAT annotation tool with seven kinds of category labels. A pre-trained HOG feature extractor on LIBSVM is being used along with multiple templates matching on HOG maps of images and HOG maps of templates to improve the predicted masked images obtained via Mask R-CNN training. This system intends to timely alert the cleanup organizations with the warning notifications using live recorded beach debris data. The suggested network results in the improvement of misclassified debris masks of debris objects with different illuminations, shapes, viewpoints and litter with occlusions which have vague visibility.

Keywords: computer vision, debris, deep learning, fixed live camera images, histogram of gradients feature extractor, instance segmentation, manually annotated dataset, multiple template matching

Procedia PDF Downloads 63
627 Acoustic Characteristics of Ḫijaiyaḫ Letters Pronunciation by Indonesian Native Speaker

Authors: Romi Hardiyansyah, Raden Sugeng Joko Sarwono, Agus Samsi

Abstract:

Indonesian people have a mother language but not Arabic. Meanwhile, they must be able to pronounce the Arabic because Islam is the biggest religion in Indonesia. Arabic is composed by ḫijaiyaḫ letters which has its own pronunciation. Sound production process in humans can be divided into three physiological processes, namely: the formation of airflow from the lungs, the change in airflow from the lungs into the sound, and articulation (the modulation/sound setting into a specific sound). Ḫijaiyaḫ letters has its own articulation, some of which seem strange for most people in Indonesia. Those letters come out from the middle and upper throat so that the letters has its own acoustic characteristics. Acoustic characteristics of voice can be observed by source-filter approach that has parameters: pitch, formant, and formant bandwidth. Pitch is the basic tone in every human being. Formant is the resonance frequency of the human voice. Formant bandwidth is the time-width of a formant. After recording the sound from 21 subjects, data is processed by software Praat version 5.3.39. The analysis showed that each pronunciation, syakal (vowel changer), and the place of discharge letters has the same timbre which are determined by third and fourth formant.

Keywords: ḫijaiyaḫ, articulation, pitch, formant, formant bandwidth, timbre

Procedia PDF Downloads 356
626 Automated Ultrasound Carotid Artery Image Segmentation Using Curvelet Threshold Decomposition

Authors: Latha Subbiah, Dhanalakshmi Samiappan

Abstract:

In this paper, we propose denoising Common Carotid Artery (CCA) B mode ultrasound images by a decomposition approach to curvelet thresholding and automatic segmentation of the intima media thickness and adventitia boundary. By decomposition, the local geometry of the image, its direction of gradients are well preserved. The components are combined into a single vector valued function, thus removes noise patches. Double threshold is applied to inherently remove speckle noise in the image. The denoised image is segmented by active contour without specifying seed points. Combined with level set theory, they provide sub regions with continuous boundaries. The deformable contours match to the shapes and motion of objects in the images. A curve or a surface under constraints is developed from the image with the goal that it is pulled into the necessary features of the image. Region based and boundary based information are integrated to achieve the contour. The method treats the multiplicative speckle noise in objective and subjective quality measurements and thus leads to better-segmented results. The proposed denoising method gives better performance metrics compared with other state of art denoising algorithms.

Keywords: curvelet, decomposition, levelset, ultrasound

Procedia PDF Downloads 308
625 Community Radio Broadcasting in Phutthamonthon District, Nakhon Pathom, Thailand

Authors: Anchana Sooksomchitra

Abstract:

This study aims to explore and compare the current condition of community radio stations in Phutthamonthon district, Nakhon Pathom province, Thailand, as well as the challenges they are facing. Qualitative research tools including in-depth interviews, documentary analysis, focus group interviews, and observation are used to examine the content, programming, and management structure of three community radio stations currently in operation within the district. Research findings indicate that the management and operational approaches adopted by the two non-profit stations included in the study, Salaya Pattana and Voice of Dhamma, are more structured and effective than that of the for-profit Tune Radio. Salaya Pattana, backed by the Faculty of Engineering, Mahidol University, and the charity-funded Voice of Dhamma are comparatively free from political and commercial influence, and able to provide more relevant and consistent community-oriented content to meet the real demand of the audience. Tune Radio, on the other hand, has to rely solely on financial support from political factions and business groups, which heavily influence its content.

Keywords: radio broadcasting, programming, management, community radio, Thailand

Procedia PDF Downloads 310
624 Content Based Video Retrieval System Using Principal Object Analysis

Authors: Van Thinh Bui, Anh Tuan Tran, Quoc Viet Ngo, The Bao Pham

Abstract:

Video retrieval is a searching problem on videos or clips based on content in which they are relatively close to an input image or video. The application of this retrieval consists of selecting video in a folder or recognizing a human in security camera. However, some recent approaches have been in challenging problem due to the diversity of video types, frame transitions and camera positions. Besides, that an appropriate measures is selected for the problem is a question. In order to overcome all obstacles, we propose a content-based video retrieval system in some main steps resulting in a good performance. From a main video, we process extracting keyframes and principal objects using Segmentation of Aggregating Superpixels (SAS) algorithm. After that, Speeded Up Robust Features (SURF) are selected from those principal objects. Then, the model “Bag-of-words” in accompanied by SVM classification are applied to obtain the retrieval result. Our system is performed on over 300 videos in diversity from music, history, movie, sports, and natural scene to TV program show. The performance is evaluated in promising comparison to the other approaches.

Keywords: video retrieval, principal objects, keyframe, segmentation of aggregating superpixels, speeded up robust features, bag-of-words, SVM

Procedia PDF Downloads 269
623 Analysis of Vocal Pathologies Through Subglottic Pressure Measurement

Authors: Perla Elizabeth Jimarez Rocha, Carolina Daniela Tejeda Franco, Arturo Minor Martínez, Annel Gomez Coello

Abstract:

One of the biggest problems in developing new therapies for the management and treatment of voice disorders is the difficulty of objectively evaluating the results of each treatment. A system was proposed that captures and records voice signals, in addition to analyzing the vocal quality (fundamental frequency, zero crossings, energy, and amplitude spectrum), as well as the subglottic pressure (cm H2O) during the sustained phonation of the vowel / a /; a recording system is implemented, as well as an interactive system that records information on subglottic pressure. In Mexico City, a control group of 31 patients with phoniatric pathology is proposed; non-invasive tests were performed for these most common vocal pathologies (Nodules, Polyps, Irritative Laryngitis, Ventricular Dysphonia, Laryngeal Cancer, Dysphonia, and Dysphagia). The most common pathology was irritative laryngitis (32%), followed by vocal fold paralysis (unilateral and bilateral,19.4 %). We take into consideration men and women in the pathological groups due to the physiological difference. They were separated in gender by the difference in the morphology of the respiratory tract.

Keywords: amplitude spectrum, energy, fundamental frequency, subglottic pressure, zero crossings

Procedia PDF Downloads 88
622 An Investigation of Community Radio Broadcasting in Phutthamonthon District, Nakhon Pathom, Thailand

Authors: Anchana Sooksomchitra

Abstract:

This study aims to explore and compare the current condition of community radio stations in Phutthamonthon district, Nakhon Pathom province, Thailand, as well as the challenges they are facing. Qualitative research tools including in-depth interviews; documentary analysis; focus group interviews; and observation, are used to examine the content, programming, and management structure of three community radio stations currently in operation within the district. Research findings indicate that the management and operational approaches adopted by the two non-profit stations included in the study, Salaya Pattana and Voice of Dhamma, are more structured and effective than that of the for-profit Tune Radio. Salaya Pattana – backed by the Faculty of Engineering, Mahidol University, and the charity-funded Voice of Dhamma, are comparatively free from political and commercial influence, and able to provide more relevant and consistent community-oriented content to meet the real demand of the audience. Tune Radio, on the other hand, has to rely solely on financial support from political factions and business groups, which heavily influence its content.

Keywords: radio broadcasting, programming, management, community radio, Thailand

Procedia PDF Downloads 369
621 A Profile of the Patients at the Hearing and Speech Clinic at the University of Jordan: A Retrospective Study

Authors: Maisa Haj-Tas, Jehad Alaraifi

Abstract:

The significance of the study: This retrospective study examined the speech and language profiles of patients who received clinical services at the University of Jordan Hearing and Speech Clinic (UJ-HSC) from 2009 to 2014. The UJ-HSC clinic is located in the capital Amman and was established in the late 1990s. It is the first hearing and speech clinic in Jordan and one of first speech and hearing clinics in the Middle East. This clinic provides services to an annual average of 2000 patients who are diagnosed with different communication disorders. Examining the speech and language profiles of patients in this clinic could provide an insight about the most common disorders seen in patients who attend similar clinics in Jordan. It could also provide information about community awareness of the role of speech therapists in the management of speech and language disorders. Methodology: The researchers examined the clinical records of 1140 patients (797 males and 343 females) who received clinical services at the UJ-HSC between the years 2009 and 2014 for the purpose of data analysis for this study. The main variables examined in the study were disorder type and gender. Participants were divided into four age groups: children, adolescents, adults, and older adults. The examined disorders were classified as either speech disorders, language disorders, or dysphagia (i.e., swallowing problems). The disorders were further classified as childhood language impairments, articulation disorders, stuttering, cluttering, voice disorders, aphasia, and dysphagia. Results: The results indicated that the prevalence for language disorders was the highest (50.7%) followed by speech disorders (48.3%), and dysphagia (0.9%). The majority of patients who were seen at the JU-HSC were diagnosed with childhood language impairments (47.3%) followed consecutively by articulation disorders (21.1%), stuttering (16.3%), voice disorders (12.1%), aphasia (2.2%), dysphagia (0.9%), and cluttering (0.2%). As for gender, the majority of patients seen at the clinic were males in all disorders except for voice disorders and cluttering. Discussion: The results of the present study indicate that the majority of examined patients were diagnosed with childhood language impairments. Based on this result, the researchers suggest that there seems to be a high prevalence of childhood language impairments among children in Jordan compared to other types of speech and language disorders. The researchers also suggest that there is a need for further examination of the actual prevalence data on speech and language disorders in Jordan. The fact that many of the children seen at the UJ-HSC were brought to the clinic either as a result of parental concern or teacher referral indicates that there seems to an increased awareness among parents and teachers about the services speech pathologists can provide about assessment and treatment of childhood speech and language disorders. The small percentage of other disorders (i.e., stuttering, cluttering, dysphasia, aphasia, and voice disorders) seen at the UJ-HSC may indicate a little awareness by the local community about the role of speech pathologists in the assessment and treatment of these disorders.

Keywords: clinic, disorders, language, profile, speech

Procedia PDF Downloads 285
620 Robustness Conditions for the Establishment of Stationary Patterns of Drosophila Segmentation Gene Expression

Authors: Ekaterina M. Myasnikova, Andrey A. Makashov, Alexander V. Spirov

Abstract:

First manifestation of a segmentation pattern in the early Drosophila development is the formation of expression domains (along with the main embryo axis) of genes belonging to the trunk gene class. Highly variable expression of genes from gap family in early Drosophila embryo is strongly reduced by the start of gastrulation due to the gene cross-regulation. The dynamics of gene expression is described by a gene circuit model for a system of four gap genes. It is shown that for the formation of a steep and stationary border by the model it is necessary that there existed a nucleus (modeling point) in which the gene expression level is constant in time and hence is described by a stationary equation. All the rest genes expressed in this nucleus are in a dynamic equilibrium. The mechanism of border formation associated with the existence of a stationary nucleus is also confirmed by the experiment. An important advantage of this approach is that properties of the system in a stationary nucleus are described by algebraic equations and can be easily handled analytically. Thus we explicitly characterize the cross-regulation properties necessary for the robustness and formulate the conditions providing this effect through the properties of the initial input data. It is shown that our formally derived conditions are satisfied for the previously published model solutions.

Keywords: drosophila, gap genes, reaction-diffusion model, robustness

Procedia PDF Downloads 329
619 Visual, Zoological Metaphors and 'Urtiin Duu' (Long Song) in Alshaa, Inner Mongolia

Authors: Oyuna Weina

Abstract:

This study examines how musicians use visual and zoological metaphors for singing technique and voice quality in a genre of traditional music called urtiin duu (‘long song’) in Alshaa, Inner Mongolia, China. Previous studies have discussed melodic contour in Mongol music, but little study of the intersection of singing technique, visual and zoological metaphors has yet been undertaken. The purpose of this study is to address this lack by analysing urtiin duu itself, traditional pedagogy and performances, all of which have been inspired and are assessed by reference to nature and mobile pastoral herding practices. This study investigates the visual and zoological metaphors related to urtiin duu especially colour, the shape of the circle and animals in the Mongol community. Urtiin duu singing is associated with certain colours in song texts, in selection of repertoire and in the status of singers. Musicians also use colour to describe timbre. These colours in turn reference worship of nature, religions, and daily practices of most Mongols in Alshaa. Moreover, voice quality and singing technique are often related to the animals not only in song text but also in the approach to breathing and to melodic contour. Additionally, the concept of boronhoi (‘the shape of circle’), not only is applied to the melodic contour but also to the voice quality and singing technique. These three factors illustrate the connections among nature, spiritual world and everyday herding life of Mongols. These different connections provide evidence of multi-layered meanings. In contemporary Alshaa, urtiin duu singers received Western musical training from the city and returned to their homelands to perform urtiin duu. In doing so, they are also trying to reconnect with the history, nature and spiritual world in order to achieve their ideal sound. Within a multicultural society, singers negotiate amongst themselves, and with ethnic groups, audiences and government officials. The power of the metaphor therefore assists and reconnects the strength of regional identity and ethnic identity in Alshaa.

Keywords: Alshaa, urtiin duu, visual, zoological metaphors

Procedia PDF Downloads 325
618 Application of the Quantile Regression Approach to the Heterogeneity of the Fine Wine Prices

Authors: Charles-Olivier Amédée-Manesme, Benoit Faye, Eric Le Fur

Abstract:

In this paper, the heterogeneity of the Bordeaux Legends 50 wine market price segment is addressed. For this purpose, quantile regression is applied – with market segmentation based on wine bottle price quantile – and the hedonic price of wine attributes is computed for various price segments of the market. The approach is applied to a major privately held data set which consists of approximately 30,000 transactions over the 2003–2014 period. The findings suggest that the relative hedonic prices of several wine attributes differ significantly among deciles. In particular, the elasticity coefficient of the expert ratings shows strong variation among prices. If - as suggested in the literature - expert ratings have a positive influence on wine price on average, they have a clearly decreasing impact over the quantiles. Finally, the lower the wine price, the higher the potential for price appreciation over time. Other variables such as chateaux or vintage are also shown to vary across the distribution of wine prices. While enhancing our understanding of the complex market dynamics that underlie Bordeaux wines’ price, this research provides empirical evidence that the QR approach adequately captures heterogeneity among wine price ranges, which simultaneously applies to wine stock, vintage and auctions’ house.

Keywords: hedonics, market segmentation, quantile regression, heterogeneity, wine economics

Procedia PDF Downloads 303
617 Multi-scale Geographic Object-Based Image Analysis (GEOBIA) Approach to Segment a Very High Resolution Images for Extraction of New Degraded Zones. Application to The Region of Mécheria in The South-West of Algeria

Authors: Bensaid A., Mostephaoui T., Nedjai R.

Abstract:

A considerable area of Algerian lands are threatened by the phenomenon of wind erosion. For a long time, wind erosion and its associated harmful effects on the natural environment have posed a serious threat, especially in the arid regions of the country. In recent years, as a result of increases in the irrational exploitation of natural resources (fodder) and extensive land clearing, wind erosion has particularly accentuated. The extent of degradation in the arid region of the Algerian Mécheriadepartment generated a new situation characterized by the reduction of vegetation cover, the decrease of land productivity, as well as sand encroachment on urban development zones. In this study, we attempt to investigate the potential of remote sensing and geographic information systems for detecting the spatial dynamics of the ancient dune cords based on the numerical processing of PlanetScope PSB.SB sensors images by September 29, 2021. As a second step, we prospect the use of a multi-scale geographic object-based image analysis (GEOBIA) approach to segment the high spatial resolution images acquired on heterogeneous surfaces that vary according to human influence on the environment. We have used the fractal net evolution approach (FNEA) algorithm to segment images (Baatz&Schäpe, 2000). Multispectral data, a digital terrain model layer, ground truth data, a normalized difference vegetation index (NDVI) layer, and a first-order texture (entropy) layer were used to segment the multispectral images at three segmentation scales, with an emphasis on accurately delineating the boundaries and components of the sand accumulation areas (Dune, dunes fields, nebka, and barkhane). It is important to note that each auxiliary data contributed to improve the segmentation at different scales. The silted areas were classified using a nearest neighbor approach over the Naâma area using imagery. The classification of silted areas was successfully achieved over all study areas with an accuracy greater than 85%, although the results suggest that, overall, a higher degree of landscape heterogeneity may have a negative effect on segmentation and classification. Some areas suffered from the greatest over-segmentation and lowest mapping accuracy (Kappa: 0.79), which was partially attributed to confounding a greater proportion of mixed siltation classes from both sandy areas and bare ground patches. This research has demonstrated a technique based on very high-resolution images for mapping sanded and degraded areas using GEOBIA, which can be applied to the study of other lands in the steppe areas of the northern countries of the African continent.

Keywords: land development, GIS, sand dunes, segmentation, remote sensing

Procedia PDF Downloads 72
616 Real Time Traffic Performance Study over MPLS VPNs with DiffServ

Authors: Naveed Ghani

Abstract:

With the arrival of higher speed communication links and mature application running over the internet, the requirement for reliable, efficient and robust network designs rising day by day. Multi-Protocol Label Switching technology (MPLS) Virtual Private Networks (VPNs) have committed to provide optimal network services. They are gaining popularity in industry day by day. Enterprise customers are moving to service providers that offer MPLS VPNs. The main reason for this shifting is the capability of MPLS VPN to provide built in security features and any-to-any connectivity. MPLS VPNs improved the network performance due to fast label switching as compare to traditional IP Forwarding but traffic classification and policing was still required on per hop basis to enhance the performance of real time traffic which is delay sensitive (particularly voice and video). QoS (Quality of service) is the most important factor to prioritize enterprise networks’ real time traffic such as voice and video. This thesis is focused on the study of QoS parameters (e.g. delay, jitter and MOS (Mean Opinion Score)) for the real time traffic over MPLS VPNs. DiffServ (Differentiated Services) QoS model will be used over MPLS VPN network to get end-to-end service quality.

Keywords: network, MPLS, VPN, DiffServ, MPLS VPN, DiffServ QoS, QoS Model, GNS2

Procedia PDF Downloads 392
615 Semiautomatic Calculation of Ejection Fraction Using Echocardiographic Image Processing

Authors: Diana Pombo, Maria Loaiza, Mauricio Quijano, Alberto Cadena, Juan Pablo Tello

Abstract:

In this paper, we present a semi-automatic tool for calculating ejection fraction from an echocardiographic video signal which is derived from a database in DICOM format, of Clinica de la Costa - Barranquilla. Described in this paper are each of the steps and methods used to find the respective calculation that includes acquisition and formation of the test samples, processing and finally the calculation of the parameters to obtain the ejection fraction. Two imaging segmentation methods were compared following a methodological framework that is similar only in the initial stages of processing (process of filtering and image enhancement) and differ in the end when algorithms are implemented (Active Contour and Region Growing Algorithms). The results were compared with the measurements obtained by two different medical specialists in cardiology who calculated the ejection fraction of the study samples using the traditional method, which consists of drawing the region of interest directly from the computer using echocardiography equipment and a simple equation to calculate the desired value. The results showed that if the quality of video samples are good (i.e., after the pre-processing there is evidence of an improvement in the contrast), the values provided by the tool are substantially close to those reported by physicians; also the correlation between physicians does not vary significantly.

Keywords: echocardiography, DICOM, processing, segmentation, EDV, ESV, ejection fraction

Procedia PDF Downloads 399
614 The Evolution of Amazon Alexa: From Voice Assistant to Smart Home Hub

Authors: Abrar Abuzaid, Maha Alaaeddine, Haya Alesayi

Abstract:

This project is centered around understanding the usage and impact of Alexa, Amazon's popular virtual assistant, in everyday life. Alexa, known for its integration into devices like Amazon Echo, offers functionalities such as voice interaction, media control, providing real-time information, and managing smart home devices. Our primary focus is to conduct a straightforward survey aimed at uncovering how people use Alexa in their daily routines. We plan to reach out to a wide range of individuals to get a diverse perspective on how Alexa is being utilized for various tasks, the frequency and context of its use, and the overall user experience. The survey will explore the most common uses of Alexa, its impact on daily life, features that users find most beneficial, and improvements they are looking for. This project is not just about collecting data but also about understanding the real-world applications of a technology like Alexa and how it fits into different lifestyles. By examining the responses, we aim to gain a practical understanding of Alexa's role in homes and possibly in workplaces. This project will provide insights into user satisfaction and areas where Alexa could be enhanced to meet the evolving needs of its users. It’s a step towards connecting technology with everyday life, making it more accessible and user-friendly

Keywords: Amazon Alexa, artificial intelligence, smart speaker, natural language processing

Procedia PDF Downloads 17
613 Finding a Paraguayan Voice: The Indigenous Language Guarani in Performances of Paraguayan Female Singers

Authors: Romy Martinez

Abstract:

This paper focuses on the use of the indigenous language Guarani in Paraguayan popular song and on some key interpreters born between the 1930s and 1980s. It analyses two representative musical genres of Paraguay, the Polka Paraguaya and Guarania. The lyrics of these genres follow one of four poetic-linguistic forms: to be entirely in Guarani, entirely in Spanish, bilingual (alternating verses in Guarani and Spanish), or in Jopará; the last being a form where words of both languages may be mixed in a single verse. Through these forms, the lyrics alternate and combine the indigenous voice with the one introduced with colonisation, in turn reflecting how Guarani seems to constantly transit, to and from, between a position of disdain and of value within Paraguayan society. Through analysing recordings of Polkas, Paraguayas, and Guaranias, it identifies three styles of singing adopted by female singers who include these genres in their repertoires, namely Paraguayan classical folk, Paraguayan folk, and Paraguayan pop-folk. This analysis is informed by a pilot study which consisted of online interviews with several Paraguayan artists, revealing significant aspects of their backgrounds and musical influences. In addition, it draws on autoethnographic approaches, building on the experience of the music researcher and singer. From a decolonising perspective, the paper brings together the distinctive voices and sounds expressed in popular songs from a marginalised country, language, and gender.

Keywords: female singers, Guarani, Paraguayan song, performance

Procedia PDF Downloads 167
612 Particle Filter Supported with the Neural Network for Aircraft Tracking Based on Kernel and Active Contour

Authors: Mohammad Izadkhah, Mojtaba Hoseini, Alireza Khalili Tehrani

Abstract:

In this paper we presented a new method for tracking flying targets in color video sequences based on contour and kernel. The aim of this work is to overcome the problem of losing target in changing light, large displacement, changing speed, and occlusion. The proposed method is made in three steps, estimate the target location by particle filter, segmentation target region using neural network and find the exact contours by greedy snake algorithm. In the proposed method we have used both region and contour information to create target candidate model and this model is dynamically updated during tracking. To avoid the accumulation of errors when updating, target region given to a perceptron neural network to separate the target from background. Then its output used for exact calculation of size and center of the target. Also it is used as the initial contour for the greedy snake algorithm to find the exact target's edge. The proposed algorithm has been tested on a database which contains a lot of challenges such as high speed and agility of aircrafts, background clutter, occlusions, camera movement, and so on. The experimental results show that the use of neural network increases the accuracy of tracking and segmentation.

Keywords: video tracking, particle filter, greedy snake, neural network

Procedia PDF Downloads 307
611 Computer-Aided Classification of Liver Lesions Using Contrasting Features Difference

Authors: Hussein Alahmer, Amr Ahmed

Abstract:

Liver cancer is one of the common diseases that cause the death. Early detection is important to diagnose and reduce the incidence of death. Improvements in medical imaging and image processing techniques have significantly enhanced interpretation of medical images. Computer-Aided Diagnosis (CAD) systems based on these techniques play a vital role in the early detection of liver disease and hence reduce liver cancer death rate.  This paper presents an automated CAD system consists of three stages; firstly, automatic liver segmentation and lesion’s detection. Secondly, extracting features. Finally, classifying liver lesions into benign and malignant by using the novel contrasting feature-difference approach. Several types of intensity, texture features are extracted from both; the lesion area and its surrounding normal liver tissue. The difference between the features of both areas is then used as the new lesion descriptors. Machine learning classifiers are then trained on the new descriptors to automatically classify liver lesions into benign or malignant. The experimental results show promising improvements. Moreover, the proposed approach can overcome the problems of varying ranges of intensity and textures between patients, demographics, and imaging devices and settings.

Keywords: CAD system, difference of feature, fuzzy c means, lesion detection, liver segmentation

Procedia PDF Downloads 287
610 Development of Internet of Things (IoT) with Mobile Voice Picking and Cargo Tracing Systems in Warehouse Operations of Third-Party Logistics

Authors: Eugene Y. C. Wong

Abstract:

The increased market competition, customer expectation, and warehouse operating cost in third-party logistics have motivated the continuous exploration in improving operation efficiency in warehouse logistics. Cargo tracing in ordering picking process consumes excessive time for warehouse operators when handling enormous quantities of goods flowing through the warehouse each day. Internet of Things (IoT) with mobile cargo tracing apps and database management systems are developed this research to facilitate and reduce the cargo tracing time in order picking process of a third-party logistics firm. An operation review is carried out in the firm with opportunities for improvement being identified, including inaccurate inventory record in warehouse management system, excessive tracing time on stored products, and product misdelivery. The facility layout has been improved by modifying the designated locations of various types of products. The relationship among the pick and pack processing time, cargo tracing time, delivery accuracy, inventory turnover, and inventory count operation time in the warehouse are evaluated. The correlation of the factors affecting the overall cycle time is analysed. A mobile app is developed with the use of MIT App Inventor and the Access management database to facilitate cargo tracking anytime anywhere. The information flow framework from warehouse database system to cloud computing document-sharing, and further to the mobile app device is developed. The improved performance on cargo tracing in the order processing cycle time of warehouse operators have been collected and evaluated. The developed mobile voice picking and tracking systems brings significant benefit to the third-party logistics firm, including eliminating unnecessary cargo tracing time in order picking process and reducing warehouse operators overtime cost. The mobile tracking device is further planned to enhance the picking time and cycle count of warehouse operators with voice picking system in the developed mobile apps as future development.

Keywords: warehouse, order picking process, cargo tracing, mobile app, third-party logistics

Procedia PDF Downloads 344
609 A Fast Parallel and Distributed Type-2 Fuzzy Algorithm Based on Cooperative Mobile Agents Model for High Performance Image Processing

Authors: Fatéma Zahra Benchara, Mohamed Youssfi, Omar Bouattane, Hassan Ouajji, Mohamed Ouadi Bensalah

Abstract:

The aim of this paper is to present a distributed implementation of the Type-2 Fuzzy algorithm in a parallel and distributed computing environment based on mobile agents. The proposed algorithm is assigned to be implemented on a SPMD (Single Program Multiple Data) architecture which is based on cooperative mobile agents as AVPE (Agent Virtual Processing Element) model in order to improve the processing resources needed for performing the big data image segmentation. In this work we focused on the application of this algorithm in order to process the big data MRI (Magnetic Resonance Images) image of size (n x m). It is encapsulated on the Mobile agent team leader in order to be split into (m x n) pixels one per AVPE. Each AVPE perform and exchange the segmentation results and maintain asynchronous communication with their team leader until the convergence of this algorithm. Some interesting experimental results are obtained in terms of accuracy and efficiency analysis of the proposed implementation, thanks to the mobile agents several interesting skills introduced in this distributed computational model.

Keywords: distributed type-2 fuzzy algorithm, image processing, mobile agents, parallel and distributed computing

Procedia PDF Downloads 385
608 Indian Brands Speak Through Colors That Is ‘Culturally Vibrant’

Authors: Ranjana Dani

Abstract:

Brand communication narratives in India has evolved today to reflect the vibrant and intriguing tone of voice inspired by a rich cultural heritage while addressing the culturally alert attitude of the contemporary global Indian. Brands are strongly associated with the organization's values, vision, and mission and portray this through specific ‘look and feel’ and ‘tone of voice’. It is within the brand’s visual language that COLOUR has evolved to become a most powerful weapon in the designer’s arsenal. Color is big business in Brand Design! A brand is a ‘collection of perceptions’, meaningful brand connect is about striving to occupy head and heart space in consumers. The persona of the young Indian reflects a deep attachment to cultural roots as seen through the characteristic of ‘Indie Pride,’ blended with the ambitious, aspirational traits of a modern ‘global citizen’.Studies on ‘Color Perceptions’ indicate a trend that amplifies this, and hence brands reflect a GLOCAL palette, a Global and Local Blend. This paper establishes this through case studies that expand the inspirations, selection processes, and use of innovative color palettes crafted by some dynamic brand designers. This throws light on the role of color as it generates visual impact and recall for successful brands.

Keywords: colour palettes, brand design and business, cultural context, colour perceptions, glocal, contemporaneity

Procedia PDF Downloads 49
607 Engendered Noises: The Gender Politics of Sensorial Pleasure in Neoliberal Korean Food Commercials

Authors: Eunyup Yeom

Abstract:

The roles of male and female in context of cuisine have developed into stereotypes throughout history. However¬— with Korea’s fast advancement in politics, technology, society and social standards¬— gender stereotypes have become blurred. This is not to say that such stereotypes no longer exist for they still remain present in media and advertisements embedding ‘idealistic’ ideas into the unconscious state of minds of viewers. Many media outlets, especially commercials, portray males expressing pleasure of food [that they are advertising] through audible qualities generally considered ‘rude’ and ‘unmannered’ in the Korean society. Females, on the other hand, express such pleasures only verbally. This happenstance of a stereotype is displayed bluntly in instant noodle, namely ramen, commercials. This research explores the cultural significance of a type of audible gesture that can be found in Korean speech in which is termed the Fricative Voice Gesture (FVG). There are two forms of FVGs: the reactive and the prosodic. The reactive FVG is a legitimate form of expression while the prosodic FVG works as a speech intensifier. So, in order to understand this stereotype of who is authorized to express sensorial pleasure as a reactive FVG as opposed to a prosodic FVG, information has been extracted from interviews and dissected numerous ramen/instant noodle commercials and its appearances in other mediums of media. The commercials were tediously analyzed in all aspects of dialogue, featured contents, background music, actors and/or actresses selling the product, body language, and voice gestures. To effectively understand the exact impact these commercials have on the audience, each commercial was viewed with an interviewee. In this research, there were main informants whom were all Korean students residing in South Korea. All three interviewees were able to attend interview and commercial viewing sessions via Skype. This research, overall, focuses and concludes on Harkness’s statement of how the reactive FVG is a recognizable index of the privileging of males for Korean culture norms and, in parallel, food commercials are still conforming to male ideals and fantasies.

Keywords: advertisement, food politics, fricative voice gestures, gender politics

Procedia PDF Downloads 198
606 Automatic Differential Diagnosis of Melanocytic Skin Tumours Using Ultrasound and Spectrophotometric Data

Authors: Kristina Sakalauskiene, Renaldas Raisutis, Gintare Linkeviciute, Skaidra Valiukeviciene

Abstract:

Cutaneous melanoma is a melanocytic skin tumour, which has a very poor prognosis while is highly resistant to treatment and tends to metastasize. Thickness of melanoma is one of the most important biomarker for stage of disease, prognosis and surgery planning. In this study, we hypothesized that the automatic analysis of spectrophotometric images and high-frequency ultrasonic 2D data can improve differential diagnosis of cutaneous melanoma and provide additional information about tumour penetration depth. This paper presents the novel complex automatic system for non-invasive melanocytic skin tumour differential diagnosis and penetration depth evaluation. The system is composed of region of interest segmentation in spectrophotometric images and high-frequency ultrasound data, quantitative parameter evaluation, informative feature extraction and classification with linear regression classifier. The segmentation of melanocytic skin tumour region in ultrasound image is based on parametric integrated backscattering coefficient calculation. The segmentation of optical image is based on Otsu thresholding. In total 29 quantitative tissue characterization parameters were evaluated by using ultrasound data (11 acoustical, 4 shape and 15 textural parameters) and 55 quantitative features of dermatoscopic and spectrophotometric images (using total melanin, dermal melanin, blood and collagen SIAgraphs acquired using spectrophotometric imaging device SIAscope). In total 102 melanocytic skin lesions (including 43 cutaneous melanomas) were examined by using SIAscope and ultrasound system with 22 MHz center frequency single element transducer. The diagnosis and Breslow thickness (pT) of each MST were evaluated during routine histological examination after excision and used as a reference. The results of this study have shown that automatic analysis of spectrophotometric and high frequency ultrasound data can improve non-invasive classification accuracy of early-stage cutaneous melanoma and provide supplementary information about tumour penetration depth.

Keywords: cutaneous melanoma, differential diagnosis, high-frequency ultrasound, melanocytic skin tumours, spectrophotometric imaging

Procedia PDF Downloads 243
605 Embedded Semantic Segmentation Network Optimized for Matrix Multiplication Accelerator

Authors: Jaeyoung Lee

Abstract:

Autonomous driving systems require high reliability to provide people with a safe and comfortable driving experience. However, despite the development of a number of vehicle sensors, it is difficult to always provide high perceived performance in driving environments that vary from time to season. The image segmentation method using deep learning, which has recently evolved rapidly, provides high recognition performance in various road environments stably. However, since the system controls a vehicle in real time, a highly complex deep learning network cannot be used due to time and memory constraints. Moreover, efficient networks are optimized for GPU environments, which degrade performance in embedded processor environments equipped simple hardware accelerators. In this paper, a semantic segmentation network, matrix multiplication accelerator network (MMANet), optimized for matrix multiplication accelerator (MMA) on Texas instrument digital signal processors (TI DSP) is proposed to improve the recognition performance of autonomous driving system. The proposed method is designed to maximize the number of layers that can be performed in a limited time to provide reliable driving environment information in real time. First, the number of channels in the activation map is fixed to fit the structure of MMA. By increasing the number of parallel branches, the lack of information caused by fixing the number of channels is resolved. Second, an efficient convolution is selected depending on the size of the activation. Since MMA is a fixed, it may be more efficient for normal convolution than depthwise separable convolution depending on memory access overhead. Thus, a convolution type is decided according to output stride to increase network depth. In addition, memory access time is minimized by processing operations only in L3 cache. Lastly, reliable contexts are extracted using the extended atrous spatial pyramid pooling (ASPP). The suggested method gets stable features from an extended path by increasing the kernel size and accessing consecutive data. In addition, it consists of two ASPPs to obtain high quality contexts using the restored shape without global average pooling paths since the layer uses MMA as a simple adder. To verify the proposed method, an experiment is conducted using perfsim, a timing simulator, and the Cityscapes validation sets. The proposed network can process an image with 640 x 480 resolution for 6.67 ms, so six cameras can be used to identify the surroundings of the vehicle as 20 frame per second (FPS). In addition, it achieves 73.1% mean intersection over union (mIoU) which is the highest recognition rate among embedded networks on the Cityscapes validation set.

Keywords: edge network, embedded network, MMA, matrix multiplication accelerator, semantic segmentation network

Procedia PDF Downloads 94
604 Critical Thinking and Academic Writing: A Case Study

Authors: Mubina Rauf

Abstract:

Critical thinking is a highly valued outcome of university education. There is an agreement in literature that it is demonstrated through the abilities to highlight issues and assumptions, find links between ideas and concepts, make correct inferences, evaluate evidence or authority and deduce conclusions (Tsui, 2002). Although Critical thinking plays a significant role in developing all academic skills, its role in developing writing skills is significant (Kurfiss, 1988). SAW (student academic writing) is an observable output of critical thinking (Wilson K. , 2016). When students apply critical thinking to their writing, they present clear, accurate, significant and logical arguments constructing their own voice in the form of an essay or dissertation (Matsuda, 2001). This presentation will show how a rubric can be used to find evidence of critical thinking in SAW. Participants will experience how evidence-based written arguments supported by background knowledge and authorial voice can develop students into efficient critical thinkers. Participants will have an opportunity to use the rubric to find the evidence of critical thinking in SAW samples. This presentation is intended for classroom teachers with or without the basic knowledge of implementing critical thinking in academic settings. Participants will also learn tips how various features of critical thinking can be developed among students. After the session, the participants will be able to use or adapt the rubric according to their needs to find evidence of critical thinking in SAW within their context.

Keywords: critical thinking, Rubric, student academic writing, argumentation, text analysis

Procedia PDF Downloads 37
603 Vehicular Speed Detection Camera System Using Video Stream

Authors: C. A. Anser Pasha

Abstract:

In this paper, a new Vehicular Speed Detection Camera System that is applicable as an alternative to traditional radars with the same accuracy or even better is presented. The real-time measurement and analysis of various traffic parameters such as speed and number of vehicles are increasingly required in traffic control and management. Image processing techniques are now considered as an attractive and flexible method for automatic analysis and data collections in traffic engineering. Various algorithms based on image processing techniques have been applied to detect multiple vehicles and track them. The SDCS processes can be divided into three successive phases; the first phase is Objects detection phase, which uses a hybrid algorithm based on combining an adaptive background subtraction technique with a three-frame differencing algorithm which ratifies the major drawback of using only adaptive background subtraction. The second phase is Objects tracking, which consists of three successive operations - object segmentation, object labeling, and object center extraction. Objects tracking operation takes into consideration the different possible scenarios of the moving object like simple tracking, the object has left the scene, the object has entered the scene, object crossed by another object, and object leaves and another one enters the scene. The third phase is speed calculation phase, which is calculated from the number of frames consumed by the object to pass by the scene.

Keywords: radar, image processing, detection, tracking, segmentation

Procedia PDF Downloads 425
602 Fruit Identification System in Sweet Orange Citrus (L.) Osbeck Using Thermal Imaging and Fuzzy

Authors: Ingrid Argote, John Archila, Marcelo Becker

Abstract:

In agriculture, intelligent systems applications have generated great advances in automating some of the processes in the production chain. In order to improve the efficiency of those systems is proposed a vision system to estimate the amount of fruits in sweet orange trees. This work presents a system proposal using capture of thermal images and fuzzy logic. A bibliographical review has been done to analyze the state-of-the-art of the different systems used in fruit recognition, and also the different applications of thermography in agricultural systems. The algorithm developed for this project uses the metrics of the fuzzines parameter to the contrast improvement and segmentation of the image, for the counting algorith m was used the Hough transform. In order to validate the proposed algorithm was created a bank of images of sweet orange Citrus (L.) Osbeck acquired in the Maringá Farm. The tests with the algorithm Indicated that the variation of the tree branch temperature and the fruit is not very high, Which makes the process of image segmentation using this differentiates, This Increases the amount of false positives in the fruit counting algorithm. Recognition of fruits isolated with the proposed algorithm present an overall accuracy of 90.5 % and grouped fruits. The accuracy was 81.3 %. The experiments show the need for a more suitable hardware to have a better recognition of small temperature changes in the image.

Keywords: Agricultural systems, Citrus, Fuzzy logic, Thermal images.

Procedia PDF Downloads 205
601 An Intelligent Text Independent Speaker Identification Using VQ-GMM Model Based Multiple Classifier System

Authors: Ben Soltane Cheima, Ittansa Yonas Kelbesa

Abstract:

Speaker Identification (SI) is the task of establishing identity of an individual based on his/her voice characteristics. The SI task is typically achieved by two-stage signal processing: training and testing. The training process calculates speaker specific feature parameters from the speech and generates speaker models accordingly. In the testing phase, speech samples from unknown speakers are compared with the models and classified. Even though performance of speaker identification systems has improved due to recent advances in speech processing techniques, there is still need of improvement. In this paper, a Closed-Set Tex-Independent Speaker Identification System (CISI) based on a Multiple Classifier System (MCS) is proposed, using Mel Frequency Cepstrum Coefficient (MFCC) as feature extraction and suitable combination of vector quantization (VQ) and Gaussian Mixture Model (GMM) together with Expectation Maximization algorithm (EM) for speaker modeling. The use of Voice Activity Detector (VAD) with a hybrid approach based on Short Time Energy (STE) and Statistical Modeling of Background Noise in the pre-processing step of the feature extraction yields a better and more robust automatic speaker identification system. Also investigation of Linde-Buzo-Gray (LBG) clustering algorithm for initialization of GMM, for estimating the underlying parameters, in the EM step improved the convergence rate and systems performance. It also uses relative index as confidence measures in case of contradiction in identification process by GMM and VQ as well. Simulation results carried out on voxforge.org speech database using MATLAB highlight the efficacy of the proposed method compared to earlier work.

Keywords: feature extraction, speaker modeling, feature matching, Mel frequency cepstrum coefficient (MFCC), Gaussian mixture model (GMM), vector quantization (VQ), Linde-Buzo-Gray (LBG), expectation maximization (EM), pre-processing, voice activity detection (VAD), short time energy (STE), background noise statistical modeling, closed-set tex-independent speaker identification system (CISI)

Procedia PDF Downloads 278