Search results for: voice segmentation
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 934

Search results for: voice segmentation

634 The Evolution of Amazon Alexa: From Voice Assistant to Smart Home Hub

Authors: Abrar Abuzaid, Maha Alaaeddine, Haya Alesayi

Abstract:

This project is centered around understanding the usage and impact of Alexa, Amazon's popular virtual assistant, in everyday life. Alexa, known for its integration into devices like Amazon Echo, offers functionalities such as voice interaction, media control, providing real-time information, and managing smart home devices. Our primary focus is to conduct a straightforward survey aimed at uncovering how people use Alexa in their daily routines. We plan to reach out to a wide range of individuals to get a diverse perspective on how Alexa is being utilized for various tasks, the frequency and context of its use, and the overall user experience. The survey will explore the most common uses of Alexa, its impact on daily life, features that users find most beneficial, and improvements they are looking for. This project is not just about collecting data but also about understanding the real-world applications of a technology like Alexa and how it fits into different lifestyles. By examining the responses, we aim to gain a practical understanding of Alexa's role in homes and possibly in workplaces. This project will provide insights into user satisfaction and areas where Alexa could be enhanced to meet the evolving needs of its users. It’s a step towards connecting technology with everyday life, making it more accessible and user-friendly

Keywords: Amazon Alexa, artificial intelligence, smart speaker, natural language processing

Procedia PDF Downloads 62
633 Finding a Paraguayan Voice: The Indigenous Language Guarani in Performances of Paraguayan Female Singers

Authors: Romy Martinez

Abstract:

This paper focuses on the use of the indigenous language Guarani in Paraguayan popular song and on some key interpreters born between the 1930s and 1980s. It analyses two representative musical genres of Paraguay, the Polka Paraguaya and Guarania. The lyrics of these genres follow one of four poetic-linguistic forms: to be entirely in Guarani, entirely in Spanish, bilingual (alternating verses in Guarani and Spanish), or in Jopará; the last being a form where words of both languages may be mixed in a single verse. Through these forms, the lyrics alternate and combine the indigenous voice with the one introduced with colonisation, in turn reflecting how Guarani seems to constantly transit, to and from, between a position of disdain and of value within Paraguayan society. Through analysing recordings of Polkas, Paraguayas, and Guaranias, it identifies three styles of singing adopted by female singers who include these genres in their repertoires, namely Paraguayan classical folk, Paraguayan folk, and Paraguayan pop-folk. This analysis is informed by a pilot study which consisted of online interviews with several Paraguayan artists, revealing significant aspects of their backgrounds and musical influences. In addition, it draws on autoethnographic approaches, building on the experience of the music researcher and singer. From a decolonising perspective, the paper brings together the distinctive voices and sounds expressed in popular songs from a marginalised country, language, and gender.

Keywords: female singers, Guarani, Paraguayan song, performance

Procedia PDF Downloads 201
632 The Reach, Influence, and Acceptance of International Media Institutions in Local Language Broadcasting in Africa: A Case Study of VOA, DW, and BBC Amharic Services in Ethiopia

Authors: Aster Misganaw

Abstract:

This study investigates the reach, influence, and credibility of international broadcasters—specifically Voice of America (VOA), Deutsche Welle (DW), and British Broadcasting Corporation (BBC)—among Ethiopian audiences, comparing these perceptions to local media sources. Utilizing a mixed-methods approach that included quantitative surveys and qualitative interviews, the research reveals that the majority of respondents engage regularly with international broadcasters, with younger audiences showing a marked preference. Findings indicate that most of the participants perceive these international sources as more credible than local media, largely due to concerns over government influence on local reporting. Furthermore, the study finds that the majority of respondents believe international broadcasters significantly shape their understanding of both domestic and international issues, highlighting their critical role in public discourse. To enhance their relevance, it is recommended that international broadcasters incorporate more localized content while local media must work to improve their credibility and independence to better serve the Ethiopian public. This research contributes to the understanding of media consumption dynamics in Ethiopia, emphasizing the interplay between local and international narratives in shaping public opinion.

Keywords: international media, BBC, Deutsche Welle, Ethiopian media, Voice of America, audience

Procedia PDF Downloads 14
631 Particle Filter Supported with the Neural Network for Aircraft Tracking Based on Kernel and Active Contour

Authors: Mohammad Izadkhah, Mojtaba Hoseini, Alireza Khalili Tehrani

Abstract:

In this paper we presented a new method for tracking flying targets in color video sequences based on contour and kernel. The aim of this work is to overcome the problem of losing target in changing light, large displacement, changing speed, and occlusion. The proposed method is made in three steps, estimate the target location by particle filter, segmentation target region using neural network and find the exact contours by greedy snake algorithm. In the proposed method we have used both region and contour information to create target candidate model and this model is dynamically updated during tracking. To avoid the accumulation of errors when updating, target region given to a perceptron neural network to separate the target from background. Then its output used for exact calculation of size and center of the target. Also it is used as the initial contour for the greedy snake algorithm to find the exact target's edge. The proposed algorithm has been tested on a database which contains a lot of challenges such as high speed and agility of aircrafts, background clutter, occlusions, camera movement, and so on. The experimental results show that the use of neural network increases the accuracy of tracking and segmentation.

Keywords: video tracking, particle filter, greedy snake, neural network

Procedia PDF Downloads 342
630 Computer-Aided Classification of Liver Lesions Using Contrasting Features Difference

Authors: Hussein Alahmer, Amr Ahmed

Abstract:

Liver cancer is one of the common diseases that cause the death. Early detection is important to diagnose and reduce the incidence of death. Improvements in medical imaging and image processing techniques have significantly enhanced interpretation of medical images. Computer-Aided Diagnosis (CAD) systems based on these techniques play a vital role in the early detection of liver disease and hence reduce liver cancer death rate.  This paper presents an automated CAD system consists of three stages; firstly, automatic liver segmentation and lesion’s detection. Secondly, extracting features. Finally, classifying liver lesions into benign and malignant by using the novel contrasting feature-difference approach. Several types of intensity, texture features are extracted from both; the lesion area and its surrounding normal liver tissue. The difference between the features of both areas is then used as the new lesion descriptors. Machine learning classifiers are then trained on the new descriptors to automatically classify liver lesions into benign or malignant. The experimental results show promising improvements. Moreover, the proposed approach can overcome the problems of varying ranges of intensity and textures between patients, demographics, and imaging devices and settings.

Keywords: CAD system, difference of feature, fuzzy c means, lesion detection, liver segmentation

Procedia PDF Downloads 325
629 A Fast Parallel and Distributed Type-2 Fuzzy Algorithm Based on Cooperative Mobile Agents Model for High Performance Image Processing

Authors: Fatéma Zahra Benchara, Mohamed Youssfi, Omar Bouattane, Hassan Ouajji, Mohamed Ouadi Bensalah

Abstract:

The aim of this paper is to present a distributed implementation of the Type-2 Fuzzy algorithm in a parallel and distributed computing environment based on mobile agents. The proposed algorithm is assigned to be implemented on a SPMD (Single Program Multiple Data) architecture which is based on cooperative mobile agents as AVPE (Agent Virtual Processing Element) model in order to improve the processing resources needed for performing the big data image segmentation. In this work we focused on the application of this algorithm in order to process the big data MRI (Magnetic Resonance Images) image of size (n x m). It is encapsulated on the Mobile agent team leader in order to be split into (m x n) pixels one per AVPE. Each AVPE perform and exchange the segmentation results and maintain asynchronous communication with their team leader until the convergence of this algorithm. Some interesting experimental results are obtained in terms of accuracy and efficiency analysis of the proposed implementation, thanks to the mobile agents several interesting skills introduced in this distributed computational model.

Keywords: distributed type-2 fuzzy algorithm, image processing, mobile agents, parallel and distributed computing

Procedia PDF Downloads 429
628 Development of Internet of Things (IoT) with Mobile Voice Picking and Cargo Tracing Systems in Warehouse Operations of Third-Party Logistics

Authors: Eugene Y. C. Wong

Abstract:

The increased market competition, customer expectation, and warehouse operating cost in third-party logistics have motivated the continuous exploration in improving operation efficiency in warehouse logistics. Cargo tracing in ordering picking process consumes excessive time for warehouse operators when handling enormous quantities of goods flowing through the warehouse each day. Internet of Things (IoT) with mobile cargo tracing apps and database management systems are developed this research to facilitate and reduce the cargo tracing time in order picking process of a third-party logistics firm. An operation review is carried out in the firm with opportunities for improvement being identified, including inaccurate inventory record in warehouse management system, excessive tracing time on stored products, and product misdelivery. The facility layout has been improved by modifying the designated locations of various types of products. The relationship among the pick and pack processing time, cargo tracing time, delivery accuracy, inventory turnover, and inventory count operation time in the warehouse are evaluated. The correlation of the factors affecting the overall cycle time is analysed. A mobile app is developed with the use of MIT App Inventor and the Access management database to facilitate cargo tracking anytime anywhere. The information flow framework from warehouse database system to cloud computing document-sharing, and further to the mobile app device is developed. The improved performance on cargo tracing in the order processing cycle time of warehouse operators have been collected and evaluated. The developed mobile voice picking and tracking systems brings significant benefit to the third-party logistics firm, including eliminating unnecessary cargo tracing time in order picking process and reducing warehouse operators overtime cost. The mobile tracking device is further planned to enhance the picking time and cycle count of warehouse operators with voice picking system in the developed mobile apps as future development.

Keywords: warehouse, order picking process, cargo tracing, mobile app, third-party logistics

Procedia PDF Downloads 374
627 Indian Brands Speak Through Colors That Is ‘Culturally Vibrant’

Authors: Ranjana Dani

Abstract:

Brand communication narratives in India has evolved today to reflect the vibrant and intriguing tone of voice inspired by a rich cultural heritage while addressing the culturally alert attitude of the contemporary global Indian. Brands are strongly associated with the organization's values, vision, and mission and portray this through specific ‘look and feel’ and ‘tone of voice’. It is within the brand’s visual language that COLOUR has evolved to become a most powerful weapon in the designer’s arsenal. Color is big business in Brand Design! A brand is a ‘collection of perceptions’, meaningful brand connect is about striving to occupy head and heart space in consumers. The persona of the young Indian reflects a deep attachment to cultural roots as seen through the characteristic of ‘Indie Pride,’ blended with the ambitious, aspirational traits of a modern ‘global citizen’.Studies on ‘Color Perceptions’ indicate a trend that amplifies this, and hence brands reflect a GLOCAL palette, a Global and Local Blend. This paper establishes this through case studies that expand the inspirations, selection processes, and use of innovative color palettes crafted by some dynamic brand designers. This throws light on the role of color as it generates visual impact and recall for successful brands.

Keywords: colour palettes, brand design and business, cultural context, colour perceptions, glocal, contemporaneity

Procedia PDF Downloads 76
626 Engendered Noises: The Gender Politics of Sensorial Pleasure in Neoliberal Korean Food Commercials

Authors: Eunyup Yeom

Abstract:

The roles of male and female in context of cuisine have developed into stereotypes throughout history. However¬— with Korea’s fast advancement in politics, technology, society and social standards¬— gender stereotypes have become blurred. This is not to say that such stereotypes no longer exist for they still remain present in media and advertisements embedding ‘idealistic’ ideas into the unconscious state of minds of viewers. Many media outlets, especially commercials, portray males expressing pleasure of food [that they are advertising] through audible qualities generally considered ‘rude’ and ‘unmannered’ in the Korean society. Females, on the other hand, express such pleasures only verbally. This happenstance of a stereotype is displayed bluntly in instant noodle, namely ramen, commercials. This research explores the cultural significance of a type of audible gesture that can be found in Korean speech in which is termed the Fricative Voice Gesture (FVG). There are two forms of FVGs: the reactive and the prosodic. The reactive FVG is a legitimate form of expression while the prosodic FVG works as a speech intensifier. So, in order to understand this stereotype of who is authorized to express sensorial pleasure as a reactive FVG as opposed to a prosodic FVG, information has been extracted from interviews and dissected numerous ramen/instant noodle commercials and its appearances in other mediums of media. The commercials were tediously analyzed in all aspects of dialogue, featured contents, background music, actors and/or actresses selling the product, body language, and voice gestures. To effectively understand the exact impact these commercials have on the audience, each commercial was viewed with an interviewee. In this research, there were main informants whom were all Korean students residing in South Korea. All three interviewees were able to attend interview and commercial viewing sessions via Skype. This research, overall, focuses and concludes on Harkness’s statement of how the reactive FVG is a recognizable index of the privileging of males for Korean culture norms and, in parallel, food commercials are still conforming to male ideals and fantasies.

Keywords: advertisement, food politics, fricative voice gestures, gender politics

Procedia PDF Downloads 226
625 Automatic Differential Diagnosis of Melanocytic Skin Tumours Using Ultrasound and Spectrophotometric Data

Authors: Kristina Sakalauskiene, Renaldas Raisutis, Gintare Linkeviciute, Skaidra Valiukeviciene

Abstract:

Cutaneous melanoma is a melanocytic skin tumour, which has a very poor prognosis while is highly resistant to treatment and tends to metastasize. Thickness of melanoma is one of the most important biomarker for stage of disease, prognosis and surgery planning. In this study, we hypothesized that the automatic analysis of spectrophotometric images and high-frequency ultrasonic 2D data can improve differential diagnosis of cutaneous melanoma and provide additional information about tumour penetration depth. This paper presents the novel complex automatic system for non-invasive melanocytic skin tumour differential diagnosis and penetration depth evaluation. The system is composed of region of interest segmentation in spectrophotometric images and high-frequency ultrasound data, quantitative parameter evaluation, informative feature extraction and classification with linear regression classifier. The segmentation of melanocytic skin tumour region in ultrasound image is based on parametric integrated backscattering coefficient calculation. The segmentation of optical image is based on Otsu thresholding. In total 29 quantitative tissue characterization parameters were evaluated by using ultrasound data (11 acoustical, 4 shape and 15 textural parameters) and 55 quantitative features of dermatoscopic and spectrophotometric images (using total melanin, dermal melanin, blood and collagen SIAgraphs acquired using spectrophotometric imaging device SIAscope). In total 102 melanocytic skin lesions (including 43 cutaneous melanomas) were examined by using SIAscope and ultrasound system with 22 MHz center frequency single element transducer. The diagnosis and Breslow thickness (pT) of each MST were evaluated during routine histological examination after excision and used as a reference. The results of this study have shown that automatic analysis of spectrophotometric and high frequency ultrasound data can improve non-invasive classification accuracy of early-stage cutaneous melanoma and provide supplementary information about tumour penetration depth.

Keywords: cutaneous melanoma, differential diagnosis, high-frequency ultrasound, melanocytic skin tumours, spectrophotometric imaging

Procedia PDF Downloads 270
624 Embedded Semantic Segmentation Network Optimized for Matrix Multiplication Accelerator

Authors: Jaeyoung Lee

Abstract:

Autonomous driving systems require high reliability to provide people with a safe and comfortable driving experience. However, despite the development of a number of vehicle sensors, it is difficult to always provide high perceived performance in driving environments that vary from time to season. The image segmentation method using deep learning, which has recently evolved rapidly, provides high recognition performance in various road environments stably. However, since the system controls a vehicle in real time, a highly complex deep learning network cannot be used due to time and memory constraints. Moreover, efficient networks are optimized for GPU environments, which degrade performance in embedded processor environments equipped simple hardware accelerators. In this paper, a semantic segmentation network, matrix multiplication accelerator network (MMANet), optimized for matrix multiplication accelerator (MMA) on Texas instrument digital signal processors (TI DSP) is proposed to improve the recognition performance of autonomous driving system. The proposed method is designed to maximize the number of layers that can be performed in a limited time to provide reliable driving environment information in real time. First, the number of channels in the activation map is fixed to fit the structure of MMA. By increasing the number of parallel branches, the lack of information caused by fixing the number of channels is resolved. Second, an efficient convolution is selected depending on the size of the activation. Since MMA is a fixed, it may be more efficient for normal convolution than depthwise separable convolution depending on memory access overhead. Thus, a convolution type is decided according to output stride to increase network depth. In addition, memory access time is minimized by processing operations only in L3 cache. Lastly, reliable contexts are extracted using the extended atrous spatial pyramid pooling (ASPP). The suggested method gets stable features from an extended path by increasing the kernel size and accessing consecutive data. In addition, it consists of two ASPPs to obtain high quality contexts using the restored shape without global average pooling paths since the layer uses MMA as a simple adder. To verify the proposed method, an experiment is conducted using perfsim, a timing simulator, and the Cityscapes validation sets. The proposed network can process an image with 640 x 480 resolution for 6.67 ms, so six cameras can be used to identify the surroundings of the vehicle as 20 frame per second (FPS). In addition, it achieves 73.1% mean intersection over union (mIoU) which is the highest recognition rate among embedded networks on the Cityscapes validation set.

Keywords: edge network, embedded network, MMA, matrix multiplication accelerator, semantic segmentation network

Procedia PDF Downloads 129
623 Critical Thinking and Academic Writing: A Case Study

Authors: Mubina Rauf

Abstract:

Critical thinking is a highly valued outcome of university education. There is an agreement in literature that it is demonstrated through the abilities to highlight issues and assumptions, find links between ideas and concepts, make correct inferences, evaluate evidence or authority and deduce conclusions (Tsui, 2002). Although Critical thinking plays a significant role in developing all academic skills, its role in developing writing skills is significant (Kurfiss, 1988). SAW (student academic writing) is an observable output of critical thinking (Wilson K. , 2016). When students apply critical thinking to their writing, they present clear, accurate, significant and logical arguments constructing their own voice in the form of an essay or dissertation (Matsuda, 2001). This presentation will show how a rubric can be used to find evidence of critical thinking in SAW. Participants will experience how evidence-based written arguments supported by background knowledge and authorial voice can develop students into efficient critical thinkers. Participants will have an opportunity to use the rubric to find the evidence of critical thinking in SAW samples. This presentation is intended for classroom teachers with or without the basic knowledge of implementing critical thinking in academic settings. Participants will also learn tips how various features of critical thinking can be developed among students. After the session, the participants will be able to use or adapt the rubric according to their needs to find evidence of critical thinking in SAW within their context.

Keywords: critical thinking, Rubric, student academic writing, argumentation, text analysis

Procedia PDF Downloads 73
622 Vehicular Speed Detection Camera System Using Video Stream

Authors: C. A. Anser Pasha

Abstract:

In this paper, a new Vehicular Speed Detection Camera System that is applicable as an alternative to traditional radars with the same accuracy or even better is presented. The real-time measurement and analysis of various traffic parameters such as speed and number of vehicles are increasingly required in traffic control and management. Image processing techniques are now considered as an attractive and flexible method for automatic analysis and data collections in traffic engineering. Various algorithms based on image processing techniques have been applied to detect multiple vehicles and track them. The SDCS processes can be divided into three successive phases; the first phase is Objects detection phase, which uses a hybrid algorithm based on combining an adaptive background subtraction technique with a three-frame differencing algorithm which ratifies the major drawback of using only adaptive background subtraction. The second phase is Objects tracking, which consists of three successive operations - object segmentation, object labeling, and object center extraction. Objects tracking operation takes into consideration the different possible scenarios of the moving object like simple tracking, the object has left the scene, the object has entered the scene, object crossed by another object, and object leaves and another one enters the scene. The third phase is speed calculation phase, which is calculated from the number of frames consumed by the object to pass by the scene.

Keywords: radar, image processing, detection, tracking, segmentation

Procedia PDF Downloads 467
621 Fruit Identification System in Sweet Orange Citrus (L.) Osbeck Using Thermal Imaging and Fuzzy

Authors: Ingrid Argote, John Archila, Marcelo Becker

Abstract:

In agriculture, intelligent systems applications have generated great advances in automating some of the processes in the production chain. In order to improve the efficiency of those systems is proposed a vision system to estimate the amount of fruits in sweet orange trees. This work presents a system proposal using capture of thermal images and fuzzy logic. A bibliographical review has been done to analyze the state-of-the-art of the different systems used in fruit recognition, and also the different applications of thermography in agricultural systems. The algorithm developed for this project uses the metrics of the fuzzines parameter to the contrast improvement and segmentation of the image, for the counting algorith m was used the Hough transform. In order to validate the proposed algorithm was created a bank of images of sweet orange Citrus (L.) Osbeck acquired in the Maringá Farm. The tests with the algorithm Indicated that the variation of the tree branch temperature and the fruit is not very high, Which makes the process of image segmentation using this differentiates, This Increases the amount of false positives in the fruit counting algorithm. Recognition of fruits isolated with the proposed algorithm present an overall accuracy of 90.5 % and grouped fruits. The accuracy was 81.3 %. The experiments show the need for a more suitable hardware to have a better recognition of small temperature changes in the image.

Keywords: Agricultural systems, Citrus, Fuzzy logic, Thermal images.

Procedia PDF Downloads 229
620 An Intelligent Text Independent Speaker Identification Using VQ-GMM Model Based Multiple Classifier System

Authors: Ben Soltane Cheima, Ittansa Yonas Kelbesa

Abstract:

Speaker Identification (SI) is the task of establishing identity of an individual based on his/her voice characteristics. The SI task is typically achieved by two-stage signal processing: training and testing. The training process calculates speaker specific feature parameters from the speech and generates speaker models accordingly. In the testing phase, speech samples from unknown speakers are compared with the models and classified. Even though performance of speaker identification systems has improved due to recent advances in speech processing techniques, there is still need of improvement. In this paper, a Closed-Set Tex-Independent Speaker Identification System (CISI) based on a Multiple Classifier System (MCS) is proposed, using Mel Frequency Cepstrum Coefficient (MFCC) as feature extraction and suitable combination of vector quantization (VQ) and Gaussian Mixture Model (GMM) together with Expectation Maximization algorithm (EM) for speaker modeling. The use of Voice Activity Detector (VAD) with a hybrid approach based on Short Time Energy (STE) and Statistical Modeling of Background Noise in the pre-processing step of the feature extraction yields a better and more robust automatic speaker identification system. Also investigation of Linde-Buzo-Gray (LBG) clustering algorithm for initialization of GMM, for estimating the underlying parameters, in the EM step improved the convergence rate and systems performance. It also uses relative index as confidence measures in case of contradiction in identification process by GMM and VQ as well. Simulation results carried out on voxforge.org speech database using MATLAB highlight the efficacy of the proposed method compared to earlier work.

Keywords: feature extraction, speaker modeling, feature matching, Mel frequency cepstrum coefficient (MFCC), Gaussian mixture model (GMM), vector quantization (VQ), Linde-Buzo-Gray (LBG), expectation maximization (EM), pre-processing, voice activity detection (VAD), short time energy (STE), background noise statistical modeling, closed-set tex-independent speaker identification system (CISI)

Procedia PDF Downloads 309
619 The Development, Composition, and Implementation of Vocalises as a Method of Technical Training for the Adult Musical Theatre Singer

Authors: Casey Keenan Joiner, Shayna Tayloe

Abstract:

Classical voice training for the novice singer has long relied on the guidance and instruction of vocalise collections, such as those written and compiled by Marchesi, Lütgen, Vaccai, and Lamperti. These vocalise collections purport to encourage healthy vocal habits and instill technical longevity in both aspiring and established singers, though their scope has long been somewhat confined to the classical idiom. For pedagogues and students specializing in other vocal genres, such as musical theatre and CCM (contemporary commercial music,) low-impact and pertinent vocal training aids are in short supply, and much of the suggested literature derives from classical methodology. While the tenants of healthy vocal production remain ubiquitous, specific stylistic needs and technical emphases differ from genre to genre and may require a specified extension of vocal acuity. As musical theatre continues to grow in popularity at both the professional and collegiate levels, the need for specialized training grows as well. Pedagogical literature geared specifically towards musical theatre (MT) singing and vocal production, while relatively uncommon, is readily accessible to the contemporary educator. Practitioners such as Norman Spivey, Mary Saunders Barton, Claudia Friedlander, Wendy Leborgne, and Marci Rosenberg continue to publish relevant research in the field of musical theatre voice pedagogy and have successfully identified many common MT vocal faults, their subsequent diagnoses, and their eventual corrections. Where classical methodology would suggest specific vocalises or training exercises to maintain corrected vocal posture following successful fault diagnosis, musical theatre finds itself without a relevant body of work towards which to transition. By analyzing the existing vocalise literature by means of a specialized set of parameters, including but not limited to melodic variation, rhythmic complexity, vowel utilization, and technical targeting, we have composed a set of vocalises meant specifically to address the training and conditioning of adult musical theatre voices. These vocalises target many pedagogical tenants in the musical theatre genre, including but not limited to thyroarytenoid-dominant production, twang resonance, lateral vowel formation, and “belt-mix.” By implementing these vocalises in the musical theatre voice studio, pedagogues can efficiently communicate proper musical theatre vocal posture and kinesthetic connection to their students, regardless of age or level of experience. The composition of these vocalises serves MT pedagogues on both a technical level as well as a sociological one. MT is a relative newcomer on the collegiate stage and the academization of musical theatre methodologies has been a slow and arduous process. The conflation of classical and MT techniques and training methods has long plagued the world of voice pedagogy and teachers often find themselves in positions of “cross-training,” that is, teaching students of both genres in one combined voice studio. As MT continues to establish itself on academic platforms worldwide, genre-specific literature and focused studies are both rare and invaluable. To ensure that modern students receive exacting and definitive training in their chosen fields, it becomes increasingly necessary for genres such as musical theatre to boast specified literature and a collection of musical theatre-specific vocalises only aids in this effort. This collection of musical theatre vocalises is the first of its kind and provides genre-specific studios with a basis upon which to grow healthy, balanced voices built for the harsh conditions of the modern theatre stage.

Keywords: voice pedagogy, targeted methodology, musical theatre, singing

Procedia PDF Downloads 156
618 Alphabet Recognition Using Pixel Probability Distribution

Authors: Vaidehi Murarka, Sneha Mehta, Dishant Upadhyay

Abstract:

Our project topic is “Alphabet Recognition using pixel probability distribution”. The project uses techniques of Image Processing and Machine Learning in Computer Vision. Alphabet recognition is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text. It is widely used to convert books and documents into electronic files etc. Alphabet Recognition based OCR application is sometimes used in signature recognition which is used in bank and other high security buildings. One of the popular mobile applications includes reading a visiting card and directly storing it to the contacts. OCR's are known to be used in radar systems for reading speeders license plates and lots of other things. The implementation of our project has been done using Visual Studio and Open CV (Open Source Computer Vision). Our algorithm is based on Neural Networks (machine learning). The project was implemented in three modules: (1) Training: This module aims “Database Generation”. Database was generated using two methods: (a) Run-time generation included database generation at compilation time using inbuilt fonts of OpenCV library. Human intervention is not necessary for generating this database. (b) Contour–detection: ‘jpeg’ template containing different fonts of an alphabet is converted to the weighted matrix using specialized functions (contour detection and blob detection) of OpenCV. The main advantage of this type of database generation is that the algorithm becomes self-learning and the final database requires little memory to be stored (119kb precisely). (2) Preprocessing: Input image is pre-processed using image processing concepts such as adaptive thresholding, binarizing, dilating etc. and is made ready for segmentation. “Segmentation” includes extraction of lines, words, and letters from the processed text image. (3) Testing and prediction: The extracted letters are classified and predicted using the neural networks algorithm. The algorithm recognizes an alphabet based on certain mathematical parameters calculated using the database and weight matrix of the segmented image.

Keywords: contour-detection, neural networks, pre-processing, recognition coefficient, runtime-template generation, segmentation, weight matrix

Procedia PDF Downloads 389
617 Detecting Tomato Flowers in Greenhouses Using Computer Vision

Authors: Dor Oppenheim, Yael Edan, Guy Shani

Abstract:

This paper presents an image analysis algorithm to detect and count yellow tomato flowers in a greenhouse with uneven illumination conditions, complex growth conditions and different flower sizes. The algorithm is designed to be employed on a drone that flies in greenhouses to accomplish several tasks such as pollination and yield estimation. Detecting the flowers can provide useful information for the farmer, such as the number of flowers in a row, and the number of flowers that were pollinated since the last visit to the row. The developed algorithm is designed to handle the real world difficulties in a greenhouse which include varying lighting conditions, shadowing, and occlusion, while considering the computational limitations of the simple processor in the drone. The algorithm identifies flowers using an adaptive global threshold, segmentation over the HSV color space, and morphological cues. The adaptive threshold divides the images into darker and lighter images. Then, segmentation on the hue, saturation and volume is performed accordingly, and classification is done according to size and location of the flowers. 1069 images of greenhouse tomato flowers were acquired in a commercial greenhouse in Israel, using two different RGB Cameras – an LG G4 smartphone and a Canon PowerShot A590. The images were acquired from multiple angles and distances and were sampled manually at various periods along the day to obtain varying lighting conditions. Ground truth was created by manually tagging approximately 25,000 individual flowers in the images. Sensitivity analyses on the acquisition angle of the images, periods throughout the day, different cameras and thresholding types were performed. Precision, recall and their derived F1 score were calculated. Results indicate better performance for the view angle facing the flowers than any other angle. Acquiring images in the afternoon resulted with the best precision and recall results. Applying a global adaptive threshold improved the median F1 score by 3%. Results showed no difference between the two cameras used. Using hue values of 0.12-0.18 in the segmentation process provided the best results in precision and recall, and the best F1 score. The precision and recall average for all the images when using these values was 74% and 75% respectively with an F1 score of 0.73. Further analysis showed a 5% increase in precision and recall when analyzing images acquired in the afternoon and from the front viewpoint.

Keywords: agricultural engineering, image processing, computer vision, flower detection

Procedia PDF Downloads 329
616 Video Object Segmentation for Automatic Image Annotation of Ethernet Connectors with Environment Mapping and 3D Projection

Authors: Marrone Silverio Melo Dantas Pedro Henrique Dreyer, Gabriel Fonseca Reis de Souza, Daniel Bezerra, Ricardo Souza, Silvia Lins, Judith Kelner, Djamel Fawzi Hadj Sadok

Abstract:

The creation of a dataset is time-consuming and often discourages researchers from pursuing their goals. To overcome this problem, we present and discuss two solutions adopted for the automation of this process. Both optimize valuable user time and resources and support video object segmentation with object tracking and 3D projection. In our scenario, we acquire images from a moving robotic arm and, for each approach, generate distinct annotated datasets. We evaluated the precision of the annotations by comparing these with a manually annotated dataset, as well as the efficiency in the context of detection and classification problems. For detection support, we used YOLO and obtained for the projection dataset an F1-Score, accuracy, and mAP values of 0.846, 0.924, and 0.875, respectively. Concerning the tracking dataset, we achieved an F1-Score of 0.861, an accuracy of 0.932, whereas mAP reached 0.894. In order to evaluate the quality of the annotated images used for classification problems, we employed deep learning architectures. We adopted metrics accuracy and F1-Score, for VGG, DenseNet, MobileNet, Inception, and ResNet. The VGG architecture outperformed the others for both projection and tracking datasets. It reached an accuracy and F1-score of 0.997 and 0.993, respectively. Similarly, for the tracking dataset, it achieved an accuracy of 0.991 and an F1-Score of 0.981.

Keywords: RJ45, automatic annotation, object tracking, 3D projection

Procedia PDF Downloads 167
615 Participation in Decision Making and Work Outcomes: The Moderating Role of Ethical Climate

Authors: Ali Muhammad

Abstract:

The study examines the consequences of decision making in Kuwait work organization. The framework used in this study proposes that participation in decision making improves organizational ethical climate, which in turn increases employee’s trust in supervisor and trust in the organization. Furthermore, the model suggests that allowing employees to voice their opinions positively effects their perceptions of organizational justice. Providing employees with the opportunity to participate in decision making (voice), enhances their perceptions of the fairness of those decisions. Allowing employees to express their opinions and feeling about decisions being made show that the organization respect appreciates their views. This feeling of respect and appreciation reflects positively on employee’s perception of justice. Survey data were collected from a sample of 292 employees working in Kuwaiti work organizations. Pearson correlation, non-parametric tests, and structural equation models were used to analyze the data. Results of the analysis show that participation in decision making enhances employee perception of ethical climate, which in turn increases perception organizational justice and organizational trust. Implications of the findings and directions for future research are discussed.

Keywords: participation in decision making, organizational trust, trust in supervisor, organizational justice, ethical climate

Procedia PDF Downloads 113
614 Current Applications of Artificial Intelligence (AI) in Chest Radiology

Authors: Angelis P. Barlampas

Abstract:

Learning Objectives: The purpose of this study is to inform briefly the reader about the applications of AI in chest radiology. Background: Currently, there are 190 FDA-approved radiology AI applications, with 42 (22%) pertaining specifically to thoracic radiology. Imaging findings OR Procedure details Aids of AI in chest radiology1: Detects and segments pulmonary nodules. Subtracts bone to provide an unobstructed view of the underlying lung parenchyma and provides further information on nodule characteristics, such as nodule location, nodule two-dimensional size or three dimensional (3D) volume, change in nodule size over time, attenuation data (i.e., mean, minimum, and/or maximum Hounsfield units [HU]), morphological assessments, or combinations of the above. Reclassifies indeterminate pulmonary nodules into low or high risk with higher accuracy than conventional risk models. Detects pleural effusion . Differentiates tension pneumothorax from nontension pneumothorax. Detects cardiomegaly, calcification, consolidation, mediastinal widening, atelectasis, fibrosis and pneumoperitoneum. Localises automatically vertebrae segments, labels ribs and detects rib fractures. Measures the distance from the tube tip to the carina and localizes both endotracheal tubes and central vascular lines. Detects consolidation and progression of parenchymal diseases such as pulmonary fibrosis or chronic obstructive pulmonary disease (COPD).Can evaluate lobar volumes. Identifies and labels pulmonary bronchi and vasculature and quantifies air-trapping. Offers emphysema evaluation. Provides functional respiratory imaging, whereby high-resolution CT images are post-processed to quantify airflow by lung region and may be used to quantify key biomarkers such as airway resistance, air-trapping, ventilation mapping, lung and lobar volume, and blood vessel and airway volume. Assesses the lung parenchyma by way of density evaluation. Provides percentages of tissues within defined attenuation (HU) ranges besides furnishing automated lung segmentation and lung volume information. Improves image quality for noisy images with built-in denoising function. Detects emphysema, a common condition seen in patients with history of smoking and hyperdense or opacified regions, thereby aiding in the diagnosis of certain pathologies, such as COVID-19 pneumonia. It aids in cardiac segmentation and calcium detection, aorta segmentation and diameter measurements, and vertebral body segmentation and density measurements. Conclusion: The future is yet to come, but AI already is a helpful tool for the daily practice in radiology. It is assumed, that the continuing progression of the computerized systems and the improvements in software algorithms , will redder AI into the second hand of the radiologist.

Keywords: artificial intelligence, chest imaging, nodule detection, automated diagnoses

Procedia PDF Downloads 72
613 ‘A Ghost of One’s Own’: Spectral Intrusions and Trauma in the Poetry of Joanna Baillie and Anne Bannerman

Authors: Elli Karampela

Abstract:

In Specters of Marx (1993), Jacques Derrida refers to the ghost as an Other presence that occupies the space of the self and emanates from there, haunting in its shadowy pastness and threatening/striving to break free. In times of change, ghosts both reflect the dissolution of set principles and voice traumas of the past that create a sense of fear and instability. This paper observes the way female ghosts create connections with the living in the poetry of Joanna Baillie and Anne Bannerman, both integral, albeit under-researched in different ways, writers of the English Romantic period working in the aftermath of the French Revolution. Especially at the beginning of the nineteenth century, when ghost narratives were devoured by readers and enjoyed as stories that re-awakened sensation in times of revolution, there was at the same time fear of intrusion by terror’s unruly forces that threatened to turn the readers restless. The ghost was particularly dangerous because it was associated with memory and the intrusion of past trauma in the here and now. As will be seen, both Baillie and Bannerman explore the idea of the female ghost’s ‘return’ (a Freudian term that will be approached) which breaks both time and space boundaries to raise the suppressed female voice, threaten stability, and correct wrongs. As a result, the varied manifestations of female ghosts render Baillie and Bannerman active in the contemporary discourse about human rights and the reclamation of the agency.

Keywords: poetry, romanticism, spectrality, trauma, women

Procedia PDF Downloads 211
612 Perceiving Casual Speech: A Gating Experiment with French Listeners of L2 English

Authors: Naouel Zoghlami

Abstract:

Spoken-word recognition involves the simultaneous activation of potential word candidates which compete with each other for final correct recognition. In continuous speech, the activation-competition process gets more complicated due to speech reductions existing at word boundaries. Lexical processing is more difficult in L2 than in L1 because L2 listeners often lack phonetic, lexico-semantic, syntactic, and prosodic knowledge in the target language. In this study, we investigate the on-line lexical segmentation hypotheses that French listeners of L2 English form and then revise as subsequent perceptual evidence is revealed. Our purpose is to shed further light on the processes of L2 spoken-word recognition in context and better understand L2 listening difficulties through a comparison of skilled and unskilled reactions at the point where their working hypothesis is rejected. We use a variant of the gating experiment in which subjects transcribe an English sentence presented in increments of progressively greater duration. The spoken sentence was “And this amazing athlete has just broken another world record”, chosen mainly because it included common reductions and phonetic features in English, such as elision and assimilation. Our preliminary results show that there is an important difference in the manner in which proficient and less-proficient L2 listeners handle connected speech. Less-proficient listeners delay recognition of words as they wait for lexical and syntactic evidence to appear in the gates. Further statistical results are currently being undertaken.

Keywords: gating paradigm, spoken word recognition, online lexical segmentation, L2 listening

Procedia PDF Downloads 464
611 DenseNet and Autoencoder Architecture for COVID-19 Chest X-Ray Image Classification and Improved U-Net Lung X-Ray Segmentation

Authors: Jonathan Gong

Abstract:

Purpose AI-driven solutions are at the forefront of many pathology and medical imaging methods. Using algorithms designed to better the experience of medical professionals within their respective fields, the efficiency and accuracy of diagnosis can improve. In particular, X-rays are a fast and relatively inexpensive test that can diagnose diseases. In recent years, X-rays have not been widely used to detect and diagnose COVID-19. The under use of Xrays is mainly due to the low diagnostic accuracy and confounding with pneumonia, another respiratory disease. However, research in this field has expressed a possibility that artificial neural networks can successfully diagnose COVID-19 with high accuracy. Models and Data The dataset used is the COVID-19 Radiography Database. This dataset includes images and masks of chest X-rays under the labels of COVID-19, normal, and pneumonia. The classification model developed uses an autoencoder and a pre-trained convolutional neural network (DenseNet201) to provide transfer learning to the model. The model then uses a deep neural network to finalize the feature extraction and predict the diagnosis for the input image. This model was trained on 4035 images and validated on 807 separate images from the ones used for training. The images used to train the classification model include an important feature: the pictures are cropped beforehand to eliminate distractions when training the model. The image segmentation model uses an improved U-Net architecture. This model is used to extract the lung mask from the chest X-ray image. The model is trained on 8577 images and validated on a validation split of 20%. These models are calculated using the external dataset for validation. The models’ accuracy, precision, recall, f1-score, IOU, and loss are calculated. Results The classification model achieved an accuracy of 97.65% and a loss of 0.1234 when differentiating COVID19-infected, pneumonia-infected, and normal lung X-rays. The segmentation model achieved an accuracy of 97.31% and an IOU of 0.928. Conclusion The models proposed can detect COVID-19, pneumonia, and normal lungs with high accuracy and derive the lung mask from a chest X-ray with similarly high accuracy. The hope is for these models to elevate the experience of medical professionals and provide insight into the future of the methods used.

Keywords: artificial intelligence, convolutional neural networks, deep learning, image processing, machine learning

Procedia PDF Downloads 130
610 An Investigation into Computer Vision Methods to Identify Material Other Than Grapes in Harvested Wine Grape Loads

Authors: Riaan Kleyn

Abstract:

Mass wine production companies across the globe are provided with grapes from winegrowers that predominantly utilize mechanical harvesting machines to harvest wine grapes. Mechanical harvesting accelerates the rate at which grapes are harvested, allowing grapes to be delivered faster to meet the demands of wine cellars. The disadvantage of the mechanical harvesting method is the inclusion of material-other-than-grapes (MOG) in the harvested wine grape loads arriving at the cellar which degrades the quality of wine that can be produced. Currently, wine cellars do not have a method to determine the amount of MOG present within wine grape loads. This paper seeks to find an optimal computer vision method capable of detecting the amount of MOG within a wine grape load. A MOG detection method will encourage winegrowers to deliver MOG-free wine grape loads to avoid penalties which will indirectly enhance the quality of the wine to be produced. Traditional image segmentation methods were compared to deep learning segmentation methods based on images of wine grape loads that were captured at a wine cellar. The Mask R-CNN model with a ResNet-50 convolutional neural network backbone emerged as the optimal method for this study to determine the amount of MOG in an image of a wine grape load. Furthermore, a statistical analysis was conducted to determine how the MOG on the surface of a grape load relates to the mass of MOG within the corresponding grape load.

Keywords: computer vision, wine grapes, machine learning, machine harvested grapes

Procedia PDF Downloads 94
609 Critical Discourse Analysis of Xenophobia in UK Political Party Blogs

Authors: Nourah Almulhim

Abstract:

This paper takes a critical discourse analysis (CDA) approach to investigate discourse and ideology in political blogs, focusing in particular on the Conservative Home blog from the UK’s current governing party. The Conservative party member’s discourse strategies as the blogger, alongside the discourse used by members of the public who reply to the blog in the below-the-lines comments, will be examined. The blog discourse reflects the writer's political identity and authorial voice. The analysis of the below-the-lines comments enables members of the public to engage in creating adversative positions, introducing different language users who bring their own individual and collective identities. These language users can play the role of news reporters, political analysts, protesters or supporters of a specific agenda and current socio-political topics or events. This study takes a qualitative approach to analyze the discriminatory context towards Islam/Muslims in ' The Conservative Home' blog. A cognitive approach is adopted and an analysis of dominant discourses in the blog text and the below-the-line comments is used. The focus of the study is, firstly, on the construction of self/ collective national identity in comparison to Muslim identity, highlighting the in-group and out-group construction. Second, the type of attitudes, whether feelings or judgments, related to these social actors as they are explicated to draw on the social values. Third, the role of discursive strategies in justifying and legitimizing those Islamophobic discriminatory practices. Therefore, the analysis is based on the systematic analysis of social actors drawing on actors, actions, and arguments to explicate identity construction and its development in the different discourses. A socio-semantic categorization of social actors is implemented to draw on the discursive strategies in addition to using literature to understand these strategies. An appraisal analysis is further used to classify attitudes and elaborate on core values in both genres. Finally, the grammar of othering is applied to explain how discriminatory dichotomies of 'Us' Vs. ''Them' actions are carried in discourse. Some of the key findings of the analysis can be summarized in two main points. First, the discursive practice used to represent Muslims/Islam as different from ‘Us’ are different in both genres as the blogger uses a covert voice while the commenters generally use an overt voice. This is to say that the blogger uses a mitigated strategy to represent the Muslim identity, for example, using the noun phrase ‘British Muslim’ but then representing them as ‘radical’ and ‘terrorists'. Contrary to this is in below the lines comments, where a direct strategy with an active declarative voice is used to negatively represent the Muslim identity as ‘oppressors’ and ‘terrorists’ with no inclusion of the noun phrase ‘British Muslims’. Second, the negotiation of the ‘British’ identity and values, such as culture and democracy, are prominent in the comment section as being unique and under threat by Muslims, while in the article, these standpoints are not represented.

Keywords: xenophobia, blogs, identity, critical discourse analysis

Procedia PDF Downloads 92
608 Web Page Design Optimisation Based on Segment Analytics

Authors: Varsha V. Rohini, P. R. Shreya, B. Renukadevi

Abstract:

In the web analytics the information delivery and the web usage is optimized and the analysis of data is done. The analytics is the measurement, collection and analysis of webpage data. Page statistics and user metrics are the important factor in most of the web analytics tool. This is the limitation of the existing tools. It does not provide design inputs for the optimization of information. This paper aims at providing an extension for the scope of web analytics to provide analysis and statistics of each segment of a webpage. The number of click count is calculated and the concentration of links in a web page is obtained. Its user metrics are used to help in proper design of the displayed content in a webpage by Vision Based Page Segmentation (VIPS) algorithm. When the algorithm is applied on the web page it divides the entire web page into the visual block tree. The visual block tree generated will further divide the web page into visual blocks or segments which help us to understand the usage of each segment in a page and its content. The dynamic web pages and deep web pages are used to extend the scope of web page segment analytics. Space optimization concept is used with the help of the output obtained from the Vision Based Page Segmentation (VIPS) algorithm. This technique provides us the visibility of the user interaction with the WebPages and helps us to place the important links in the appropriate segments of the webpage and effectively manage space in a page and the concentration of links.

Keywords: analytics, design optimization, visual block trees, vision based technology

Procedia PDF Downloads 266
607 Decoding Gender Disparities in AI: An Experimental Exploration Within the Realm of AI and Trust Building

Authors: Alexander Scott English, Yilin Ma, Xiaoying Liu

Abstract:

The widespread use of artificial intelligence in everyday life has triggered a fervent discussion covering a wide range of areas. However, to date, research on the influence of gender in various segments and factors from a social science perspective is still limited. This study aims to explore whether there are gender differences in human trust in AI for its application in basic everyday life and correlates with human perceived similarity, perceived emotions (including competence and warmth), and attractiveness. We conducted a study involving 321 participants using a two-subject experimental design with a two-factor (masculinized vs. feminized voice of the AI) multiplied by a two-factor (pitch level of the AI's voice) between-subject experimental design. Four contexts were created for the study and randomly assigned. The results of the study showed significant gender differences in perceived similarity, trust, and perceived emotion of the AIs, with females rating them significantly higher than males. Trust was higher in relation to AIs presenting the same gender (e.g., human female to female AI, human male to male AI). Mediation modeling tests indicated that emotion perception and similarity played a sufficiently mediating role in trust. Notably, although trust in AIs was strongly correlated with human gender, there was no significant effect on the gender of the AI. In addition, the study discusses the effects of subjects' age, job search experience, and job type on the findings.

Keywords: artificial intelligence, gender differences, human-robot trust, mediation modeling

Procedia PDF Downloads 45
606 reconceptualizing the place of empire in european women’s travel writing through the lens of iberian texts

Authors: Gayle Nunley

Abstract:

Between the mid-nineteenth and early twentieth century, a number of Western European women broke with gender norms of their time and undertook to write and publish accounts of their own international journeys. In addition to contributing to their contemporaries’ progressive reimagining of the space and place of female experience within the public sphere, these often orientalism-tinged texts have come to provide key source material for the analysis of gendered voice in the narration of Empire, particularly with regard to works associated with Europe’s then-ascendant imperial powers, Britain and France. Incorporation of contemporaneous writings from the once-dominant Empires of Iberian Europe introduces an important additional lens onto this process. By bringing to bear geographic notions of placedness together with discourse analysis, the examination of works by Iberian Europe’s female travelers in conjunction with those of their more celebrated Northern European peers reveals a pervasive pattern of conjoined belonging and displacement traceable throughout the broader corpus, while also underscoring the insufficiency of binary paradigms of gendered voice. The re-situating of women travelers’ participation in the European imperial project to include voices from the Iberian south creates a more robust understanding of these writers’ complex, and often unexpectedly modern, engagement with notions of gender, mobility, ‘otherness’ and contact-zone encounter acted out both within and against the imperial paradigm.

Keywords: colonialism, orientalism, Spain, travel writing, women travelers

Procedia PDF Downloads 112
605 Iterative Method for Lung Tumor Localization in 4D CT

Authors: Sarah K. Hagi, Majdi Alnowaimi

Abstract:

In the last decade, there were immense advancements in the medical imaging modalities. These advancements can scan a whole volume of the lung organ in high resolution images within a short time. According to this performance, the physicians can clearly identify the complicated anatomical and pathological structures of lung. Therefore, these advancements give large opportunities for more advance of all types of lung cancer treatment available and will increase the survival rate. However, lung cancer is still one of the major causes of death with around 19% of all the cancer patients. Several factors may affect survival rate. One of the serious effects is the breathing process, which can affect the accuracy of diagnosis and lung tumor treatment plan. We have therefore developed a semi automated algorithm to localize the 3D lung tumor positions across all respiratory data during respiratory motion. The algorithm can be divided into two stages. First, a lung tumor segmentation for the first phase of the 4D computed tomography (CT). Lung tumor segmentation is performed using an active contours method. Then, localize the tumor 3D position across all next phases using a 12 degrees of freedom of an affine transformation. Two data set where used in this study, a compute simulate for 4D CT using extended cardiac-torso (XCAT) phantom and 4D CT clinical data sets. The result and error calculation is presented as root mean square error (RMSE). The average error in data sets is 0.94 mm ± 0.36. Finally, evaluation and quantitative comparison of the results with a state-of-the-art registration algorithm was introduced. The results obtained from the proposed localization algorithm show a promising result to localize alung tumor in 4D CT data.

Keywords: automated algorithm , computed tomography, lung tumor, tumor localization

Procedia PDF Downloads 602