Search results for: audio segmentation
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 809

Search results for: audio segmentation

539 Exploring Teacher Verbal Feedback on Postgraduate Students' Performances in Presentations in English

Authors: Nattawadee Sinpattanawong, Yaowaret Tharawoot

Abstract:

This is an analytic and descriptive classroom-centered research, the purpose of which is to explore teacher verbal feedback on postgraduate students’ performances in presentations in English in an English for Specific Purposes (ESP) postgraduate classroom. The participants are a Thai female teacher, two Thai female postgraduate students, and two foreign male postgraduate students. The current study draws on both classroom observation and interview data. The class focused on the students’ presentations and the teacher’s providing verbal feedback on them was observed nine times with audio recording and taking notes. For the interviews, the teacher was interviewed about linkages between her verbal feedback and each student’s presentation skills in English. For the data analysis, the audio files from the observations were transcribed and analyzed both quantitatively and qualitatively. The quantitative approach addressed the frequencies and percentages of content of the teacher’s verbal feedback for each student’s performances based on eight presentation factors (content, structure, grammar, coherence, vocabulary, speaking skills, involving the audience, and self-presentation). Based on the quantitative data including the interview data, a qualitative analysis of the transcripts was made to describe the occurrences of several content of verbal feedback for each student’s presentation performances. The study’s findings may help teachers to reflect on their providing verbal feedback based on various students’ performances in presentation in English. They also help students who have similar characteristics to the students in the present study when giving a presentation in English improve their presentation performances by applying the teacher’s verbal feedback content.

Keywords: teacher verbal feedback, presentation factors, presentation in English, presentation performances

Procedia PDF Downloads 131
538 Tool for Maxillary Sinus Quantification in Computed Tomography Exams

Authors: Guilherme Giacomini, Ana Luiza Menegatti Pavan, Allan Felipe Fattori Alves, Marcela de Oliveira, Fernando Antonio Bacchim Neto, José Ricardo de Arruda Miranda, Seizo Yamashita, Diana Rodrigues de Pina

Abstract:

The maxillary sinus (MS), part of the paranasal sinus complex, is one of the most enigmatic structures in modern humans. The literature has suggested that MSs function as olfaction accessories, to heat or humidify inspired air, for thermoregulation, to impart resonance to the voice and others. Thus, the real function of the MS is still uncertain. Furthermore, the MS anatomy is complex and varies from person to person. Many diseases may affect the development process of sinuses. The incidence of rhinosinusitis and other pathoses in the MS is comparatively high, so, volume analysis has clinical value. Providing volume values for MS could be helpful in evaluating the presence of any abnormality and could be used for treatment planning and evaluation of the outcome. The computed tomography (CT) has allowed a more exact assessment of this structure, which enables a quantitative analysis. However, this is not always possible in the clinical routine, and if possible, it involves much effort and/or time. Therefore, it is necessary to have a convenient, robust, and practical tool correlated with the MS volume, allowing clinical applicability. Nowadays, the available methods for MS segmentation are manual or semi-automatic. Additionally, manual methods present inter and intraindividual variability. Thus, the aim of this study was to develop an automatic tool to quantity the MS volume in CT scans of paranasal sinuses. This study was developed with ethical approval from the authors’ institutions and national review panels. The research involved 30 retrospective exams of University Hospital, Botucatu Medical School, São Paulo State University, Brazil. The tool for automatic MS quantification, developed in Matlab®, uses a hybrid method, combining different image processing techniques. For MS detection, the algorithm uses a Support Vector Machine (SVM), by features such as pixel value, spatial distribution, shape and others. The detected pixels are used as seed point for a region growing (RG) segmentation. Then, morphological operators are applied to reduce false-positive pixels, improving the segmentation accuracy. These steps are applied in all slices of CT exam, obtaining the MS volume. To evaluate the accuracy of the developed tool, the automatic method was compared with manual segmentation realized by an experienced radiologist. For comparison, we used Bland-Altman statistics, linear regression, and Jaccard similarity coefficient. From the statistical analyses for the comparison between both methods, the linear regression showed a strong association and low dispersion between variables. The Bland–Altman analyses showed no significant differences between the analyzed methods. The Jaccard similarity coefficient was > 0.90 in all exams. In conclusion, the developed tool to quantify MS volume proved to be robust, fast, and efficient, when compared with manual segmentation. Furthermore, it avoids the intra and inter-observer variations caused by manual and semi-automatic methods. As future work, the tool will be applied in clinical practice. Thus, it may be useful in the diagnosis and treatment determination of MS diseases. Providing volume values for MS could be helpful in evaluating the presence of any abnormality and could be used for treatment planning and evaluation of the outcome. The computed tomography (CT) has allowed a more exact assessment of this structure which enables a quantitative analysis. However, this is not always possible in the clinical routine, and if possible, it involves much effort and/or time. Therefore, it is necessary to have a convenient, robust and practical tool correlated with the MS volume, allowing clinical applicability. Nowadays, the available methods for MS segmentation are manual or semi-automatic. Additionally, manual methods present inter and intraindividual variability. Thus, the aim of this study was to develop an automatic tool to quantity the MS volume in CT scans of paranasal sinuses. This study was developed with ethical approval from the authors’ institutions and national review panels. The research involved 30 retrospective exams of University Hospital, Botucatu Medical School, São Paulo State University, Brazil. The tool for automatic MS quantification, developed in Matlab®, uses a hybrid method, combining different image processing techniques. For MS detection, the algorithm uses a Support Vector Machine (SVM), by features such as pixel value, spatial distribution, shape and others. The detected pixels are used as seed point for a region growing (RG) segmentation. Then, morphological operators are applied to reduce false-positive pixels, improving the segmentation accuracy. These steps are applied in all slices of CT exam, obtaining the MS volume. To evaluate the accuracy of the developed tool, the automatic method was compared with manual segmentation realized by an experienced radiologist. For comparison, we used Bland-Altman statistics, linear regression and Jaccard similarity coefficient. From the statistical analyses for the comparison between both methods, the linear regression showed a strong association and low dispersion between variables. The Bland–Altman analyses showed no significant differences between the analyzed methods. The Jaccard similarity coefficient was > 0.90 in all exams. In conclusion, the developed tool to automatically quantify MS volume proved to be robust, fast and efficient, when compared with manual segmentation. Furthermore, it avoids the intra and inter-observer variations caused by manual and semi-automatic methods. As future work, the tool will be applied in clinical practice. Thus, it may be useful in the diagnosis and treatment determination of MS diseases.

Keywords: maxillary sinus, support vector machine, region growing, volume quantification

Procedia PDF Downloads 483
537 Current Applications of Artificial Intelligence (AI) in Chest Radiology

Authors: Angelis P. Barlampas

Abstract:

Learning Objectives: The purpose of this study is to inform briefly the reader about the applications of AI in chest radiology. Background: Currently, there are 190 FDA-approved radiology AI applications, with 42 (22%) pertaining specifically to thoracic radiology. Imaging findings OR Procedure details Aids of AI in chest radiology1: Detects and segments pulmonary nodules. Subtracts bone to provide an unobstructed view of the underlying lung parenchyma and provides further information on nodule characteristics, such as nodule location, nodule two-dimensional size or three dimensional (3D) volume, change in nodule size over time, attenuation data (i.e., mean, minimum, and/or maximum Hounsfield units [HU]), morphological assessments, or combinations of the above. Reclassifies indeterminate pulmonary nodules into low or high risk with higher accuracy than conventional risk models. Detects pleural effusion . Differentiates tension pneumothorax from nontension pneumothorax. Detects cardiomegaly, calcification, consolidation, mediastinal widening, atelectasis, fibrosis and pneumoperitoneum. Localises automatically vertebrae segments, labels ribs and detects rib fractures. Measures the distance from the tube tip to the carina and localizes both endotracheal tubes and central vascular lines. Detects consolidation and progression of parenchymal diseases such as pulmonary fibrosis or chronic obstructive pulmonary disease (COPD).Can evaluate lobar volumes. Identifies and labels pulmonary bronchi and vasculature and quantifies air-trapping. Offers emphysema evaluation. Provides functional respiratory imaging, whereby high-resolution CT images are post-processed to quantify airflow by lung region and may be used to quantify key biomarkers such as airway resistance, air-trapping, ventilation mapping, lung and lobar volume, and blood vessel and airway volume. Assesses the lung parenchyma by way of density evaluation. Provides percentages of tissues within defined attenuation (HU) ranges besides furnishing automated lung segmentation and lung volume information. Improves image quality for noisy images with built-in denoising function. Detects emphysema, a common condition seen in patients with history of smoking and hyperdense or opacified regions, thereby aiding in the diagnosis of certain pathologies, such as COVID-19 pneumonia. It aids in cardiac segmentation and calcium detection, aorta segmentation and diameter measurements, and vertebral body segmentation and density measurements. Conclusion: The future is yet to come, but AI already is a helpful tool for the daily practice in radiology. It is assumed, that the continuing progression of the computerized systems and the improvements in software algorithms , will redder AI into the second hand of the radiologist.

Keywords: artificial intelligence, chest imaging, nodule detection, automated diagnoses

Procedia PDF Downloads 46
536 Perceiving Casual Speech: A Gating Experiment with French Listeners of L2 English

Authors: Naouel Zoghlami

Abstract:

Spoken-word recognition involves the simultaneous activation of potential word candidates which compete with each other for final correct recognition. In continuous speech, the activation-competition process gets more complicated due to speech reductions existing at word boundaries. Lexical processing is more difficult in L2 than in L1 because L2 listeners often lack phonetic, lexico-semantic, syntactic, and prosodic knowledge in the target language. In this study, we investigate the on-line lexical segmentation hypotheses that French listeners of L2 English form and then revise as subsequent perceptual evidence is revealed. Our purpose is to shed further light on the processes of L2 spoken-word recognition in context and better understand L2 listening difficulties through a comparison of skilled and unskilled reactions at the point where their working hypothesis is rejected. We use a variant of the gating experiment in which subjects transcribe an English sentence presented in increments of progressively greater duration. The spoken sentence was “And this amazing athlete has just broken another world record”, chosen mainly because it included common reductions and phonetic features in English, such as elision and assimilation. Our preliminary results show that there is an important difference in the manner in which proficient and less-proficient L2 listeners handle connected speech. Less-proficient listeners delay recognition of words as they wait for lexical and syntactic evidence to appear in the gates. Further statistical results are currently being undertaken.

Keywords: gating paradigm, spoken word recognition, online lexical segmentation, L2 listening

Procedia PDF Downloads 443
535 DenseNet and Autoencoder Architecture for COVID-19 Chest X-Ray Image Classification and Improved U-Net Lung X-Ray Segmentation

Authors: Jonathan Gong

Abstract:

Purpose AI-driven solutions are at the forefront of many pathology and medical imaging methods. Using algorithms designed to better the experience of medical professionals within their respective fields, the efficiency and accuracy of diagnosis can improve. In particular, X-rays are a fast and relatively inexpensive test that can diagnose diseases. In recent years, X-rays have not been widely used to detect and diagnose COVID-19. The under use of Xrays is mainly due to the low diagnostic accuracy and confounding with pneumonia, another respiratory disease. However, research in this field has expressed a possibility that artificial neural networks can successfully diagnose COVID-19 with high accuracy. Models and Data The dataset used is the COVID-19 Radiography Database. This dataset includes images and masks of chest X-rays under the labels of COVID-19, normal, and pneumonia. The classification model developed uses an autoencoder and a pre-trained convolutional neural network (DenseNet201) to provide transfer learning to the model. The model then uses a deep neural network to finalize the feature extraction and predict the diagnosis for the input image. This model was trained on 4035 images and validated on 807 separate images from the ones used for training. The images used to train the classification model include an important feature: the pictures are cropped beforehand to eliminate distractions when training the model. The image segmentation model uses an improved U-Net architecture. This model is used to extract the lung mask from the chest X-ray image. The model is trained on 8577 images and validated on a validation split of 20%. These models are calculated using the external dataset for validation. The models’ accuracy, precision, recall, f1-score, IOU, and loss are calculated. Results The classification model achieved an accuracy of 97.65% and a loss of 0.1234 when differentiating COVID19-infected, pneumonia-infected, and normal lung X-rays. The segmentation model achieved an accuracy of 97.31% and an IOU of 0.928. Conclusion The models proposed can detect COVID-19, pneumonia, and normal lungs with high accuracy and derive the lung mask from a chest X-ray with similarly high accuracy. The hope is for these models to elevate the experience of medical professionals and provide insight into the future of the methods used.

Keywords: artificial intelligence, convolutional neural networks, deep learning, image processing, machine learning

Procedia PDF Downloads 103
534 An Investigation into Computer Vision Methods to Identify Material Other Than Grapes in Harvested Wine Grape Loads

Authors: Riaan Kleyn

Abstract:

Mass wine production companies across the globe are provided with grapes from winegrowers that predominantly utilize mechanical harvesting machines to harvest wine grapes. Mechanical harvesting accelerates the rate at which grapes are harvested, allowing grapes to be delivered faster to meet the demands of wine cellars. The disadvantage of the mechanical harvesting method is the inclusion of material-other-than-grapes (MOG) in the harvested wine grape loads arriving at the cellar which degrades the quality of wine that can be produced. Currently, wine cellars do not have a method to determine the amount of MOG present within wine grape loads. This paper seeks to find an optimal computer vision method capable of detecting the amount of MOG within a wine grape load. A MOG detection method will encourage winegrowers to deliver MOG-free wine grape loads to avoid penalties which will indirectly enhance the quality of the wine to be produced. Traditional image segmentation methods were compared to deep learning segmentation methods based on images of wine grape loads that were captured at a wine cellar. The Mask R-CNN model with a ResNet-50 convolutional neural network backbone emerged as the optimal method for this study to determine the amount of MOG in an image of a wine grape load. Furthermore, a statistical analysis was conducted to determine how the MOG on the surface of a grape load relates to the mass of MOG within the corresponding grape load.

Keywords: computer vision, wine grapes, machine learning, machine harvested grapes

Procedia PDF Downloads 64
533 Web Page Design Optimisation Based on Segment Analytics

Authors: Varsha V. Rohini, P. R. Shreya, B. Renukadevi

Abstract:

In the web analytics the information delivery and the web usage is optimized and the analysis of data is done. The analytics is the measurement, collection and analysis of webpage data. Page statistics and user metrics are the important factor in most of the web analytics tool. This is the limitation of the existing tools. It does not provide design inputs for the optimization of information. This paper aims at providing an extension for the scope of web analytics to provide analysis and statistics of each segment of a webpage. The number of click count is calculated and the concentration of links in a web page is obtained. Its user metrics are used to help in proper design of the displayed content in a webpage by Vision Based Page Segmentation (VIPS) algorithm. When the algorithm is applied on the web page it divides the entire web page into the visual block tree. The visual block tree generated will further divide the web page into visual blocks or segments which help us to understand the usage of each segment in a page and its content. The dynamic web pages and deep web pages are used to extend the scope of web page segment analytics. Space optimization concept is used with the help of the output obtained from the Vision Based Page Segmentation (VIPS) algorithm. This technique provides us the visibility of the user interaction with the WebPages and helps us to place the important links in the appropriate segments of the webpage and effectively manage space in a page and the concentration of links.

Keywords: analytics, design optimization, visual block trees, vision based technology

Procedia PDF Downloads 244
532 Iterative Method for Lung Tumor Localization in 4D CT

Authors: Sarah K. Hagi, Majdi Alnowaimi

Abstract:

In the last decade, there were immense advancements in the medical imaging modalities. These advancements can scan a whole volume of the lung organ in high resolution images within a short time. According to this performance, the physicians can clearly identify the complicated anatomical and pathological structures of lung. Therefore, these advancements give large opportunities for more advance of all types of lung cancer treatment available and will increase the survival rate. However, lung cancer is still one of the major causes of death with around 19% of all the cancer patients. Several factors may affect survival rate. One of the serious effects is the breathing process, which can affect the accuracy of diagnosis and lung tumor treatment plan. We have therefore developed a semi automated algorithm to localize the 3D lung tumor positions across all respiratory data during respiratory motion. The algorithm can be divided into two stages. First, a lung tumor segmentation for the first phase of the 4D computed tomography (CT). Lung tumor segmentation is performed using an active contours method. Then, localize the tumor 3D position across all next phases using a 12 degrees of freedom of an affine transformation. Two data set where used in this study, a compute simulate for 4D CT using extended cardiac-torso (XCAT) phantom and 4D CT clinical data sets. The result and error calculation is presented as root mean square error (RMSE). The average error in data sets is 0.94 mm ± 0.36. Finally, evaluation and quantitative comparison of the results with a state-of-the-art registration algorithm was introduced. The results obtained from the proposed localization algorithm show a promising result to localize alung tumor in 4D CT data.

Keywords: automated algorithm , computed tomography, lung tumor, tumor localization

Procedia PDF Downloads 580
531 Governance of Social Media Using the Principles of Community Radio

Authors: Ken Zakreski

Abstract:

Regulating Canadian Facebook Groups, of a size and type, when they reach a threshold of audio video content. Consider the evolution of the Streaming Act, Parl GC Bill C-11 (44-1) and the regulations that will certainly follow. The Canadian Heritage Minister's office stipulates, "the Broadcasting Act only applies to audio and audiovisual content, not written journalism.” Governance— After 10 years, a community radio station for Gabriola Island, BC – Canadian Radio-television and Telecommunications Commission (“CRTC”) was approved but never started – became a Facebook Group “Community Bulletin Board - Life on Gabriola“ referred to as CBBlog. After CBBlog started and began to gather real traction, a member of the Group cloned the membership and ran their competing Facebook group under the banner of "free speech”. Here we see an inflection point [change of cultural stewardship] with two different mathematical results [engagement and membership growth]. Canada's telecommunication history of “portability” and “interoperability” made that Facebook Group CBBlog the better option, over broadcast FM radio for a community pandemic information sharing service for Gabriola Island, BC. A culture of ignorance flourishes in social media. Often people do not understand their own experience, or the experience of others because they do not have the concepts needed for understanding. It is thus important they are not denied concepts required for their full understanding. For example, Legislators need to know something about gay culture before they can make any decisions about it. Community Media policies and CRTC regulations are known and regulators can use that history to forge forward with regulations for internet platforms of a size and content type that reach a threshold of audio / video content. Mostly volunteer run media services, provide order of magnitude lower costs over commercial media. (Treating) Facebook Groups as new media.? Cathy Edwards, executive director of the Canadian Association of Community Television Users and Stations (“CACTUS”), calls it new media in that the distribution platform is not the issue. What does make community groups community media? Cathy responded, "... it's bylaws, articles of incorporation that state they are community media, they have accessibility, commitments to skills training, any member of the community can be a member, and there is accountability to a board of directors". Eligibility for funding through CACTUS requires these same commitments. It is risky for a community to invest into a platform as ownership has not been litigated. Is a FaceBook Group an asset of a not for profit society? The memo, from law student, Jared Hubbard summarizes, “Rights and interests in a Facebook group could, in theory, be transferred as property... This theory is currently unconfirmed by Canadian courts. “

Keywords: social media, governance, community media, Canadian radio

Procedia PDF Downloads 42
530 Developing a Virtual Reality System to Assist in Anatomy Teaching and Evaluating the Effectiveness of That System

Authors: Tarek Abdelkader, Suresh Selvaraj, Prasad Iyer, Yong Mun Hin, Hajmath Begum, P. Gopalakrishnakone

Abstract:

Nowadays, more and more educational institutes, as well as students, rely on 3D anatomy programs as an important tool that helps students correlate the actual locations of anatomical structures in a 3D dimension. Lately, virtual reality (VR) is gaining more favor from the younger generations due to its higher interactive mode. As a result, using virtual reality as a gamified learning platform for anatomy became the current goal. We present a model where a Virtual Human Anatomy Program (VHAP) was developed to assist with the anatomy learning experience of students. The anatomy module has been built, mostly, from real patient CT scans. Segmentation and surface rendering were used to create the 3D model by direct segmentation of CT scans for each organ individually and exporting that model as a 3D file. After acquiring the 3D files for all needed organs, all the files were introduced into a Virtual Reality environment as a complete body anatomy model. In this ongoing experiment, students from different Allied Health orientations are testing the VHAP. Specifically, the cardiovascular system has been selected as the focus system of study since all of our students finished learning about it in the 1st trimester. The initial results suggest that the VHAP system is adding value to the learning process of our students, encouraging them to get more involved and to ask more questions. Involved students comments show that they are excited about the VHAP system with comments about its interactivity as well as the ability to use it solo as a self-learning aid in combination with the lectures. Some students also experienced minor side effects like dizziness.

Keywords: 3D construction, health sciences, teaching pedagogy, virtual reality

Procedia PDF Downloads 136
529 Emotion-Convolutional Neural Network for Perceiving Stress from Audio Signals: A Brain Chemistry Approach

Authors: Anup Anand Deshmukh, Catherine Soladie, Renaud Seguier

Abstract:

Emotion plays a key role in many applications like healthcare, to gather patients’ emotional behavior. Unlike typical ASR (Automated Speech Recognition) problems which focus on 'what was said', it is equally important to understand 'how it was said.' There are certain emotions which are given more importance due to their effectiveness in understanding human feelings. In this paper, we propose an approach that models human stress from audio signals. The research challenge in speech emotion detection is finding the appropriate set of acoustic features corresponding to an emotion. Another difficulty lies in defining the very meaning of emotion and being able to categorize it in a precise manner. Supervised Machine Learning models, including state of the art Deep Learning classification methods, rely on the availability of clean and labelled data. One of the problems in affective computation is the limited amount of annotated data. The existing labelled emotions datasets are highly subjective to the perception of the annotator. We address the first issue of feature selection by exploiting the use of traditional MFCC (Mel-Frequency Cepstral Coefficients) features in Convolutional Neural Network. Our proposed Emo-CNN (Emotion-CNN) architecture treats speech representations in a manner similar to how CNN’s treat images in a vision problem. Our experiments show that Emo-CNN consistently and significantly outperforms the popular existing methods over multiple datasets. It achieves 90.2% categorical accuracy on the Emo-DB dataset. We claim that Emo-CNN is robust to speaker variations and environmental distortions. The proposed approach achieves 85.5% speaker-dependant categorical accuracy for SAVEE (Surrey Audio-Visual Expressed Emotion) dataset, beating the existing CNN based approach by 10.2%. To tackle the second problem of subjectivity in stress labels, we use Lovheim’s cube, which is a 3-dimensional projection of emotions. Monoamine neurotransmitters are a type of chemical messengers in the brain that transmits signals on perceiving emotions. The cube aims at explaining the relationship between these neurotransmitters and the positions of emotions in 3D space. The learnt emotion representations from the Emo-CNN are mapped to the cube using three component PCA (Principal Component Analysis) which is then used to model human stress. This proposed approach not only circumvents the need for labelled stress data but also complies with the psychological theory of emotions given by Lovheim’s cube. We believe that this work is the first step towards creating a connection between Artificial Intelligence and the chemistry of human emotions.

Keywords: deep learning, brain chemistry, emotion perception, Lovheim's cube

Procedia PDF Downloads 126
528 Innovation Outcomes and Competing Agendas in Higher Education: Experimenting with Audio-Video Feedback

Authors: Adina Dudau, Georgios Kominis, Melinda Szocs

Abstract:

This paper links distinct bodies of literature around innovation and public services by examining a case of perceived innovation failure. Through a mixed methodology investigating student attitudes to, and behaviour around, technological innovation in higher education, the paper makes a contribution to the public service innovation literature by focusing on the duality of innovation outcomes, suggestive of an innovation typology in public services. The study was conducted in a UK Russell Group university and it focused on a technological process innovation. The innovation consisted of the provision of feedback to students in the form of a digital video (mp4), tailored to each individual submission, with extended voice-over commentary from the course coordinator and visual cues intended to help students see the relevance of comments to their submissions. The sample of the study consisted of a class of 79 undergraduate students. To investigate student attainment, we designed a field (also known as quasi or natural) experiment, essentially a manipulation of a social setting (in this case, the form of feedback given to students), but as part of a naturally occurring social arrangement (a real course which students attend and in which they are assessed). A two group control group design (see figure 3) was utilised to examine the effectiveness of the feedback innovation (video feedback). Two outcome variables of the service innovation were measured: student satisfaction and student attainment. In other words, the study examined not only students’ perceptions of whether VF was deemed to be beneficial towards their subsequent assignments; but also evidence of actual incremental benefits in students’ performance from one assignment to the next after VF was provided. The results were baffling and indicating competing agendas in higher education.

Keywords: higher education, audio-video, feedback, innovation

Procedia PDF Downloads 338
527 Another Beautiful Sounds: Building the Memory of Sound of Peddling in Beijing with Digital Technology

Authors: Dan Wang, Qing Ma, Xiaodan Wang, Tianjiao Qi

Abstract:

The sound of peddling in Beijing, also called “yo-heave-ho” or “cry of one's ware”, is a unique folk culture and usually found in Beijing hutong. For the civilians in Beijing, sound of peddling is part of their childhood. And for those who love the traditional culture of Beijing, it is an old song singing the local conditions and customs of the ancient city. For example, because of his great appreciation, the British poet Osbert Stewart once put sound of peddling which he had heard in Beijing as a street orchestra performance in the article named "Beijing's sound and color".This research aims to collect and integrate the voice/photo resources and historical materials of sound concerning peddling in Beijing by digital technology in order to protect the intangible cultural heritage and pass on the city memory. With the goal in mind, the next stage is to collect and record all the materials and resources based on the historical documents study and interviews with civilians or performers. Then set up a metadata scheme (which refers to the domestic and international standards such as "Audio Data Processing Standards in the National Library", DC, VRA, and CDWA, etc.) to describe, process and organize the sound of peddling into a database. In order to fully show the traditional culture of sound of peddling in Beijing, web design and GIS technology are utilized to establish a website and plan holding offline exhibitions and events for people to simulate and learn the sound of peddling by using VR/AR technology. All resources are opened to the public and civilians can share the digital memory through not only the offline experiential activities, but also the online interaction. With all the attempts, a multi-media narrative platform has been established to multi-dimensionally record the sound of peddling in old Beijing with text, images, audio, video and so on.

Keywords: sound of peddling, GIS, metadata scheme, VR/AR technology

Procedia PDF Downloads 279
526 Creative Radio Advertising in Turkey

Authors: Mehmet Sinan Erguven

Abstract:

A number of authorities argue that radio is an outdated medium for advertising and does not have the same impact on consumers as it did in the past. This grim outlook on the future of radio has its basis in the audio-visual world that consumers now live in and the popularity of Internet-based marketing tools among advertising professionals. Nonetheless, consumers still appear to overwhelmingly prefer radio as an entertainment tool. Today, in Canada, 90% of all adults (18+) tune into the radio on a weekly basis, and they listen for 17 hours. Teens are the most challenging group for radio to capture as an audience, but still, almost 75% tune in weekly. One online radio station reaches more than 250 million registered listeners worldwide, and revenues from radio advertising in Australia are expected to grow at an annual rate of 3% for the foreseeable future. Radio is also starting to become popular again in Turkey, with a 5% increase in the listening rates compared to 2014. A major matter of concern always affecting radio advertising is creativity. As radio generally serves as a background medium for listeners, the creativity of the radio commercials is important in terms of attracting the attention of the listener and directing their focus on the advertising message. This cannot simply be done by using audio tools like sound effects and jingles. This study aims to identify the creative elements (execution formats appeals and approaches) and creativity factors of radio commercials in Turkey. As part of the study, all of the award winning radio commercials produced throughout the history of the Kristal Elma Advertising Festival were analyzed using the content analysis technique. Two judges (an advertising agency copywriter and an academic) coded the commercials. The reliability was measured according to the proportional agreement. The results showed that sound effects, jingles, testimonials, slices of life and announcements were the most common execution formats in creative Turkish radio ads. Humor and excitement were the most commonly used creative appeals while award-winning ads featured various approaches, such as surprise musical performances, audio wallpaper, product voice, and theater of the mind. Some ads, however, were found to not contain any creativity factors. In order to be accepted as creative, an ad must have at least one divergence factor, such as originality, flexibility, unusual/empathic perspective, and provocative questions. These findings, as well as others from the study, hold great value for the history of creative radio advertising in Turkey. Today, the nature of radio and its listeners is changing. As more and more people are tuning into online radio channels, brands will need to focus more on this relatively cheap advertising medium in the very near future. This new development will require that advertising agencies focus their attention on creativity in order to produce radio commercials for their customers that will differentiate them from their competitors.

Keywords: advertising, creativity, radio, Turkey

Procedia PDF Downloads 367
525 Automatic Identification of Pectoral Muscle

Authors: Ana L. M. Pavan, Guilherme Giacomini, Allan F. F. Alves, Marcela De Oliveira, Fernando A. B. Neto, Maria E. D. Rosa, Andre P. Trindade, Diana R. De Pina

Abstract:

Mammography is a worldwide image modality used to diagnose breast cancer, even in asymptomatic women. Due to its large availability, mammograms can be used to measure breast density and to predict cancer development. Women with increased mammographic density have a four- to sixfold increase in their risk of developing breast cancer. Therefore, studies have been made to accurately quantify mammographic breast density. In clinical routine, radiologists perform image evaluations through BIRADS (Breast Imaging Reporting and Data System) assessment. However, this method has inter and intraindividual variability. An automatic objective method to measure breast density could relieve radiologist’s workload by providing a first aid opinion. However, pectoral muscle is a high density tissue, with similar characteristics of fibroglandular tissues. It is consequently hard to automatically quantify mammographic breast density. Therefore, a pre-processing is needed to segment the pectoral muscle which may erroneously be quantified as fibroglandular tissue. The aim of this work was to develop an automatic algorithm to segment and extract pectoral muscle in digital mammograms. The database consisted of thirty medio-lateral oblique incidence digital mammography from São Paulo Medical School. This study was developed with ethical approval from the authors’ institutions and national review panels under protocol number 3720-2010. An algorithm was developed, in Matlab® platform, for the pre-processing of images. The algorithm uses image processing tools to automatically segment and extract the pectoral muscle of mammograms. Firstly, it was applied thresholding technique to remove non-biological information from image. Then, the Hough transform is applied, to find the limit of the pectoral muscle, followed by active contour method. Seed of active contour is applied in the limit of pectoral muscle found by Hough transform. An experienced radiologist also manually performed the pectoral muscle segmentation. Both methods, manual and automatic, were compared using the Jaccard index and Bland-Altman statistics. The comparison between manual and the developed automatic method presented a Jaccard similarity coefficient greater than 90% for all analyzed images, showing the efficiency and accuracy of segmentation of the proposed method. The Bland-Altman statistics compared both methods in relation to area (mm²) of segmented pectoral muscle. The statistic showed data within the 95% confidence interval, enhancing the accuracy of segmentation compared to the manual method. Thus, the method proved to be accurate and robust, segmenting rapidly and freely from intra and inter-observer variability. It is concluded that the proposed method may be used reliably to segment pectoral muscle in digital mammography in clinical routine. The segmentation of the pectoral muscle is very important for further quantifications of fibroglandular tissue volume present in the breast.

Keywords: active contour, fibroglandular tissue, hough transform, pectoral muscle

Procedia PDF Downloads 324
524 Calculation of the Normalized Difference Vegetation Index and the Spectral Signature of Coffee Crops: Benefits of Image Filtering on Mixed Crops

Authors: Catalina Albornoz, Giacomo Barbieri

Abstract:

Crop monitoring has shown to reduce vulnerability to spreading plagues and pathologies in crops. Remote sensing with Unmanned Aerial Vehicles (UAVs) has made crop monitoring more precise, cost-efficient and accessible. Nowadays, remote monitoring involves calculating maps of vegetation indices by using different software that takes either Truecolor (RGB) or multispectral images as an input. These maps are then used to segment the crop into management zones. Finally, knowing the spectral signature of a crop (the reflected radiation as a function of wavelength) can be used as an input for decision-making and crop characterization. The calculation of vegetation indices using software such as Pix4D has high precision for monoculture plantations. However, this paper shows that using this software on mixed crops may lead to errors resulting in an incorrect segmentation of the field. Within this work, authors propose to filter all the elements different from the main crop before the calculation of vegetation indices and the spectral signature. A filter based on the Sobel method for border detection is used for filtering a coffee crop. Results show that segmentation into management zones changes with respect to the traditional situation in which a filter is not applied. In particular, it is shown how the values of the spectral signature change in up to 17% per spectral band. Future work will quantify the benefits of filtering through the comparison between in situ measurements and the calculated vegetation indices obtained through remote sensing.

Keywords: coffee, filtering, mixed crop, precision agriculture, remote sensing, spectral signature

Procedia PDF Downloads 366
523 A BIM-Based Approach to Assess COVID-19 Risk Management Regarding Indoor Air Ventilation and Pedestrian Dynamics

Authors: T. Delval, C. Sauvage, Q. Jullien, R. Viano, T. Diallo, B. Collignan, G. Picinbono

Abstract:

In the context of the international spread of COVID-19, the Centre Scientifique et Technique du Bâtiment (CSTB) has led a joint research with the French government authorities Hauts-de-Seine department, to analyse the risk in school spaces according to their configuration, ventilation system and spatial segmentation strategy. This paper describes the main results of this joint research. A multidisciplinary team involving experts in indoor air quality/ventilation, pedestrian movements and IT domains was established to develop a COVID risk analysis tool based on Building Information Model. The work started with specific analysis on two pilot schools in order to provide for the local administration specifications to minimize the spread of the virus. Different recommendations were published to optimize/validate the use of ventilation systems and the strategy of student occupancy and student flow segmentation within the building. This COVID expertise has been digitized in order to manage a quick risk analysis on the entire building that could be used by the public administration through an easy user interface implemented in a free BIM Management software. One of the most interesting results is to enable a dynamic comparison of different ventilation system scenarios and space occupation strategy inside the BIM model. This concurrent engineering approach provides users with the optimal solution according to both ventilation and pedestrian flow expertise.

Keywords: BIM, knowledge management, system expert, risk management, indoor ventilation, pedestrian movement, integrated design

Procedia PDF Downloads 84
522 A Review of Blog Assisted Language Learning Research: Based on Bibliometric Analysis

Authors: Bo Ning Lyu

Abstract:

Blog assisted language learning (BALL) has been trialed by educators in language teaching with the development of Web 2.0 technology. Understanding the development trend of related research helps grasp the whole picture of the use of blog in language education. This paper reviews current research related to blogs enhanced language learning based on bibliometric analysis, aiming at (1) identifying the most frequently used keywords and their co-occurrence, (2) clustering research topics based on co-citation analysis, (3) finding the most frequently cited studies and authors and (4) constructing the co-authorship network. 330 articles were searched out in Web of Science, 225 peer-viewed journal papers were finally collected according to selection criteria. Bibexcel and VOSviewer were used to visualize the results. Studies reviewed were published between 2005 to 2016, most in the year of 2014 and 2015 (35 papers respectively). The top 10 most frequently appeared keywords are learning, language, blog, teaching, writing, social, web 2.0, technology, English, communication. 8 research themes could be clustered by co-citation analysis: blogging for collaborative learning, blogging for writing skills, blogging in higher education, feedback via blogs, blogging for self-regulated learning, implementation of using blogs in classroom, comparative studies and audio/video blogs. Early studies focused on the introduction of the classroom implementation while recent studies moved to the audio/video blogs from their traditional usage. By reviewing the research related to BALL quantitatively and objectively, this paper reveals the evolution and development trends as well as identifies influential research, helping researchers and educators quickly grasp this field overall and conducting further studies.

Keywords: blog, bibliometric analysis, language learning, literature review

Procedia PDF Downloads 187
521 Market Segmentation of Cruise Ship Passengers: Implications for Marketing of Local Products and Services at Destination Points

Authors: Gunnar Oskarsson, Irena Georgsdottir

Abstract:

Tourism has been growing incredibly fast during the past years, including the cruise industry, which is gaining increasing popularity among various groups of travelers. It is a challenging task for companies serving cruise ship passengers with local products and services at the point of destination to reach them in due time with information about their offerings, as well learning how to adapt their offerings and messages to the type of customers arriving on each particular occasion. Although some research has been conducted in this sphere, there is still limited knowledge about many specifics within this sector of the tourist industry. The objective of this research is to examine one of these, with the main goal of studying the segmentation of cruise passengers and to learn about marketing practices directed towards them. A qualitative research method, based on in-depth interviews, was used, as this provides an opportunity to gain insight into the participants’ perspectives. Interviews were conducted with 10 respondents from different companies in the tourist industry in Iceland, who interact with cruise passengers on a regular basis in their work environment. The main objective was to gain an understanding of what distinguishes different customer groups, or segments, in this industry, and of the marketing approaches directed towards them. The main findings reveal that participants note the strongest difference between cruise passengers of different nationalities, passengers coming on different ships (size and type), and passengers arriving at different times of the year. A drastic difference was noticed between nationalities in four main segments, American, British, Other European, and Asian customers, although some of these segments could be divided into even further sub-segments. Other important differencing factors were size and type of ships, quality or number of stars on the ship, and travelling time of the year. Companies serving cruise ship passengers, as well as the customers themselves, could benefit if the offerings of services were designed specifically for particular segments within the industry. Concerning marketing towards cruise passengers, the results indicate that it is carried out almost exclusively through the Internet using; a reliable website and, search engine optimization. Marketing is also by word-of-mouth. This research can assist practitioners by offering a deeper understanding of the approaches that may be effective in marketing local products and services to cruise ship passengers, based on their segmentation and by identifying effective ways to reach them. The research, furthermore, provides a valuable contribution to marketing knowledge for the benefit of an increasingly important market segment in a fast growing tourist industry.

Keywords: capabilities, global integration, internationalisation, SMEs

Procedia PDF Downloads 383
520 Investigations of Effective Marketing Metric Strategies: The Case of St. George Brewery Factory, Ethiopia

Authors: Mekdes Getu Chekol, Biniam Tedros Kahsay, Rahwa Berihu Haile

Abstract:

The main objective of this study is to investigate the marketing strategy practice in the Case of St. George Brewery Factory in Addis Ababa. One of the core activities in a Business Company to stay in business is having a well-developed marketing strategy. It assessed how the marketing strategies were practiced in the company to achieve its goals aligned with segmentation, target market, positioning, and the marketing mix elements to satisfy customer requirements. Using primary and secondary data, the study is conducted by using both qualitative and quantitative approaches. The primary data was collected through open and closed-ended questionnaires. Considering the size of the population is small, the selection of the respondents was carried out by using a census. The finding shows that the company used all the 4 Ps of the marketing mix elements in its marketing strategies and provided quality products at affordable prices by promoting its products by using high and effective advertising mechanisms. The product availability and accessibility are admirable with the practices of both direct and indirect distribution channels. On the other hand, the company has identified its target customers, and the company’s market segmentation practice is geographical location. Communication effectiveness between the marketing department and other departments is very good. The adjusted R2 model explains 61.6% of the marketing strategy practice variance by product, price, promotion, and place. The remaining 38.4% of variation in the dependent variable was explained by other factors not included in this study. The result reveals that all four independent variables, product, price, promotion, and place, have a positive beta sign, proving that predictor variables have a positive effect on that of the predicting dependent variable marketing strategy practice. Even though the marketing strategies of the company are effectively practiced, there are some problems that the company faces while implementing them. These are infrastructure problems, economic problems, intensive competition in the market, shortage of raw materials, seasonality of consumption, socio-cultural problems, and the time and cost of awareness creation for the customers. Finally, the authors suggest that the company better develop a long-range view and try to implement a more structured approach to attain information about potential customers, competitor’s actions, and market intelligence within the industry. In addition, we recommend conducting the study by increasing the sample size and including different marketing factors.

Keywords: marketing strategy, market segmentation, target marketing, market positioning, marketing mix

Procedia PDF Downloads 26
519 A Deep Learning Approach to Calculate Cardiothoracic Ratio From Chest Radiographs

Authors: Pranav Ajmera, Amit Kharat, Tanveer Gupte, Richa Pant, Viraj Kulkarni, Vinay Duddalwar, Purnachandra Lamghare

Abstract:

The cardiothoracic ratio (CTR) is the ratio of the diameter of the heart to the diameter of the thorax. An abnormal CTR, that is, a value greater than 0.55, is often an indicator of an underlying pathological condition. The accurate prediction of an abnormal CTR from chest X-rays (CXRs) aids in the early diagnosis of clinical conditions. We propose a deep learning-based model for automatic CTR calculation that can assist the radiologist with the diagnosis of cardiomegaly and optimize the radiology flow. The study population included 1012 posteroanterior (PA) CXRs from a single institution. The Attention U-Net deep learning (DL) architecture was used for the automatic calculation of CTR. A CTR of 0.55 was used as a cut-off to categorize the condition as cardiomegaly present or absent. An observer performance test was conducted to assess the radiologist's performance in diagnosing cardiomegaly with and without artificial intelligence (AI) assistance. The Attention U-Net model was highly specific in calculating the CTR. The model exhibited a sensitivity of 0.80 [95% CI: 0.75, 0.85], precision of 0.99 [95% CI: 0.98, 1], and a F1 score of 0.88 [95% CI: 0.85, 0.91]. During the analysis, we observed that 51 out of 1012 samples were misclassified by the model when compared to annotations made by the expert radiologist. We further observed that the sensitivity of the reviewing radiologist in identifying cardiomegaly increased from 40.50% to 88.4% when aided by the AI-generated CTR. Our segmentation-based AI model demonstrated high specificity and sensitivity for CTR calculation. The performance of the radiologist on the observer performance test improved significantly with AI assistance. A DL-based segmentation model for rapid quantification of CTR can therefore have significant potential to be used in clinical workflows.

Keywords: cardiomegaly, deep learning, chest radiograph, artificial intelligence, cardiothoracic ratio

Procedia PDF Downloads 70
518 Building a Comprehensive Repository for Montreal Gamelan Archives

Authors: Laurent Bellemare

Abstract:

After the showcase of traditional Indonesian performing arts at the Vancouver Expo 1986, Canadian universities inherited sets of Indonesian gamelan orchestras and soon began offering courses for music students interested in learning these diverse traditions. Among them, Université de Montréal was offered two sets of Balinese orchestras, a novelty that allowed a community of Montreal gamelan enthusiasts to form and engage with this music. A few generations later, a large body of archives have amassed, framing the history of this niche community’s achievements. This data, scattered in public and private archive collections, comes in various formats: Digital Audio Tape, audio cassettes, Video Home System videotape, digital files, photos, reel-to-reel audiotape, posters, concert programs, letters, TV shows, reports and more. Attempting to study these documents in order to unearth a chronology of gamelan in Montreal has proven to be challenging since no suitable platform for preservation, storage, and research currently exists. These files are, therefore, hard to find due to their decentralized locations. Additionally, most of the documents in older formats have yet to be digitized. In the case of recent digital files, such as pictures or rehearsal recordings, their locations can be even messier and their quantity overwhelming. Aside from the basic issue of choosing a suitable repository platform, questions of legal rights and methodology arise. For posterity, these documents should nonetheless be digitized, organized, and stored in an easily accessible online repository. This paper aims to underline the various challenges encountered in the early stages of such a project as well as to suggest ways of overcoming the obstacles to a thorough archival investigation.

Keywords: archival work, archives, Balinese gamelan, Canada, Gamelan, Indonesia, Javanese gamelan, Montreal

Procedia PDF Downloads 94
517 Effect of Threshold Configuration on Accuracy in Upper Airway Analysis Using Cone Beam Computed Tomography

Authors: Saba Fahham, Supak Ngamsom, Suchaya Damrongsri

Abstract:

Objective: The objective is to determine the optimal threshold of Romexis software for the airway volume and minimum cross-section area (MCA) analysis using Image J as a gold standard. Materials and Methods: A total of ten cone-beam computed tomography (CBCT) images were collected. The airway volume and MCA of each patient were analyzed using the automatic airway segmentation function in the CBCT DICOM viewer (Romexis). Airway volume and MCA measurements were conducted on each CBCT sagittal view with fifteen different threshold values from the Romexis software, Ranging from 300 to 1000. Duplicate DICOM files, in axial view, were imported into Image J for concurrent airway volume and MCA analysis as the gold standard. The airway volume and MCA measured from Romexis and Image J were compared using a t-test with Bonferroni correction, and statistical significance was set at p<0.003. Results: Concerning airway volume, thresholds of 600 to 850 as well as 1000, exhibited results that were not significantly distinct from those obtained through Image J. Regarding MCA, employing thresholds from 400 to 850 within Romexis Viewer showed no variance from Image J. Notably, within the threshold range of 600 to 850, there were no statistically significant differences observed in both airway volume and MCA analyses, in comparison to Image J. Conclusion: This study demonstrated that the utilization of Planmeca Romexis Viewer 6.4.3.3 within threshold range of 600 to 850 yields airway volume and MCA measurements that exhibit no statistically significant variance in comparison to measurements obtained through Image J. This outcome holds implications for diagnosing upper airway obstructions and post-orthodontic surgical monitoring.

Keywords: airway analysis, airway segmentation, cone beam computed tomography, threshold

Procedia PDF Downloads 15
516 Colloquialism in Audiovisual Translation: English Subtitling of the Lebanese Film Capernaum as a Case Study

Authors: Fatima Saab

Abstract:

This paper attempts to study colloquialism in audio-visual translation, with particular emphasis given to investigating the difficulties and challenges encountered by subtitlers in translating Lebanese colloquial into English. To achieve the main objectives of this study, ample and thorough cultural and translational analysis of examples drawn from the subtitled movie Capernaum are presented in order to identify the strategies used to overcome cultural barriers and differences and to show the process of decision-making by the translator. Also, special attention is given to explain the technicalities in translating subtitles and how they affect the translation process. The research is a descriptive analytical study whereby the writer sets out empirical observations, consisting of descriptive and analytical examination of the difficulties and problems associated with translating Arabic colloquialisms, specifically Lebanese, into English in the subtitled film, Capernaum. The research methodology utilizes a qualitative approach to group the selected data into the subtitling strategies presented by Gottlieb under the domesticating or foreignizing strategies according to Venuti's Model. It is shown that producing the same meanings to a foreign audience is not an easy task. The background of cultural elements and the stories that make up the history and mindset of the Lebanese and Arabic peoples leads to the use of the transfer and paraphrase methodologies most of the time (81% of the sample used for analysis). The research shows that translating and subtitling colloquialism needs special skills by the translators to overcome the challenges imposed by the limited presentation space as well as cultural differences. Translation of colloquial Arabic/Lebanese can be achieved to a certain extent and delivering the meaning and effect of the source language culture is accomplished in as much as the translator investigates and relates to the target culture.

Keywords: Lebanese colloquial, audio-visual translation, subtitling, Capernaum

Procedia PDF Downloads 124
515 Preserving Urban Cultural Heritage with Deep Learning: Color Planning for Japanese Merchant Towns

Authors: Dongqi Li, Yunjia Huang, Tomo Inoue, Kohei Inoue

Abstract:

With urbanization, urban cultural heritage is facing the impact and destruction of modernization and urbanization. Many historical areas are losing their historical information and regional cultural characteristics, so it is necessary to carry out systematic color planning for historical areas in conservation. As an early focus on urban color planning, Japan has a systematic approach to urban color planning. Hence, this paper selects five merchant towns from the category of important traditional building preservation areas in Japan as the subject of this study to explore the color structure and emotion of this type of historic area. First, the image semantic segmentation method identifies the buildings, roads, and landscape environments. Their color data were extracted for color composition and emotion analysis to summarize their common features. Second, the obtained Internet evaluations were extracted by natural language processing for keyword extraction. The correlation analysis of the color structure and keywords provides a valuable reference for conservation decisions for this historic area in the town. This paper also combines the color structure and Internet evaluation results with generative adversarial networks to generate predicted images of color structure improvements and color improvement schemes. The methods and conclusions of this paper can provide new ideas for the digital management of environmental colors in historic districts and provide a valuable reference for the inheritance of local traditional culture.

Keywords: historic districts, color planning, semantic segmentation, natural language processing

Procedia PDF Downloads 57
514 Linguistic Accessibility and Audiovisual Translation: Corpus Linguistics as a Tool for Analysis

Authors: Juan-Pedro Rica-Peromingo

Abstract:

The important change taking place with respect to the media and the audiovisual world in Europe needs to benefit all populations, in particular those with special needs, such as the deaf and hard-of-hearing population (SDH) and blind and partially-sighted population (AD). This recent interest in the field of audiovisual translation (AVT) can be observed in the teaching and learning of the different modes of AVT in the degree and post-degree courses at Spanish universities, which expand the interest and practice of AVT linguistic accessibility. We present a research project led at the UCM which consists of the compilation of AVT activities for teaching purposes and tries to analyze the creation and reception of SDH and AD: the AVLA Project (Audiovisual Learning Archive), which includes audiovisual materials carried out by the university students on different AVT modes and evaluations from the blind and deaf informants. In this study, we present the materials created by the students. A group of the deaf and blind population has been in charge of testing the student's SDH and AD corpus of audiovisual materials through some questionnaires used to evaluate the students’ production. These questionnaires include information about the reception of the subtitles and the audio descriptions from linguistic and technical points of view. With all the materials compiled in the research project, a corpus with both the students’ production and the recipients’ evaluations is being compiled: the CALING (Corpus de Accesibilidad Lingüística) corpus. Preliminary results will be presented with respect to those aspects, difficulties, and deficiencies in the SDH and AD included in the corpus, specifically with respect to the length of subtitles, the position of the contextual information on the screen, and the text included in the audio descriptions and tone of voice used. These results may suggest some changes and improvements in the quality of the SDH and AD analyzed. In the end, demand for the teaching and learning of AVT and linguistic accessibility at a university level and some important changes in the norms which regulate SDH and AD nationally and internationally will be suggested.

Keywords: audiovisual translation, corpus linguistics, linguistic accessibility, teaching

Procedia PDF Downloads 55
513 A Methodology Based on Image Processing and Deep Learning for Automatic Characterization of Graphene Oxide

Authors: Rafael do Amaral Teodoro, Leandro Augusto da Silva

Abstract:

Originated from graphite, graphene is a two-dimensional (2D) material that promises to revolutionize technology in many different areas, such as energy, telecommunications, civil construction, aviation, textile, and medicine. This is possible because its structure, formed by carbon bonds, provides desirable optical, thermal, and mechanical characteristics that are interesting to multiple areas of the market. Thus, several research and development centers are studying different manufacturing methods and material applications of graphene, which are often compromised by the scarcity of more agile and accurate methodologies to characterize the material – that is to determine its composition, shape, size, and the number of layers and crystals. To engage in this search, this study proposes a computational methodology that applies deep learning to identify graphene oxide crystals in order to characterize samples by crystal sizes. To achieve this, a fully convolutional neural network called U-net has been trained to segment SEM graphene oxide images. The segmentation generated by the U-net is fine-tuned with a standard deviation technique by classes, which allows crystals to be distinguished with different labels through an object delimitation algorithm. As a next step, the characteristics of the position, area, perimeter, and lateral measures of each detected crystal are extracted from the images. This information generates a database with the dimensions of the crystals that compose the samples. Finally, graphs are automatically created showing the frequency distributions by area size and perimeter of the crystals. This methodological process resulted in a high capacity of segmentation of graphene oxide crystals, presenting accuracy and F-score equal to 95% and 94%, respectively, over the test set. Such performance demonstrates a high generalization capacity of the method in crystal segmentation, since its performance considers significant changes in image extraction quality. The measurement of non-overlapping crystals presented an average error of 6% for the different measurement metrics, thus suggesting that the model provides a high-performance measurement for non-overlapping segmentations. For overlapping crystals, however, a limitation of the model was identified. To overcome this limitation, it is important to ensure that the samples to be analyzed are properly prepared. This will minimize crystal overlap in the SEM image acquisition and guarantee a lower error in the measurements without greater efforts for data handling. All in all, the method developed is a time optimizer with a high measurement value, considering that it is capable of measuring hundreds of graphene oxide crystals in seconds, saving weeks of manual work.

Keywords: characterization, graphene oxide, nanomaterials, U-net, deep learning

Procedia PDF Downloads 137
512 The Relationship between Spindle Sound and Tool Performance in Turning

Authors: N. Seemuang, T. McLeay, T. Slatter

Abstract:

Worn tools have a direct effect on the surface finish and part accuracy. Tool condition monitoring systems have been developed over a long period and used to avoid a loss of productivity resulting from using a worn tool. However, the majority of tool monitoring research has applied expensive sensing systems not suitable for production. In this work, the cutting sound in turning machine was studied using microphone. Machining trials using seven cutting conditions were conducted until the observable flank wear width (FWW) on the main cutting edge exceeded 0.4 mm. The cutting inserts were removed from the tool holder and the flank wear width was measured optically. A microphone with built-in preamplifier was used to record the machining sound of EN24 steel being face turned by a CNC lathe in a wet cutting condition using constant surface speed control. The sound was sampled at 50 kS/s and all sound signals recorded from microphone were transformed into the frequency domain by FFT in order to establish the frequency content in the audio signature that could be then used for tool condition monitoring. The extracted feature from audio signal was compared to the flank wear progression on the cutting inserts. The spectrogram reveals a promising feature, named as ‘spindle noise’, which emits from the main spindle motor of turning machine. The spindle noise frequency was detected at 5.86 kHz of regardless of cutting conditions used on this particular CNC lathe. Varying cutting speed and feed rate have an influence on the magnitude of power spectrum of spindle noise. The magnitude of spindle noise frequency alters in conjunction with the tool wear progression. The magnitude increases significantly in the transition state between steady-state wear and severe wear. This could be used as a warning signal to prepare for tool replacement or adapt cutting parameters to extend tool life.

Keywords: tool wear, flank wear, condition monitoring, spindle noise

Procedia PDF Downloads 309
511 Digi-Buddy: A Smart Cane with Artificial Intelligence and Real-Time Assistance

Authors: Amaladhithyan Krishnamoorthy, Ruvaitha Banu

Abstract:

Vision is considered as the most important sense in humans, without which leading a normal can be often difficult. There are many existing smart canes for visually impaired with obstacle detection using ultrasonic transducer to help them navigate. Though the basic smart cane increases the safety of the users, it does not help in filling the void of visual loss. This paper introduces the concept of Digi-Buddy which is an evolved smart cane for visually impaired. The cane consists for several modules, apart from the basic obstacle detection features; the Digi-Buddy assists the user by capturing video/images and streams them to the server using a wide-angled camera, which then detects the objects using Deep Convolutional Neural Network. In addition to determining what the particular image/object is, the distance of the object is assessed by the ultrasonic transducer. The sound generation application, modelled with the help of Natural Language Processing is used to convert the processed images/object into audio. The object detected is signified by its name which is transmitted to the user with the help of Bluetooth hear phones. The object detection is extended to facial recognition which maps the faces of the person the user meets in the database of face images and alerts the user about the person. One of other crucial function consists of an automatic-intimation-alarm which is triggered when the user is in an emergency. If the user recovers within a set time, a button is provisioned in the cane to stop the alarm. Else an automatic intimation is sent to friends and family about the whereabouts of the user using GPS. In addition to safety and security by the existing smart canes, the proposed concept devices to be implemented as a prototype helping visually-impaired visualize their surroundings through audio more in an amicable way.

Keywords: artificial intelligence, facial recognition, natural language processing, internet of things

Procedia PDF Downloads 323
510 Audio-Visual Co-Data Processing Pipeline

Authors: Rita Chattopadhyay, Vivek Anand Thoutam

Abstract:

Speech is the most acceptable means of communication where we can quickly exchange our feelings and thoughts. Quite often, people can communicate orally but cannot interact or work with computers or devices. It’s easy and quick to give speech commands than typing commands to computers. In the same way, it’s easy listening to audio played from a device than extract output from computers or devices. Especially with Robotics being an emerging market with applications in warehouses, the hospitality industry, consumer electronics, assistive technology, etc., speech-based human-machine interaction is emerging as a lucrative feature for robot manufacturers. Considering this factor, the objective of this paper is to design the “Audio-Visual Co-Data Processing Pipeline.” This pipeline is an integrated version of Automatic speech recognition, a Natural language model for text understanding, object detection, and text-to-speech modules. There are many Deep Learning models for each type of the modules mentioned above, but OpenVINO Model Zoo models are used because the OpenVINO toolkit covers both computer vision and non-computer vision workloads across Intel hardware and maximizes performance, and accelerates application development. A speech command is given as input that has information about target objects to be detected and start and end times to extract the required interval from the video. Speech is converted to text using the Automatic speech recognition QuartzNet model. The summary is extracted from text using a natural language model Generative Pre-Trained Transformer-3 (GPT-3). Based on the summary, essential frames from the video are extracted, and the You Only Look Once (YOLO) object detection model detects You Only Look Once (YOLO) objects on these extracted frames. Frame numbers that have target objects (specified objects in the speech command) are saved as text. Finally, this text (frame numbers) is converted to speech using text to speech model and will be played from the device. This project is developed for 80 You Only Look Once (YOLO) labels, and the user can extract frames based on only one or two target labels. This pipeline can be extended for more than two target labels easily by making appropriate changes in the object detection module. This project is developed for four different speech command formats by including sample examples in the prompt used by Generative Pre-Trained Transformer-3 (GPT-3) model. Based on user preference, one can come up with a new speech command format by including some examples of the respective format in the prompt used by the Generative Pre-Trained Transformer-3 (GPT-3) model. This pipeline can be used in many projects like human-machine interface, human-robot interaction, and surveillance through speech commands. All object detection projects can be upgraded using this pipeline so that one can give speech commands and output is played from the device.

Keywords: OpenVINO, automatic speech recognition, natural language processing, object detection, text to speech

Procedia PDF Downloads 55