Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 590

Search results for: speaker segmentation

320 Effect of Threshold Configuration on Accuracy in Upper Airway Analysis Using Cone Beam Computed Tomography

Authors: Saba Fahham, Supak Ngamsom, Suchaya Damrongsri

Abstract:

Objective: The objective is to determine the optimal threshold of Romexis software for the airway volume and minimum cross-section area (MCA) analysis using Image J as a gold standard. Materials and Methods: A total of ten cone-beam computed tomography (CBCT) images were collected. The airway volume and MCA of each patient were analyzed using the automatic airway segmentation function in the CBCT DICOM viewer (Romexis). Airway volume and MCA measurements were conducted on each CBCT sagittal view with fifteen different threshold values from the Romexis software, Ranging from 300 to 1000. Duplicate DICOM files, in axial view, were imported into Image J for concurrent airway volume and MCA analysis as the gold standard. The airway volume and MCA measured from Romexis and Image J were compared using a t-test with Bonferroni correction, and statistical significance was set at p<0.003. Results: Concerning airway volume, thresholds of 600 to 850 as well as 1000, exhibited results that were not significantly distinct from those obtained through Image J. Regarding MCA, employing thresholds from 400 to 850 within Romexis Viewer showed no variance from Image J. Notably, within the threshold range of 600 to 850, there were no statistically significant differences observed in both airway volume and MCA analyses, in comparison to Image J. Conclusion: This study demonstrated that the utilization of Planmeca Romexis Viewer 6.4.3.3 within threshold range of 600 to 850 yields airway volume and MCA measurements that exhibit no statistically significant variance in comparison to measurements obtained through Image J. This outcome holds implications for diagnosing upper airway obstructions and post-orthodontic surgical monitoring.

Keywords: airway analysis, airway segmentation, cone beam computed tomography, threshold

Procedia PDF Downloads 44

319 English Complex Aspectuality: A Functional Approach

Authors: Cunyu Zhang

Abstract:

Based on Systemic Functional Linguistics, this paper aims to explore the complex aspectuality system of English. This study shows that the complex aspectuality is classified into complex viewpoint aspect which refers to the homogeneous or heterogeneous ways continuously viewing on the same situation by the speaker and complex situation aspect which is the combined configuration of the internal time schemata of situation. Through viewpoint shifting and repeating, the complex viewpoint aspect is formed in two combination ways. Complex situation aspect is combined by the way of hypotactic verbal complex and the limitation of participant and circumstance in a clause.

Keywords: aspect series, complex situation aspect, complex viewpoint aspect, systemic functional linguistics

Procedia PDF Downloads 356

318 Preserving Urban Cultural Heritage with Deep Learning: Color Planning for Japanese Merchant Towns

Authors: Dongqi Li, Yunjia Huang, Tomo Inoue, Kohei Inoue

Abstract:

With urbanization, urban cultural heritage is facing the impact and destruction of modernization and urbanization. Many historical areas are losing their historical information and regional cultural characteristics, so it is necessary to carry out systematic color planning for historical areas in conservation. As an early focus on urban color planning, Japan has a systematic approach to urban color planning. Hence, this paper selects five merchant towns from the category of important traditional building preservation areas in Japan as the subject of this study to explore the color structure and emotion of this type of historic area. First, the image semantic segmentation method identifies the buildings, roads, and landscape environments. Their color data were extracted for color composition and emotion analysis to summarize their common features. Second, the obtained Internet evaluations were extracted by natural language processing for keyword extraction. The correlation analysis of the color structure and keywords provides a valuable reference for conservation decisions for this historic area in the town. This paper also combines the color structure and Internet evaluation results with generative adversarial networks to generate predicted images of color structure improvements and color improvement schemes. The methods and conclusions of this paper can provide new ideas for the digital management of environmental colors in historic districts and provide a valuable reference for the inheritance of local traditional culture.

Keywords: historic districts, color planning, semantic segmentation, natural language processing

Procedia PDF Downloads 88

317 A Methodology Based on Image Processing and Deep Learning for Automatic Characterization of Graphene Oxide

Authors: Rafael do Amaral Teodoro, Leandro Augusto da Silva

Abstract:

Originated from graphite, graphene is a two-dimensional (2D) material that promises to revolutionize technology in many different areas, such as energy, telecommunications, civil construction, aviation, textile, and medicine. This is possible because its structure, formed by carbon bonds, provides desirable optical, thermal, and mechanical characteristics that are interesting to multiple areas of the market. Thus, several research and development centers are studying different manufacturing methods and material applications of graphene, which are often compromised by the scarcity of more agile and accurate methodologies to characterize the material – that is to determine its composition, shape, size, and the number of layers and crystals. To engage in this search, this study proposes a computational methodology that applies deep learning to identify graphene oxide crystals in order to characterize samples by crystal sizes. To achieve this, a fully convolutional neural network called U-net has been trained to segment SEM graphene oxide images. The segmentation generated by the U-net is fine-tuned with a standard deviation technique by classes, which allows crystals to be distinguished with different labels through an object delimitation algorithm. As a next step, the characteristics of the position, area, perimeter, and lateral measures of each detected crystal are extracted from the images. This information generates a database with the dimensions of the crystals that compose the samples. Finally, graphs are automatically created showing the frequency distributions by area size and perimeter of the crystals. This methodological process resulted in a high capacity of segmentation of graphene oxide crystals, presenting accuracy and F-score equal to 95% and 94%, respectively, over the test set. Such performance demonstrates a high generalization capacity of the method in crystal segmentation, since its performance considers significant changes in image extraction quality. The measurement of non-overlapping crystals presented an average error of 6% for the different measurement metrics, thus suggesting that the model provides a high-performance measurement for non-overlapping segmentations. For overlapping crystals, however, a limitation of the model was identified. To overcome this limitation, it is important to ensure that the samples to be analyzed are properly prepared. This will minimize crystal overlap in the SEM image acquisition and guarantee a lower error in the measurements without greater efforts for data handling. All in all, the method developed is a time optimizer with a high measurement value, considering that it is capable of measuring hundreds of graphene oxide crystals in seconds, saving weeks of manual work.

Keywords: characterization, graphene oxide, nanomaterials, U-net, deep learning

Procedia PDF Downloads 160

316 Performance Evaluation of Acoustic-Spectrographic Voice Identification Method in Native and Non-Native Speech

Authors: E. Krasnova, E. Bulgakova, V. Shchemelinin

Abstract:

The paper deals with acoustic-spectrographic voice identification method in terms of its performance in non-native language speech. Performance evaluation is conducted by comparing the result of the analysis of recordings containing native language speech with recordings that contain foreign language speech. Our research is based on Tajik and Russian speech of Tajik native speakers due to the character of the criminal situation with drug trafficking. We propose a pilot experiment that represents a primary attempt enter the field.

Keywords: speaker identification, acoustic-spectrographic method, non-native speech, performance evaluation

Procedia PDF Downloads 446

315 The Effects of Culture and Language on Social Impression Formation from Voice Pleasantness: A Study with French and Iranian People

Authors: L. Bruckert, A. Mansourzadeh

Abstract:

The voice has a major influence on interpersonal communication in everyday life via the perception of pleasantness. The evolutionary perspective postulates that the mechanisms underlying the pleasantness judgments are universal adaptations that have evolved in the service of choosing a mate (through the process of sexual selection). From this point of view, the favorite voices would be those with more marked sexually dimorphic characteristics; for example, in men with lower voice pitch, pitch is the main criterion. On the other hand, one can postulate that the mechanisms involved are gradually established since childhood through exposure to the environment, and thus the prosodic elements could take precedence in everyday life communication as it conveys information about the speaker's attitude (willingness to communicate, interest toward the interlocutors). Our study focuses on voice pleasantness and its relationship with social impression formation, exploring both the spectral aspects (pitch, timbre) and the prosodic ones. In our study, we recorded the voices through two vocal corpus (five vowels and a reading text) of 25 French males speaking French and 25 Iranian males speaking Farsi. French listeners (40 male/40 female) listened to the French voices and made a judgment either on the voice's pleasantness or on the speaker (judgment about his intelligence, honesty, sociability). The regression analyses from our acoustic measures showed that the prosodic elements (for example, the intonation and the speech rate) are the most important criteria concerning pleasantness, whatever the corpus or the listener's gender. Moreover, the correlation analyses showed that the speakers with the voices judged as the most pleasant are considered the most intelligent, sociable, and honest. The voices in Farsi have been judged by 80 other French listeners (40 male/40 female), and we found the same effect of intonation concerning the judgment of pleasantness with the corpus «vowel» whereas with the corpus «text» the pitch is more important than the prosody. It may suggest that voice perception contains some elements invariant across culture/language, whereas others are influenced by the cultural/linguistic background of the listener. Shortly in the future, Iranian people will be asked to listen either to the French voices for half of them or to the Farsi voices for the other half and produce the same judgments as the French listeners. This experimental design could potentially make it possible to distinguish what is linked to culture and what is linked to language in the case of differences in voice perception.

Keywords: cross-cultural psychology, impression formation, pleasantness, voice perception

Procedia PDF Downloads 69

314 Skull Extraction for Quantification of Brain Volume in Magnetic Resonance Imaging of Multiple Sclerosis Patients

Authors: Marcela De Oliveira, Marina P. Da Silva, Fernando C. G. Da Rocha, Jorge M. Santos, Jaime S. Cardoso, Paulo N. Lisboa-Filho

Abstract:

Multiple Sclerosis (MS) is an immune-mediated disease of the central nervous system characterized by neurodegeneration, inflammation, demyelination, and axonal loss. Magnetic resonance imaging (MRI), due to the richness in the information details provided, is the gold standard exam for diagnosis and follow-up of neurodegenerative diseases, such as MS. Brain atrophy, the gradual loss of brain volume, is quite extensive in multiple sclerosis, nearly 0.5-1.35% per year, far off the limits of normal aging. Thus, the brain volume quantification becomes an essential task for future analysis of the occurrence atrophy. The analysis of MRI has become a tedious and complex task for clinicians, who have to manually extract important information. This manual analysis is prone to errors and is time consuming due to various intra- and inter-operator variability. Nowadays, computerized methods for MRI segmentation have been extensively used to assist doctors in quantitative analyzes for disease diagnosis and monitoring. Thus, the purpose of this work was to evaluate the brain volume in MRI of MS patients. We used MRI scans with 30 slices of the five patients diagnosed with multiple sclerosis according to the McDonald criteria. The computational methods for the analysis of images were carried out in two steps: segmentation of the brain and brain volume quantification. The first image processing step was to perform brain extraction by skull stripping from the original image. In the skull stripper for MRI images of the brain, the algorithm registers a grayscale atlas image to the grayscale patient image. The associated brain mask is propagated using the registration transformation. Then this mask is eroded and used for a refined brain extraction based on level-sets (edge of the brain-skull border with dedicated expansion, curvature, and advection terms). In the second step, the brain volume quantification was performed by counting the voxels belonging to the segmentation mask and converted in cc. We observed an average brain volume of 1469.5 cc. We concluded that the automatic method applied in this work can be used for the brain extraction process and brain volume quantification in MRI. The development and use of computer programs can contribute to assist health professionals in the diagnosis and monitoring of patients with neurodegenerative diseases. In future works, we expect to implement more automated methods for the assessment of cerebral atrophy and brain lesions quantification, including machine-learning approaches. Acknowledgements: This work was supported by a grant from Brazilian agency Fundação de Amparo à Pesquisa do Estado de São Paulo (number 2019/16362-5).

Keywords: brain volume, magnetic resonance imaging, multiple sclerosis, skull stripper

Procedia PDF Downloads 146

313 Code Switching: A Case Study Of Lebanon

Authors: Wassim Bekai

Abstract:

Code switching, as its name states, is altering between two or more languages in one sentence. The speaker tends to use code switching in his/her speech for better clarification of his/her message to the receiver. It is commonly used in sociocultural countries such as Lebanon because of the various cultures that have come across its lands through history, considering Lebanon is geographically located in the heart of the world, and hence between many cultures and languages. In addition, Lebanon was occupied by Turkish authorities for about 400 years, and later on by the French mandate, where both of these countries forced their languages in official papers and in the Lebanese educational system. In this paper, the importance of code switching in the Lebanese workplace will be examined, stressing the efficiency and amount of the production resulting from code switching in the workplace (factories, universities among other places) in addition to exploring the social, education, religious and cultural factors behind this phenomenon in Lebanon.

Keywords: code switching, Lebanon, cultural, factors

Procedia PDF Downloads 287

312 Quantitative Evaluation of Supported Catalysts Key Properties from Electron Tomography Studies: Assessing Accuracy Using Material-Realistic 3D-Models

Authors: Ainouna Bouziane

Abstract:

The ability of Electron Tomography to recover the 3D structure of catalysts, with spatial resolution in the subnanometer scale, has been widely explored and reviewed in the last decades. A variety of experimental techniques, based either on Transmission Electron Microscopy (TEM) or Scanning Transmission Electron Microscopy (STEM) have been used to reveal different features of nanostructured catalysts in 3D, but High Angle Annular Dark Field imaging in STEM mode (HAADF-STEM) stands out as the most frequently used, given its chemical sensitivity and avoidance of imaging artifacts related to diffraction phenomena when dealing with crystalline materials. In this regard, our group has developed a methodology that combines image denoising by undecimated wavelet transforms (UWT) with automated, advanced segmentation procedures and parameter selection methods using CS-TVM (Compressed Sensing-total variation minimization) algorithms to reveal more reliable quantitative information out of the 3D characterization studies. However, evaluating the accuracy of the magnitudes estimated from the segmented volumes is also an important issue that has not been properly addressed yet, because a perfectly known reference is needed. The problem particularly complicates in the case of multicomponent material systems. To tackle this key question, we have developed a methodology that incorporates volume reconstruction/segmentation methods. In particular, we have established an approach to evaluate, in quantitative terms, the accuracy of TVM reconstructions, which considers the influence of relevant experimental parameters like the range of tilt angles, image noise level or object orientation. The approach is based on the analysis of material-realistic, 3D phantoms, which include the most relevant features of the system under analysis.

Keywords: electron tomography, supported catalysts, nanometrology, error assessment

Procedia PDF Downloads 88

311 The Markers -mm and dämmo in Amharic: Developmental Approach

Authors: Hayat Omar

Abstract:

Languages provide speakers with a wide range of linguistic units to organize and deliver information. There are several ways to verbally express the mental representations of events. According to the linguistic tools they have acquired, speakers select the one that brings out the most communicative effect to convey their message. Our study focuses on two markers, -mm and dämmo, in Amharic (Ethiopian Semitic language). Our aim is to examine, from a developmental perspective, how they are used by speakers. We seek to distinguish the communicative and pragmatic functions indicated by means of these markers. To do so, we created a corpus of sixty narrative productions of children from 5-6, 7-8 to 10-12 years old and adult Amharic speakers. The experimental material we used to collect our data is a series of pictures without text 'Frog, Where are you?'. Although -mm and dämmo are each used in specific contexts, they are sometimes analyzed as being interchangeable. The suffix -mm is complex and multifunctional. It marks the end of the negative verbal structure, it is found in the relative structure of the imperfect, it creates new words such as adverbials or pronouns, it also serves to coordinate words, sentences and to mark the link between macro-propositions within a larger textual unit. -mm was analyzed as marker of insistence, topic shift marker, element of concatenation, contrastive focus marker, 'bisyndetic' coordinator. On the other hand, dämmo has limited function and did not attract the attention of many authors. The only approach we could find analyzes it in terms of 'monosyndetic' coordinator. The paralleling of these two elements made it possible to understand their distinctive functions and refine their description. When it comes to marking a referent, the choice of -mm or dämmo is not neutral, depending on whether the tagged argument is newly introduced, maintained, promoted or reintroduced. The presence of these morphemes explains the inter-phrastic link. The information is seized by anaphora or presupposition: -mm goes upstream while dämmo arrows downstream, the latter requires new information. The speaker uses -mm or dämmo according to what he assumes to be known to his interlocutors. The results show that -mm and dämmo, although all the speakers use them both, do not always have the same scope according to the speaker and vary according to the age. dämmo is mainly used to mark a contrastive topic to signal the concomitance of events. It is more commonly used in young children’s narratives (F(3,56) = 3,82, p < .01). Some values of -mm (additive) are acquired very early while others are rather late and increase with age (F(3,56) = 3,2, p < .03). The difficulty is due not only because of its synthetic structure but primarily because it is multi-purpose and requires a memory work. It highlights the constituent on which it operates to clarify how the message should be interpreted.

Keywords: acquisition, cohesion, connection, contrastive topic, contrastive focus, discourse marker, pragmatics

Procedia PDF Downloads 134

310 Developing Speaking Confidence of Students through Communicative Activities

Authors: Yadab Giri

Abstract:

Confidence is considered a power of a good speaker, and it also can be taken as a tool for speaking. The paper entitled ‘Developing Speaking Confidence of Students through Communicative Activities’ has been written with the purpose of developing the speaking confidence of the students of the Seventh grade of our context in mind. The research is designed under the interpretive paradigm of action research. During my research, thirteen students from class seven were chosen for the study. It was seen a lot of improvement in their confidence while communicating with other speakers by the end of the eighth week. Though there is a positive result of the invention, some students still did not develop the level of confidence that they could have developed to get a satisfactory response. Therefore, the outcome of my action research is positive because students are eager and interested in speaking daily in the initiation of their English class, and they have improved in their speaking.

Keywords: confidence, speaking skills, action research, reflection with feedback and observation, finally endeavour

Procedia PDF Downloads 76

309 Stop Consonants in Chinese and Slovak: Contrastive Analysis by Using Praat

Authors: Maria Istvanova

Abstract:

The acquisition of the correct pronunciation in Chinese is closely linked to the initial phase of the study. Based on the contrastive analysis, we determine the differences in the pronunciation of stop consonants in Chinese and Slovak taking into consideration the place and manner of articulation to gain a better understanding of the students' main difficulties in the process of acquiring correct pronunciation of Chinese stop consonants. We employ the software Praat for the analysis of the recorded samples with an emphasis on the pronunciation of the students with a varying command of Chinese. The comparison of the VOT length for the individual consonants in the students' pronunciation and the pronunciation of the native speaker exposes the differences between the correct pronunciation and the deviant pronunciation of the students.

Keywords: Chinese, contrastive analysis, Praat, pronunciation, Slovak.

Procedia PDF Downloads 137

308 A Robust Visual Simultaneous Localization and Mapping for Indoor Dynamic Environment

Authors: Xiang Zhang, Daohong Yang, Ziyuan Wu, Lei Li, Wanting Zhou

Abstract:

Visual Simultaneous Localization and Mapping (VSLAM) uses cameras to collect information in unknown environments to realize simultaneous localization and environment map construction, which has a wide range of applications in autonomous driving, virtual reality and other related fields. At present, the related research achievements about VSLAM can maintain high accuracy in static environment. But in dynamic environment, due to the presence of moving objects in the scene, the movement of these objects will reduce the stability of VSLAM system, resulting in inaccurate localization and mapping, or even failure. In this paper, a robust VSLAM method was proposed to effectively deal with the problem in dynamic environment. We proposed a dynamic region removal scheme based on semantic segmentation neural networks and geometric constraints. Firstly, semantic extraction neural network is used to extract prior active motion region, prior static region and prior passive motion region in the environment. Then, the light weight frame tracking module initializes the transform pose between the previous frame and the current frame on the prior static region. A motion consistency detection module based on multi-view geometry and scene flow is used to divide the environment into static region and dynamic region. Thus, the dynamic object region was successfully eliminated. Finally, only the static region is used for tracking thread. Our research is based on the ORBSLAM3 system, which is one of the most effective VSLAM systems available. We evaluated our method on the TUM RGB-D benchmark and the results demonstrate that the proposed VSLAM method improves the accuracy of the original ORBSLAM3 by 70%˜98.5% under high dynamic environment.

Keywords: dynamic scene, dynamic visual SLAM, semantic segmentation, scene flow, VSLAM

Procedia PDF Downloads 116

307 Uplift Segmentation Approach for Targeting Customers in a Churn Prediction Model

Authors: Shivahari Revathi Venkateswaran

Abstract:

Segmenting customers plays a significant role in churn prediction. It helps the marketing team with proactive and reactive customer retention. For the reactive retention, the retention team reaches out to customers who already showed intent to disconnect by giving some special offers. When coming to proactive retention, the marketing team uses churn prediction model, which ranks each customer from rank 1 to 100, where 1 being more risk to churn/disconnect (high ranks have high propensity to churn). The churn prediction model is built by using XGBoost model. However, with the churn rank, the marketing team can only reach out to the customers based on their individual ranks. To profile different groups of customers and to frame different marketing strategies for targeted groups of customers are not possible with the churn ranks. For this, the customers must be grouped in different segments based on their profiles, like demographics and other non-controllable attributes. This helps the marketing team to frame different offer groups for the targeted audience and prevent them from disconnecting (proactive retention). For segmentation, machine learning approaches like k-mean clustering will not form unique customer segments that have customers with same attributes. This paper finds an alternate approach to find all the combination of unique segments that can be formed from the user attributes and then finds the segments who have uplift (churn rate higher than the baseline churn rate). For this, search algorithms like fast search and recursive search are used. Further, for each segment, all customers can be targeted using individual churn ranks from the churn prediction model. Finally, a UI (User Interface) is developed for the marketing team to interactively search for the meaningful segments that are formed and target the right set of audience for future marketing campaigns and prevent them from disconnecting.

Keywords: churn prediction modeling, XGBoost model, uplift segments, proactive marketing, search algorithms, retention, k-mean clustering

Procedia PDF Downloads 71

306 Deep Learning Approach for Colorectal Cancer’s Automatic Tumor Grading on Whole Slide Images

Authors: Shenlun Chen, Leonard Wee

Abstract:

Tumor grading is an essential reference for colorectal cancer (CRC) staging and survival prognostication. The widely used World Health Organization (WHO) grading system defines histological grade of CRC adenocarcinoma based on the density of glandular formation on whole slide images (WSI). Tumors are classified as well-, moderately-, poorly- or un-differentiated depending on the percentage of the tumor that is gland forming; >95%, 50-95%, 5-50% and <5%, respectively. However, manually grading WSIs is a time-consuming process and can cause observer error due to subjective judgment and unnoticed regions. Furthermore, pathologists’ grading is usually coarse while a finer and continuous differentiation grade may help to stratifying CRC patients better. In this study, a deep learning based automatic differentiation grading algorithm was developed and evaluated by survival analysis. Firstly, a gland segmentation model was developed for segmenting gland structures. Gland regions of WSIs were delineated and used for differentiation annotating. Tumor regions were annotated by experienced pathologists into high-, medium-, low-differentiation and normal tissue, which correspond to tumor with clear-, unclear-, no-gland structure and non-tumor, respectively. Then a differentiation prediction model was developed on these human annotations. Finally, all enrolled WSIs were processed by gland segmentation model and differentiation prediction model. The differentiation grade can be calculated by deep learning models’ prediction of tumor regions and tumor differentiation status according to WHO’s defines. If multiple WSIs were possessed by a patient, the highest differentiation grade was chosen. Additionally, the differentiation grade was normalized into scale between 0 to 1. The Cancer Genome Atlas, project COAD (TCGA-COAD) project was enrolled into this study. For the gland segmentation model, receiver operating characteristic (ROC) reached 0.981 and accuracy reached 0.932 in validation set. For the differentiation prediction model, ROC reached 0.983, 0.963, 0.963, 0.981 and accuracy reached 0.880, 0.923, 0.668, 0.881 for groups of low-, medium-, high-differentiation and normal tissue in validation set. Four hundred and one patients were selected after removing WSIs without gland regions and patients without follow up data. The concordance index reached to 0.609. Optimized cut off point of 51% was found by “Maxstat” method which was almost the same as WHO system’s cut off point of 50%. Both WHO system’s cut off point and optimized cut off point performed impressively in Kaplan-Meier curves and both p value of logrank test were below 0.005. In this study, gland structure of WSIs and differentiation status of tumor regions were proven to be predictable through deep leaning method. A finer and continuous differentiation grade can also be automatically calculated through above models. The differentiation grade was proven to stratify CAC patients well in survival analysis, whose optimized cut off point was almost the same as WHO tumor grading system. The tool of automatically calculating differentiation grade may show potential in field of therapy decision making and personalized treatment.

Keywords: colorectal cancer, differentiation, survival analysis, tumor grading

Procedia PDF Downloads 134

305 Arabic Light Word Analyser: Roles with Deep Learning Approach

Authors: Mohammed Abu Shquier

Abstract:

This paper introduces a word segmentation method using the novel BP-LSTM-CRF architecture for processing semantic output training. The objective of web morphological analysis tools is to link a formal morpho-syntactic description to a lemma, along with morpho-syntactic information, a vocalized form, a vocalized analysis with morpho-syntactic information, and a list of paradigms. A key objective is to continuously enhance the proposed system through an inductive learning approach that considers semantic influences. The system is currently under construction and development based on data-driven learning. To evaluate the tool, an experiment on homograph analysis was conducted. The tool also encompasses the assumption of deep binary segmentation hypotheses, the arbitrary choice of trigram or n-gram continuation probabilities, language limitations, and morphology for both Modern Standard Arabic (MSA) and Dialectal Arabic (DA), which provide justification for updating this system. Most Arabic word analysis systems are based on the phonotactic morpho-syntactic analysis of a word transmitted using lexical rules, which are mainly used in MENA language technology tools, without taking into account contextual or semantic morphological implications. Therefore, it is necessary to have an automatic analysis tool taking into account the word sense and not only the morpho-syntactic category. Moreover, they are also based on statistical/stochastic models. These stochastic models, such as HMMs, have shown their effectiveness in different NLP applications: part-of-speech tagging, machine translation, speech recognition, etc. As an extension, we focus on language modeling using Recurrent Neural Network (RNN); given that morphological analysis coverage was very low in dialectal Arabic, it is significantly important to investigate deeply how the dialect data influence the accuracy of these approaches by developing dialectal morphological processing tools to show that dialectal variability can support to improve analysis.

Keywords: NLP, DL, ML, analyser, MSA, RNN, CNN

Procedia PDF Downloads 42

304 Acoustic Room Impulse Response Computation with Image Sources and Frequency Dependent Boundary Reflection Coefficients

Authors: Pratik Gandhi, Kavitha Chandra, Charles Thompson

Abstract:

A computational model of the acoustic room impulse response between transmitters and receivers located in an enclosed cavity under the influence of frequency-dependent reflection coefficients of the walls is presented. The characteristic features of the impulse responses that differentiate these results from frequency-independent reflecting surfaces are discussed. The image-source model is derived from the first principle solution to Green's function of the acoustic wave equation. The post-processing of the computed impulse response with a band-pass filter to better represents the response of a loud-speaker is demonstrated.

Keywords: acoustic room impulse response, frequency dependent reflection coefficients, Green's function, image model

Procedia PDF Downloads 232

303 Emotion-Convolutional Neural Network for Perceiving Stress from Audio Signals: A Brain Chemistry Approach

Authors: Anup Anand Deshmukh, Catherine Soladie, Renaud Seguier

Abstract:

Emotion plays a key role in many applications like healthcare, to gather patients’ emotional behavior. Unlike typical ASR (Automated Speech Recognition) problems which focus on 'what was said', it is equally important to understand 'how it was said.' There are certain emotions which are given more importance due to their effectiveness in understanding human feelings. In this paper, we propose an approach that models human stress from audio signals. The research challenge in speech emotion detection is finding the appropriate set of acoustic features corresponding to an emotion. Another difficulty lies in defining the very meaning of emotion and being able to categorize it in a precise manner. Supervised Machine Learning models, including state of the art Deep Learning classification methods, rely on the availability of clean and labelled data. One of the problems in affective computation is the limited amount of annotated data. The existing labelled emotions datasets are highly subjective to the perception of the annotator. We address the first issue of feature selection by exploiting the use of traditional MFCC (Mel-Frequency Cepstral Coefficients) features in Convolutional Neural Network. Our proposed Emo-CNN (Emotion-CNN) architecture treats speech representations in a manner similar to how CNN’s treat images in a vision problem. Our experiments show that Emo-CNN consistently and significantly outperforms the popular existing methods over multiple datasets. It achieves 90.2% categorical accuracy on the Emo-DB dataset. We claim that Emo-CNN is robust to speaker variations and environmental distortions. The proposed approach achieves 85.5% speaker-dependant categorical accuracy for SAVEE (Surrey Audio-Visual Expressed Emotion) dataset, beating the existing CNN based approach by 10.2%. To tackle the second problem of subjectivity in stress labels, we use Lovheim’s cube, which is a 3-dimensional projection of emotions. Monoamine neurotransmitters are a type of chemical messengers in the brain that transmits signals on perceiving emotions. The cube aims at explaining the relationship between these neurotransmitters and the positions of emotions in 3D space. The learnt emotion representations from the Emo-CNN are mapped to the cube using three component PCA (Principal Component Analysis) which is then used to model human stress. This proposed approach not only circumvents the need for labelled stress data but also complies with the psychological theory of emotions given by Lovheim’s cube. We believe that this work is the first step towards creating a connection between Artificial Intelligence and the chemistry of human emotions.

Keywords: deep learning, brain chemistry, emotion perception, Lovheim's cube

Procedia PDF Downloads 154

302 Encephalon-An Implementation of a Handwritten Mathematical Expression Solver

Authors: Shreeyam, Ranjan Kumar Sah, Shivangi

Abstract:

Recognizing and solving handwritten mathematical expressions can be a challenging task, particularly when certain characters are segmented and classified. This project proposes a solution that uses Convolutional Neural Network (CNN) and image processing techniques to accurately solve various types of equations, including arithmetic, quadratic, and trigonometric equations, as well as logical operations like logical AND, OR, NOT, NAND, XOR, and NOR. The proposed solution also provides a graphical solution, allowing users to visualize equations and their solutions. In addition to equation solving, the platform, called CNNCalc, offers a comprehensive learning experience for students. It provides educational content, a quiz platform, and a coding platform for practicing programming skills in different languages like C, Python, and Java. This all-in-one solution makes the learning process engaging and enjoyable for students. The proposed methodology includes horizontal compact projection analysis and survey for segmentation and binarization, as well as connected component analysis and integrated connected component analysis for character classification. The compact projection algorithm compresses the horizontal projections to remove noise and obtain a clearer image, contributing to the accuracy of character segmentation. Experimental results demonstrate the effectiveness of the proposed solution in solving a wide range of mathematical equations. CNNCalc provides a powerful and user-friendly platform for solving equations, learning, and practicing programming skills. With its comprehensive features and accurate results, CNNCalc is poised to revolutionize the way students learn and solve mathematical equations. The platform utilizes a custom-designed Convolutional Neural Network (CNN) with image processing techniques to accurately recognize and classify symbols within handwritten equations. The compact projection algorithm effectively removes noise from horizontal projections, leading to clearer images and improved character segmentation. Experimental results demonstrate the accuracy and effectiveness of the proposed solution in solving a wide range of equations, including arithmetic, quadratic, trigonometric, and logical operations. CNNCalc features a user-friendly interface with a graphical representation of equations being solved, making it an interactive and engaging learning experience for users. The platform also includes tutorials, testing capabilities, and programming features in languages such as C, Python, and Java. Users can track their progress and work towards improving their skills. CNNCalc is poised to revolutionize the way students learn and solve mathematical equations with its comprehensive features and accurate results.

Keywords: AL, ML, hand written equation solver, maths, computer, CNNCalc, convolutional neural networks

Procedia PDF Downloads 122

301 Blind Speech Separation Using SRP-PHAT Localization and Optimal Beamformer in Two-Speaker Environments

Authors: Hai Quang Hong Dam, Hai Ho, Minh Hoang Le Ngo

Abstract:

This paper investigates the problem of blind speech separation from the speech mixture of two speakers. A voice activity detector employing the Steered Response Power - Phase Transform (SRP-PHAT) is presented for detecting the activity information of speech sources and then the desired speech signals are extracted from the speech mixture by using an optimal beamformer. For evaluation, the algorithm effectiveness, a simulation using real speech recordings had been performed in a double-talk situation where two speakers are active all the time. Evaluations show that the proposed blind speech separation algorithm offers a good interference suppression level whilst maintaining a low distortion level of the desired signal.

Keywords: blind speech separation, voice activity detector, SRP-PHAT, optimal beamformer

Procedia PDF Downloads 283

300 An Adaptive Decomposition for the Variability Analysis of Observation Time Series in Geophysics

Authors: Olivier Delage, Thierry Portafaix, Hassan Bencherif, Guillaume Guimbretiere

Abstract:

Most observation data sequences in geophysics can be interpreted as resulting from the interaction of several physical processes at several time and space scales. As a consequence, measurements time series in geophysics have often characteristics of non-linearity and non-stationarity and thereby exhibit strong fluctuations at all time-scales and require a time-frequency representation to analyze their variability. Empirical Mode Decomposition (EMD) is a relatively new technic as part of a more general signal processing method called the Hilbert-Huang transform. This analysis method turns out to be particularly suitable for non-linear and non-stationary signals and consists in decomposing a signal in an auto adaptive way into a sum of oscillating components named IMFs (Intrinsic Mode Functions), and thereby acts as a bank of bandpass filters. The advantages of the EMD technic are to be entirely data driven and to provide the principal variability modes of the dynamics represented by the original time series. However, the main limiting factor is the frequency resolution that may give rise to the mode mixing phenomenon where the spectral contents of some IMFs overlap each other. To overcome this problem, J. Gilles proposed an alternative entitled “Empirical Wavelet Transform” (EWT) which consists in building from the segmentation of the original signal Fourier spectrum, a bank of filters. The method used is based on the idea utilized in the construction of both Littlewood-Paley and Meyer’s wavelets. The heart of the method lies in the segmentation of the Fourier spectrum based on the local maxima detection in order to obtain a set of non-overlapping segments. Because linked to the Fourier spectrum, the frequency resolution provided by EWT is higher than that provided by EMD and therefore allows to overcome the mode-mixing problem. On the other hand, if the EWT technique is able to detect the frequencies involved in the original time series fluctuations, EWT does not allow to associate the detected frequencies to a specific mode of variability as in the EMD technic. Because EMD is closer to the observation of physical phenomena than EWT, we propose here a new technic called EAWD (Empirical Adaptive Wavelet Decomposition) based on the coupling of the EMD and EWT technics by using the IMFs density spectral content to optimize the segmentation of the Fourier spectrum required by EWT. In this study, EMD and EWT technics are described, then EAWD technic is presented. Comparison of results obtained respectively by EMD, EWT and EAWD technics on time series of ozone total columns recorded at Reunion island over [1978-2019] period is discussed. This study was carried out as part of the SOLSTYCE project dedicated to the characterization and modeling of the underlying dynamics of time series issued from complex systems in atmospheric sciences

Keywords: adaptive filtering, empirical mode decomposition, empirical wavelet transform, filter banks, mode-mixing, non-linear and non-stationary time series, wavelet

Procedia PDF Downloads 137

299 Carrier Communication through Power Lines

Authors: Pavuluri Gopikrishna, B. Neelima

Abstract:

Power line carrier communication means audio power transmission via power line and reception of the amplified audio power at the receiver as in the form of speaker output signal using power line as the channel medium. The main objective of this suggested work is to transmit our message signal after frequency modulation by the help of FM modulator IC LM565 which gives output proportional to the input voltage of the input message signal. And this audio power is received from the power line by the help of isolation circuit and demodulated from IC LM565 which uses the concept of the PLL and produces FM demodulated signal to the listener. Message signal will be transmitted over the carrier signal that will be generated from the FM modulator IC LM565. Using this message signal will not damage because of no direct contact of message signal from the power line, but noise can disturb our information.

Keywords: amplification, fm demodulator ic 565, fm modulator ic 565, phase locked loop, power isolation

Procedia PDF Downloads 552

298 3D Microscopy, Image Processing, and Analysis of Lymphangiogenesis in Biological Models

Authors: Thomas Louis, Irina Primac, Florent Morfoisse, Tania Durre, Silvia Blacher, Agnes Noel

Abstract:

In vitro and in vivo lymphangiogenesis assays are essential for the identification of potential lymphangiogenic agents and the screening of pharmacological inhibitors. In the present study, we analyse three biological models: in vitro lymphatic endothelial cell spheroids, in vivo ear sponge assay, and in vivo lymph node colonisation by tumour cells. These assays provide suitable 3D models to test pro- and anti-lymphangiogenic factors or drugs. 3D images were acquired by confocal laser scanning and light sheet fluorescence microscopy. Virtual scan microscopy followed by 3D reconstruction by image aligning methods was also used to obtain 3D images of whole large sponge and ganglion samples. 3D reconstruction, image segmentation, skeletonisation, and other image processing algorithms are described. Fixed and time-lapse imaging techniques are used to analyse lymphatic endothelial cell spheroids behaviour. The study of cell spatial distribution in spheroid models enables to detect interactions between cells and to identify invasion hierarchy and guidance patterns. Global measurements such as volume, length, and density of lymphatic vessels are measured in both in vivo models. Branching density and tortuosity evaluation are also proposed to determine structure complexity. Those properties combined with vessel spatial distribution are evaluated in order to determine lymphangiogenesis extent. Lymphatic endothelial cell invasion and lymphangiogenesis were evaluated under various experimental conditions. The comparison of these conditions enables to identify lymphangiogenic agents and to better comprehend their roles in the lymphangiogenesis process. The proposed methodology is validated by its application on the three presented models.

Keywords: 3D image segmentation, 3D image skeletonisation, cell invasion, confocal microscopy, ear sponges, light sheet microscopy, lymph nodes, lymphangiogenesis, spheroids

Procedia PDF Downloads 378

297 Comparative Study of Affricate Initial Consonants in Chinese and Slovak

Authors: Maria Istvanova

Abstract:

The purpose of the comparative study of the affricate consonants in Chinese and Slovak is to increase the awareness of the main distinguishing features between these two languages taking into consideration this particular group of consonants. This study determines the main difficulties of the Slovak learners in the process of acquiring correct pronunciation of affricate initial consonants in Chinese based on the understanding of the distinguishing features of Chinese and Slovak affricates in combination with the experimental measuring of VOT values. The software tool Praat is used for the analysis of the recorded language samples. The language samples contain recordings of a Chinese native speaker and Slovak students of Chinese with different language proficiency levels. Based on the results of the analysis in Praat, the study identifies erroneous pronunciation and provide clarification of its cause.

Keywords: Chinese, comparative study, initial consonants, pronunciation, Slovak

Procedia PDF Downloads 159

296 Electronic Marketing Applied to Tourism Case Study

Authors: Ahcene Boucied

Abstract:

In this paper, a case study is conducted to analyze the effectiveness of web pages designed in Barbados for the tourism and hospitality industry. The assessment is made from two perspectives: to understand how the Barbados’ tourism industry is using the web, and to identify the effect of information technology on economic issues. In return, this is used: (a) to provide interested parties with accurate information and marketing insight necessary for decision making for electronic commerce/e-commerce, and (b) to demonstrate pragmatic difficulties in searching and designing web pages.

Keywords: segmentation, tourism stakeholders, destination marketing, case study

Procedia PDF Downloads 421

295 A Comparative Study on Compliment Response between Indonesian EFL Students and English Native Speakers

Authors: Maria F. Seran

Abstract:

In second language interaction, an EFL student always carries his knowledge of targeted language and sometimes gets influenced by his first language cultures which makes him transfer his utterances from the first language to the second language. The influence of L1 cultures somehow can lead to face-threatening act when it comes to responding on speech act, for instance, compliment. A speaker praises a compliment to show gratitude, and in return, he expects for compliment respond uttered by the hearer. While Western people use more acceptance continuum on compliment response, Indonesians utter more denial continuum which can somehow put the speakers into a face-threating situation and offense. This study investigated compliment response employed by EFL students and English native speakers. The study was distinct as none compliment response studies had been conducted to compare the compliment response between English native speakers and two different Indonesian EFL proficiency groups in which this research sought to meet this need. This study was significant for EFL teachers because it gave insight on cross-cultural understanding and brought pedagogical implication on explicit pragmatic instruction. Two research questions were set, 1. How do Indonesian EFL students and English native speakers respond compliments? 2. Is there any correlation between Indonesia EFL students’ proficiency and their compliment response use in English? The study involved three groups of participants; 5 English native speakers, 10 high-proficiency and 10 low-proficiency Indonesian EFL university students. The research instruments used in this study were as follows, an online TOEFL prediction test, focusing on grammar skill which was modified from Barron TOEFL exercise test, and a discourse completion task (DCT), consisting of 10 compliment respond items. Based on the research invitation, 20 second-year university students majoring in English education at Widya Mandira Catholic University, Kupang, East Nusa Tenggara, Indonesia who willingly participated in the research took the TOEFL prediction test online from the link provided. Students who achieved score 75-100 in test were categorized as high-proficiency students, while, students who attained score below 74 were considered as low-proficiency students. Then, the DCT survey was administered to these EFL groups and the native speaker group. Participants’ responses were coded and analyzed using categories of compliment response framework proposed by Tran. The study found out that 5 native speakers applied more compliment upgrades and appreciation token in compliment response, whereas, Indonesian EFL students combined some compliment response strategies in their utterance, such as, appreciation token, return and compliment downgrade. There is no correlation between students’ proficiency level and their CR responds as most EFL students in both groups produced less varied compliment responses and only 4 Indonesian high-proficiency students uttered more varied and were similar to the native speakers. The combination strategies used by EFL students can be explained as the influence of pragmatic transfer from L1 to L2; therefore, EFL teachers should explicitly teach more compliment response strategies to raise students’ awareness on English culture and elaborate their speaking to be more competence as close to native speakers as possible.

Keywords: compliment response, English native speakers, Indonesian EFL students, speech acts

Procedia PDF Downloads 148

294 Discourse Markers in Chinese University Students and Native English Speakers: A Corpus-Based Study

Authors: Dan Xie

Abstract:

The use of discourse markers (DMs) can play a crucial role in representing discourse interaction and pragmatic competence. Learners’ use of DMs and differences between native speakers (NSs) and non-native speakers (NNSs) in the use of various DMs have been the focus of considerable research attention. However, some commonly used DMs, such as you know, have not received as much attention in comparative studies, especially in the Chinese context. This study analyses data in two corpora (COLSEC and Spoken BNC 2014 (14-25)) to investigate how Chinese learners differ from NNSs in their use of the DM you know and its functions in speech. The results show that there is a significant difference between the two corpora in terms of the frequency of use of you know. In terms of the functions of you know, the study shows that six functions can all be present in both corpora, although there are significant differences between the five functional dimensions, especially in introducing a claim linked to the prior discourse and highlighting particular points in the discourse. It is hoped to show empirically how Chinese learners and NSs use DMs differently.

Keywords: you know, discourse marker, native speaker, Chinese learner

Procedia PDF Downloads 81

293 Cross-Language Variation and the ‘Fused’ Zone in Bilingual Mental Lexicon: An Experimental Research

Authors: Yuliya E. Leshchenko, Tatyana S. Ostapenko

Abstract:

Language variation is a widespread linguistic phenomenon which can affect different levels of a language system: phonological, morphological, lexical, syntactic, etc. It is obvious that the scope of possible standard alternations within a particular language is limited by a variety of its norms and regulations which set more or less clear boundaries for what is possible and what is not possible for the speakers. The possibility of lexical variation (alternate usage of lexical items within the same contexts) is based on the fact that the meanings of words are not clearly and rigidly defined in the consciousness of the speakers. Therefore, lexical variation is usually connected with unstable relationship between words and their referents: a case when a particular lexical item refers to different types of referents, or when a particular referent can be named by various lexical items. We assume that the scope of lexical variation in bilingual speech is generally wider than that observed in monolingual speech due to the fact that, besides ‘lexical item – referent’ relations it involves the possibility of cross-language variation of L1 and L2 lexical items. We use the term ‘cross-language variation’ to denote a case when two equivalent words of different languages are treated by a bilingual speaker as freely interchangeable within the common linguistic context. As distinct from code-switching which is traditionally defined as the conscious use of more than one language within one communicative act, in case of cross-language lexical variation the speaker does not perceive the alternate lexical items as belonging to different languages and, therefore, does not realize the change of language code. In the paper, the authors present research of lexical variation of adult Komi-Permyak – Russian bilingual speakers. The two languages co-exist on the territory of the Komi-Permyak District in Russia (Komi-Permyak as the ethnic language and Russian as the official state language), are usually acquired from birth in natural linguistic environment and, according to the data of sociolinguistic surveys, are both identified by the speakers as coordinate mother tongues. The experimental research demonstrated that alternation of Komi-Permyak and Russian words within one utterance/phrase is highly frequent both in speech perception and production. Moreover, our participants estimated cross-language word combinations like ‘маленькая /Russian/ нывка /Komi-Permyak/’ (‘a little girl’) or ‘мунны /Komi-Permyak/ домой /Russian/’ (‘go home’) as regular/habitual, containing no violation of any linguistic rules and being equally possible in speech as the equivalent intra-language word combinations (‘учöтик нывка’ /Komi-Permyak/ or ‘идти домой’ /Russian/). All the facts considered, we claim that constant concurrent use of the two languages results in the fact that a large number of their words tend to be intuitively interpreted by the speakers as lexical variants not only related to the same referent, but also referring to both languages or, more precisely, to none of them in particular. Consequently, we can suppose that bilingual mental lexicon includes an extensive ‘fused’ zone of lexical representations that provide the basis for cross-language variation in bilingual speech.

Keywords: bilingualism, bilingual mental lexicon, code-switching, lexical variation

Procedia PDF Downloads 148

292 Subtitling in the Classroom: Combining Language Mediation, ICT and Audiovisual Material

Authors: Rossella Resi

Abstract:

This paper describes a project carried out in an Italian school with English learning pupils combining three didactic tools which are attested to be relevant for the success of young learner’s language curriculum: the use of technology, the intralingual and interlingual mediation (according to CEFR) and the cultural dimension. Aim of this project was to test a technological hands-on translation activity like subtitling in a formal teaching context and to exploit its potential as motivational tool for developing listening and writing, translation and cross-cultural skills among language learners. The activities proposed involved the use of professional subtitling software called Aegisub and culture-specific films. The workshop was optional so motivation was entirely based on the pleasure of engaging in the use of a realistic subtitling program and on the challenge of meeting the constraints that a real life/work situation might involve. Twelve pupils in the age between 16 and 18 have attended the afternoon workshop. The workshop was organized in three parts: (i) An introduction where the learners were opened up to the concept and constraints of subtitling and provided with few basic rules on spotting and segmentation. During this session learners had also the time to familiarize with the main software features. (ii) The second part involved three subtitling activities in plenum or in groups. In the first activity the learners experienced the technical dimensions of subtitling. They were provided with a short video segment together with its transcription to be segmented and time-spotted. The second activity involved also oral comprehension. Learners had to understand and transcribe a video segment before subtitling it. The third activity embedded a translation activity of a provided transcription including segmentation and spotting of subtitles. (iii) The workshop ended with a small final project. At this point learners were able to master a short subtitling assignment (transcription, translation, segmenting and spotting) on their own with a similar video interview. The results of these assignments were above expectations since the learners were highly motivated by the authentic and original nature of the assignment. The subtitled videos were evaluated and watched in the regular classroom together with other students who did not take part to the workshop.

Keywords: ICT, L2, language learning, language mediation, subtitling

Procedia PDF Downloads 416

291 An Analysis of Conversation Structure of Oprah Winfrey and Justin Bieber Utterances on The Oprah Winfrey Show

Authors: Najib Khumaidillah

Abstract:

A conversation needs skills to create the good flow of it. The skills also need to be paid attention by a host like Oprah Winfrey and Justin Bieber as an artist. This study is aimed at describing turn taking strategies and adjacency pairs used by the speakers. The data are from one segment of The Oprah Winfrey Show’s transcription with Justin Bieber. Those are analyzed by Stenstorm’s turn taking theories and adjacency pairs theories. From the analysis, it was found that both speakers use various turn taking strategies and adjacency pairs. These findings are hoped to be an example for non-native English speaker in doing English conversation and advance people’s comprehension of how to organize good conversation structure.

Keywords: adjacency pairs, conversation structure, the Oprah Winfrey show, turn taking

Procedia PDF Downloads 195