Search results for: percussive sounds
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 173

Search results for: percussive sounds

173 A Combined Feature Extraction and Thresholding Technique for Silence Removal in Percussive Sounds

Authors: B. Kishore Kumar, Pogula Rakesh, T. Kishore Kumar

Abstract:

The music analysis is a part of the audio content analysis used to analyze the music by using the different features of audio signal. In music analysis, the first step is to divide the music signal to different sections based on the feature profiles of the music signal. In this paper, we present a music segmentation technique that will effectively segmentize the signal and thresholding technique to remove silence from the percussive sounds produced by percussive instruments, which uses two features of music, namely signal energy and spectral centroid. The proposed method impose thresholds on both the features which will vary depends on the music signal. Depends on the threshold, silence part is removed and the segmentation is done. The effectiveness of the proposed method is analyzed using MATLAB.

Keywords: percussive sounds, spectral centroid, spectral energy, silence removal, feature extraction

Procedia PDF Downloads 554
172 A Novel Method for Silence Removal in Sounds Produced by Percussive Instruments

Authors: B. Kishore Kumar, Rakesh Pogula, T. Kishore Kumar

Abstract:

The steepness of an audio signal which is produced by the musical instruments, specifically percussive instruments is the perception of how high tone or low tone which can be considered as a frequency closely related to the fundamental frequency. This paper presents a novel method for silence removal and segmentation of music signals produced by the percussive instruments and the performance of proposed method is studied with the help of MATLAB simulations. This method is based on two simple features, namely the signal energy and the spectral centroid. As long as the feature sequences are extracted, a simple thresholding criterion is applied in order to remove the silence areas in the sound signal. The simulations were carried on various instruments like drum, flute and guitar and results of the proposed method were analyzed.

Keywords: percussive instruments, spectral energy, spectral centroid, silence removal

Procedia PDF Downloads 369
171 Difficulties in Pronouncing the English Bilabial Plosive Sounds among EFL Students

Authors: Ali Mohammed Saleh Al-Hamzi

Abstract:

This study aims at finding out the most difficult position in pronouncing the bilabial plosive sounds at the fourth level of English foreign language students of the Faculty of Education, Mahweet, Sana’a University in Yemen. The subject of this study were 50 participants from English foreign language students aged 22-25. In describing sounds according to their place of articulation, sounds are classified as bilabial, labiodental, dental, alveolar, post-alveolar, palato-alveolar retroflex, palatal, velar, uvular, and glottal. In much the same way, sounds can be described in their manner of articulation as plosives, nasals, affricates, flaps, taps, rolls, fricatives, laterals, frictionless continuants, and semi-vowels. For English foreign language students in Yemen, there are some articulators that are difficult to pronounce. In this study, the researcher focuses on difficulties in pronouncing the English bilabial plosive sounds among English foreign language students. It can be in the initial, medial, and final positions. The problem discussed in this study was: which position is the most difficult in pronouncing the English bilabial plosive sounds? To solve the problem, a descriptive qualitative method was conducted in this study. The data were collected from each English bilabial plosive sounds produced by students. Finally, the researcher reached that the most difficult position in pronouncing the English bilabial plosive sounds is when English bilabial plosive /p/ and /b/ occur word-finally, where both are voiceless.

Keywords: difficulty, EFL students’ pronunciation, bilabial sounds, plosive sounds

Procedia PDF Downloads 116
170 Heart Murmurs and Heart Sounds Extraction Using an Algorithm Process Separation

Authors: Fatima Mokeddem

Abstract:

The phonocardiogram signal (PCG) is a physiological signal that reflects heart mechanical activity, is a promising tool for curious researchers in this field because it is full of indications and useful information for medical diagnosis. PCG segmentation is a basic step to benefit from this signal. Therefore, this paper presents an algorithm that serves the separation of heart sounds and heart murmurs in case they exist in order to use them in several applications and heart sounds analysis. The separation process presents here is founded on three essential steps filtering, envelope detection, and heart sounds segmentation. The algorithm separates the PCG signal into S1 and S2 and extract cardiac murmurs.

Keywords: phonocardiogram signal, filtering, Envelope, Detection, murmurs, heart sounds

Procedia PDF Downloads 108
169 Parameter Selection and Monitoring for Water-Powered Percussive Drilling in Green-Fields Mineral Exploration

Authors: S. J. Addinell, T. Richard, B. Evans

Abstract:

The Deep Exploration Technologies Cooperative Research Centre (DET CRC) is researching and developing a new coiled tubing based greenfields mineral exploration drilling system utilising downhole water powered percussive drill tooling. This new drilling system is aimed at significantly reducing the costs associated with identifying mineral resource deposits beneath deep, barron cover. This system has shown superior rates of penetration in water-rich hard rock formations at depths exceeding 500 meters. Several key challenges exist regarding the deployment and use of these bottom hole assemblies for mineral exploration, and this paper discusses some of the key technical challenges. This paper presents experimental results obtained from the research program during laboratory and field testing of the prototype drilling system. A study of the morphological aspects of the cuttings generated during the percussive drilling process is presented and shows a strong power law relationship for particle size distributions. Several percussive drilling parameters such as RPM, applied fluid pressure and weight on bit have been shown to influence the particle size distributions of the cuttings generated. This has direct influence on other drilling parameters such as flow loop performance, cuttings dewatering, and solids control. Real-time, accurate knowledge of percussive system operating parameters will assist the driller in maximising the efficiency of the drilling process. The applied fluid flow, fluid pressure, and rock properties are known to influence the natural oscillating frequency of the percussive hammer, but this paper also shows that drill bit design, drill bit wear and the applied weight on bit can also influence the oscillation frequency. Due to the changing drilling conditions and therefore changing operating parameters, real-time understanding of the natural operating frequency is paramount to achieving system optimisation. Several techniques to understand the oscillating frequency have been investigated and presented. With a conventional top drive drilling rig, spectral analysis of applied fluid pressure, hydraulic feed force pressure, hold back pressure and drill string vibrations have shown the presence of the operating frequency of the bottom hole tooling. Unfortunately, however, with the implementation of a coiled tubing drilling rig, implementing a positive displacement downhole motor to provide drill bit rotation, these signals are not available for interrogation at the surface and therefore another method must be considered. The investigation and analysis of ground vibrations using geophone sensors, similar to seismic-while-drilling techniques have indicated the presence of the natural oscillating frequency of the percussive hammer. This method is shown to provide a robust technique for the determination of the downhole percussive oscillation frequency when used with a coiled tubing drill rig.

Keywords: cuttings characterization, drilling optimization, oscillation frequency, percussive drilling, spectral analysis

Procedia PDF Downloads 198
168 Investigating Underground Explosion-Like Sounds in Sarableh City and Its Possible Connection with Geological Hazards

Authors: Hosein Almasikia

Abstract:

Sarableh City is located in the west of Iran and in the seismic zone of Zagros. After the Azgole-Sarpol Zahab earthquake with a magnitude of 3.7 Richter on November 21, 2016, in some parts of Sarableh city, horrible sounds were heard by people. There is also a sound similar to the wear of the mill by some of the residents. Vibration studies and field investigations showed that these sounds have a geological origin and are emitted from the ground to the surface and may be related to geological hazards such as landslides, collapse of karstic zones, etc. In this study, an attempt has been made to investigate the possible relationship between these abnormal sounds and geological hazards.

Keywords: Sarable, Zagros, landslide, karstic zone

Procedia PDF Downloads 24
167 The Influence of Music Education and the Order of Sounds on the Grouping of Sounds into Sequences of Six Tones

Authors: Adam Rosiński

Abstract:

This paper discusses an experiment conducted with two groups of participants, composed of musicians and non-musicians, in order to investigate the impact of the speed of a sound sequence and the order of sounds on the grouping of sounds into sequences of six tones. Significant differences were observed between musicians and non-musicians with respect to the threshold sequence speed at which the sequence was split into two streams. The differences in the results for the two groups suggest that the musical education of the participating listeners may be a vital factor. The criterion of musical education should be taken into account during experiments so that the results obtained are reliable, uniform, and free from interpretive errors.

Keywords: auditory scene analysis, education, hearing, psychoacoustics

Procedia PDF Downloads 60
166 Development of Sound Tactile Interface by Use of Human Sensation of Stiffness

Authors: K. Doi, T. Nishimura, M. Umeda

Abstract:

There are very few sound interfaces that both healthy people and hearing handicapped people can use to play together. In this study, we developed a sound tactile interface that makes use of the human sensation of stiffness. The interface comprises eight elastic objects having varying degrees of stiffness. Each elastic object is shaped like a column. When people with and without hearing disabilities press each elastic object, different sounds are produced depending on the stiffness of the elastic object. The types of sounds used were “Do Re Mi sounds.” The interface has a major advantage in that people with or without hearing disabilities can play with it. We found that users were able to recognize the hardness sensation and relate it to the corresponding Do Re Mi sounds.

Keywords: tactile sense, sound interface, stiffness perception, elastic object

Procedia PDF Downloads 253
165 Design of a Real Time Heart Sounds Recognition System

Authors: Omer Abdalla Ishag, Magdi Baker Amien

Abstract:

Physicians used the stethoscope for listening patient heart sounds in order to make a diagnosis. However, the determination of heart conditions by acoustic stethoscope is a difficult task so it requires special training of medical staff. This study developed an accurate model for analyzing the phonocardiograph signal based on PC and DSP processor. The system has been realized into two phases; offline and real time phase. In offline phase, 30 cases of heart sounds files were collected from medical students and doctor's world website. For experimental phase (real time), an electronic stethoscope has been designed, implemented and recorded signals from 30 volunteers, 17 were normal cases and 13 were various pathologies cases, these acquired 30 signals were preprocessed using an adaptive filter to remove lung sounds. The background noise has been removed from both offline and real data, using wavelet transform, then graphical and statistics features vector elements were extracted, finally a look-up table was used for classification heart sounds cases. The obtained results of the implemented system showed accuracy of 90%, 80% and sensitivity of 87.5%, 82.4% for offline data, and real data respectively. The whole system has been designed on TMS320VC5509a DSP Platform.

Keywords: code composer studio, heart sounds, phonocardiograph, wavelet transform

Procedia PDF Downloads 404
164 Optimizing Solids Control and Cuttings Dewatering for Water-Powered Percussive Drilling in Mineral Exploration

Authors: S. J. Addinell, A. F. Grabsch, P. D. Fawell, B. Evans

Abstract:

The Deep Exploration Technologies Cooperative Research Centre (DET CRC) is researching and developing a new coiled tubing based greenfields mineral exploration drilling system utilising down-hole water-powered percussive drill tooling. This new drilling system is aimed at significantly reducing the costs associated with identifying mineral resource deposits beneath deep, barren cover. This system has shown superior rates of penetration in water-rich, hard rock formations at depths exceeding 500 metres. With fluid flow rates of up to 120 litres per minute at 200 bar operating pressure to energise the bottom hole tooling, excessive quantities of high quality drilling fluid (water) would be required for a prolonged drilling campaign. As a result, drilling fluid recovery and recycling has been identified as a necessary option to minimise costs and logistical effort. While the majority of the cuttings report as coarse particles, a significant fines fraction will typically also be present. To maximise tool life longevity, the percussive bottom hole assembly requires high quality fluid with minimal solids loading and any recycled fluid needs to have a solids cut point below 40 microns and a concentration less than 400 ppm before it can be used to reenergise the system. This paper presents experimental results obtained from the research program during laboratory and field testing of the prototype drilling system. A study of the morphological aspects of the cuttings generated during the percussive drilling process shows a strong power law relationship for particle size distributions. This data is critical in optimising solids control strategies and cuttings dewatering techniques. Optimisation of deployable solids control equipment is discussed and how the required centrate clarity was achieved in the presence of pyrite-rich metasediment cuttings. Key results were the successful pre-aggregation of fines through the selection and use of high molecular weight anionic polyacrylamide flocculants and the techniques developed for optimal dosing prior to scroll decanter centrifugation, thus keeping sub 40 micron solids loading within prescribed limits. Experiments on maximising fines capture in the presence of thixotropic drilling fluid additives (e.g. Xanthan gum and other biopolymers) are also discussed. As no core is produced during the drilling process, it is intended that the particle laden returned drilling fluid is used for top-of-hole geochemical and mineralogical assessment. A discussion is therefore presented on the biasing and latency of cuttings representivity by dewatering techniques, as well as the resulting detrimental effects on depth fidelity and accuracy. Data pertaining to the sample biasing with respect to geochemical signatures due to particle size distributions is presented and shows that, depending on the solids control and dewatering techniques used, it can have unwanted influence on top-of-hole analysis. Strategies are proposed to overcome these effects, improving sample quality. Successful solids control and cuttings dewatering for water-powered percussive drilling is presented, contributing towards the successful advancement of coiled tubing based greenfields mineral exploration.

Keywords: cuttings, dewatering, flocculation, percussive drilling, solids control

Procedia PDF Downloads 215
163 Robust Heart Sounds Segmentation Based on the Variation of the Phonocardiogram Curve Length

Authors: Mecheri Zeid Belmecheri, Maamar Ahfir, Izzet Kale

Abstract:

Automatic cardiac auscultation is still a subject of research in order to establish an objective diagnosis. Recorded heart sounds as Phonocardiogram signals (PCG) can be used for automatic segmentation into components that have clinical meanings. These are the first sound, S1, the second sound, S2, and the systolic and diastolic components, respectively. In this paper, an automatic method is proposed for the robust segmentation of heart sounds. This method is based on calculating an intermediate sawtooth-shaped signal from the length variation of the recorded Phonocardiogram (PCG) signal in the time domain and, using its positive derivative function that is a binary signal in training a Recurrent Neural Network (RNN). Results obtained in the context of a large database of recorded PCGs with their simultaneously recorded ElectroCardioGrams (ECGs) from different patients in clinical settings, including normal and abnormal subjects, show a segmentation testing performance average of 76 % sensitivity and 94 % specificity.

Keywords: heart sounds, PCG segmentation, event detection, recurrent neural networks, PCG curve length

Procedia PDF Downloads 143
162 Automatic Classification of Periodic Heart Sounds Using Convolutional Neural Network

Authors: Jia Xin Low, Keng Wah Choo

Abstract:

This paper presents an automatic normal and abnormal heart sound classification model developed based on deep learning algorithm. MITHSDB heart sounds datasets obtained from the 2016 PhysioNet/Computing in Cardiology Challenge database were used in this research with the assumption that the electrocardiograms (ECG) were recorded simultaneously with the heart sounds (phonocardiogram, PCG). The PCG time series are segmented per heart beat, and each sub-segment is converted to form a square intensity matrix, and classified using convolutional neural network (CNN) models. This approach removes the need to provide classification features for the supervised machine learning algorithm. Instead, the features are determined automatically through training, from the time series provided. The result proves that the prediction model is able to provide reasonable and comparable classification accuracy despite simple implementation. This approach can be used for real-time classification of heart sounds in Internet of Medical Things (IoMT), e.g. remote monitoring applications of PCG signal.

Keywords: convolutional neural network, discrete wavelet transform, deep learning, heart sound classification

Procedia PDF Downloads 312
161 Optimum Drilling States in Down-the-Hole Percussive Drilling: An Experimental Investigation

Authors: Joao Victor Borges Dos Santos, Thomas Richard, Yevhen Kovalyshen

Abstract:

Down-the-hole (DTH) percussive drilling is an excavation method that is widely used in the mining industry due to its high efficiency in fragmenting hard rock formations. A DTH hammer system consists of a fluid driven (air or water) piston and a drill bit; the reciprocating movement of the piston transmits its kinetic energy to the drill bit by means of stress waves that propagate through the drill bit towards the rock formation. In the literature of percussive drilling, the existence of an optimum drilling state (Sweet Spot) is reported in some laboratory and field experimental studies. An optimum rate of penetration is achieved for a specific range of axial thrust (or weight-on-bit) beyond which the rate of penetration decreases. Several authors advance different explanations as possible root causes to the occurrence of the Sweet Spot, but a universal explanation or consensus does not exist yet. The experimental investigation in this work was initiated with drilling experiments conducted at a mining site. A full-scale drilling rig (equipped with a DTH hammer system) was instrumented with high precision sensors sampled at a very high sampling rate (kHz). Data was collected while two boreholes were being excavated, an in depth analysis of the recorded data confirmed that an optimum performance can be achieved for specific ranges of input thrust (weight-on-bit). The high sampling rate allowed to identify the bit penetration at each single impact (of the piston on the drill bit) as well as the impact frequency. These measurements provide a direct method to identify when the hammer does not fire, and drilling occurs without percussion, and the bit propagate the borehole by shearing the rock. The second stage of the experimental investigation was conducted in a laboratory environment with a custom-built equipment dubbed Woody. Woody allows the drilling of shallow holes few centimetres deep by successive discrete impacts from a piston. After each individual impact, the bit angular position is incremented by a fixed amount, the piston is moved back to its initial position at the top of the barrel, and the air pressure and thrust are set back to their pre-set values. The goal is to explore whether the observed optimum drilling state stems from the interaction between the drill bit and the rock (during impact) or governed by the overall system dynamics (between impacts). The experiments were conducted on samples of Calca Red, with a drill bit of 74 millimetres (outside diameter) and with weight-on-bit ranging from 0.3 kN to 3.7 kN. Results show that under the same piston impact energy and constant angular displacement of 15 degrees between impact, the average drill bit rate of penetration is independent of the weight-on-bit, which suggests that the sweet spot is not caused by intrinsic properties of the bit-rock interface.

Keywords: optimum drilling state, experimental investigation, field experiments, laboratory experiments, down-the-hole percussive drilling

Procedia PDF Downloads 49
160 From the “Movement Language” to Communication Language

Authors: Mahmudjon Kuchkarov, Marufjon Kuchkarov

Abstract:

The origin of ‘Human Language’ is still a secret and the most interesting subject of historical linguistics. The core element is the nature of labeling or coding the things or processes with symbols and sounds. In this paper, we investigate human’s involuntary Paired Sounds and Shape Production (PSSP) and its contribution to the development of early human communication. Aimed at twenty-six volunteers who provided many physical movements with various difficulties, the research team investigated the natural, repeatable, and paired sounds and shape productions during human activities. The paper claims the involvement of Paired Sounds and Shape Production (PSSP) in the phonetic origin of some modern words and the existence of similarities between elements of PSSP with characters of the classic Latin alphabet. The results may be used not only as a supporting idea for existing theories but to create a closer look at some fundamental nature of the origin of the languages as well.

Keywords: body shape, body language, coding, Latin alphabet, merging method, movement language, movement sound, natural sound, origin of language, pairing, phonetics, sound and shape production, word origin, word semantic

Procedia PDF Downloads 138
159 The Effect of the Pronunciation of Emphatic Sounds on Perceived Masculinity/Femininity

Authors: M. Sayyour, M. Abdulkareem, O. Osman, S. Salmeh

Abstract:

Emphatic sounds in Arabic are /tˤ/, /sˤ/, /dˤ/, and /ðˤ/. They involve a secondary articulation in the pharynx area as opposed to their counterparts: /t/,/s/,/d/and /ð/. Although they are present in most Arabic dialects, some dialects have lost this class as a historical development, such as Maltese Arabic. It has been found that there is a difference in the pronunciation of these emphatic sounds between the two genders, arguing that males tend to produce more evident emphasis than females. This study builds on these studies by trying to investigate whether listeners perceive fully emphatic sounds as more masculine and less emphatic sounds as more feminine. Furthermore, the study aims to find out which is more important in this perception process: the emphatic consonant itself or the vowel following it. To test this, natural and manipulated tokens of two male and two female speakers were used. The natural tokens include words that have emphatic consonant and emphatic vowel and tokens that have plain consonant and plain vowel. The manipulated tokens include words that have emphatic consonant but central vowel and plain consonant followed by the same central vowel. These manipulated tokens allow us to see whether the consonant will still affect the perception even if the vowel is controlled. Another group of words that contained no emphatic sounds was used as a control group. The total number of tokens (natural, manipulated, and control) are 160 tokens. After that, 60 university students (30 males and 30 females) listened to these tokens and responded by choosing a specific character that they think is likely to produce each token. The characters’ descriptions are carefully written with two degrees of femininity and two degrees of masculinity. The preliminary results for the femininity level showed that the highest degree of femininity was for tokens that contain a plain consonant and a plain vowel. The lowest level of femininity was given for tokens that have fully emphatic consonant and vowel. For the manipulated tokens that contained plain consonant and central vowel, the femininity degree was high which indicates that the consonant is more important than the vowel, while for the manipulated tokens that contain emphatic consonant and a central vowel, the femininity level was higher than that for the tokens that have emphatic consonant and emphatic vowel, which indicates that the vowel is more important for the perception of emphatic consonants. These results are interpreted in light of feminist linguistic theories, linguistic expectations, performed gender and linguistic change theories.

Keywords: Emphatic sounds, gender studies, perception, sociophonetics

Procedia PDF Downloads 339
158 Problems of Learning English Vowels Pronunciation in Nigeria

Authors: Wasila Lawan Gadanya

Abstract:

This paper examines the problems of learning English vowel pronunciation. The objective is to identify some of the factors that affect the learning of English vowel sounds and their proper realization in words. The theoretical framework adopted is based on both error analysis and contrastive analysis. The data collection instruments used in the study are questionnaire and word list for the respondents (students) and observation of some of their lecturers. All the data collected were analyzed using simple percentage. The findings show that it is not a single factor that affects the learning of English vowel pronunciation rather many factors concurrently do so. Among the factors examined, it has been found that lack of correlation between English orthography and its pronunciation, not mother-tongue (which most people consider as a factor affecting learning of the pronunciation of a second language), has the greatest influence on students’ learning and realization of English vowel sounds since the respondents in this study are from different ethnic groups of Nigeria and thus speak different languages but having the same or almost the same problem when pronouncing the English vowel sounds.

Keywords: English vowels, learning, Nigeria, pronunciation

Procedia PDF Downloads 396
157 A Case Study Using Sounds Write and The Writing Revolution to Support Students with Literacy Difficulties

Authors: Emilie Zimet

Abstract:

During our department meetings for teachers of children with learning disabilities and difficulties, we often discuss the best practices for supporting students who come to school with literacy difficulties. After completing Sounds Write and Writing Revolution courses, it seems there is a possibility to link approaches and still maintain fidelity to a program and provide individualised instruction to support students with such difficulties and disabilities. In this case study, the researcher has been focussing on how best to use the knowledge acquired to provide quality intervention that targets the varied areas of challenge that students require support in. Students present to school with a variety of co-occurring reading and writing deficits and with complementary approaches, such as The Writing Revolution and Sounds Write, it is possible to support students to improve their fundamental skills in these key areas. Over the next twelve weeks, the researcher will collect data on current students with whom this approach will be trialled and then compare growth with students from last year who received support using Sounds-Write only. Maintaining fidelity may be a potential challenge as each approach has been tested in a specific format for best results. The aim of this study is to determine if approaches can be combined, so the implementation will need to incorporate elements of both reading (from Sounds Write) and writing (from The Writing Revolution). A further challenge is the time length of each session (25 minutes), so the researcher will need to be creative in the use of time to ensure both writing and reading are targeted while ensuring the programs are implemented. The implementation will be documented using student work samples and planning documents. This work will include a display of findings using student learning samples to demonstrate the importance of co-targeting the reading and writing challenges students come to school with.

Keywords: literacy difficulties, intervention, individual differences, methods of provision

Procedia PDF Downloads 14
156 Sound Instance: Art, Perception and Composition through Soundscapes

Authors: Ricardo Mestre

Abstract:

The soundscape stands out as an agglomeration of sounds available in the world, associated with different contexts and origins, being a theme studied by various areas of knowledge, seeking to guide their benefits and their consequences, contributing to the welfare of society and other ecosystems. Murray Schafer, the author who originally developed this concept, highlights the need for a greater recognition of sound reality, through the selection and differentiation of sounds, contributing to a tuning of the world and to the balance and well-being of humanity. According to some authors sound environment, produced and created in various ways, provides various sources of information, contributing to the orientation of the human being, alerting and manipulating him during his daily journey, like small notifications received on a cell phone or other device with these features. In this way, it becomes possible to give sound its due importance in relation to the processes of individual representation, in manners of social, professional and emotional life. Ensuring an individual representation means providing the human being with new tools for the long process of reflection by recognizing his environment, the sounds that represent him, and his perspective on his respective function in it. In order to provide more information about the importance of the sound environment inherent to the individual reality, one introduces the term sound instance, in order to refer to the whole sound field existing in the individual's life, which is divided into four distinct subfields, but essential to the process of individual representation, called sound matrix, sound cycles, sound traces and sound interference.

Keywords: sound instance, soundscape, sound art, perception, composition

Procedia PDF Downloads 106
155 Slice Bispectrogram Analysis-Based Classification of Environmental Sounds Using Convolutional Neural Network

Authors: Katsumi Hirata

Abstract:

Certain systems can function well only if they recognize the sound environment as humans do. In this research, we focus on sound classification by adopting a convolutional neural network and aim to develop a method that automatically classifies various environmental sounds. Although the neural network is a powerful technique, the performance depends on the type of input data. Therefore, we propose an approach via a slice bispectrogram, which is a third-order spectrogram and is a slice version of the amplitude for the short-time bispectrum. This paper explains the slice bispectrogram and discusses the effectiveness of the derived method by evaluating the experimental results using the ESC‑50 sound dataset. As a result, the proposed scheme gives high accuracy and stability. Furthermore, some relationship between the accuracy and non-Gaussianity of sound signals was confirmed.

Keywords: environmental sound, bispectrum, spectrogram, slice bispectrogram, convolutional neural network

Procedia PDF Downloads 89
154 Investigating the Pronunciation of '-S' and '-Ed' Suffixes in Yemeni English

Authors: Saif Bareq, Vivek Mirgane

Abstract:

The present paper seeks to explicate the pronunciation of the ‘-s’ and ‘-ed’ suffixes when applied in their relative places in word endings. It attempts to investigate the problems faced by Yemenis in the pronunciation of these suffixes in all occurrences and realizations. It discusses the realization of ‘s’ in the four areas of plural, 3rd person singular and genitive markers, and contraction of ‘has’ and ‘is’ as in he’s, it’s ..,etc. and shows how they are differently represented by three different sounds /s/, /z/ and /z/ based on the phonological structure of the words in which they occur. Similarly, it explains the realization of the ‘-ed’ suffix of the past and past participle marker and how it is realized differently by three sounds governed by the phonological structure of these words. Besides, it tries to shed some light on the English morphophonemic and phonological rules that govern the pronunciation of such troublesome endings. It is hypothesized that the absence of such phenomenon in the mother tongue pronunciation of these suffixes.

Keywords: Suffixes' Pronunciation, Phonological Structure, Phonological Rules, Morpho-Phonemics, Yemeni English

Procedia PDF Downloads 254
153 Sound Analysis of Young Broilers Reared under Different Stocking Densities in Intensive Poultry Farming

Authors: Xiaoyang Zhao, Kaiying Wang

Abstract:

The choice of stocking density in poultry farming is a potential way for determining welfare level of poultry. However, it is difficult to measure stocking densities in poultry farming because of a lot of variables such as species, age and weight, feeding way, house structure and geographical location in different broiler houses. A method was proposed in this paper to measure the differences of young broilers reared under different stocking densities by sound analysis. Vocalisations of broilers were recorded and analysed under different stocking densities to identify the relationship between sounds and stocking densities. Recordings were made continuously for three-week-old chickens in order to evaluate the variation of sounds emitted by the animals at the beginning. The experimental trial was carried out in an indoor reared broiler farm; the audio recording procedures lasted for 5 days. Broilers were divided into 5 groups, stocking density treatments were 8/m², 10/m², 12/m² (96birds/pen), 14/m² and 16/m², all conditions including ventilation and feed conditions were kept same except from stocking densities in every group. The recordings and analysis of sounds of chickens were made noninvasively. Sound recordings were manually analysed and labelled using sound analysis software: GoldWave Digital Audio Editor. After sound acquisition process, the Mel Frequency Cepstrum Coefficients (MFCC) was extracted from sound data, and the Support Vector Machine (SVM) was used as an early detector and classifier. This preliminary study, conducted in an indoor reared broiler farm shows that this method can be used to classify sounds of chickens under different densities economically (only a cheap microphone and recorder can be used), the classification accuracy is 85.7%. This method can predict the optimum stocking density of broilers with the complement of animal welfare indicators, animal productive indicators and so on.

Keywords: broiler, stocking density, poultry farming, sound monitoring, Mel Frequency Cepstrum Coefficients (MFCC), Support Vector Machine (SVM)

Procedia PDF Downloads 114
152 Challenges of Teaching and Learning English Speech Sounds in Five Selected Secondary Schools in Bauchi, Bauchi State, Nigeria

Authors: Mairo Musa Galadima, Phoebe Mshelia

Abstract:

In Nigeria, the national policy of education stipulates that the kindergarten primary schools and the legislature are to use the three popular Nigerian Languages namely: Hausa, Igbo and Yoruba. However, the English language seems to be preferred and this calls for this paper. Attempts were made to draw out the challenges faced by learners in understanding English speech sounds and using them to communicate effectively in English; using 5(five) selected secondary school in Bauchi. It was discover that challenges abound in the wrong use of stress and intonation, transfer of phonetic features from their first language. Others are inadequate qualified teachers and relevant materials including text-books. It is recommended that teachers of English should lay more emphasis on the teaching of supra-segmental features and should be encouraged to go for further studies, seminars and refresher courses.

Keywords: kindergarten, stress, phonetic and intonation, Nigeria

Procedia PDF Downloads 265
151 Velocity Profiles of Vowel Perception by Javanese and Sundanese English Language Learners

Authors: Arum Perwitasari

Abstract:

Learning L2 sounds is influenced by the first language (L1) sound system. This current study seeks to examine how the listeners with a different L1 vowel system perceive L2 sounds. The fact that English has a bigger number of vowel inventory than Javanese and Sundanese L1 might cause problems for Javanese and Sundanese English language learners perceiving English sounds. To reveal the L2 sound perception over time, we measured the mouse trajectories related to the hand movements made by Javanese and Sundanese language learners, two of Indonesian local languages. Do the Javanese and Sundanese listeners show higher velocity than the English listeners when they perceive English vowels which are similar and new to their L1 system? The study aims to map the patterns of real-time processing through compatible hand movements to reveal any uncertainties when making selections. The results showed that the Javanese listeners exhibited significantly slower velocity values than the English listeners for similar vowels /I, ɛ, ʊ/ in the 826-1200ms post stimulus. Unlike the Javanese, the Sundanese listeners showed slow velocity values except for similar vowel /ʊ/. For the perception of new vowels /i:, æ, ɜ:, ʌ, ɑː, u:, ɔ:/, the Javanese listeners showed slower velocity in making the lexical decision. In contrast, the Sundanese listeners showed slow velocity only for vowels /ɜ:, ɔ:, æ, I/ indicating that these vowels are hard to perceive. Our results fit well with the second language model representing how the L1 vowel system influences the L2 sound perception.

Keywords: velocity profiles, EFL learners, speech perception, experimental linguistics

Procedia PDF Downloads 188
150 Challenges of Teaching and Learning English Speech Sounds in Five Selected Secondary Schools in Bauchi, Bauchi State, Nigeria

Authors: Mairo Musa Galadima, Phoebe Mshelia

Abstract:

In Nigeria, the national policy of education stipulates that the kindergarten-primary schools and the legislature are to use the three popular Nigerian Languages namely: Hausa, Igbo, and Yoruba. However, the English language seems to be preferred and this calls for this paper. Attempts were made to draw out the challenges faced by learners in understanding English speech sounds and using them to communicate effectively in English; using 5 (five) selected secondary school in Bauchi. It was discovered that challenges abound in the wrong use of stress and intonation, transfer of phonetic features from their first language. Others are inadequately qualified teachers and relevant materials including textbooks. It is recommended that teachers of English should lay more emphasis on the teaching of supra-segmental features and should be encouraged to go for further studies, seminars and refresher courses.

Keywords: stress and intonation, phonetic and challenges, teaching and learning English, secondary schools

Procedia PDF Downloads 321
149 Categorical Metadata Encoding Schemes for Arteriovenous Fistula Blood Flow Sound Classification: Scaling Numerical Representations Leads to Improved Performance

Authors: George Zhou, Yunchan Chen, Candace Chien

Abstract:

Kidney replacement therapy is the current standard of care for end-stage renal diseases. In-center or home hemodialysis remains an integral component of the therapeutic regimen. Arteriovenous fistulas (AVF) make up the vascular circuit through which blood is filtered and returned. Naturally, AVF patency determines whether adequate clearance and filtration can be achieved and directly influences clinical outcomes. Our aim was to build a deep learning model for automated AVF stenosis screening based on the sound of blood flow through the AVF. A total of 311 patients with AVF were enrolled in this study. Blood flow sounds were collected using a digital stethoscope. For each patient, blood flow sounds were collected at 6 different locations along the patient’s AVF. The 6 locations are artery, anastomosis, distal vein, middle vein, proximal vein, and venous arch. A total of 1866 sounds were collected. The blood flow sounds are labeled as “patent” (normal) or “stenotic” (abnormal). The labels are validated from concurrent ultrasound. Our dataset included 1527 “patent” and 339 “stenotic” sounds. We show that blood flow sounds vary significantly along the AVF. For example, the blood flow sound is loudest at the anastomosis site and softest at the cephalic arch. Contextualizing the sound with location metadata significantly improves classification performance. How to encode and incorporate categorical metadata is an active area of research1. Herein, we study ordinal (i.e., integer) encoding schemes. The numerical representation is concatenated to the flattened feature vector. We train a vision transformer (ViT) on spectrogram image representations of the sound and demonstrate that using scalar multiples of our integer encodings improves classification performance. Models are evaluated using a 10-fold cross-validation procedure. The baseline performance of our ViT without any location metadata achieves an AuROC and AuPRC of 0.68 ± 0.05 and 0.28 ± 0.09, respectively. Using the following encodings of Artery:0; Arch: 1; Proximal: 2; Middle: 3; Distal 4: Anastomosis: 5, the ViT achieves an AuROC and AuPRC of 0.69 ± 0.06 and 0.30 ± 0.10, respectively. Using the following encodings of Artery:0; Arch: 10; Proximal: 20; Middle: 30; Distal 40: Anastomosis: 50, the ViT achieves an AuROC and AuPRC of 0.74 ± 0.06 and 0.38 ± 0.10, respectively. Using the following encodings of Artery:0; Arch: 100; Proximal: 200; Middle: 300; Distal 400: Anastomosis: 500, the ViT achieves an AuROC and AuPRC of 0.78 ± 0.06 and 0.43 ± 0.11. respectively. Interestingly, we see that using increasing scalar multiples of our integer encoding scheme (i.e., encoding “venous arch” as 1,10,100) results in progressively improved performance. In theory, the integer values do not matter since we are optimizing the same loss function; the model can learn to increase or decrease the weights associated with location encodings and converge on the same solution. However, in the setting of limited data and computation resources, increasing the importance at initialization either leads to faster convergence or helps the model escape a local minimum.

Keywords: arteriovenous fistula, blood flow sounds, metadata encoding, deep learning

Procedia PDF Downloads 48
148 A Mixing Matrix Estimation Algorithm for Speech Signals under the Under-Determined Blind Source Separation Model

Authors: Jing Wu, Wei Lv, Yibing Li, Yuanfan You

Abstract:

The separation of speech signals has become a research hotspot in the field of signal processing in recent years. It has many applications and influences in teleconferencing, hearing aids, speech recognition of machines and so on. The sounds received are usually noisy. The issue of identifying the sounds of interest and obtaining clear sounds in such an environment becomes a problem worth exploring, that is, the problem of blind source separation. This paper focuses on the under-determined blind source separation (UBSS). Sparse component analysis is generally used for the problem of under-determined blind source separation. The method is mainly divided into two parts. Firstly, the clustering algorithm is used to estimate the mixing matrix according to the observed signals. Then the signal is separated based on the known mixing matrix. In this paper, the problem of mixing matrix estimation is studied. This paper proposes an improved algorithm to estimate the mixing matrix for speech signals in the UBSS model. The traditional potential algorithm is not accurate for the mixing matrix estimation, especially for low signal-to noise ratio (SNR).In response to this problem, this paper considers the idea of an improved potential function method to estimate the mixing matrix. The algorithm not only avoids the inuence of insufficient prior information in traditional clustering algorithm, but also improves the estimation accuracy of mixing matrix. This paper takes the mixing of four speech signals into two channels as an example. The results of simulations show that the approach in this paper not only improves the accuracy of estimation, but also applies to any mixing matrix.

Keywords: DBSCAN, potential function, speech signal, the UBSS model

Procedia PDF Downloads 100
147 Still Pictures for Learning Foreign Language Sounds

Authors: Kaoru Tomita

Abstract:

This study explores how visual information helps us to learn foreign language pronunciation. Visual assistance and its effect for learning foreign language have been discussed widely. For example, simplified illustrations in textbooks are used for telling learners which part of the articulation organs are used for pronouncing sounds. Vowels are put into a chart that depicts a vowel space. Consonants are put into a table that contains two axes of place and manner of articulation. When comparing a still picture and a moving picture for visualizing learners’ pronunciation, it becomes clear that the former works better than the latter. The visualization of vowels was applied to class activities in which native and non-native speakers’ English was compared and the learners’ feedback was collected: the positions of six vowels did not scatter as much as they were expected to do. Specifically, two vowels were not discriminated and were arranged very close in the vowel space. It was surprising for the author to find that learners liked analyzing their own pronunciation by linking formant ones and twos on a sheet of paper with a pencil. Even a simple method works well if it leads learners to think about their pronunciation analytically.

Keywords: feedback, pronunciation, visualization, vowel

Procedia PDF Downloads 210
146 Development of the New York Misophonia Scale: Implications for Diagnostic Criteria

Authors: Usha Barahmand, Maria Stalias, Abdul Haq, Esther Rotlevi, Ying Xiang

Abstract:

Misophonia is a condition in which specific repetitive oral, nasal, or other sounds and movements made by humans trigger impulsive aversive reactions of irritation or disgust that instantly become anger. A few measures exist for the assessment of misophonia, but each has some limitations, and evidence for a formal diagnosis is still lacking. The objective of this study was to develop a reliable and valid measure of misophonia for use in the general population. Adopting a purely descriptive approach, this study focused on developing a self-report measure using all triggers and reactions identified in previous studies on misophonia. A measure with two subscales, one assessing the aversive quality of various triggers and the other assessing reactions of individuals, was developed. Data were gathered from a large sample of both men and women ranging in age from 18 to 65 years. Exploratory factor analysis revealed three main triggers: oral/nasal sounds, hand and leg movements, and environmental sounds. Two clusters of reactions also emerged: nonangry attempts to avoid the impact of the aversive stimuli and angry attempts to stop the aversive stimuli. The examination of the psychometric properties of the scale revealed its internal consistency and test-retest reliability to be excellent. The scale was also found to have very good concurrent and convergent validity. Significant annoyance and disgust in response to the triggers were reported by 12% of the sample, although for some specific triggers, rates as high as 31% were also reported. These findings have implications for the delineation of the criteria for identifying misophonia as a clinical condition.

Keywords: adults, factor analysis, misophonia, psychometric properties, scale

Procedia PDF Downloads 157
145 Role of Speech Articulation in English Language Learning

Authors: Khadija Rafi, Neha Jamil, Laiba Khalid, Meerub Nawaz, Mahwish Farooq

Abstract:

Speech articulation is a complex process to produce intelligible sounds with the help of precise movements of various structures within the vocal tract. All these structures in the vocal tract are named as articulators, which comprise lips, teeth, tongue, and palate. These articulators work together to produce a range of distinct phonemes, which happen to be the basis of language. It starts with the airstream from the lungs passing through the trachea and into oral and nasal cavities. When the air passes through the mouth, the tongue and the muscles around it form such coordination it creates certain sounds. It can be seen when the tongue is placed in different positions- sometimes near the alveolar ridge, soft palate, roof of the mouth or the back of the teeth which end up creating unique qualities of each phoneme. We can articulate vowels with open vocal tracts, but the height and position of the tongue is different every time depending upon each vowel, while consonants can be pronounced when we create obstructions in the airflow. For instance, the alphabet ‘b’ is a plosive and can be produced only by briefly closing the lips. Articulation disorders can not only affect communication but can also be a hurdle in speech production. To improve articulation skills for such individuals, doctors often recommend speech therapy, which involves various kinds of exercises like jaw exercises and tongue twisters. However, this disorder is more common in children who are going through developmental articulation issues right after birth, but in adults, it can be caused by injury, neurological conditions, or other speech-related disorders. In short, speech articulation is an essential aspect of productive communication, which also includes coordination of the specific articulators to produce different intelligible sounds, which are a vital part of spoken language.

Keywords: linguistics, speech articulation, speech therapy, language learning

Procedia PDF Downloads 26
144 The Voiceless Dental- Alveolar Common Augment in Arabic and Other Semitic Languages, a Morphophonemic Comparison

Authors: Tarek Soliman Mostafa Soliman Al-Nana'i

Abstract:

There are non-steady voiced augments in the Semitic languages, and in the morphological and structural augmentation, two sounds were augments in all Semitic languages at the level of the spoken language and two letters at the level of the written language, which are the hamza and the ta’. This research studies only the second of them; Therefore, we defined it as “The Voiceless Dental- alveolar common augment” (VDACA) to distinguish it from the glottal sound “Hamza”, first, middle, or last, in a noun or in a verb, in Arabic and its equivalent in the Semitic languages. What is meant by “VDACA” is the ta’ that is in addition to the root of the word at the morphological level: the word “voiceless” takes out the voiced sounds that we studied before, and the “dental- alveolar common augment” takes out the laryngeal sound of them, which is the “Hamza”: and the word “common” brings out the uncommon voiceless sounds, which are sīn, shīn, and hā’. The study is limited to the ta' alone among the Arabic sounds, and this title faced a problem in identifying it with the ta'. Because the designation of the ta is not the same in most Semitic languages. Hebrew, for example, has “tav” and is pronounced with the voiced fa (v), which is not in Arabic. It is called different names in other Semitic languages, such as “taw” or “tAu” in old Syriac. And so on. This goes hand in hand with the insistence on distance from the written level and the reference to the phonetic aspect in this study that is closely and closely linked to the morphological level. Therefore, the study is “morphophonemic”. What is meant by Semitic languages in this study are the following: Akkadian, Ugaritic, Hebrew, Syriac, Mandaean, Ge'ez, and Amharic. The problem of the study is the agreement or difference between these languages in the position of that augment, first, middle, or last. And in determining the distinguishing characteristics of each language from the other. As for the study methodology, it is determined by the comparative approach in Semitic languages, which is based on the descriptive approach for each language. The study is divided into an introduction, four sections, and a conclusion: Introduction: It included the subject of the study, its importance, motives, problem, methodology, and division. The first section: VDACA as a non-common phoneme. The second: VDACA as a common phoneme. The third: VDACA as a functional morpheme. The fourth section: Commentary and conclusion with the most important results. The positions of VDACA in Arabic and other Semitic languages, and in nouns and verbs, were limited to first, middle, and last. The research identified the individual addition, which is common with other augments, and the research proved that this augmentation is constant in all Semitic languages, but there are characteristics that distinguish each language from the other.

Keywords: voiceless -, dental- alveolar, augment, Arabic - semitic languages

Procedia PDF Downloads 31