Search results for: speech recognition
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2257

Search results for: speech recognition

1477 Image Recognition Performance Benchmarking for Edge Computing Using Small Visual Processing Unit

Authors: Kasidis Chomrat, Nopasit Chakpitak, Anukul Tamprasirt, Annop Thananchana

Abstract:

Internet of Things devices or IoT and Edge Computing has become one of the biggest things happening in innovations and one of the most discussed of the potential to improve and disrupt traditional business and industry alike. With rises of new hang cliff challenges like COVID-19 pandemic that posed a danger to workforce and business process of the system. Along with drastically changing landscape in business that left ruined aftermath of global COVID-19 pandemic, looming with the threat of global energy crisis, global warming, more heating global politic that posed a threat to become new Cold War. How emerging technology like edge computing and usage of specialized design visual processing units will be great opportunities for business. The literature reviewed on how the internet of things and disruptive wave will affect business, which explains is how all these new events is an effect on the current business and how would the business need to be adapting to change in the market and world, and example test benchmarking for consumer marketed of newer devices like the internet of things devices equipped with new edge computing devices will be increase efficiency and reducing posing a risk from a current and looming crisis. Throughout the whole paper, we will explain the technologies that lead the present technologies and the current situation why these technologies will be innovations that change the traditional practice through brief introductions to the technologies such as cloud computing, edge computing, Internet of Things and how it will be leading into future.

Keywords: internet of things, edge computing, machine learning, pattern recognition, image classification

Procedia PDF Downloads 143
1476 Statistical Feature Extraction Method for Wood Species Recognition System

Authors: Mohd Iz'aan Paiz Bin Zamri, Anis Salwa Mohd Khairuddin, Norrima Mokhtar, Rubiyah Yusof

Abstract:

Effective statistical feature extraction and classification are important in image-based automatic inspection and analysis. An automatic wood species recognition system is designed to perform wood inspection at custom checkpoints to avoid mislabeling of timber which will results to loss of income to the timber industry. The system focuses on analyzing the statistical pores properties of the wood images. This paper proposed a fuzzy-based feature extractor which mimics the experts’ knowledge on wood texture to extract the properties of pores distribution from the wood surface texture. The proposed feature extractor consists of two steps namely pores extraction and fuzzy pores management. The total number of statistical features extracted from each wood image is 38 features. Then, a backpropagation neural network is used to classify the wood species based on the statistical features. A comprehensive set of experiments on a database composed of 5200 macroscopic images from 52 tropical wood species was used to evaluate the performance of the proposed feature extractor. The advantage of the proposed feature extraction technique is that it mimics the experts’ interpretation on wood texture which allows human involvement when analyzing the wood texture. Experimental results show the efficiency of the proposed method.

Keywords: classification, feature extraction, fuzzy, inspection system, image analysis, macroscopic images

Procedia PDF Downloads 411
1475 Supernatural Beliefs Impact Pattern Perception

Authors: Silvia Boschetti, Jakub Binter, Robin Kopecký, Lenka PříPlatová, Jaroslav Flegr

Abstract:

A strict dichotomy was present between religion and science, but recently, cognitive science focusses on the impact of supernatural beliefs on cognitive processes such as pattern recognition. It has been hypothesized that cognitive and perceptual processes have been under evolutionary pressures that ensured amplified perception of patterns, especially when in stressful and harsh conditions. The pattern detection in religious and non-religious individuals after induction of negative, anxious mood shall constitute a cornerstone of the general role of anxiety, cognitive bias, leading towards or against the by-product hypothesis, one of the main theories on the evolutionary studies of religion. The apophenia (tendencies to perceive connection and meaning on unrelated events) and perception of visual patterns (or pateidolia) are of utmost interest. To capture the impact of culture and upbringing, a comparative study of two European countries, the Czech Republic (low organized religion participation, high esoteric belief) and Italy (high organized religion participation, low esoteric belief), are currently in the data collection phase. Outcomes will be presented at the conference. A battery of standardized questionnaires followed by pattern recognition tasks (the patterns involve color, shape, and are of artificial and natural origin) using an experimental method involving the conditioning of (controlled, laboratory-induced) stress is taking place. We hypothesize to find a difference between organized religious belief and personal (esoteric) belief that will be alike in both of the cultural environments.

Keywords: culture, esoteric belief, pattern perception, religiosity

Procedia PDF Downloads 171
1474 A Genre-Based Approach to the Teaching of Pronunciation

Authors: Marden Silva, Danielle Guerra

Abstract:

Some studies have indicated that pronunciation teaching hasn’t been paid enough attention by teachers regarding EFL contexts. In particular, segmental and suprasegmental features through genre-based approach may be an opportunity on how to integrate pronunciation into a more meaningful learning practice. Therefore, the aim of this project was to carry out a survey on some aspects related to English pronunciation that Brazilian students consider more difficult to learn, thus enabling the discussion of strategies that can facilitate the development of oral skills in English classes by integrating the teaching of phonetic-phonological aspects into the genre-based approach. Notions of intelligibility, fluency and accuracy were proposed by some authors as an ideal didactic sequence. According to their proposals, basic learners should be exposed to activities focused on the notion of intelligibility as well as intermediate students to the notion of fluency, and finally more advanced ones to accuracy practices. In order to test this hypothesis, data collection was conducted during three high school English classes at Federal Center for Technological Education of Minas Gerais (CEFET-MG), in Brazil, through questionnaires and didactic activities, which were recorded and transcribed for further analysis. The genre debate was chosen to facilitate the oral expression of the participants in a freer way, making them answering questions and giving their opinion about a previously selected topic. The findings indicated that basic students demonstrated more difficulty with aspects of English pronunciation than the others. Many of the intelligibility aspects analyzed had to be listened more than once for a better understanding. For intermediate students, the speeches recorded were considerably easier to understand, but nevertheless they found it more difficult to pronounce the words fluently, often interrupting their speech to think about what they were going to say and how they would talk. Lastly, more advanced learners seemed to express their ideas more fluently, but still subtle errors related to accuracy were perceptible in speech, thereby confirming the proposed hypothesis. It was also seen that using genre-based approach to promote oral communication in English classes might be a relevant method, considering the socio-communicative function inherent in the suggested approach.

Keywords: EFL, genre-based approach, oral skills, pronunciation

Procedia PDF Downloads 122
1473 Communicating Meaning through Translanguaging: The Case of Multilingual Interactions of Algerians on Facebook

Authors: F. Abdelhamid

Abstract:

Algeria is a multilingual speech community where individuals constantly mix between codes in spoken discourse. Code is used as a cover term to refer to the existing languages and language varieties which include, among others, the mother tongue of the majority Algerian Arabic, the official language Modern Standard Arabic and the foreign languages French and English. The present study explores whether Algerians mix between these codes in online communication as well. Facebook is the selected platform from which data is collected because it is the preferred social media site for most Algerians and it is the most used one. Adopting the notion of translanguaging, this study attempts explaining how users of Facebook use multilingual messages to communicate meaning. Accordingly, multilingual interactions are not approached from a pejorative perspective but rather as a creative linguistic behavior that multilingual utilize to achieve intended meanings. The study is intended as a contribution to the research on multilingualism online because although an extensive literature has investigated multilingualism in spoken discourse, limited research investigated it in the online one. Its aim is two-fold. First, it aims at ensuring that the selected platform for analysis, namely Facebook, could be a source for multilingual data to enable the qualitative analysis. This is done by measuring frequency rates of multilingual instances. Second, when enough multilingual instances are encountered, it aims at describing and interpreting some selected ones. 120 posts and 16335 comments were collected from two Facebook pages. Analysis revealed that third of the collected data are multilingual messages. Users of Facebook mixed between the four mentioned codes in writing their messages. The most frequent cases are mixing between Algerian Arabic and French and between Algerian Arabic and Modern Standard Arabic. A focused qualitative analysis followed where some examples are interpreted and explained. It seems that Algerians mix between codes when communicating online despite the fact that it is a conscious type of communication. This suggests that such behavior is not a random and corrupted way of communicating but rather an intentional and natural one.

Keywords: Algerian speech community, computer mediated communication, languages in contact, multilingualism, translanguaging

Procedia PDF Downloads 121
1472 The Feminine Disruption of Speech and Refounding of Discourse: Kristeva’s Semiotic Chora and Psychoanalysis

Authors: Kevin Klein-Cardeña

Abstract:

For Julia Kristeva, contra Lacan, the instinctive body refuses to go away within discourse. Neither is the pre-Oedipal stage of maternal fusion vanquished by the emergence of language and with it, the law of the father. On the contrary, Kristeva argues, the pre-symbolic ambivalently haunts the society of speech, simultaneously animating and threatening the very foundations of signification. Kristeva invents the term “the semiotic” to refer to this continual breaking-through of the material unconscious onto the scene of meaning. This presentation examines Kristeva’s semiotic as a theoretical gesture that itself is a disruption of discourse, re-presenting the ‘return of the repressed’ body in theory—-the breaking-through of the unconscious onto the science of meaning. Faced with linguistic theories concerned with abstract sign-systems as well as Lacanian doctrine privileging the linguistic sign unequivocally over the bodily drive, Kristeva’s theoretical corpus issues the message of a psychic remainder that disrupts with a view toward replenishing theoretical accounts of language and sense. Reviewing Semiotic challenge across these two levels (the sense and science of language), the presentation suggests that Kristeva’s offerings constitute a coherent gestalt, providing an account of the feminist nature of her dual intervention. In contrast to other feminist critiques, Kristeva’s gesture hinges on its restoration of the maternal contribution to subjectivity. Against the backdrop of ‘phallogocentric’ and ‘necrophilic’ theories that strip language of a subject and strip the subject of a body, Kristeva recasts linguistic study through a metaphor of life and birthing. Yet the semiotic fragments the subject it produces, dialoguing with an unconscious curtailed by but also exceeding the symbolic order of signification. Linguistics, too, becomes fragmented in the same measure as it is more meaningfully renewed by its confrontation with the semiotic body. It is Kristeva’s own body that issues this challenge, on both sides of the boundary between the theory and the theorized. The Semiotic becomes comprehensible as a project unified by its concern to disrupt and rehabilitate language, the subject, and the scholarly discourses that treat them.

Keywords: Julia kristeva, the Semiotic, french feminism, psychoanalysic theory, linguistics

Procedia PDF Downloads 59
1471 IT-Based Global Healthcare Delivery System: An Alternative Global Healthcare Delivery System

Authors: Arvind Aggarwal

Abstract:

We have developed a comprehensive global healthcare delivery System based on information technology. It has medical consultation system where a virtual consultant can give medical consultation to the patients and Doctors at the digital medical centre after reviewing the patient’s EMR file consisting of patient’s history, investigations in the voice, images and data format. The system has the surgical operation system too, where a remote robotic consultant can conduct surgery at the robotic surgical centre. The instant speech and text translation is incorporated in the software where the patient’s speech and text (language) can be translated into the consultant’s language and vice versa. A consultant of any specialty (surgeon or Physician) based in any country can provide instant health care consultation, to any patient in any country without loss of time. Robotic surgeons based in any country in a tertiary care hospital can perform remote robotic surgery, through patient friendly telemedicine and tele-surgical centres. The patient EMR, financial data and data of all the consultants and robotic surgeons shall be stored in cloud. It is a complete comprehensive business model with healthcare medical and surgical delivery system. The whole system is self-financing and can be implemented in any country. The entire system uses paperless, filmless techniques. This eliminates the use of all consumables thereby reduces substantial cost which is incurred by consumables. The consultants receive virtual patients, in the form of EMR, thus the consultant saves time and expense to travel to the hospital to see the patients. The consultant gets electronic file ready for reporting & diagnosis. Hence time spent on the physical examination of the patient is saved, the consultant can, therefore, spend quality time in studying the EMR/virtual patient and give his instant advice. The time consumed per patient is reduced and therefore can see more number of patients, the cost of the consultation per patients is therefore reduced. The additional productivity of the consultants can be channelized to serve rural patients devoid of doctors.

Keywords: e-health, telemedicine, telecare, IT-based healthcare

Procedia PDF Downloads 169
1470 EU Innovative Economic Priorities, Contemporary Problems and Challenges of Its Formation

Authors: Gechbaia Badri

Abstract:

The paper discusses in today's world of economic globalization and development of innovative economic integration is one of the issues of the day in the world. The article analyzes the innovation economy development trends in EU, showed the innovation economy formation of the main problems and results, also the development of innovative potential of the economy. The author reckons that the European economy will contribute to the development of innovative economic space of speech in recent years developed a financial and economic crisis.

Keywords: European Union, innovative system, innovative development, innovations

Procedia PDF Downloads 296
1469 Responsive Integrative Therapeutic Method: Paradigm for Addressing Core Deficits in Autism by Balkibekova

Authors: Balkibekova Venera Serikpaevna

Abstract:

Background: Autism Spectrum Disorder (ASD) poses significant challenges in both diagnosis and treatment. Existing therapeutic interventions often target specific symptoms, necessitating the exploration of alternative approaches. This study investigates the RITM (Rhythm Integration Tapping Music) developed by Balkibekova, aiming to create imitation, social engagement and a wide range of emotions through brain development. Methods: A randomized controlled trial was conducted with 100 participants diagnosed with ASD, aged 1 to 4 years. Participants were randomly assigned to either the RITM therapy group or a control group receiving standard care. The RITM therapy, rooted in tapping rhythm to music such as: marche on the drums, waltz on bells, lullaby on musical triangle, dancing on tambourine, polka on wooden spoons. Therapy sessions were conducted over a 3 year period, with assessments at baseline, midpoint, and post-intervention. Results: Preliminary analyses reveal promising outcomes in the RITM therapy group. Participants demonstrated significant improvements in social interactions, speech understanding, birth of speech, and adaptive behaviors compared to the control group. Careful examination of subgroup analyses provides insights into the differential effectiveness of the RITM approach across various ASD profiles. Conclusions: The findings suggest that RITM therapy, as developed by Balkibekova, holds promise as intervention for ASD. The integrative nature of the approach, addressing multiple domains simultaneously, may contribute to its efficacy. Further research is warranted to validate these preliminary results and explore the long-term impact of RITM therapy on individuals with ASD. This abstract presents a snapshot of the research, emphasizing the significance, methodology, key findings, and implications of the RITM therapy method for consideration in an autism conference.

Keywords: RITM therapy, tapping rhythm, autism, mirror neurons, bright emotions, social interactions, communications

Procedia PDF Downloads 54
1468 A Systematic Review of the Psychometric Properties of Augmentative and Alternative Communication Assessment Tools in Adolescents with Complex Communication Needs

Authors: Nadwah Onwi, Puspa Maniam, Azmawanie A. Aziz, Fairus Mukhtar, Nor Azrita Mohamed Zin, Nurul Haslina Mohd Zin, Nurul Fatehah Ismail, Mohamad Safwan Yusoff, Susilidianamanalu Abd Rahman, Siti Munirah Harris, Maryam Aizuddin

Abstract:

Objective: Malaysia has a growing number of individuals with complex communication needs (CCN). The initiation of augmentative and alternative communication (AAC) intervention may facilitate individuals with CCN to understand and express themselves optimally and actively participate in activities in their daily life. AAC is defined as multimodal use of communication ability to allow individuals to use every mode possible to communicate with others using a set of symbols or systems that may include the symbols, aids, techniques, and strategies. It is consequently critical to evaluate the deficits to inform treatment for AAC intervention. However, no known measurement tools are available to evaluate the user with CCN available locally. Design: A systematic review (SR) is designed to analyze the psychometric properties of AAC assessment for adolescents with CCN published in peer-reviewed journals. Tools are rated by the methodological quality of studies and the psychometric measurement qualities of each tool. Method: A literature search identifying AAC assessment tools with psychometrically robust properties and conceptual framework was considered. Two independent reviewers screened the abstracts and full-text articles and review bibliographies for further references. Data were extracted using standardized forms and study risk of bias was assessed. Result: The review highlights the psychometric properties of AAC assessment tools that can be used by speech-language therapists applicable to be used in the Malaysian context. The work outlines how systematic review methods may be applied to the consideration of published material that provides valuable data to initiate the development of Malay Language AAC assessment tools. Conclusion: The synthesis of evidence has provided a framework for Malaysia Speech-Language therapists in making an informed decision for AAC intervention in our standard operating procedure in the Ministry of Health, Malaysia.

Keywords: augmentative and alternative communication, assessment, adolescents, complex communication needs

Procedia PDF Downloads 140
1467 Carl Wernicke and the Origin of Neurolinguistics in Breslau: A Case Study in the Domain of the History of Linguistics

Authors: Aneta Daniel

Abstract:

The subject of the study is the exploration of the origins and dynamics of the development of language studies, which have been labelled as neurolinguistics. It is worth mentioning that the origins of neurolinguistics are to be found in the research conducted by German scientists before the Second World War in Breslau Universität (presently Wroclaw). The dominant figure in these studies was professor Carl Wernicke, whose students continued and creatively developed projects of their master within this area. Professor Carl Wernicke, a German physician, anatomist, psychiatrist, and neuropathologist, is primarily known for his influential research on aphasia. His research, as well as those conducted by professor Paul Broca, has led to breakthroughs in the location of brain functions, particularly speech. Years later the theses of the pioneers of cognitive neurology (Carl Wernicke and Paul Broca) were developed by other neurolinguists. The main objective of the investigation is the reconstruction of the group of scientists –the students of Carl Wernicke– who contributed to the development of neurolinguistics. The scholars were mainly neurologists and psychiatrists and dealt with the branch of science that had not been named neurolinguistics at that time. The profiles of the scholars will be analysed and presented as the members of the group of researchers who have contributed to the breakthroughs in psychology and neuroscience. The research material consists of archival records documenting the research of professor Carl Wernicke and the researchers from Breslau (presently Wroclaw) which is one of the fastest growing cities in Europe. In 1870, when Carl Wernicke became the medical doctor, Breslau was full of cultural events: festivals and circus shows were held in the city center. Today we can come back to these events due to 'Breslauer Zeitung (1870)', which precisely describes all the events that took place on particular days. It is worth noting that those were the beginnings of antisemitism in Breslau. Many theses and articles that have survived in the libraries in Wroclaw and all over the world contribute to the development of neuroscience. The history of research on the brain and speech analysis, including the history of psychology and neuroscience, areas from which neurolinguistics is derived, will be presented.

Keywords: Aphasia, brain injury, Carl Wernicke, language, neurolinguistics

Procedia PDF Downloads 375
1466 Sociology of Vis and Ramin

Authors: Farzane Yusef Ghanbari

Abstract:

A sociological analysis on the ancient poetry of Vis and Ramin reveals important points about the political, cultural, and social conditions of the Iranian ancient history. The reciprocal relationship between the effect and structure of society helps the understanding and interpretation of the work. Therefore, informed by the Goldman genetic structuralism and through a glance at social epistemology, this study attempts to explain the role of spell in shaping the social knowledge of ancient people. The results suggest that due to the lack of a central government, and secularism in politics and freedom of speech and opinion, such romantic stories as Vis and Ramin, with a focal female character, has emerged.

Keywords: persian literature, Vis and Ramin, sociology, developmental structuralism

Procedia PDF Downloads 416
1465 Second Language Perception of Japanese /Cju/ and /Cjo/ Sequences by Mandarin-Speaking Learners of Japanese

Authors: Yili Liu, Honghao Ren, Mariko Kondo

Abstract:

In the field of second language (L2) speech learning, it is well-known that that learner’s first language (L1) phonetic and phonological characteristics will be transferred into their L2 production and perception, which lead to foreign accent. For L1 Mandarin learners of Japanese, the confusion of /u/ and /o/ in /CjV/ sequences has been observed in their utterance frequently. L1 transfer is considered to be the cause of this issue, however, other factors which influence the identification of /Cju/ and /Cjo/ sequences still under investigation. This study investigates the perception of Japanese /Cju/ and /Cjo/ units by L1 Mandarin learners of Japanese. It further examined whether learners’ proficiency, syllable position, phonetic features of preceding consonants and background noise affect learners’ performance in perception. Fifty-two Mandarin-speaking learners of Japanese and nine native Japanese speakers were recruited to participate in an identification task. Learners were divided into beginner, intermediate and advanced level according to their Japanese proficiency. The average correct rate was used to evaluate learners’ perceptual performance. Furthermore, the comparison of the correct rate between learners’ groups and the control group was conducted as well to examine learners’ nativelikeness. Results showed that background noise tends to pose an adverse effect on distinguishing /u/ and /o/ in /CjV/ sequences. Secondly, Japanese proficiency has no influence on learners’ perceptual performance in the quiet and in background noise. Then all learners did not reach a native-like level without the distraction of noise. Beginner level learners performed less native-like, although higher level learners appeared to have achieved nativelikeness in the multi-talker babble noise. Finally, syllable position tends to affect distinguishing /Cju/ and /Cjo/ only under the noisy condition. Phonetic features of preceding consonants did not impact learners’ perception in any listening conditions. Findings in this study can give an insight into a further understanding of Japanese vowel acquisition by L1 Mandarin learners of Japanese. In addition, this study indicates that L1 transfer is not the only explanation for the confusion of /u/ and /o/ in /CjV/ sequences, factors such as listening condition and syllable position are also needed to take into consideration in future research. It also suggests the importance of perceiving speech in a noisy environment, which is close to the actual conversation required more attention to pedagogy.

Keywords: background noise, Chinese learners of Japanese, /Cju/ and /Cjo/ sequences, second language perception

Procedia PDF Downloads 151
1464 Communicative Strategies in Colombian Political Speech: On the Example of the Speeches of Francia Marquez

Authors: Danila Arbuzov

Abstract:

In this article the author examines the communicative strategies used in the Colombian political discourse, following the example of the speeches of the Vice President of Colombia Francia Marquez, who took office in 2022 and marked a new development vector for the Colombian nation. The lexical and syntactic means are analyzed to achieve the communicative objectives. The material presented may be useful for those who are interested in investigating various aspects of discursive linguistics, particularly political discourse, as well as the implementation of communicative strategies in certain types of discourse.

Keywords: political discourse, communication strategies, Colombian political discourse, Colombia, manipulation

Procedia PDF Downloads 97
1463 Classroom Discourse and English Language Teaching: Issues, Importance, and Implications

Authors: Rabi Abdullahi Danjuma, Fatima Binta Attahir

Abstract:

Classroom discourse is important, and it is worth examining what the phenomena is and how it helps both the teacher and students in a classroom situation. This paper looks at the classroom as a traditional social setting which has its own norms and values. The paper also explains what discourse is, as extended communication in speech or writing often interactively dealing with some particular topics. It also discusses classroom discourse as the language which teachers and students use to communicate with each other in a classroom situation. The paper also looks at some strategies for effective classroom discourse. Finally, implications and recommendations were drawn.

Keywords: classroom, discourse, learning, student, strategies, communication

Procedia PDF Downloads 585
1462 Reading and Teaching Poetry as Communicative Discourse: A Pragma-Linguistic Approach

Authors: Omnia Elkommos

Abstract:

Language is communication on several discourse levels. The target of teaching a language and the literature of a foreign language is to communicate a message. Reading, appreciating, analysing, and interpreting poetry as a sophisticated rhetorical expression of human thoughts, emotions, and philosophical messages is more feasible through the use of linguistic pragmatic tools from a communicative discourse perspective. The poet's intention, speech act, illocutionary act, and perlocutionary goal can be better understood when communicative situational context as well as linguistic discourse structure theories are employed. The use of linguistic theories in the teaching of poetry is, therefore, intrinsic to students' comprehension, interpretation, and appreciation of poetry of the different ages. It is the purpose of this study to show how both teachers as well as students can apply these linguistic theories and tools to dramatic poetic texts for an engaging, enlightening, and effective interpretation and appreciation of the language. Theories drawn from areas of pragmatics, discourse analysis, embedded discourse level, communicative situational context, and other linguistic approaches were applied to selected poetry texts from the different centuries. Further, in a simple statistical count of the number of poems with dialogic dramatic discourse with embedded two or three levels of discourse in different anthologies outweighs the number of descriptive poems with a one level of discourse, between the poet and the reader. Poetry is thus discourse on one, two, or three levels. It is, therefore, recommended that teachers and students in the area of ESL/EFL use the linguistics theories for a better understanding of poetry as communicative discourse. The practice of applying these linguistic theories in classrooms and in research will allow them to perceive the language and its linguistic, social, and cultural aspect. Texts will become live illocutionary acts with a perlocutionary acts goal rather than mere literary texts in anthologies.

Keywords: coda, commissives, communicative situation, context of culture, context of reference, context of utterance, dialogue, directives, discourse analysis, dramatic discourse interaction, duologue, embedded discourse levels, language for communication, linguistic structures, literary texts, poetry, pragmatic theories, reader response, speech acts (macro/micro), stylistics, teaching literature, TEFL, terms of address, turn-taking

Procedia PDF Downloads 315
1461 Omni-Modeler: Dynamic Learning for Pedestrian Redetection

Authors: Michael Karnes, Alper Yilmaz

Abstract:

This paper presents the application of the omni-modeler towards pedestrian redetection. The pedestrian redetection task creates several challenges when applying deep neural networks (DNN) due to the variety of pedestrian appearance with camera position, the variety of environmental conditions, and the specificity required to recognize one pedestrian from another. DNNs require significant training sets and are not easily adapted for changes in class appearances or changes in the set of classes held in its knowledge domain. Pedestrian redetection requires an algorithm that can actively manage its knowledge domain as individuals move in and out of the scene, as well as learn individual appearances from a few frames of a video. The Omni-Modeler is a dynamically learning few-shot visual recognition algorithm developed for tasks with limited training data availability. The Omni-Modeler adapts the knowledge domain of pre-trained deep neural networks to novel concepts with a calculated localized language encoder. The Omni-Modeler knowledge domain is generated by creating a dynamic dictionary of concept definitions, which are directly updatable as new information becomes available. Query images are identified through nearest neighbor comparison to the learned object definitions. The study presented in this paper evaluates its performance in re-identifying individuals as they move through a scene in both single-camera and multi-camera tracking applications. The results demonstrate that the Omni-Modeler shows potential for across-camera view pedestrian redetection and is highly effective for single-camera redetection with a 93% accuracy across 30 individuals using 64 example images for each individual.

Keywords: dynamic learning, few-shot learning, pedestrian redetection, visual recognition

Procedia PDF Downloads 65
1460 Input and Interaction as Training for Cognitive Learning: Variation Sets Influence the Sudden Acquisition of Periphrastic estar 'to be' + verb + -ndo*

Authors: Mary Rosa Espinosa-Ochoa

Abstract:

Some constructions appear suddenly in children’s speech and are productive from the beginning. These constructions are supported by others, previously acquired, with which they share semantic and pragmatic features. Thus, for example, the acquisition of the passive voice in German is supported by other constructions with which it shares the lexical verb sein (“to be”). This also occurs in Spanish, in the acquisition of the progressive aspectual periphrasis estar (“to be”) + verb root + -ndo (present participle), supported by locative constructions acquired earlier with the same verb. The periphrasis shares with the locative constructions not only the lexical verb estar, but also pragmatic relations. Both constructions can be used to answer the question ¿Dónde está? (“Where is he/she/it?”), whose answer could be either Está aquí (“He/she/it is here”) or Se está bañando (“He/she/it is taking a bath”).This study is a corpus-based analysis of two children (1;08-2;08) and the input directed to them: it proposes that the pragmatic and semantic support from previously-acquired constructions comes from the input, during interaction with others. This hypothesis is based on analysis of constructions with estar, whose use to express temporal change (which differentiates it from its counterpart ser [“to be”]), is given in variation sets, similar to those described by Küntay and Slobin (2002), that allow the child to perceive the change of place experienced by nouns that function as its grammatical subject. For example, at different points during a bath, the mother says: El jabón está aquí “The soap is here” (beginning of bath); five minutes later, the soap has moved, and the mother says el jabón está ahí “the soap is there”; the soap moves again later on and she says: el jabón está abajo de ti “the soap is under you”. “The soap” is the grammatical subject of all of these utterances. The Spanish verb + -ndo is a progressive phase aspect encoder of a dynamic state that generates a token. The verb + -ndo is also combined with verb estar to encode. It is proposed here that the phases experienced in interaction with the adult, in events related to the verb estar, allow a child to generate this dynamicity and token reading of the verb + -ndo. In this way, children begin to produce the periphrasis suddenly and productively, even though neither the periphrasis nor the verb + -ndo itself are frequent in adult speech.

Keywords: child language acquisition, input, variation sets, Spanish language

Procedia PDF Downloads 135
1459 Dysphagia Tele Assessment Challenges Faced by Speech and Swallow Pathologists in India: Questionnaire Study

Authors: B. S. Premalatha, Mereen Rose Babu, Vaishali Prabhu

Abstract:

Background: Dysphagia must be assessed, either subjectively or objectively, in order to properly address the swallowing difficulty. Providing therapeutic care to patients with dysphagia via tele mode was one approach for providing clinical services during the COVID-19 epidemic. As a result, the teleassessment of dysphagia has increased in India. Aim: This study aimed to identify challenges faced by Indian SLPs while providing teleassessment to individuals with dysphagia during the outbreak of COVID-19 from 2020 to 2021. Method: After receiving approval from the institute's institutional review board and ethics committee, the current study was carried out. The study was cross-sectional in nature and lasted from 2020 to 2021. The study enrolled participants who met the inclusion and exclusion criteria of the study. It was decided to recruit roughly 246 people based on the sample size calculations. The research was done in three stages: questionnaire development and content validation, questionnaire administration. Five speech and hearing professionals' content verified the questionnaire for faults and clarity. Participants received questionnaires via various social media platforms such as e-mail and WhatsApp, which were written in Microsoft Word and then converted to Google Forms. SPSS software was used to examine the data. Results: In light of the obstacles that Indian SLPs encounter, the study's findings were examined. Only 135 people responded. During the COVID-19 lockdowns, 38% of participants said they did not deal with dysphagia patients. After the lockout, 70.4% of SLPs kept working with dysphagia patients, while 29.6% did not. From the beginning of the oromotor examination, the main problems in completing tele evaluation of dysphagia have been highlighted. Around 37.5% of SLPs said they don't undertake the OPME online because of difficulties doing the evaluation, such as the need for repeated instructions from patients and family members and trouble visualizing structures in various positions. The majority of SLPs' online assessments were inefficient and time-consuming. A bigger percentage of SLPs stated that they will not advocate tele evaluation in dysphagia to their colleagues. SLPs' use of dysphagia assessment has decreased as a result of the epidemic. When it came to the amount of food, the majority of people proposed a small amount. Apart from placing the patient for assessment and gaining less cooperation from the family, most SLPs found that Internet speed was a source of concern and a barrier. Hearing impairment and the presence of a tracheostomy in patients with dysphagia proved to be the most difficult conditions to treat online. For patients with NPO, the majority of SLPs did not advise tele-evaluation. In the anterior region of the oral cavity, oral meal residue was more visible. The majority of SLPs reported more anterior than posterior leakage. Even while the majority of SLPs could detect aspiration by coughing, many found it difficult to discern the gurgling tone of speech after swallowing. Conclusion: The current study sheds light on the difficulties that Indian SLPs experience when assessing dysphagia via tele mode, indicating that tele-assessment of dysphagia is still to gain importance in India.

Keywords: dysphagia, teleassessment, challenges, Indian SLP

Procedia PDF Downloads 122
1458 Recognition of Arrest Patients and Application of Basic Life Support by Bystanders in the Field

Authors: Behcet Al, Mehmet Murat Oktay, Suat Zengin, Mustafa Sabak, Cuma Yildirim

Abstract:

Objective: Th Recognition of arrest patients and application of basic life support (BLS) by bystanders in the field and the activation of emergency serves were evaluated in present study. Methodology: The present study was carried out by Emergency Department of Medicine Faculty of Gaziantep University at 33 of Emergency Health center in Gaziantep between December 2012- April 2014 prospectively. Of 539 arrested patients, 171 patients were included in study. Results: 118 (69%) male, and 53 31(%) female with a totlay of 171 patients were included in this study. Of patients, 32.2% had syncope and 24% had shorth breathing just befor being arrested. The majority of arrest cases had occured at home (61.4%) and rural area (11.7%) respectively. Of asking help, %48.5 were constructed by family members. Of announcement, only 15.2% occured within first minute of arrest. The BLS ratio that was applied by bystanders was 22.2%. Of bystanders, 47.4% had a course experience of BLS. The emergency serve had reached to the field with a mean of 8.43 min. Of cases, 55% (n=94) were evaluated as exitus firstly bu emergency staff. The most noticed rythim was asystol (73.1%). BLS and advanced life support (ALS) were applied to 98.8% and 60% respectively at the field. 10.5% (n=18) of cases were defibrilated, and 45 (26.3%) were intubated endotrecealy. The majority (48.5%) of staff who applied BLS and ALS at the fied were emergency medicine technicians. CPR was performed to 86.5% (n=148) cases in ambulance while they were transported. The mean arrival time to mergency department was 9.13 min. When the patients arrived to ED 15.2% needed defirlitation. 91.2% (n =156) of patients resulted in exitus in ED. 15 (8.8%) patients were discharged (9 with recovery, six patients with damage). Conclusion: The ratio of inntervention for arrest patients by bystanders is still low. To optain a high percentage of survival, BLS training should be widened among the puplic especiallyamong the caregivers.

Keywords: arrest patients, cardiopulmonary resuscitation, bystanders, chest compressions, prehospital

Procedia PDF Downloads 381
1457 Improvement of Microscopic Detection of Acid-Fast Bacilli for Tuberculosis by Artificial Intelligence-Assisted Microscopic Platform and Medical Image Recognition System

Authors: Hsiao-Chuan Huang, King-Lung Kuo, Mei-Hsin Lo, Hsiao-Yun Chou, Yusen Lin

Abstract:

The most robust and economical method for laboratory diagnosis of TB is to identify mycobacterial bacilli (AFB) under acid-fast staining despite its disadvantages of low sensitivity and labor-intensive. Though digital pathology becomes popular in medicine, an automated microscopic system for microbiology is still not available. A new AI-assisted automated microscopic system, consisting of a microscopic scanner and recognition program powered by big data and deep learning, may significantly increase the sensitivity of TB smear microscopy. Thus, the objective is to evaluate such an automatic system for the identification of AFB. A total of 5,930 smears was enrolled for this study. An intelligent microscope system (TB-Scan, Wellgen Medical, Taiwan) was used for microscopic image scanning and AFB detection. 272 AFB smears were used for transfer learning to increase the accuracy. Referee medical technicians were used as Gold Standard for result discrepancy. Results showed that, under a total of 1726 AFB smears, the automated system's accuracy, sensitivity and specificity were 95.6% (1,650/1,726), 87.7% (57/65), and 95.9% (1,593/1,661), respectively. Compared to culture, the sensitivity for human technicians was only 33.8% (38/142); however, the automated system can achieve 74.6% (106/142), which is significantly higher than human technicians, and this is the first of such an automated microscope system for TB smear testing in a controlled trial. This automated system could achieve higher TB smear sensitivity and laboratory efficiency and may complement molecular methods (eg. GeneXpert) to reduce the total cost for TB control. Furthermore, such an automated system is capable of remote access by the internet and can be deployed in the area with limited medical resources.

Keywords: TB smears, automated microscope, artificial intelligence, medical imaging

Procedia PDF Downloads 209
1456 The Significance of Islamic Concept of Good Faith to Cure Flaws in Public International Law

Authors: M. A. H. Barry

Abstract:

The concept of Good faith (husn al-niyyah) and fair-dealing (Nadl) are the fundamental guiding elements in all contracts and other agreements under Islamic law. The preaching of Al-Quran and Prophet Muhammad’s (Peace Be upon Him) firmly command people to act in good faith in all dealings. There are several Quran verses and the Prophet’s saying which stressed the significance of dealing honestly and fairly in all transactions. Under the English law, the good faith is not considered a fundamental requirement for the formation of a legal contract. However, the concept of Good Faith in private contracts is recognized by the civil law system and in Article 7(1) of the Convention on International Sale of Goods (CISG-Vienna Convention-1980). It took several centuries for the international trading community to recognize the significance of the concept of good faith for the international sale of goods transactions. Nevertheless, the recognition of good faith in Civil law is only confined for the commercial contracts. Subsequently to the CISG, this concept has made inroads into the private international law. There are submissions in favour of applying the good faith concept to public international law based on tacit recognition by the international conventions and International Tribunals. However, under public international law the concept of good faith is not recognized as a source of rights or obligations. This weakens the spirit of the good faith concept, particularly when determining the international disputes. This also creates a fundamental flaw because the absence of good faith application means the breaches tainted by bad faith are tolerated. The objective of this research is to evaluate, examine and analyze the application of the concept of good faith in the modern laws and identify its limitation, in comparison with Islamic concept of good faith. This paper also identifies the problems and issues connected with the non-application of this concept to public international law. This research consists of three key components (1) the preliminary inquiry (2) subject analysis and discovery of research results, and (3) examining the challenging problems, and concluding with proposals. The preliminary inquiry is based on both the primary and secondary sources. The same sources are used for the subject analysis. This research also has both inductive and deductive features. The Islamic concept of good faith covers all situations and circumstances where the bad faith causes unfairness to the affected parties, especially the weak parties. Under the Islamic law, the concept of good faith is a source of rights and obligations as Islam prohibits any person committing wrongful or delinquent acts in any dealing whether in a private or public life. This rule is applicable not only for individuals but also for institutions, states, and international organizations. This paper explains how the unfairness is caused by non-recognition of the good faith concept as a source of rights or obligations under public international law and provides legal and non-legal reasons to show why the Islamic formulation is important.

Keywords: good faith, the civil law system, the Islamic concept, public international law

Procedia PDF Downloads 130
1455 The Process of Irony Comprehension in Young Children: Evidence from Monolingual and Bilingual Preschoolers

Authors: Natalia Banasik

Abstract:

Comprehension of verbal irony is an example of pragmatic competence in understanding figurative language. The knowledge of how it develops may shed new light on the understanding of social and communicative competence that is crucial for one's effective functioning in the society. Researchers agree it is a competence that develops late in a child’s development. One of the abilities that seems crucial for irony comprehension is theory of mind (ToM), that is the ability to understand that others may have beliefs, desires and intentions different from one’s own. Although both theory of mind and irony comprehension require the ability to understand the figurative use of the false description of the reality, the exact relationship between them is still unknown. Also, even though irony comprehension in children has been studied for over thirty years, the results of the studies are inconsistent as to the age when this competence are acquired. The presented study aimed to answer questions about the developmental trajectories of irony comprehension and ascribing function to ironic utterances by preschool children. Specifically, we were interested in how it is related to the development of ToM and how comprehension of the function of irony changes with age. Data was collected from over 150 monolingual, Polish-speaking children and (so far) thirty bilingual children speaking Polish and English who live in the US. Four-, five- and six-year-olds were presented with a story comprehension task in the form of audio and visual stimuli programmed in the E-prime software (pre-recorded narrated stories, some of which included ironic utterances, and pictures accompanying the stories displayed on a touch screen). Following the presentation, the children were then asked to answer a series of questions. The questions checked the children’s understanding of the intended utterance meaning, evaluation of the degree to which it was funny and evaluation of how nice the speaker was. The children responded by touching the screen, which made it possible to measure reaction times. Additionally, the children were asked to explain why the speaker had uttered the ironic statement. Both quantitive and qualitative analyses were applied. The results of our study indicate that for irony recognition there is a significant difference among the three age groups, but what is new is that children as young as four do understand the real meaning behind the ironic statement as long as the utterance is not grammtically or lexically complex also, there is a clear correlation of ToM and irony comprehension. Although four-year olds and six-year olds understand the real meaning of the ironic utterance, it is not earlier than at the age of six when children start to explain the reason of using this marked form of expression. They talk about the speaker's intention to tell a joke, be funny, or to protect the listener's emotions. There are also some metalinguistic references, such as "mommy sometimes says things that don't make sense and this is called a metaphor".

Keywords: child's pragmatics, figurative speech, irony comprehension in children, theory of mind and irony

Procedia PDF Downloads 300
1454 Engendered Noises: The Gender Politics of Sensorial Pleasure in Neoliberal Korean Food Commercials

Authors: Eunyup Yeom

Abstract:

The roles of male and female in context of cuisine have developed into stereotypes throughout history. However¬— with Korea’s fast advancement in politics, technology, society and social standards¬— gender stereotypes have become blurred. This is not to say that such stereotypes no longer exist for they still remain present in media and advertisements embedding ‘idealistic’ ideas into the unconscious state of minds of viewers. Many media outlets, especially commercials, portray males expressing pleasure of food [that they are advertising] through audible qualities generally considered ‘rude’ and ‘unmannered’ in the Korean society. Females, on the other hand, express such pleasures only verbally. This happenstance of a stereotype is displayed bluntly in instant noodle, namely ramen, commercials. This research explores the cultural significance of a type of audible gesture that can be found in Korean speech in which is termed the Fricative Voice Gesture (FVG). There are two forms of FVGs: the reactive and the prosodic. The reactive FVG is a legitimate form of expression while the prosodic FVG works as a speech intensifier. So, in order to understand this stereotype of who is authorized to express sensorial pleasure as a reactive FVG as opposed to a prosodic FVG, information has been extracted from interviews and dissected numerous ramen/instant noodle commercials and its appearances in other mediums of media. The commercials were tediously analyzed in all aspects of dialogue, featured contents, background music, actors and/or actresses selling the product, body language, and voice gestures. To effectively understand the exact impact these commercials have on the audience, each commercial was viewed with an interviewee. In this research, there were main informants whom were all Korean students residing in South Korea. All three interviewees were able to attend interview and commercial viewing sessions via Skype. This research, overall, focuses and concludes on Harkness’s statement of how the reactive FVG is a recognizable index of the privileging of males for Korean culture norms and, in parallel, food commercials are still conforming to male ideals and fantasies.

Keywords: advertisement, food politics, fricative voice gestures, gender politics

Procedia PDF Downloads 215
1453 BERT-Based Chinese Coreference Resolution

Authors: Li Xiaoge, Wang Chaodong

Abstract:

We introduce the first Chinese Coreference Resolution Model based on BERT (CCRM-BERT) and show that it significantly outperforms all previous work. The key idea is to consider the features of the mention, such as part of speech, width of spans, distance between spans, etc. And the influence of each features on the model is analyzed. The model computes mention embeddings that combine BERT with features. Compared to the existing state-of-the-art span-ranking approach, our model significantly improves accuracy on the Chinese OntoNotes benchmark.

Keywords: BERT, coreference resolution, deep learning, nature language processing

Procedia PDF Downloads 198
1452 Protective Effect of the Histamine H3 Receptor Antagonist DL77 in Behavioral Cognitive Deficits Associated with Schizophrenia

Authors: B. Sadek, N. Khan, D. Łażewska, K. Kieć-Kononowicz

Abstract:

The effects of the non-imidazole histamine H3 receptor (H3R) antagonist DL77 in passive avoidance paradigm (PAP) and novel object recognition (NOR) task in MK801-induced cognitive deficits associated with schizophrenia (CDS) in adult male rats, and applying donepezil (DOZ) as a reference drug were investigated. The results show that acute systemic administration of DL77 (2.5, 5, and 10 mg/kg, i.p.) significantly improved MK801-induced (0.1 mg/kg, i.p.) memory deficits in PAP. The ameliorating activity of DL77 (5 mg/kg, i.p.) in MK801-induced deficits was partly reversed when rats were pretreated with the centrally-acting H2R antagonist zolantidine (ZOL, 10 mg/kg, i.p.) or with the antimuscarinic antagonist scopolamine (SCO, 0.1 mg/kg, i.p.), but not with the CNS penetrant H1R antagonist pyrilamine (PYR, 10 mg/kg, i.p.). Moreover, the memory enhancing effect of DL77 (5 mg/kg, i.p.) in MK801-induced memory deficits in PAP was strongly reversed when rats were pretreated with a combination of ZOL (10 mg/kg, i.p.) and SCO (1.0 mg/kg, i.p.). Furthermore, the significant ameliorative effect of DL77 (5 mg/kg, i.p.) on MK801-induced long-term memory (LTM) impairment in NOR test was comparable to the DOZ-provided memory-enhancing effect, and was abrogated when animals were pretreated with the histamine H3R agonist R-(α)-methylhistamine (RAMH, 10 mg/kg, i.p.). However, DL77(5 mg/kg, i.p.) failed to provide procognitive effect on MK801-induced short-term memory (STM) impairment in NOR test. In addition, DL77 (5 mg/kg) did not alter anxiety levels and locomotor activity of animals naive to elevated-plus maze (EPM), demonstrating that improved performances with DL77 (5 mg/kg) in PAP or NOR are unrelated to changes in emotional responding or spontaneous locomotor activity. These results provide evidence for the potential of H3Rs for the treatment of neurodegenerative disorders related to impaired memory function, e.g. CDS.

Keywords: histamine H3 receptor, antagonist, learning, memory impairment, passive avoidance paradigm, novel object recognition

Procedia PDF Downloads 189
1451 The Application of a Neural Network in the Reworking of Accu-Chek to Wrist Bands to Monitor Blood Glucose in the Human Body

Authors: J. K Adedeji, O. H Olowomofe, C. O Alo, S.T Ijatuyi

Abstract:

The issue of high blood sugar level, the effects of which might end up as diabetes mellitus, is now becoming a rampant cardiovascular disorder in our community. In recent times, a lack of awareness among most people makes this disease a silent killer. The situation calls for urgency, hence the need to design a device that serves as a monitoring tool such as a wrist watch to give an alert of the danger a head of time to those living with high blood glucose, as well as to introduce a mechanism for checks and balances. The neural network architecture assumed 8-15-10 configuration with eight neurons at the input stage including a bias, 15 neurons at the hidden layer at the processing stage, and 10 neurons at the output stage indicating likely symptoms cases. The inputs are formed using the exclusive OR (XOR), with the expectation of getting an XOR output as the threshold value for diabetic symptom cases. The neural algorithm is coded in Java language with 1000 epoch runs to bring the errors into the barest minimum. The internal circuitry of the device comprises the compatible hardware requirement that matches the nature of each of the input neurons. The light emitting diodes (LED) of red, green, and yellow colors are used as the output for the neural network to show pattern recognition for severe cases, pre-hypertensive cases and normal without the traces of diabetes mellitus. The research concluded that neural network is an efficient Accu-Chek design tool for the proper monitoring of high glucose levels than the conventional methods of carrying out blood test.

Keywords: Accu-Check, diabetes, neural network, pattern recognition

Procedia PDF Downloads 136
1450 3D Human Face Reconstruction in Unstable Conditions

Authors: Xiaoyuan Suo

Abstract:

3D object reconstruction is a broad research area within the computer vision field involving many stages and still open problems. One of the existing challenges in this field lies with micromotion, such as the facial expressions on the appearance of the human or animal face. Similar literatures in this field focuses on 3D reconstruction in stable conditions such as an existing image or photos taken in a rather static environment, while the purpose of this work is to discuss a flexible scan system using multiple cameras that can correctly reconstruct 3D stable and moving objects -- human face with expression in particular. Further, a mathematical model is proposed at the end of this literature to automate the 3D object reconstruction process. The reconstruction process takes several stages. Firstly, a set of simple 2D lines would be projected onto the object and hence a set of uneven curvy lines can be obtained, which represents the 3D numerical data of the surface. The lines and their shapes will help to identify object’s 3D construction in pixels. With the two-recorded angles and their distance from the camera, a simple mathematical calculation would give the resulting coordinate of each projected line in an absolute 3D space. This proposed research will benefit many practical areas, including but not limited to biometric identification, authentications, cybersecurity, preservation of cultural heritage, drama acting especially those with rapid and complex facial gestures, and many others. Specifically, this will (I) provide a brief survey of comparable techniques existing in this field. (II) discuss a set of specialized methodologies or algorithms for effective reconstruction of 3D objects. (III)implement, and testing the developed methodologies. (IV) verify findings with data collected from experiments. (V) conclude with lessons learned and final thoughts.

Keywords: 3D photogrammetry, 3D object reconstruction, facial expression recognition, facial recognition

Procedia PDF Downloads 139
1449 Faster Pedestrian Recognition Using Deformable Part Models

Authors: Alessandro Preziosi, Antonio Prioletti, Luca Castangia

Abstract:

Deformable part models achieve high precision in pedestrian recognition, but all publicly available implementations are too slow for real-time applications. We implemented a deformable part model algorithm fast enough for real-time use by exploiting information about the camera position and orientation. This implementation is both faster and more precise than alternative DPM implementations. These results are obtained by computing convolutions in the frequency domain and using lookup tables to speed up feature computation. This approach is almost an order of magnitude faster than the reference DPM implementation, with no loss in precision. Knowing the position of the camera with respect to horizon it is also possible prune many hypotheses based on their size and location. The range of acceptable sizes and positions is set by looking at the statistical distribution of bounding boxes in labelled images. With this approach it is not needed to compute the entire feature pyramid: for example higher resolution features are only needed near the horizon. This results in an increase in mean average precision of 5% and an increase in speed by a factor of two. Furthermore, to reduce misdetections involving small pedestrians near the horizon, input images are supersampled near the horizon. Supersampling the image at 1.5 times the original scale, results in an increase in precision of about 4%. The implementation was tested against the public KITTI dataset, obtaining an 8% improvement in mean average precision over the best performing DPM-based method. By allowing for a small loss in precision computational time can be easily brought down to our target of 100ms per image, reaching a solution that is faster and still more precise than all publicly available DPM implementations.

Keywords: autonomous vehicles, deformable part model, dpm, pedestrian detection, real time

Procedia PDF Downloads 267
1448 Non-Invasive Data Extraction from Machine Display Units Using Video Analytics

Authors: Ravneet Kaur, Joydeep Acharya, Sudhanshu Gaur

Abstract:

Artificial Intelligence (AI) has the potential to transform manufacturing by improving shop floor processes such as production, maintenance and quality. However, industrial datasets are notoriously difficult to extract in a real-time, streaming fashion thus, negating potential AI benefits. The main example is some specialized industrial controllers that are operated by custom software which complicates the process of connecting them to an Information Technology (IT) based data acquisition network. Security concerns may also limit direct physical access to these controllers for data acquisition. To connect the Operational Technology (OT) data stored in these controllers to an AI application in a secure, reliable and available way, we propose a novel Industrial IoT (IIoT) solution in this paper. In this solution, we demonstrate how video cameras can be installed in a factory shop floor to continuously obtain images of the controller HMIs. We propose image pre-processing to segment the HMI into regions of streaming data and regions of fixed meta-data. We then evaluate the performance of multiple Optical Character Recognition (OCR) technologies such as Tesseract and Google vision to recognize the streaming data and test it for typical factory HMIs and realistic lighting conditions. Finally, we use the meta-data to match the OCR output with the temporal, domain-dependent context of the data to improve the accuracy of the output. Our IIoT solution enables reliable and efficient data extraction which will improve the performance of subsequent AI applications.

Keywords: human machine interface, industrial internet of things, internet of things, optical character recognition, video analytics

Procedia PDF Downloads 98