Search results for: data recognition
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25660

Search results for: data recognition

25270 Using Deep Learning Real-Time Object Detection Convolution Neural Networks for Fast Fruit Recognition in the Tree

Authors: K. Bresilla, L. Manfrini, B. Morandi, A. Boini, G. Perulli, L. C. Grappadelli

Abstract:

Image/video processing for fruit in the tree using hard-coded feature extraction algorithms have shown high accuracy during recent years. While accurate, these approaches even with high-end hardware are computationally intensive and too slow for real-time systems. This paper details the use of deep convolution neural networks (CNNs), specifically an algorithm (YOLO - You Only Look Once) with 24+2 convolution layers. Using deep-learning techniques eliminated the need for hard-code specific features for specific fruit shapes, color and/or other attributes. This CNN is trained on more than 5000 images of apple and pear fruits on 960 cores GPU (Graphical Processing Unit). Testing set showed an accuracy of 90%. After this, trained data were transferred to an embedded device (Raspberry Pi gen.3) with camera for more portability. Based on correlation between number of visible fruits or detected fruits on one frame and the real number of fruits on one tree, a model was created to accommodate this error rate. Speed of processing and detection of the whole platform was higher than 40 frames per second. This speed is fast enough for any grasping/harvesting robotic arm or other real-time applications.

Keywords: artificial intelligence, computer vision, deep learning, fruit recognition, harvesting robot, precision agriculture

Procedia PDF Downloads 405
25269 Fruit Identification System in Sweet Orange Citrus (L.) Osbeck Using Thermal Imaging and Fuzzy

Authors: Ingrid Argote, John Archila, Marcelo Becker

Abstract:

In agriculture, intelligent systems applications have generated great advances in automating some of the processes in the production chain. In order to improve the efficiency of those systems is proposed a vision system to estimate the amount of fruits in sweet orange trees. This work presents a system proposal using capture of thermal images and fuzzy logic. A bibliographical review has been done to analyze the state-of-the-art of the different systems used in fruit recognition, and also the different applications of thermography in agricultural systems. The algorithm developed for this project uses the metrics of the fuzzines parameter to the contrast improvement and segmentation of the image, for the counting algorith m was used the Hough transform. In order to validate the proposed algorithm was created a bank of images of sweet orange Citrus (L.) Osbeck acquired in the Maringá Farm. The tests with the algorithm Indicated that the variation of the tree branch temperature and the fruit is not very high, Which makes the process of image segmentation using this differentiates, This Increases the amount of false positives in the fruit counting algorithm. Recognition of fruits isolated with the proposed algorithm present an overall accuracy of 90.5 % and grouped fruits. The accuracy was 81.3 %. The experiments show the need for a more suitable hardware to have a better recognition of small temperature changes in the image.

Keywords: Agricultural systems, Citrus, Fuzzy logic, Thermal images.

Procedia PDF Downloads 221
25268 Neuron Efficiency in Fluid Dynamics and Prediction of Groundwater Reservoirs'' Properties Using Pattern Recognition

Authors: J. K. Adedeji, S. T. Ijatuyi

Abstract:

The application of neural network using pattern recognition to study the fluid dynamics and predict the groundwater reservoirs properties has been used in this research. The essential of geophysical survey using the manual methods has failed in basement environment, hence the need for an intelligent computing such as predicted from neural network is inevitable. A non-linear neural network with an XOR (exclusive OR) output of 8-bits configuration has been used in this research to predict the nature of groundwater reservoirs and fluid dynamics of a typical basement crystalline rock. The control variables are the apparent resistivity of weathered layer (p1), fractured layer (p2), and the depth (h), while the dependent variable is the flow parameter (F=λ). The algorithm that was used in training the neural network is the back-propagation coded in C++ language with 300 epoch runs. The neural network was very intelligent to map out the flow channels and detect how they behave to form viable storage within the strata. The neural network model showed that an important variable gr (gravitational resistance) can be deduced from the elevation and apparent resistivity pa. The model results from SPSS showed that the coefficients, a, b and c are statistically significant with reduced standard error at 5%.

Keywords: gravitational resistance, neural network, non-linear, pattern recognition

Procedia PDF Downloads 203
25267 Social Network Analysis, Social Power in Water Co-Management (Case Study: Iran, Shemiranat, Jirood Village)

Authors: Fariba Ebrahimi, Mehdi Ghorbani, Ali Salajegheh

Abstract:

Comprehensively water management considers economic, environmental, technical and social and also sustainability of water resources for future generations. Grassland management implies cooperative approach and involves all stakeholders and also introduces issues to managers, decision and policy makers. Solving these issues needs integrated and system approach. According to the recognition of actors or key persons in necessary to apply cooperative management of Water. Therefore, based on stakeholder analysis and social network analysis can be used to demonstrate the most effective actors for environmental decisions. In this research, social powers according are specified to social network approach at Water utilizers’ level of Natural in Jirood catchment of Latian basin. In this paper, utilizers of water resources were recognized using field trips and then, trust and collaboration matrix produced using questionnaires. In the next step, degree centrality index were Examined. Finally, geometric position of each actor was illustrated in the network. The results of the research based on centrality index have a key role in recognition of cooperative management of Water in Jirood and also will help managers and planners of water in the case of recognition of social powers in order to organization and implementation of sustainable management of Water.

Keywords: social network analysis, water co-management, social power, centrality index, local stakeholders network, Jirood catchment

Procedia PDF Downloads 360
25266 Transcultural Study on Social Intelligence

Authors: Martha Serrano-Arias, Martha Frías-Armenta

Abstract:

Significant results have been found both supporting universality of emotion recognition and cultural background influence. Thus, the aim of this research was to test a Mexican version of the MTSI in different cultures to find differences in their performance. The MTSI-Mx assesses through a scenario approach were subjects must evaluate real persons. Two target persons were used for the construction, a man (FS) and a woman (AD). The items were grouped in four variables: Picture, Video, and FS and AD scenarios. The test was applied to 201 students from Mexico and Germany. T-test for picture and FS scenario show no significance. Video and AD had a significance at the 5% level. Results show slight differences between cultures, although a more comprehensive research is needed to conclude which culture can perform better in this kind of assessments.

Keywords: emotion recognition, MTSI, social intelligence, transcultural study

Procedia PDF Downloads 315
25265 Design and Development of Novel Anion Selective Chemosensors Derived from Vitamin B6 Cofactors

Authors: Darshna Sharma, Suban K. Sahoo

Abstract:

The detection of intracellular fluoride in human cancer cell HeLa was achieved by chemosensors derived from vitamin B6 cofactors using fluorescence imaging technique. These sensors were first synthesized by condensation of pyridoxal/pyridoxal phosphate with 2-amino(thio)phenol. The anion recognition ability was explored by experimental (UV-VIS, fluorescence and 1H NMR) and theoretical DFT [(B3LYP/6-31G(d,p)] methods in DMSO and mixed DMSO-H2O system. All the developed sensors showed both naked-eye detectable color change and remarkable fluorescence enhancement in the presence of F- and AcO-. The anion recognition was occurred through the formation of hydrogen bonded complexes between these anions and sensor, followed by the partial deprotonation of sensor. The detection limit of these sensors were down to micro(nano) molar level of F- and AcO-.

Keywords: chemosensors, fluoride, acetate, turn-on, live cells imaging, DFT

Procedia PDF Downloads 390
25264 A Review on Predictive Sound Recognition System

Authors: Ajay Kadam, Ramesh Kagalkar

Abstract:

The proposed research objective is to add to a framework for programmed recognition of sound. In this framework the real errand is to distinguish any information sound stream investigate it & anticipate the likelihood of diverse sounds show up in it. To create and industrially conveyed an adaptable sound web crawler a flexible sound search engine. The calculation is clamor and contortion safe, computationally productive, and hugely adaptable, equipped for rapidly recognizing a short portion of sound stream caught through a phone microphone in the presence of frontal area voices and other predominant commotion, and through voice codec pressure, out of a database of over accessible tracks. The algorithm utilizes a combinatorial hashed time-recurrence group of stars examination of the sound, yielding ordinary properties, for example, transparency, in which numerous tracks combined may each be distinguished.

Keywords: fingerprinting, pure tone, white noise, hash function

Procedia PDF Downloads 312
25263 Dynamic Gabor Filter Facial Features-Based Recognition of Emotion in Video Sequences

Authors: T. Hari Prasath, P. Ithaya Rani

Abstract:

In the world of visual technology, recognizing emotions from the face images is a challenging task. Several related methods have not utilized the dynamic facial features effectively for high performance. This paper proposes a method for emotions recognition using dynamic facial features with high performance. Initially, local features are captured by Gabor filter with different scale and orientations in each frame for finding the position and scale of face part from different backgrounds. The Gabor features are sent to the ensemble classifier for detecting Gabor facial features. The region of dynamic features is captured from the Gabor facial features in the consecutive frames which represent the dynamic variations of facial appearances. In each region of dynamic features is normalized using Z-score normalization method which is further encoded into binary pattern features with the help of threshold values. The binary features are passed to Multi-class AdaBoost classifier algorithm with the well-trained database contain happiness, sadness, surprise, fear, anger, disgust, and neutral expressions to classify the discriminative dynamic features for emotions recognition. The developed method is deployed on the Ryerson Multimedia Research Lab and Cohn-Kanade databases and they show significant performance improvement owing to their dynamic features when compared with the existing methods.

Keywords: detecting face, Gabor filter, multi-class AdaBoost classifier, Z-score normalization

Procedia PDF Downloads 262
25262 Image Rotation Using an Augmented 2-Step Shear Transform

Authors: Hee-Choul Kwon, Heeyong Kwon

Abstract:

Image rotation is one of main pre-processing steps for image processing or image pattern recognition. It is implemented with a rotation matrix multiplication. It requires a lot of floating point arithmetic operations and trigonometric calculations, so it takes a long time to execute. Therefore, there has been a need for a high speed image rotation algorithm without two major time-consuming operations. However, the rotated image has a drawback, i.e. distortions. We solved the problem using an augmented two-step shear transform. We compare the presented algorithm with the conventional rotation with images of various sizes. Experimental results show that the presented algorithm is superior to the conventional rotation one.

Keywords: high-speed rotation operation, image rotation, transform matrix, image processing, pattern recognition

Procedia PDF Downloads 266
25261 Musical Instrument Recognition in Polyphonic Audio Through Convolutional Neural Networks and Spectrograms

Authors: Rujia Chen, Akbar Ghobakhlou, Ajit Narayanan

Abstract:

This study investigates the task of identifying musical instruments in polyphonic compositions using Convolutional Neural Networks (CNNs) from spectrogram inputs, focusing on binary classification. The model showed promising results, with an accuracy of 97% on solo instrument recognition. When applied to polyphonic combinations of 1 to 10 instruments, the overall accuracy was 64%, reflecting the increasing challenge with larger ensembles. These findings contribute to the field of Music Information Retrieval (MIR) by highlighting the potential and limitations of current approaches in handling complex musical arrangements. Future work aims to include a broader range of musical sounds, including electronic and synthetic sounds, to improve the model's robustness and applicability in real-time MIR systems.

Keywords: binary classifier, CNN, spectrogram, instrument

Procedia PDF Downloads 49
25260 Data Analytics of Electronic Medical Records Shows an Age-Related Differences in Diagnosis of Coronary Artery Disease

Authors: Maryam Panahiazar, Andrew M. Bishara, Yorick Chern, Roohallah Alizadehsani, Dexter Hadleye, Ramin E. Beygui

Abstract:

Early detection plays a crucial role in enhancing the outcome for a patient with coronary artery disease (CAD). We utilized a big data analytics platform on ~23,000 patients with CAD from a total of 960,129 UCSF patients in 8 years. We traced the patients from their first encounter with a physician to diagnose and treat CAD. Characteristics such as demographic information, comorbidities, vital, lab tests, medications, and procedures are included. There are statistically significant gender-based differences in patients younger than 60 years old from the time of the first physician encounter to coronary artery bypass grafting (CABG) with a p-value=0.03. There are no significant differences between the patients between 60 and 80 years old (p-value=0.8) and older than 80 (p-value=0.4) with a 95% confidence interval. This recognition would affect significant changes in the guideline for referral of the patients for diagnostic tests expeditiously to improve the outcome by avoiding the delay in treatment.

Keywords: electronic medical records, coronary artery disease, data analytics, young women

Procedia PDF Downloads 137
25259 Diabetes Diagnosis Model Using Rough Set and K- Nearest Neighbor Classifier

Authors: Usiobaifo Agharese Rosemary, Osaseri Roseline Oghogho

Abstract:

Diabetes is a complex group of disease with a variety of causes; it is a disorder of the body metabolism in the digestion of carbohydrates food. The application of machine learning in the field of medical diagnosis has been the focus of many researchers and the use of recognition and classification model as a decision support tools has help the medical expert in diagnosis of diseases. Considering the large volume of medical data which require special techniques, experience, and high diagnostic skill in the diagnosis of diseases, the application of an artificial intelligent system to assist medical personnel in order to enhance their efficiency and accuracy in diagnosis will be an invaluable tool. In this study will propose a diabetes diagnosis model using rough set and K-nearest Neighbor classifier algorithm. The system consists of two modules: the feature extraction module and predictor module, rough data set is used to preprocess the attributes while K-nearest neighbor classifier is used to classify the given data. The dataset used for this model was taken for University of Benin Teaching Hospital (UBTH) database. Half of the data was used in the training while the other half was used in testing the system. The proposed model was able to achieve over 80% accuracy.

Keywords: classifier algorithm, diabetes, diagnostic model, machine learning

Procedia PDF Downloads 324
25258 Auto Classification of Multiple ECG Arrhythmic Detection via Machine Learning Techniques: A Review

Authors: Ng Liang Shen, Hau Yuan Wen

Abstract:

Arrhythmia analysis of ECG signal plays a major role in diagnosing most of the cardiac diseases. Therefore, a single arrhythmia detection of an electrocardiographic (ECG) record can determine multiple pattern of various algorithms and match accordingly each ECG beats based on Machine Learning supervised learning. These researchers used different features and classification methods to classify different arrhythmia types. A major problem in these studies is the fact that the symptoms of the disease do not show all the time in the ECG record. Hence, a successful diagnosis might require the manual investigation of several hours of ECG records. The point of this paper presents investigations cardiovascular ailment in Electrocardiogram (ECG) Signals for Cardiac Arrhythmia utilizing examination of ECG irregular wave frames via heart beat as correspond arrhythmia which with Machine Learning Pattern Recognition.

Keywords: electrocardiogram, ECG, classification, machine learning, pattern recognition, detection, QRS

Procedia PDF Downloads 358
25257 Effective Nutrition Label Use on Smartphones

Authors: Vladimir Kulyukin, Tanwir Zaman, Sarat Kiran Andhavarapu

Abstract:

Research on nutrition label use identifies four factors that impede comprehension and retention of nutrition information by consumers: label’s location on the package, presentation of information within the label, label’s surface size, and surrounding visual clutter. In this paper, a system is presented that makes nutrition label use more effective for nutrition information comprehension and retention. The system’s front end is a smartphone application. The system’s back end is a four node Linux cluster for image recognition and data storage. Image frames captured on the smartphone are sent to the back end for skewed or aligned barcode recognition. When barcodes are recognized, corresponding nutrition labels are retrieved from a cloud database and presented to the user on the smartphone’s touchscreen. Each displayed nutrition label is positioned centrally on the touchscreen with no surrounding visual clutter. Wikipedia links to important nutrition terms are embedded to improve comprehension and retention of nutrition information. Standard touch gestures (e.g., zoom in/out) available on mainstream smartphones are used to manipulate the label’s surface size. The nutrition label database currently includes 200,000 nutrition labels compiled from public web sites by a custom crawler. Stress test experiments with the node cluster are presented. Implications for proactive nutrition management and food policy are discussed.

Keywords: mobile computing, cloud computing, nutrition label use, nutrition management, barcode scanning

Procedia PDF Downloads 357
25256 Distributed Processing for Content Based Lecture Video Retrieval on Hadoop Framework

Authors: U. S. N. Raju, Kothuri Sai Kiran, Meena G. Kamal, Vinay Nikhil Pabba, Suresh Kanaparthi

Abstract:

There is huge amount of lecture video data available for public use, and many more lecture videos are being created and uploaded every day. Searching for videos on required topics from this huge database is a challenging task. Therefore, an efficient method for video retrieval is needed. An approach for automated video indexing and video search in large lecture video archives is presented. As the amount of video lecture data is huge, it is very inefficient to do the processing in a centralized computation framework. Hence, Hadoop Framework for distributed computing for Big Video Data is used. First, step in the process is automatic video segmentation and key-frame detection to offer a visual guideline for the video content navigation. In the next step, we extract textual metadata by applying video Optical Character Recognition (OCR) technology on key-frames. The OCR and detected slide text line types are adopted for keyword extraction, by which both video- and segment-level keywords are extracted for content-based video browsing and search. The performance of the indexing process can be improved for a large database by using distributed computing on Hadoop framework.

Keywords: video lectures, big video data, video retrieval, hadoop

Procedia PDF Downloads 516
25255 Facial Recognition of University Entrance Exam Candidates using FaceMatch Software in Iran

Authors: Mahshid Arabi

Abstract:

In recent years, remarkable advancements in the fields of artificial intelligence and machine learning have led to the development of facial recognition technologies. These technologies are now employed in a wide range of applications, including security, surveillance, healthcare, and education. In the field of education, the identification of university entrance exam candidates has been one of the fundamental challenges. Traditional methods such as using ID cards and handwritten signatures are not only inefficient and prone to fraud but also susceptible to errors. In this context, utilizing advanced technologies like facial recognition can be an effective and efficient solution to increase the accuracy and reliability of identity verification in entrance exams. This article examines the use of FaceMatch software for recognizing the faces of university entrance exam candidates in Iran. The main objective of this research is to evaluate the efficiency and accuracy of FaceMatch software in identifying university entrance exam candidates to prevent fraud and ensure the authenticity of individuals' identities. Additionally, this research investigates the advantages and challenges of using this technology in Iran's educational systems. This research was conducted using an experimental method and random sampling. In this study, 1000 university entrance exam candidates in Iran were selected as samples. The facial images of these candidates were processed and analyzed using FaceMatch software. The software's accuracy and efficiency were evaluated using various metrics, including accuracy rate, error rate, and processing time. The research results indicated that FaceMatch software could accurately identify candidates with a precision of 98.5%. The software's error rate was less than 1.5%, demonstrating its high efficiency in facial recognition. Additionally, the average processing time for each candidate's image was less than 2 seconds, indicating the software's high efficiency. Statistical evaluation of the results using precise statistical tests, including analysis of variance (ANOVA) and t-test, showed that the observed differences were significant, and the software's accuracy in identity verification is high. The findings of this research suggest that FaceMatch software can be effectively used as a tool for identifying university entrance exam candidates in Iran. This technology not only enhances security and prevents fraud but also simplifies and streamlines the exam administration process. However, challenges such as preserving candidates' privacy and the costs of implementation must also be considered. The use of facial recognition technology with FaceMatch software in Iran's educational systems can be an effective solution for preventing fraud and ensuring the authenticity of university entrance exam candidates' identities. Given the promising results of this research, it is recommended that this technology be more widely implemented and utilized in the country's educational systems.

Keywords: facial recognition, FaceMatch software, Iran, university entrance exam

Procedia PDF Downloads 29
25254 Machine Learning and Deep Learning Approach for People Recognition and Tracking in Crowd for Safety Monitoring

Authors: A. Degale Desta, Cheng Jian

Abstract:

Deep learning application in computer vision is rapidly advancing, giving it the ability to monitor the public and quickly identify potentially anomalous behaviour from crowd scenes. Therefore, the purpose of the current work is to improve the performance of safety of people in crowd events from panic behaviour through introducing the innovative idea of Aggregation of Ensembles (AOE), which makes use of the pre-trained ConvNets and a pool of classifiers to find anomalies in video data with packed scenes. According to the theory of algorithms that applied K-means, KNN, CNN, SVD, and Faster-CNN, YOLOv5 architectures learn different levels of semantic representation from crowd videos; the proposed approach leverages an ensemble of various fine-tuned convolutional neural networks (CNN), allowing for the extraction of enriched feature sets. In addition to the above algorithms, a long short-term memory neural network to forecast future feature values and a handmade feature that takes into consideration the peculiarities of the crowd to understand human behavior. On well-known datasets of panic situations, experiments are run to assess the effectiveness and precision of the suggested method. Results reveal that, compared to state-of-the-art methodologies, the system produces better and more promising results in terms of accuracy and processing speed.

Keywords: action recognition, computer vision, crowd detecting and tracking, deep learning

Procedia PDF Downloads 146
25253 Bedouin Dispersion in Israel: Between Sustainable Development and Social Non-Recognition

Authors: Tamir Michal

Abstract:

The subject of Bedouin dispersion has accompanied the State of Israel from the day of its establishment. From a legal point of view, this subject has offered a launchpad for creative judicial decisions. Thus, for example, the first court decision in Israel to recognize affirmative action (Avitan), dealt with a petition submitted by a Jew appealing the refusal of the State to recognize the Petitioner’s entitlement to the long-term lease of a plot designated for Bedouins. The Supreme Court dismissed the petition, holding that there existed a public interest in assisting Bedouin to establish permanent urban settlements, an interest which justifies giving them preference by selling them plots at subsidized prices. In another case (The Forum for Coexistence in the Negev) the Supreme Court extended equitable relief for the purpose of constructing a bridge, even though the construction infringed the Law, in order to allow the children of dispersed Bedouin to reach school. Against this background, the recent verdict, delivered during the Protective Edge military campaign, which dismissed a petition aimed at forcing the State to spread out Protective Structures in Bedouin villages in the Negev against the risk of being hit from missiles launched from Gaza (Abu Afash) is disappointing. Even if, in arguendo, no selective discrimination was involved in the State’s decision not to provide such protection, the decision, and its affirmation by the Court, is problematic when examined through the prism of the Theory of Recognition. The article analyses the issue by tools of theory of Recognition, according to which people develop their identities through mutual relations of recognition in different fields. In the social context, the path to recognition is cognitive respect, which is provided by means of legal rights. By seeing other participants in Society as bearers of rights and obligations, the individual develops an understanding of his legal condition as reflected in the attitude to others. Consequently, even if the Court’s decision may be justified on strict legal grounds, the fact that Jewish settlements were protected during the military operation, whereas Bedouin villages were not, is a setback in the struggle to make the Bedouin citizens with equal rights in Israeli society. As the Court held, ‘Beyond their protective function, the Migunit [Protective Structures] may make a moral and psychological contribution that should not be undervalued’. This contribution is one that the Bedouin did not receive in the Abu Afash verdict. The basic thesis is that the Court’s verdict analyzed above clearly demonstrates that the reliance on classical liberal instruments (e.g., equality) cannot secure full appreciation of all aspects of Bedouin life, and hence it can in fact prejudice them. Therefore, elements of the recognition theory should be added, in order to find the channel for cognitive dignity, thereby advancing the Bedouins’ ability to perceive themselves as equal human beings in the Israeli society.

Keywords: bedouin dispersion, cognitive respect, recognition theory, sustainable development

Procedia PDF Downloads 338
25252 Autism Spectrum Disorder Classification Algorithm Using Multimodal Data Based on Graph Convolutional Network

Authors: Yuntao Liu, Lei Wang, Haoran Xia

Abstract:

Machine learning has shown extensive applications in the development of classification models for autism spectrum disorder (ASD) using neural image data. This paper proposes a fusion multi-modal classification network based on a graph neural network. First, the brain is segmented into 116 regions of interest using a medical segmentation template (AAL, Anatomical Automatic Labeling). The image features of sMRI and the signal features of fMRI are extracted, which build the node and edge embedding representations of the brain map. Then, we construct a dynamically updated brain map neural network and propose a method based on a dynamic brain map adjacency matrix update mechanism and learnable graph to further improve the accuracy of autism diagnosis and recognition results. Based on the Autism Brain Imaging Data Exchange I dataset(ABIDE I), we reached a prediction accuracy of 74% between ASD and TD subjects. Besides, to study the biomarkers that can help doctors analyze diseases and interpretability, we used the features by extracting the top five maximum and minimum ROI weights. This work provides a meaningful way for brain disorder identification.

Keywords: autism spectrum disorder, brain map, supervised machine learning, graph network, multimodal data, model interpretability

Procedia PDF Downloads 45
25251 American Sign Language Recognition System

Authors: Rishabh Nagpal, Riya Uchagaonkar, Venkata Naga Narasimha Ashish Mernedi, Ahmed Hambaba

Abstract:

The rapid evolution of technology in the communication sector continually seeks to bridge the gap between different communities, notably between the deaf community and the hearing world. This project develops a comprehensive American Sign Language (ASL) recognition system, leveraging the advanced capabilities of convolutional neural networks (CNNs) and vision transformers (ViTs) to interpret and translate ASL in real-time. The primary objective of this system is to provide an effective communication tool that enables seamless interaction through accurate sign language interpretation. The architecture of the proposed system integrates dual networks -VGG16 for precise spatial feature extraction and vision transformers for contextual understanding of the sign language gestures. The system processes live input, extracting critical features through these sophisticated neural network models, and combines them to enhance gesture recognition accuracy. This integration facilitates a robust understanding of ASL by capturing detailed nuances and broader gesture dynamics. The system is evaluated through a series of tests that measure its efficiency and accuracy in real-world scenarios. Results indicate a high level of precision in recognizing diverse ASL signs, substantiating the potential of this technology in practical applications. Challenges such as enhancing the system’s ability to operate in varied environmental conditions and further expanding the dataset for training were identified and discussed. Future work will refine the model’s adaptability and incorporate haptic feedback to enhance the interactivity and richness of the user experience. This project demonstrates the feasibility of an advanced ASL recognition system and lays the groundwork for future innovations in assistive communication technologies.

Keywords: sign language, computer vision, vision transformer, VGG16, CNN

Procedia PDF Downloads 23
25250 Myanmar Consonants Recognition System Based on Lip Movements Using Active Contour Model

Authors: T. Thein, S. Kalyar Myo

Abstract:

Human uses visual information for understanding the speech contents in noisy conditions or in situations where the audio signal is not available. The primary advantage of visual information is that it is not affected by the acoustic noise and cross talk among speakers. Using visual information from the lip movements can improve the accuracy and robustness of automatic speech recognition. However, a major challenge with most automatic lip reading system is to find a robust and efficient method for extracting the linguistically relevant speech information from a lip image sequence. This is a difficult task due to variation caused by different speakers, illumination, camera setting and the inherent low luminance and chrominance contrast between lip and non-lip region. Several researchers have been developing methods to overcome these problems; the one is lip reading. Moreover, it is well known that visual information about speech through lip reading is very useful for human speech recognition system. Lip reading is the technique of a comprehensive understanding of underlying speech by processing on the movement of lips. Therefore, lip reading system is one of the different supportive technologies for hearing impaired or elderly people, and it is an active research area. The need for lip reading system is ever increasing for every language. This research aims to develop a visual teaching method system for the hearing impaired persons in Myanmar, how to pronounce words precisely by identifying the features of lip movement. The proposed research will work a lip reading system for Myanmar Consonants, one syllable consonants (င (Nga)၊ ည (Nya)၊ မ (Ma)၊ လ (La)၊ ၀ (Wa)၊ သ (Tha)၊ ဟ (Ha)၊ အ (Ah) ) and two syllable consonants ( က(Ka Gyi)၊ ခ (Kha Gway)၊ ဂ (Ga Nge)၊ ဃ (Ga Gyi)၊ စ (Sa Lone)၊ ဆ (Sa Lain)၊ ဇ (Za Gwe) ၊ ဒ (Da Dway)၊ ဏ (Na Gyi)၊ န (Na Nge)၊ ပ (Pa Saug)၊ ဘ (Ba Gone)၊ ရ (Ya Gaug)၊ ဠ (La Gyi) ). In the proposed system, there are three subsystems, the first one is the lip localization system, which localizes the lips in the digital inputs. The next one is the feature extraction system, which extracts features of lip movement suitable for visual speech recognition. And the final one is the classification system. In the proposed research, Two Dimensional Discrete Cosine Transform (2D-DCT) and Linear Discriminant Analysis (LDA) with Active Contour Model (ACM) will be used for lip movement features extraction. Support Vector Machine (SVM) classifier is used for finding class parameter and class number in training set and testing set. Then, experiments will be carried out for the recognition accuracy of Myanmar consonants using the only visual information on lip movements which are useful for visual speech of Myanmar languages. The result will show the effectiveness of the lip movement recognition for Myanmar Consonants. This system will help the hearing impaired persons to use as the language learning application. This system can also be useful for normal hearing persons in noisy environments or conditions where they can find out what was said by other people without hearing voice.

Keywords: feature extraction, lip reading, lip localization, Active Contour Model (ACM), Linear Discriminant Analysis (LDA), Support Vector Machine (SVM), Two Dimensional Discrete Cosine Transform (2D-DCT)

Procedia PDF Downloads 272
25249 Interoperability Standard for Data Exchange in Educational Documents in Professional and Technological Education: A Comparative Study and Feasibility Analysis for the Brazilian Context

Authors: Giovana Nunes Inocêncio

Abstract:

The professional and technological education (EPT) plays a pivotal role in equipping students for specialized careers, and it is imperative to establish a framework for efficient data exchange among educational institutions. The primary focus of this article is to address the pressing need for document interoperability within the context of EPT. The challenges, motivations, and benefits of implementing interoperability standards for digital educational documents are thoroughly explored. These documents include EPT completion certificates, academic records, and curricula. In conjunction with the prior abstract, it is evident that the intersection of IT governance and interoperability standards holds the key to transforming the landscape of technical education in Brazil. IT governance provides the strategic framework for effective data management, aligning with educational objectives, ensuring compliance, and managing risks. By adopting interoperability standards, the technical education sector in Brazil can facilitate data exchange, enhance data security, and promote international recognition of qualifications. The utilization of the XML (Extensible Markup Language) standard further strengthens the foundation for structured data exchange, fostering efficient communication, standardization of curricula, and enhancing educational materials. The IT governance, interoperability standards, and data management critical role in driving the quality, efficiency, and security of technical education. The adoption of these standards fosters transparency, stakeholder coordination, and regulatory compliance, ultimately empowering the technical education sector to meet the dynamic demands of the 21st century.

Keywords: interoperability, education, standards, governance

Procedia PDF Downloads 58
25248 Understanding Europe’s Role in the Area of Liberty, Security, and Justice as an International Actor

Authors: Barrere Sarah

Abstract:

The area of liberty, security, and justice within the European Union is still a work in progress. No one can deny that the EU struggles between a monistic and a dualist approach. The aim of our essay is to first review how the European law is perceived by the rest of the international scene. It will then discuss two main mechanisms at play: the interpretation of larger international treaties and the penal mechanisms of European law. Finally, it will help us understand the role of a penal Europe on the international scene with concrete examples. Special attention will be paid to cases that deal with fundamental rights as they represent an interesting case study in Europe and in the rest of the World. It could illustrate the aforementioned duality currently present in the Union’s interpretation of international public law. On the other hand, it will explore some specific European penal mechanism through mutual recognition and the European arrest warrant in the transnational criminality frame. Concerning the interpretation of the treaties, it will first, underline the ambiguity and the general nature of some treaties that leave the EU exposed to tension and misunderstanding then it will review the validity of an EU act (whether or not it is compatible with the rules of International law). Finally, it will focus on the most complete manifestation of liberty, security and justice through the principle of mutual recognition. Used initially in commercial matters, it has become “the cornerstone” of European construction. It will see how it is applied in judicial decisions (its main event and achieving success is via the European arrest warrant) and how European member states have managed to develop this cooperation.

Keywords: European penal law, international scene, liberty security and justice area, mutual recognition

Procedia PDF Downloads 397
25247 Design and Implementation of Generative Models for Odor Classification Using Electronic Nose

Authors: Kumar Shashvat, Amol P. Bhondekar

Abstract:

In the midst of the five senses, odor is the most reminiscent and least understood. Odor testing has been mysterious and odor data fabled to most practitioners. The delinquent of recognition and classification of odor is important to achieve. The facility to smell and predict whether the artifact is of further use or it has become undesirable for consumption; the imitation of this problem hooked on a model is of consideration. The general industrial standard for this classification is color based anyhow; odor can be improved classifier than color based classification and if incorporated in machine will be awfully constructive. For cataloging of odor for peas, trees and cashews various discriminative approaches have been used Discriminative approaches offer good prognostic performance and have been widely used in many applications but are incapable to make effectual use of the unlabeled information. In such scenarios, generative approaches have better applicability, as they are able to knob glitches, such as in set-ups where variability in the series of possible input vectors is enormous. Generative models are integrated in machine learning for either modeling data directly or as a transitional step to form an indeterminate probability density function. The algorithms or models Linear Discriminant Analysis and Naive Bayes Classifier have been used for classification of the odor of cashews. Linear Discriminant Analysis is a method used in data classification, pattern recognition, and machine learning to discover a linear combination of features that typifies or divides two or more classes of objects or procedures. The Naive Bayes algorithm is a classification approach base on Bayes rule and a set of qualified independence theory. Naive Bayes classifiers are highly scalable, requiring a number of restraints linear in the number of variables (features/predictors) in a learning predicament. The main recompenses of using the generative models are generally a Generative Models make stronger assumptions about the data, specifically, about the distribution of predictors given the response variables. The Electronic instrument which is used for artificial odor sensing and classification is an electronic nose. This device is designed to imitate the anthropological sense of odor by providing an analysis of individual chemicals or chemical mixtures. The experimental results have been evaluated in the form of the performance measures i.e. are accuracy, precision and recall. The investigational results have proven that the overall performance of the Linear Discriminant Analysis was better in assessment to the Naive Bayes Classifier on cashew dataset.

Keywords: odor classification, generative models, naive bayes, linear discriminant analysis

Procedia PDF Downloads 369
25246 An Approach for Ensuring Data Flow in Freight Delivery and Management Systems

Authors: Aurelija Burinskienė, Dalė Dzemydienė, Arūnas Miliauskas

Abstract:

This research aims at developing the approach for more effective freight delivery and transportation process management. The road congestions and the identification of causes are important, as well as the context information recognition and management. The measure of many parameters during the transportation period and proper control of driver work became the problem. The number of vehicles per time unit passing at a given time and point for drivers can be evaluated in some situations. The collection of data is mainly used to establish new trips. The flow of the data is more complex in urban areas. Herein, the movement of freight is reported in detail, including the information on street level. When traffic density is extremely high in congestion cases, and the traffic speed is incredibly low, data transmission reaches the peak. Different data sets are generated, which depend on the type of freight delivery network. There are three types of networks: long-distance delivery networks, last-mile delivery networks and mode-based delivery networks; the last one includes different modes, in particular, railways and other networks. When freight delivery is switched from one type of the above-stated network to another, more data could be included for reporting purposes and vice versa. In this case, a significant amount of these data is used for control operations, and the problem requires an integrated methodological approach. The paper presents an approach for providing e-services for drivers by including the assessment of the multi-component infrastructure needed for delivery of freights following the network type. The construction of such a methodology is required to evaluate data flow conditions and overloads, and to minimize the time gaps in data reporting. The results obtained show the possibilities of the proposing methodological approach to support the management and decision-making processes with functionality of incorporating networking specifics, by helping to minimize the overloads in data reporting.

Keywords: transportation networks, freight delivery, data flow, monitoring, e-services

Procedia PDF Downloads 114
25245 Comparison of Various Classification Techniques Using WEKA for Colon Cancer Detection

Authors: Beema Akbar, Varun P. Gopi, V. Suresh Babu

Abstract:

Colon cancer causes the deaths of about half a million people every year. The common method of its detection is histopathological tissue analysis, it leads to tiredness and workload to the pathologist. A novel method is proposed that combines both structural and statistical pattern recognition used for the detection of colon cancer. This paper presents a comparison among the different classifiers such as Multilayer Perception (MLP), Sequential Minimal Optimization (SMO), Bayesian Logistic Regression (BLR) and k-star by using classification accuracy and error rate based on the percentage split method. The result shows that the best algorithm in WEKA is MLP classifier with an accuracy of 83.333% and kappa statistics is 0.625. The MLP classifier which has a lower error rate, will be preferred as more powerful classification capability.

Keywords: colon cancer, histopathological image, structural and statistical pattern recognition, multilayer perception

Procedia PDF Downloads 564
25244 Training a Neural Network to Segment, Detect and Recognize Numbers

Authors: Abhisek Dash

Abstract:

This study had three neural networks, one for number segmentation, one for number detection and one for number recognition all of which are coupled to one another. All networks were trained on the MNIST dataset and were convolutional. It was assumed that the images had lighter background and darker foreground. The segmentation network took 28x28 images as input and had sixteen outputs. Segmentation training starts when a dark pixel is encountered. Taking a window(7x7) over that pixel as focus, the eight neighborhood of the focus was checked for further dark pixels. The segmentation network was then trained to move in those directions which had dark pixels. To this end the segmentation network had 16 outputs. They were arranged as “go east”, ”don’t go east ”, “go south east”, “don’t go south east”, “go south”, “don’t go south” and so on w.r.t focus window. The focus window was resized into a 28x28 image and the network was trained to consider those neighborhoods which had dark pixels. The neighborhoods which had dark pixels were pushed into a queue in a particular order. The neighborhoods were then popped one at a time stitched to the existing partial image of the number one at a time and trained on which neighborhoods to consider when the new partial image was presented. The above process was repeated until the image was fully covered by the 7x7 neighborhoods and there were no more uncovered black pixels. During testing the network scans and looks for the first dark pixel. From here on the network predicts which neighborhoods to consider and segments the image. After this step the group of neighborhoods are passed into the detection network. The detection network took 28x28 images as input and had two outputs denoting whether a number was detected or not. Since the ground truth of the bounds of a number was known during training the detection network outputted in favor of number not found until the bounds were not met and vice versa. The recognition network was a standard CNN that also took 28x28 images and had 10 outputs for recognition of numbers from 0 to 9. This network was activated only when the detection network votes in favor of number detected. The above methodology could segment connected and overlapping numbers. Additionally the recognition unit was only invoked when a number was detected which minimized false positives. It also eliminated the need for rules of thumb as segmentation is learned. The strategy can also be extended to other characters as well.

Keywords: convolutional neural networks, OCR, text detection, text segmentation

Procedia PDF Downloads 144
25243 Automatic Target Recognition in SAR Images Based on Sparse Representation Technique

Authors: Ahmet Karagoz, Irfan Karagoz

Abstract:

Synthetic Aperture Radar (SAR) is a radar mechanism that can be integrated into manned and unmanned aerial vehicles to create high-resolution images in all weather conditions, regardless of day and night. In this study, SAR images of military vehicles with different azimuth and descent angles are pre-processed at the first stage. The main purpose here is to reduce the high speckle noise found in SAR images. For this, the Wiener adaptive filter, the mean filter, and the median filters are used to reduce the amount of speckle noise in the images without causing loss of data. During the image segmentation phase, pixel values are ordered so that the target vehicle region is separated from other regions containing unnecessary information. The target image is parsed with the brightest 20% pixel value of 255 and the other pixel values of 0. In addition, by using appropriate parameters of statistical region merging algorithm, segmentation comparison is performed. In the step of feature extraction, the feature vectors belonging to the vehicles are obtained by using Gabor filters with different orientation, frequency and angle values. A number of Gabor filters are created by changing the orientation, frequency and angle parameters of the Gabor filters to extract important features of the images that form the distinctive parts. Finally, images are classified by sparse representation method. In the study, l₁ norm analysis of sparse representation is used. A joint database of the feature vectors generated by the target images of military vehicle types is obtained side by side and this database is transformed into the matrix form. In order to classify the vehicles in a similar way, the test images of each vehicle is converted to the vector form and l₁ norm analysis of the sparse representation method is applied through the existing database matrix form. As a result, correct recognition has been performed by matching the target images of military vehicles with the test images by means of the sparse representation method. 97% classification success of SAR images of different military vehicle types is obtained.

Keywords: automatic target recognition, sparse representation, image classification, SAR images

Procedia PDF Downloads 350
25242 A Fast Version of the Generalized Multi-Directional Radon Transform

Authors: Ines Elouedi, Atef Hammouda

Abstract:

This paper presents a new fast version of the generalized Multi-Directional Radon Transform method. The new method uses the inverse Fast Fourier Transform to lead to a faster Generalized Radon projections. We prove in this paper that the fast algorithm leads to almost the same results of the eldest one but with a considerable lower time computation cost. The projection end result of the fast method is a parameterized Radon space where a high valued pixel allows the detection of a curve from the original image. The proposed fast inversion algorithm leads to an exact reconstruction of the initial image from the Radon space. We show examples of the impact of this algorithm on the pattern recognition domain.

Keywords: fast generalized multi-directional Radon transform, curve, exact reconstruction, pattern recognition

Procedia PDF Downloads 268
25241 Robust Features for Impulsive Noisy Speech Recognition Using Relative Spectral Analysis

Authors: Hajer Rahali, Zied Hajaiej, Noureddine Ellouze

Abstract:

The goal of speech parameterization is to extract the relevant information about what is being spoken from the audio signal. In speech recognition systems Mel-Frequency Cepstral Coefficients (MFCC) and Relative Spectral Mel-Frequency Cepstral Coefficients (RASTA-MFCC) are the two main techniques used. It will be shown in this paper that it presents some modifications to the original MFCC method. In our work the effectiveness of proposed changes to MFCC called Modified Function Cepstral Coefficients (MODFCC) were tested and compared against the original MFCC and RASTA-MFCC features. The prosodic features such as jitter and shimmer are added to baseline spectral features. The above-mentioned techniques were tested with impulsive signals under various noisy conditions within AURORA databases.

Keywords: auditory filter, impulsive noise, MFCC, prosodic features, RASTA filter

Procedia PDF Downloads 413