Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 1500

Search results for: low vision aids

1290 Efficient Passenger Counting in Public Transport Based on Machine Learning

Authors: Chonlakorn Wiboonsiriruk, Ekachai Phaisangittisagul, Chadchai Srisurangkul, Itsuo Kumazawa

Abstract:

Public transportation is a crucial aspect of passenger transportation, with buses playing a vital role in the transportation service. Passenger counting is an essential tool for organizing and managing transportation services. However, manual counting is a tedious and time-consuming task, which is why computer vision algorithms are being utilized to make the process more efficient. In this study, different object detection algorithms combined with passenger tracking are investigated to compare passenger counting performance. The system employs the EfficientDet algorithm, which has demonstrated superior performance in terms of speed and accuracy. Our results show that the proposed system can accurately count passengers in varying conditions with an accuracy of 94%.

Keywords: computer vision, object detection, passenger counting, public transportation

Procedia PDF Downloads 126

1289 Control of Belts for Classification of Geometric Figures by Artificial Vision

Authors: Juan Sebastian Huertas Piedrahita, Jaime Arturo Lopez Duque, Eduardo Luis Perez Londoño, Julián S. Rodríguez

Abstract:

The process of generating computer vision is called artificial vision. The artificial vision is a branch of artificial intelligence that allows the obtaining, processing, and analysis of any type of information especially the ones obtained through digital images. Actually the artificial vision is used in manufacturing areas for quality control and production, as these processes can be realized through counting algorithms, positioning, and recognition of objects that can be measured by a single camera (or more). On the other hand, the companies use assembly lines formed by conveyor systems with actuators on them for moving pieces from one location to another in their production. These devices must be previously programmed for their good performance and must have a programmed logic routine. Nowadays the production is the main target of every industry, quality, and the fast elaboration of the different stages and processes in the chain of production of any product or service being offered. The principal base of this project is to program a computer that recognizes geometric figures (circle, square, and triangle) through a camera, each one with a different color and link it with a group of conveyor systems to organize the mentioned figures in cubicles, which differ from one another also by having different colors. This project bases on artificial vision, therefore the methodology needed to develop this project must be strict, this one is detailed below: 1. Methodology: 1.1 The software used in this project is QT Creator which is linked with Open CV libraries. Together, these tools perform to realize the respective program to identify colors and forms directly from the camera to the computer. 1.2 Imagery acquisition: To start using the libraries of Open CV is necessary to acquire images, which can be captured by a computer’s web camera or a different specialized camera. 1.3 The recognition of RGB colors is realized by code, crossing the matrices of the captured images and comparing pixels, identifying the primary colors which are red, green, and blue. 1.4 To detect forms it is necessary to realize the segmentation of the images, so the first step is converting the image from RGB to grayscale, to work with the dark tones of the image, then the image is binarized which means having the figure of the image in a white tone with a black background. Finally, we find the contours of the figure in the image to detect the quantity of edges to identify which figure it is. 1.5 After the color and figure have been identified, the program links with the conveyor systems, which through the actuators will classify the figures in their respective cubicles. Conclusions: The Open CV library is a useful tool for projects in which an interface between a computer and the environment is required since the camera obtains external characteristics and realizes any process. With the program for this project any type of assembly line can be optimized because images from the environment can be obtained and the process would be more accurate.

Keywords: artificial intelligence, artificial vision, binarized, grayscale, images, RGB

Procedia PDF Downloads 358

1288 Football Smart Coach: Analyzing Corner Kicks Using Computer Vision

Authors: Arth Bohra, Marwa Mahmoud

Abstract:

In this paper, we utilize computer vision to develop a tool for youth coaches to formulate set-piece tactics for their players. We used the Soccernet database to extract the ResNet features and camera calibration data for over 3000 corner kick across 500 professional matches in the top 6 European leagues (English Premier League, UEFA Champions League, Ligue 1, La Liga, Serie A, Bundesliga). Leveraging the provided homography matrix, we construct a feature vector representing the formation of players on these corner kicks. Additionally, labeling the videos manually, we obtained the pass-trajectory of each of the 3000+ corner kicks by segmenting the field into four zones. Next, after determining the localization of the players and ball, we used event data to give the corner kicks a rating on a 1-4 scale. By employing a Convolutional Neural Network, our model managed to predict the success of a corner kick given the formations of players. This suggests that with the right formations, teams can optimize the way they approach corner kicks. By understanding this, we can help coaches formulate set-piece tactics for their own teams in order to maximize the success of their play. The proposed model can be easily extended; our method could be applied to even more game situations, from free kicks to counterattacks. This research project also gives insight into the myriad of possibilities that artificial intelligence possesses in transforming the domain of sports.

Keywords: soccer, corner kicks, AI, computer vision

Procedia PDF Downloads 150

1287 Meningeal Hemangiopericytoma in an HIV-Positive Patient: A Case Report and Review of Literature

Authors: Roland Benedict Reyes, Marc Edsel Ayes, Regina Berba, Cybele Lara Abad

Abstract:

Background: Three AIDS-defining malignancies have been associated with the human immunodeficiency virus (HIV): Kaposi’s sarcoma, non-Hodgkin’s lymphoma, and cervical carcinoma. However, new cases of non-AIDS defining malignancies also have been increasingly associated with HIV. One of these is a rare intracranial malignancy, meningeal hemangiopericyotma. Case Description: A 32-year old HIV-positive male, not on highly active antiretroviral therapy, was admitted to our hospital due to generalized weakness and sudden onset hearing loss. Cranial MRI was done, which revealed a temporal nodule with the following considerations: granuloma, meningioma or metastases. A craniotomy was performed and the mass excised. Results from the biopsy showed meningeal hemangiopericytoma. The patient was then started on antiretroviral therapy (Lamivudine, Tenofovir, and Efavirenz) and was discharged for radiation therapy and metastatic work-up as an outpatient. On follow-up seven months later, metastatic work up revealed multiple hepatic foci not previously documented suggestive of metastasis short of biopsy sampling. Conclusions: This case of an intracranial hemangiopericytoma in an HIV-positive patient is the second case thus far presented, based on our systematic and extensive search of the literature.

Keywords: Hemangiopericytoma, Human Immunodeficiency Virus, Meningeal hemangiopericytoma, Neoplasm

Procedia PDF Downloads 440

1286 Vision-Based Daily Routine Recognition for Healthcare with Transfer Learning

Authors: Bruce X. B. Yu, Yan Liu, Keith C. C. Chan

Abstract:

We propose to record Activities of Daily Living (ADLs) of elderly people using a vision-based system so as to provide better assistive and personalization technologies. Current ADL-related research is based on data collected with help from non-elderly subjects in laboratory environments and the activities performed are predetermined for the sole purpose of data collection. To obtain more realistic datasets for the application, we recorded ADLs for the elderly with data collected from real-world environment involving real elderly subjects. Motivated by the need to collect data for more effective research related to elderly care, we chose to collect data in the room of an elderly person. Specifically, we installed Kinect, a vision-based sensor on the ceiling, to capture the activities that the elderly subject performs in the morning every day. Based on the data, we identified 12 morning activities that the elderly person performs daily. To recognize these activities, we created a HARELCARE framework to investigate into the effectiveness of existing Human Activity Recognition (HAR) algorithms and propose the use of a transfer learning algorithm for HAR. We compared the performance, in terms of accuracy, and training progress. Although the collected dataset is relatively small, the proposed algorithm has a good potential to be applied to all daily routine activities for healthcare purposes such as evidence-based diagnosis and treatment.

Keywords: daily activity recognition, healthcare, IoT sensors, transfer learning

Procedia PDF Downloads 111

1285 Elimination of Mother to Child Transmission of HIV/AIDS: A Study of the Knowledge, Attitudes and Perceptions of Healthcare Workers in Abuja Nigeria

Authors: Ezinne K. Okoro, Takahiko Katoh, Yoko Kawamura, Stanley C. Meribe

Abstract:

HIV infection in children is largely as a result of vertical transmission (mother to child transmission [MTCT]). Thus, elimination of mother to child transmission of HIV/AIDS is critical in eliminating HIV infection in children. In Nigeria, drawbacks such as; limited pediatric screening, limited human capital, insufficient advocacy and poor understanding of ART guidelines, have impacted efforts at combating the disease, even as treatment services are free. Prevention of Mother to Child Transmission (PMTCT) program relies on health workers who not only counsel pregnant women on first contact but can competently provide HIV-positive pregnant women with accurate information about the PMTCT program such as feeding techniques and drug adherence. In developing regions like Nigeria where health care delivery faces a lot of drawbacks, it becomes paramount to address these issues of poor PMTCT coverage by conducting a baseline assessment of the knowledge, practices and perceptions related to HIV prevention amongst healthcare workers in Nigeria. A descriptive cross-sectional study was conducted amongst 250 health workers currently employed in health facilities in Abuja, Nigeria where PMTCT services were offered with the capacity to carry out early infant diagnosis testing (EID). Data was collected using a self-administered, pretested, structured questionnaire. This study showed that the knowledge of PMTCT of HIV was poor (30%) among healthcare workers who offer this service day-to-day to pregnant women. When PMTCT practices were analyzed in keeping with National PMTCT guidelines, over 61% of the respondents reported observing standard practices and the majority (58%) had good attitudes towards caring for patients with HIV/AIDS. Although 61% of the respondents reported being satisfied with the quality of service being rendered, 63% reported not being satisfied with their level of knowledge. Predictors of good knowledge were job designation and level of educational attainment. Health workers who were more satisfied with their working conditions and those who had worked for a longer time in the PMTCT service were more likely to observe standard PMTCT practices. With over 62% of the healthcare workers suggesting that more training would improve the quality of service being rendered, this is a strong pointer to stakeholders to consider a ‘healthcare worker-oriented approach’ when planning and conducting PMTCT training for healthcare workers. This in turn will increase pediatric ARV coverage, the knowledge and effectiveness of the healthcare workers in carrying out appropriate PMTCT interventions and culminating in the reduction/elimination of HIV transmission to newborns.

Keywords: attitudes, HIV/AIDS, healthcare workers, knowledge, mother to child transmission, Nigeria, perceptions

Procedia PDF Downloads 180

1284 Complications of Contact Lens-Associated Keratitis: A Refresher for Emergency Departments

Authors: S. Selman, T. Gout

Abstract:

Microbial keratitis is a serious complication of contact lens wear that can be vision and eye-threatening. Diverse presentations relating to contact lens wear include dry corneal surface, corneal infiltrate, ulceration, scarring, and complete corneal melt leading to perforation. Contact lens wear is a major risk factor and, as such, is an important consideration in any patient presenting with a red eye in the primary care setting. This paper aims to provide an overview of the risk factors, common organisms, and spectrum of contact lens-associated keratitis (CLAK) complications. It will highlight some of the salient points relevant to the assessment and workup of patients suspected of CLAK in the emergency department based on the recent literature and therapeutic guidelines. An overview of the management principles will also be provided.

Keywords: microbial keratitis, corneal pathology, contact lens-associated complications, painful vision loss

Procedia PDF Downloads 82

1283 Fitness Action Recognition Based on MediaPipe

Authors: Zixuan Xu, Yichun Lou, Yang Song, Zihuai Lin

Abstract:

MediaPipe is an open-source machine learning computer vision framework that can be ported into a multi-platform environment, which makes it easier to use it to recognize the human activity. Based on this framework, many human recognition systems have been created, but the fundamental issue is the recognition of human behavior and posture. In this paper, two methods are proposed to recognize human gestures based on MediaPipe, the first one uses the Adaptive Boosting algorithm to recognize a series of fitness gestures, and the second one uses the Fast Dynamic Time Warping algorithm to recognize 413 continuous fitness actions. These two methods are also applicable to any human posture movement recognition.

Keywords: computer vision, MediaPipe, adaptive boosting, fast dynamic time warping

Procedia PDF Downloads 82

1282 Optimizing the Use of Google Translate in Translation Teaching: A Case Study at Prince Sultan University

Authors: Saadia Elamin

Abstract:

The quasi-universal use of smart phones with internet connection available all the time makes it a reflex action for translation undergraduates, once they encounter the least translation problem, to turn to the freely available web resource: Google Translate. Like for other translator resources and aids, the use of Google Translate needs to be moderated in such a way that it contributes to developing translation competence. Here, instead of interfering with students’ learning by providing ready-made solutions which might not always fit into the contexts of use, it can help to consolidate the skills of analysis and transfer which students have already acquired. One way to do so is by training students to adhere to the basic principles of translation work. The most important of these is that analyzing the source text for comprehension comes first and foremost before jumping into the search for target language equivalents. Another basic principle is that certain translator aids and tools can be used for comprehension, while others are to be confined to the phase of re-expressing the meaning into the target language. The present paper reports on the experience of making a measured and reasonable use of Google Translate in translation teaching at Prince Sultan University (PSU), Riyadh. First, it traces the development that has taken place in the field of translation in this age of information technology, be it in translation teaching and translator training, or in the real-world practice of the profession. Second, it describes how, with the aim of reflecting this development onto the way translation is taught, senior students, after being trained on post-editing machine translation output, are authorized to use Google Translate in classwork and assignments. Third, the paper elaborates on the findings of this case study which has demonstrated that Google Translate, if used at the appropriate levels of training, can help to enhance students’ ability to perform different translation tasks. This help extends from the search for terms and expressions, to the tasks of drafting the target text, revising its content and finally editing it. In addition, using Google Translate in this way fosters a reflexive and critical attitude towards web resources in general, maximizing thus the benefit gained from them in preparing students to meet the requirements of the modern translation job market.

Keywords: Google Translate, post-editing machine translation output, principles of translation work, translation competence, translation teaching, translator aids and tools

Procedia PDF Downloads 445

1281 Developing Effective Strategies to Reduce Hiv, Aids and Sexually Transmitted Infections, Nakuru, Kenya

Authors: Brian Bacia, Esther Githaiga, Teresia Kabucho, Paul Moses Ndegwa, Lucy Gichohi

Abstract:

Purpose: The aim of the study is to ensure an appropriate mix of evidence-based prevention strategies geared towards the reduction of new HIV infections and the incidence of Sexually transmitted Illnesses Background: In Nakuru County, more than 90% of all HIV-infected patients are adults and on a single-dose medication-one pill that contains a combination of several different HIV drugs. Nakuru town has been identified as the hardest hit by HIV/Aids in the County according to the latest statistics from the County Aids and STI group, with a prevalence rate of 5.7 percent attributed to the high population and an active urban center. Method: 2 key studies were carried out to provide evidence for the effectiveness of antiretroviral therapy (ART) when used optimally on preventing sexual transmission of HIV. Discussions based on an examination, assessments of successes in planning, program implementation, and ultimate impact of prevention and treatment were undertaken involving health managers, health workers, community health workers, and people living with HIV/AIDS between February -August 2021. Questionnaires were carried out by a trained duo on ethical procedures at 15 HIV treatment clinics targeting patients on ARVs and caregivers on ARV prevention and treatment of pediatric HIV infection. Findings: Levels of AIDS awareness are extremely high. Advances in HIV treatment have led to an enhanced understanding of the virus, improved care of patients, and control of the spread of drug-resistant HIV. There has been a tremendous increase in the number of people living with HIV having access to life-long antiretroviral drugs (ARV), mostly on generic medicines. Healthcare facilities providing treatment are stressed challenging the administration of the drugs, which require a clinical setting. Women find it difficult to take a daily pill which reduces the effectiveness of the medicine. ART adherence can be strengthened largely through the use of innovative digital technology. The case management approach is useful in resource-limited settings. The county has made tremendous progress in mother-to-child transmission reduction through enhanced early antenatal care (ANC) attendance and mapping of pregnant women Recommendations: Treatment reduces the risk of transmission to the child during pregnancy, labor, and delivery. Promote research of medicines through patients and community engagement. Reduce the risk of transmission through breastfeeding. Enhance testing strategies and strengthen health systems for sustainable HIV service delivery. Need exists for improved antenatal care and delivery by skilled birth attendants. Develop a comprehensive maternal reproductive health policy covering equitability, efficient and effective delivery of services. Put in place referral systems.

Keywords: evidence-based prevention strategies, service delivery, human management, integrated approach

Procedia PDF Downloads 68

1280 Flashsonar or Echolocation Education: Expanding the Function of Hearing and Changing the Meaning of Blindness

Authors: Thomas, Daniel Tajo, Kish

Abstract:

Sight is primarily associated with the function of gathering and processing near and extended spatial information which is largely used to support self-determined interaction with the environment through self-directed movement and navigation. By contrast, hearing is primarily associated with the function of gathering and processing sequential information which may typically be used to support self-determined communication through the self-directed use of music and language. Blindness or the lack of vision is traditionally characterized by a lack of capacity to access spatial information which, in turn, is presumed to result in a lack of capacity for self-determined interaction with the environment due to limitations in self-directed movement and navigation. However, through a specific protocol of FlashSonar education developed by World Access for the Blind, the function of hearing can be expanded in blind people to carry out some of the functions normally associated with sight, that is to access and process near and extended spatial information to construct three-dimensional acoustic images of the environment. This perceptual education protocol results in a significant restoration in blind people of self-determined environmental interaction, movement, and navigational capacities normally attributed to vision - a new way to see. Thus, by expanding the function of hearing to process spatial information to restore self-determined movement, we are not only changing the meaning of blindness, and what it means to be blind, but we are also recasting the meaning of vision and what it is to see.

Keywords: echolocation, changing, sensory, function

Procedia PDF Downloads 130

1279 Multilevel Modeling of the Progression of HIV/AIDS Disease among Patients under HAART Treatment

Authors: Awol Seid Ebrie

Abstract:

HIV results as an incurable disease, AIDS. After a person is infected with virus, the virus gradually destroys all the infection fighting cells called CD4 cells and makes the individual susceptible to opportunistic infections which cause severe or fatal health problems. Several studies show that the CD4 cells count is the most determinant indicator of the effectiveness of the treatment or progression of the disease. The objective of this paper is to investigate the progression of the disease over time among patient under HAART treatment. Two main approaches of the generalized multilevel ordinal models; namely the proportional odds model and the nonproportional odds model have been applied to the HAART data. Also, the multilevel part of both models includes random intercepts and random coefficients. In general, four models are explored in the analysis and then the models are compared using the deviance information criteria. Of these models, the random coefficients nonproportional odds model is selected as the best model for the HAART data used as it has the smallest DIC value. The selected model shows that the progression of the disease increases as the time under the treatment increases. In addition, it reveals that gender, baseline clinical stage and functional status of the patient have a significant association with the progression of the disease.

Keywords: nonproportional odds model, proportional odds model, random coefficients model, random intercepts model

Procedia PDF Downloads 395

1278 American Sign Language Recognition System

Authors: Rishabh Nagpal, Riya Uchagaonkar, Venkata Naga Narasimha Ashish Mernedi, Ahmed Hambaba

Abstract:

The rapid evolution of technology in the communication sector continually seeks to bridge the gap between different communities, notably between the deaf community and the hearing world. This project develops a comprehensive American Sign Language (ASL) recognition system, leveraging the advanced capabilities of convolutional neural networks (CNNs) and vision transformers (ViTs) to interpret and translate ASL in real-time. The primary objective of this system is to provide an effective communication tool that enables seamless interaction through accurate sign language interpretation. The architecture of the proposed system integrates dual networks -VGG16 for precise spatial feature extraction and vision transformers for contextual understanding of the sign language gestures. The system processes live input, extracting critical features through these sophisticated neural network models, and combines them to enhance gesture recognition accuracy. This integration facilitates a robust understanding of ASL by capturing detailed nuances and broader gesture dynamics. The system is evaluated through a series of tests that measure its efficiency and accuracy in real-world scenarios. Results indicate a high level of precision in recognizing diverse ASL signs, substantiating the potential of this technology in practical applications. Challenges such as enhancing the system’s ability to operate in varied environmental conditions and further expanding the dataset for training were identified and discussed. Future work will refine the model’s adaptability and incorporate haptic feedback to enhance the interactivity and richness of the user experience. This project demonstrates the feasibility of an advanced ASL recognition system and lays the groundwork for future innovations in assistive communication technologies.

Keywords: sign language, computer vision, vision transformer, VGG16, CNN

Procedia PDF Downloads 7

1277 A Comparison of YOLO Family for Apple Detection and Counting in Orchards

Authors: Yuanqing Li, Changyi Lei, Zhaopeng Xue, Zhuo Zheng, Yanbo Long

Abstract:

In agricultural production and breeding, implementing automatic picking robot in orchard farming to reduce human labour and error is challenging. The core function of it is automatic identification based on machine vision. This paper focuses on apple detection and counting in orchards and implements several deep learning methods. Extensive datasets are used and a semi-automatic annotation method is proposed. The proposed deep learning models are in state-of-the-art YOLO family. In view of the essence of the models with various backbones, a multi-dimensional comparison in details is made in terms of counting accuracy, mAP and model memory, laying the foundation for realising automatic precision agriculture.

Keywords: agricultural object detection, deep learning, machine vision, YOLO family

Procedia PDF Downloads 170

1276 Multimodal Deep Learning for Human Activity Recognition

Authors: Ons Slimene, Aroua Taamallah, Maha Khemaja

Abstract:

In recent years, human activity recognition (HAR) has been a key area of research due to its diverse applications. It has garnered increasing attention in the field of computer vision. HAR plays an important role in people’s daily lives as it has the ability to learn advanced knowledge about human activities from data. In HAR, activities are usually represented by exploiting different types of sensors, such as embedded sensors or visual sensors. However, these sensors have limitations, such as local obstacles, image-related obstacles, sensor unreliability, and consumer concerns. Recently, several deep learning-based approaches have been proposed for HAR and these approaches are classified into two categories based on the type of data used: vision-based approaches and sensor-based approaches. This research paper highlights the importance of multimodal data fusion from skeleton data obtained from videos and data generated by embedded sensors using deep neural networks for achieving HAR. We propose a deep multimodal fusion network based on a twostream architecture. These two streams use the Convolutional Neural Network combined with the Bidirectional LSTM (CNN BILSTM) to process skeleton data and data generated by embedded sensors and the fusion at the feature level is considered. The proposed model was evaluated on a public OPPORTUNITY++ dataset and produced a accuracy of 96.77%.

Keywords: human activity recognition, action recognition, sensors, vision, human-centric sensing, deep learning, context-awareness

Procedia PDF Downloads 71

1275 Caring and Sustainable Government: An Examination of Political Vision of Jeong Do-Jeon

Authors: Hyeon Sop Baek

Abstract:

This paper will briefly investigate Jeong Do-jeon’s political philosophy. Jeong Do-jeon was a Korean Confucian philosopher and politician during the turbulent 14th Century who revolted against the old order, founded Joseon Dynasty, and significantly impacted the development of Korean culture. Jeong’s vision of an ideal state involved a polity that has its roots in the people -that is, an ideal government prioritizes caring for the welfare of the people, respecting and attending to the diverse opinions and concerns of the people, and relies on the genuine, voluntary support of the people. With the neo-Confucian worldview in mind -that every human being has the equal potential to become a moral person- Jeong sought to create a world suitable for everybody to contribute to the decision-making procedure and be able to realize their potential fully. This paper will first examine his works and present a quick overview of his vision of the ideal government. Then, it will examine the Confucian virtues of ren (仁) and yi (義) and how they formulate the basis of his philosophy, and then discuss the central features of his vision of government: popular mandate, equity of wealth, promoting freedom of expression and political participation, and elevating caring disposition as the paramount quality of the political leaders. Furthermore, this paper aims to analyze the element of care inherent within his political philosophy, namely his view on the dynamics of power, nurturing the people, and noncoercive justice. Finally, a discussion on why his philosophy is still relevant in the contemporary context will be provided. Jeong’s view aimed at building a sustainable model of government, by proposing that the people should be the foundation of a state and that they need to be carefully nurtured so they can realize their inborn potential and continue to contribute to the sustenance of the world, is the focal point of Jeong’s philosophy. Just as he sought to rebuild his world following the turmoils of the 14th Century, his philosophy still has a substantial implication on how we should strive to rebuild our society today.

Keywords: Korea, Confucianism, Jeong Do-jeon, Joseon, Korean philosophy, political philosophy

Procedia PDF Downloads 53

1274 Autonomous Ground Vehicle Navigation Based on a Single Camera and Image Processing Methods

Authors: Auday Al-Mayyahi, Phil Birch, William Wang

Abstract:

A vision system-based navigation for autonomous ground vehicle (AGV) equipped with a single camera in an indoor environment is presented. A proposed navigation algorithm has been utilized to detect obstacles represented by coloured mini- cones placed in different positions inside a corridor. For the recognition of the relative position and orientation of the AGV to the coloured mini cones, the features of the corridor structure are extracted using a single camera vision system. The relative position, the offset distance and steering angle of the AGV from the coloured mini-cones are derived from the simple corridor geometry to obtain a mapped environment in real world coordinates. The corridor is first captured as an image using the single camera. Hence, image processing functions are then performed to identify the existence of the cones within the environment. Using a bounding box surrounding each cone allows to identify the locations of cones in a pixel coordinate system. Thus, by matching the mapped and pixel coordinates using a projection transformation matrix, the real offset distances between the camera and obstacles are obtained. Real time experiments in an indoor environment are carried out with a wheeled AGV in order to demonstrate the validity and the effectiveness of the proposed algorithm.

Keywords: autonomous ground vehicle, navigation, obstacle avoidance, vision system, single camera, image processing, ultrasonic sensor

Procedia PDF Downloads 281

1273 An Investigation into Computer Vision Methods to Identify Material Other Than Grapes in Harvested Wine Grape Loads

Authors: Riaan Kleyn

Abstract:

Mass wine production companies across the globe are provided with grapes from winegrowers that predominantly utilize mechanical harvesting machines to harvest wine grapes. Mechanical harvesting accelerates the rate at which grapes are harvested, allowing grapes to be delivered faster to meet the demands of wine cellars. The disadvantage of the mechanical harvesting method is the inclusion of material-other-than-grapes (MOG) in the harvested wine grape loads arriving at the cellar which degrades the quality of wine that can be produced. Currently, wine cellars do not have a method to determine the amount of MOG present within wine grape loads. This paper seeks to find an optimal computer vision method capable of detecting the amount of MOG within a wine grape load. A MOG detection method will encourage winegrowers to deliver MOG-free wine grape loads to avoid penalties which will indirectly enhance the quality of the wine to be produced. Traditional image segmentation methods were compared to deep learning segmentation methods based on images of wine grape loads that were captured at a wine cellar. The Mask R-CNN model with a ResNet-50 convolutional neural network backbone emerged as the optimal method for this study to determine the amount of MOG in an image of a wine grape load. Furthermore, a statistical analysis was conducted to determine how the MOG on the surface of a grape load relates to the mass of MOG within the corresponding grape load.

Keywords: computer vision, wine grapes, machine learning, machine harvested grapes

Procedia PDF Downloads 63

1272 Monocular Depth Estimation Benchmarking with Thermal Dataset

Authors: Ali Akyar, Osman Serdar Gedik

Abstract:

Depth estimation is a challenging computer vision task that involves estimating the distance between objects in a scene and the camera. It predicts how far each pixel in the 2D image is from the capturing point. There are some important Monocular Depth Estimation (MDE) studies that are based on Vision Transformers (ViT). We benchmark three major studies. The first work aims to build a simple and powerful foundation model that deals with any images under any condition. The second work proposes a method by mixing multiple datasets during training and a robust training objective. The third work combines generalization performance and state-of-the-art results on specific datasets. Although there are studies with thermal images too, we wanted to benchmark these three non-thermal, state-of-the-art studies with a hybrid image dataset which is taken by Multi-Spectral Dynamic Imaging (MSX) technology. MSX technology produces detailed thermal images by bringing together the thermal and visual spectrums. Using this technology, our dataset images are not blur and poorly detailed as the normal thermal images. On the other hand, they are not taken at the perfect light conditions as RGB images. We compared three methods under test with our thermal dataset which was not done before. Additionally, we propose an image enhancement deep learning model for thermal data. This model helps extract the features required for monocular depth estimation. The experimental results demonstrate that, after using our proposed model, the performance of these three methods under test increased significantly for thermal image depth prediction.

Keywords: monocular depth estimation, thermal dataset, benchmarking, vision transformers

Procedia PDF Downloads 15

1271 Social Perspectives on Population of People Living Postively; An Indian Scenario, Evidence from Tiruchirappalli

Authors: Uwonkunda Jeanne, J. Godwin Prem Singh, Anjaneyalu Subbiah

Abstract:

HIV/AIDS is known to affect an individual not only physically but also mentally, socially, and financially. It is a syndrome that builds a vacuum in a person affecting his/her life as a whole.

Keywords: People living with HIV, social dysfunction, stigma, and Social support.

Procedia PDF Downloads 471

1270 Advanced Concrete Crack Detection Using Light-Weight MobileNetV2 Neural Network

Authors: Li Hui, Riyadh Hindi

Abstract:

Concrete structures frequently suffer from crack formation, a critical issue that can significantly reduce their lifespan by allowing damaging agents to enter. Traditional methods of crack detection depend on manual visual inspections, which heavily relies on the experience and expertise of inspectors using tools. In this study, a more efficient, computer vision-based approach is introduced by using the lightweight MobileNetV2 neural network. A dataset of 40,000 images was used to develop a specialized crack evaluation algorithm. The analysis indicates that MobileNetV2 matches the accuracy of traditional CNN methods but is more efficient due to its smaller size, making it well-suited for mobile device applications. The effectiveness and reliability of this new method were validated through experimental testing, highlighting its potential as an automated solution for crack detection in concrete structures.

Keywords: Concrete crack, computer vision, deep learning, MobileNetV2 neural network

Procedia PDF Downloads 39

1269 Web Page Design Optimisation Based on Segment Analytics

Authors: Varsha V. Rohini, P. R. Shreya, B. Renukadevi

Abstract:

In the web analytics the information delivery and the web usage is optimized and the analysis of data is done. The analytics is the measurement, collection and analysis of webpage data. Page statistics and user metrics are the important factor in most of the web analytics tool. This is the limitation of the existing tools. It does not provide design inputs for the optimization of information. This paper aims at providing an extension for the scope of web analytics to provide analysis and statistics of each segment of a webpage. The number of click count is calculated and the concentration of links in a web page is obtained. Its user metrics are used to help in proper design of the displayed content in a webpage by Vision Based Page Segmentation (VIPS) algorithm. When the algorithm is applied on the web page it divides the entire web page into the visual block tree. The visual block tree generated will further divide the web page into visual blocks or segments which help us to understand the usage of each segment in a page and its content. The dynamic web pages and deep web pages are used to extend the scope of web page segment analytics. Space optimization concept is used with the help of the output obtained from the Vision Based Page Segmentation (VIPS) algorithm. This technique provides us the visibility of the user interaction with the WebPages and helps us to place the important links in the appropriate segments of the webpage and effectively manage space in a page and the concentration of links.

Keywords: analytics, design optimization, visual block trees, vision based technology

Procedia PDF Downloads 242

1268 Statistical Analysis of Natural Images after Applying ICA and ISA

Authors: Peyman Sheikholharam Mashhadi

Abstract:

Difficulties in analyzing real world images in classical image processing and machine vision framework have motivated researchers towards considering the biology-based vision. It is a common belief that mammalian visual cortex has been adapted to the statistics of the real world images through the evolution process. There are two well-known successful models of mammalian visual cortical cells: Independent Component Analysis (ICA) and Independent Subspace Analysis (ISA). In this paper, we statistically analyze the dependencies which remain in the components after applying these models to the natural images. Also, we investigate the response of feature detectors to gratings with various parameters in order to find optimal parameters of the feature detectors. Finally, the selectiveness of feature detectors to phase, in both models is considered.

Keywords: statistics, independent component analysis, independent subspace analysis, phase, natural images

Procedia PDF Downloads 323

1267 Enhancer: An Effective Transformer Architecture for Single Image Super Resolution

Authors: Pitigalage Chamath Chandira Peiris

Abstract:

A widely researched domain in the field of image processing in recent times has been single image super-resolution, which tries to restore a high-resolution image from a single low-resolution image. Many more single image super-resolution efforts have been completed utilizing equally traditional and deep learning methodologies, as well as a variety of other methodologies. Deep learning-based super-resolution methods, in particular, have received significant interest. As of now, the most advanced image restoration approaches are based on convolutional neural networks; nevertheless, only a few efforts have been performed using Transformers, which have demonstrated excellent performance on high-level vision tasks. The effectiveness of CNN-based algorithms in image super-resolution has been impressive. However, these methods cannot completely capture the non-local features of the data. Enhancer is a simple yet powerful Transformer-based approach for enhancing the resolution of images. A method for single image super-resolution was developed in this study, which utilized an efficient and effective transformer design. This proposed architecture makes use of a locally enhanced window transformer block to alleviate the enormous computational load associated with non-overlapping window-based self-attention. Additionally, it incorporates depth-wise convolution in the feed-forward network to enhance its ability to capture local context. This study is assessed by comparing the results obtained for popular datasets to those obtained by other techniques in the domain.

Keywords: single image super resolution, computer vision, vision transformers, image restoration

Procedia PDF Downloads 80

1266 Hand Detection and Recognition for Malay Sign Language

Authors: Mohd Noah A. Rahman, Afzaal H. Seyal, Norhafilah Bara

Abstract:

Developing a software application using an interface with computers and peripheral devices using gestures of human body such as hand movements keeps growing in interest. A review on this hand gesture detection and recognition based on computer vision technique remains a very challenging task. This is to provide more natural, innovative and sophisticated way of non-verbal communication, such as sign language, in human computer interaction. Nevertheless, this paper explores hand detection and hand gesture recognition applying a vision based approach. The hand detection and recognition used skin color spaces such as HSV and YCrCb are applied. However, there are limitations that are needed to be considered. Almost all of skin color space models are sensitive to quickly changing or mixed lighting circumstances. There are certain restrictions in order for the hand recognition to give better results such as the distance of user’s hand to the webcam and the posture and size of the hand.

Keywords: hand detection, hand gesture, hand recognition, sign language

Procedia PDF Downloads 280

1265 Enhancing Fall Detection Accuracy with a Transfer Learning-Aided Transformer Model Using Computer Vision

Authors: Sheldon McCall, Miao Yu, Liyun Gong, Shigang Yue, Stefanos Kollias

Abstract:

Falls are a significant health concern for older adults globally, and prompt identification is critical to providing necessary healthcare support. Our study proposes a new fall detection method using computer vision based on modern deep learning techniques. Our approach involves training a trans- former model on a large 2D pose dataset for general action recognition, followed by transfer learning. Specifically, we freeze the first few layers of the trained transformer model and train only the last two layers for fall detection. Our experimental results demonstrate that our proposed method outperforms both classical machine learning and deep learning approaches in fall/non-fall classification. Overall, our study suggests that our proposed methodology could be a valuable tool for identifying falls.

Keywords: healthcare, fall detection, transformer, transfer learning

Procedia PDF Downloads 103

1264 Neural Style Transfer Using Deep Learning

Authors: Shaik Jilani Basha, Inavolu Avinash, Alla Venu Sai Reddy, Bitragunta Taraka Ramu

Abstract:

We can use the neural style transfer technique to build a picture with the same "content" as the beginning image but the "style" of the picture we've chosen. Neural style transfer is a technique for merging the style of one image into another while retaining its original information. The only change is how the image is formatted to give it an additional artistic sense. The content image depicts the plan or drawing, as well as the colors of the drawing or paintings used to portray the style. It is a computer vision programme that learns and processes images through deep convolutional neural networks. To implement software, we used to train deep learning models with the train data, and whenever a user takes an image and a styled image, the output will be as the style gets transferred to the original image, and it will be shown as the output.

Keywords: neural networks, computer vision, deep learning, convolutional neural networks

Procedia PDF Downloads 59

1263 The Dual Catastrophe of Behçet’s Disease Visual Loss Followed by Acute Spinal Shock After Lumbar Drain Removal

Authors: Naim Izet Kajtazi

Abstract:

Context: Increased intracranial pressure and associated symptoms such as headache, papilledema, motor or sensory deficits, seizures, and conscious disturbance are well-known in acute CVT. However, visual loss is not commonly associated with this disease, except in the case of secondary IIH associated with it. Process: We report a case of a 40-year-old male with Behçet’s disease and cerebral venous thrombosis, and other multiple comorbidities admitted with a four-day history of increasing headache and rapidly progressive visual loss bilaterally. The neurological examination was positive for bilateral papilledema of grade 3 with light perception on the left eye and counting fingers on the right eye. Brain imaging showed old findings of cerebral venous thrombosis without any intraparenchymal lesions to suggest a flare-up of Behçet’s disease. The lumbar puncture, followed by the lumbar drain insertion, gave no benefit in headache or vision. However, he completely lost sight. The right optic nerve sheath fenestration did not result in vision improvement. The acute spinal shock complicated the lumbar drain removal due to epidural hematoma. An urgent lumbar laminectomy with hematoma evacuation undertook. Intra-operatively, the neurosurgeon noted suspicious abnormal vessels at conus medullaris with the possibility of an arteriovenous malformation. Outcome: In a few days following the spinal surgery, the patient vision started to improve. Further improvement was achieved after plasma exchange sessions followed by cyclophosphamide. In the recent follow-up in the clinic, he reported better vision, drove, and completed his Ph.D. studies. Relevance: Visual loss in patients with Behçet’s disease should always be anticipated and taken reasonable care of, ensuring that they receive well-combined immunosuppression with anticoagulation and agents to reduce intracranial pressure. This patient’s story is significant for a high disease burden and complicated hospital course by acute spinal shock due to spinal lumbar drain removal with a possible underlying spinal arteriovenous malformation.

Keywords: Behcet disease, optic neuritis, IIH, CVT

Procedia PDF Downloads 49

1262 Improving the Performance of Deep Learning in Facial Emotion Recognition with Image Sharpening

Authors: Ksheeraj Sai Vepuri, Nada Attar

Abstract:

We as humans use words with accompanying visual and facial cues to communicate effectively. Classifying facial emotion using computer vision methodologies has been an active research area in the computer vision field. In this paper, we propose a simple method for facial expression recognition that enhances accuracy. We tested our method on the FER-2013 dataset that contains static images. Instead of using Histogram equalization to preprocess the dataset, we used Unsharp Mask to emphasize texture and details and sharpened the edges. We also used ImageDataGenerator from Keras library for data augmentation. Then we used Convolutional Neural Networks (CNN) model to classify the images into 7 different facial expressions, yielding an accuracy of 69.46% on the test set. Our results show that using image preprocessing such as the sharpening technique for a CNN model can improve the performance, even when the CNN model is relatively simple.

Keywords: facial expression recognittion, image preprocessing, deep learning, CNN

Procedia PDF Downloads 110

1261 Prevalence of Near Visual Impairment and Associated Factors among School Teachers in Gondar City, North West Ethiopia, 2022

Authors: Bersufekad Wubie

Abstract:

Introduction: Near visual impairment is presenting near visual acuity of the eye worse than N6 at a 40 cm distance. Teachers' regular duties, such as reading books, writing on the blackboard, and recognizing students' faces, need good near vision. If a teacher has near-visual impairment, the work output is unsatisfactory. Objective: The study was aimed to assess the prevalence and associated factors near vision impairment among school teachers at Gondar city Northwest Ethiopia, August 2022. Methods: To select 567 teachers in Gondar city schools, an institutional-based cross-sectional study design with a multistage sampling technique were used. The study was conducted in selected schools from May 1 to May 30, 2022. Trained data collectors used well-structured Amharic and English language questionnaires and ophthalmic instruments for examination. The collected data were checked for completeness and entered into Epi data version 4.6, then exported to SPSS version 26 for further analysis. A binary and multivariate logistic regression model was fitted. And associated factors of the outcome variable. Result: The prevalence of near visual impairment was 64.6%, with a confidence interval of 60.3%–68.4%. Near visual impairment was significantly associated with age >= 35 years (AOR: 4.90 at 95% CI: 3.15, 7.65), having prolonged years of teaching experience (AOR: 3.29 at 95% CI: 1.70, 4.62), having a history of ocular surgery (AOR: 1.96 at 95% CI: 1.10, 4.62), smokers (AOR: 2.21 at 95% CI: 1.22, 4.07), history of ocular trauma (AOR : 1.80 at 95%CI:1.11,3.18 and uncorrected refractive error (AOR:2.01 at 95%CI:1.13,4.03). Conclusion and recommendations: This study showed the prevalence of near vision impairment among school teachers was high, and it is not a problem of the presbyopia age group alone; it also happens at a young age. So teachers' ocular health should be well accommodated in the school's eye health.

Keywords: Gondar, near visual impairment, school, teachers

Procedia PDF Downloads 104