Search results for: vision picking
1031 Enhanced Traffic Light Detection Method Using Geometry Information
Authors: Changhwan Choi, Yongwan Park
Abstract:
In this paper, we propose a method that allows faster and more accurate detection of traffic lights by a vision sensor during driving, DGPS is used to obtain physical location of a traffic light, extract from the image information of the vision sensor only the traffic light area at this location and ascertain if the sign is in operation and determine its form. This method can solve the problem in existing research where low visibility at night or reflection under bright light makes it difficult to recognize the form of traffic light, thus making driving unstable. We compared our success rate of traffic light recognition in day and night road environments. Compared to previous researches, it showed similar performance during the day but 50% improvement at night.Keywords: traffic light, intelligent vehicle, night, detection, DGPS
Procedia PDF Downloads 3251030 Promoting Diversity in Leadership: Exploring Women's Roles in Corporate Governance, with a Focus on Saudi Arabia
Authors: Norah Salem Al Mosa
Abstract:
This paper critically examines the ethical position of academic scholarship concerning "women in leadership" in Saudi Arabia, focusing on the context of the Saudi Vision 2030 initiative. While this vision places a strong emphasis on empowering women and increasing their presence in the workforce, women still face significant cultural, organisational, and personal barriers to leadership roles. The existing literature highlights the challenges Saudi women encounter, including the male guardianship system, and international perspectives add complexity to the issue. The debate among scholars about considering cultural context versus highlighting ongoing challenges is explored. The paper underscores that despite efforts to enhance women's representation in leadership positions, progress has been slow due to cultural norms, the absence of legal quotas, and limited access to education and professional development. It raises questions about the seriousness of research efforts and the government's commitment to gender equality in leadership roles, emphasising the need for increased academic scrutiny in this area. Ultimately, the paper aims to enhance understanding of the challenges and opportunities for women in leadership roles, their contributions to corporate governance in Saudi Arabia, and potential implications beyond its borders.Keywords: female directors, gender diversity, women on executive positions, Saudi vision 2030
Procedia PDF Downloads 601029 Using Computer Vision to Detect and Localize Fractures in Wrist X-ray Images
Authors: John Paul Q. Tomas, Mark Wilson L. de los Reyes, Kirsten Joyce P. Vasquez
Abstract:
The most frequent type of fracture is a wrist fracture, which often makes it difficult for medical professionals to find and locate. In this study, fractures in wrist x-ray pictures were located and identified using deep learning and computer vision. The researchers used image filtering, masking, morphological operations, and data augmentation for the image preprocessing and trained the RetinaNet and Faster R-CNN models with ResNet50 backbones and Adam optimizers separately for each image filtering technique and projection. The RetinaNet model with Anisotropic Diffusion Smoothing filter trained with 50 epochs has obtained the greatest accuracy of 99.14%, precision of 100%, sensitivity/recall of 98.41%, specificity of 100%, and an IoU score of 56.44% for the Posteroanterior projection utilizing augmented data. For the Lateral projection using augmented data, the RetinaNet model with an Anisotropic Diffusion filter trained with 50 epochs has produced the highest accuracy of 98.40%, precision of 98.36%, sensitivity/recall of 98.36%, specificity of 98.43%, and an IoU score of 58.69%. When comparing the test results of the different individual projections, models, and image filtering techniques, the Anisotropic Diffusion filter trained with 50 epochs has produced the best classification and regression scores for both projections.Keywords: Artificial Intelligence, Computer Vision, Wrist Fracture, Deep Learning
Procedia PDF Downloads 731028 Injury Prediction for Soccer Players Using Machine Learning
Authors: Amiel Satvedi, Richard Pyne
Abstract:
Injuries in professional sports occur on a regular basis. Some may be minor, while others can cause huge impact on a player's career and earning potential. In soccer, there is a high risk of players picking up injuries during game time. This research work seeks to help soccer players reduce the risk of getting injured by predicting the likelihood of injury while playing in the near future and then providing recommendations for intervention. The injury prediction tool will use a soccer player's number of minutes played on the field, number of appearances, distance covered and performance data for the current and previous seasons as variables to conduct statistical analysis and provide injury predictive results using a machine learning linear regression model.Keywords: injury predictor, soccer injury prevention, machine learning in soccer, big data in soccer
Procedia PDF Downloads 1821027 Location Tracking of Human Using Mobile Robot and Wireless Sensor Networks
Authors: Muazzam A. Khan
Abstract:
In order to avoid dangerous environmental disasters, robots are being recognized as good entrants to step in as human rescuers. Robots has been gaining interest of many researchers in rescue matters especially which are furnished with advanced sensors. In distributed wireless robot system main objective for a rescue system is to track the location of the object continuously. This paper provides a novel idea to track and locate human in disaster area using stereo vision system and ZigBee technology. This system recursively predict and updates 3D coordinates in a robot coordinate camera system of a human which makes the system cost effective. This system is comprised of ZigBee network which has many advantages such as low power consumption, self-healing low data rates and low cost.Keywords: stereo vision, segmentation, classification, human tracking, ZigBee module
Procedia PDF Downloads 4941026 Efficient Passenger Counting in Public Transport Based on Machine Learning
Authors: Chonlakorn Wiboonsiriruk, Ekachai Phaisangittisagul, Chadchai Srisurangkul, Itsuo Kumazawa
Abstract:
Public transportation is a crucial aspect of passenger transportation, with buses playing a vital role in the transportation service. Passenger counting is an essential tool for organizing and managing transportation services. However, manual counting is a tedious and time-consuming task, which is why computer vision algorithms are being utilized to make the process more efficient. In this study, different object detection algorithms combined with passenger tracking are investigated to compare passenger counting performance. The system employs the EfficientDet algorithm, which has demonstrated superior performance in terms of speed and accuracy. Our results show that the proposed system can accurately count passengers in varying conditions with an accuracy of 94%.Keywords: computer vision, object detection, passenger counting, public transportation
Procedia PDF Downloads 1551025 Control of Belts for Classification of Geometric Figures by Artificial Vision
Authors: Juan Sebastian Huertas Piedrahita, Jaime Arturo Lopez Duque, Eduardo Luis Perez Londoño, Julián S. Rodríguez
Abstract:
The process of generating computer vision is called artificial vision. The artificial vision is a branch of artificial intelligence that allows the obtaining, processing, and analysis of any type of information especially the ones obtained through digital images. Actually the artificial vision is used in manufacturing areas for quality control and production, as these processes can be realized through counting algorithms, positioning, and recognition of objects that can be measured by a single camera (or more). On the other hand, the companies use assembly lines formed by conveyor systems with actuators on them for moving pieces from one location to another in their production. These devices must be previously programmed for their good performance and must have a programmed logic routine. Nowadays the production is the main target of every industry, quality, and the fast elaboration of the different stages and processes in the chain of production of any product or service being offered. The principal base of this project is to program a computer that recognizes geometric figures (circle, square, and triangle) through a camera, each one with a different color and link it with a group of conveyor systems to organize the mentioned figures in cubicles, which differ from one another also by having different colors. This project bases on artificial vision, therefore the methodology needed to develop this project must be strict, this one is detailed below: 1. Methodology: 1.1 The software used in this project is QT Creator which is linked with Open CV libraries. Together, these tools perform to realize the respective program to identify colors and forms directly from the camera to the computer. 1.2 Imagery acquisition: To start using the libraries of Open CV is necessary to acquire images, which can be captured by a computer’s web camera or a different specialized camera. 1.3 The recognition of RGB colors is realized by code, crossing the matrices of the captured images and comparing pixels, identifying the primary colors which are red, green, and blue. 1.4 To detect forms it is necessary to realize the segmentation of the images, so the first step is converting the image from RGB to grayscale, to work with the dark tones of the image, then the image is binarized which means having the figure of the image in a white tone with a black background. Finally, we find the contours of the figure in the image to detect the quantity of edges to identify which figure it is. 1.5 After the color and figure have been identified, the program links with the conveyor systems, which through the actuators will classify the figures in their respective cubicles. Conclusions: The Open CV library is a useful tool for projects in which an interface between a computer and the environment is required since the camera obtains external characteristics and realizes any process. With the program for this project any type of assembly line can be optimized because images from the environment can be obtained and the process would be more accurate.Keywords: artificial intelligence, artificial vision, binarized, grayscale, images, RGB
Procedia PDF Downloads 3781024 Football Smart Coach: Analyzing Corner Kicks Using Computer Vision
Authors: Arth Bohra, Marwa Mahmoud
Abstract:
In this paper, we utilize computer vision to develop a tool for youth coaches to formulate set-piece tactics for their players. We used the Soccernet database to extract the ResNet features and camera calibration data for over 3000 corner kick across 500 professional matches in the top 6 European leagues (English Premier League, UEFA Champions League, Ligue 1, La Liga, Serie A, Bundesliga). Leveraging the provided homography matrix, we construct a feature vector representing the formation of players on these corner kicks. Additionally, labeling the videos manually, we obtained the pass-trajectory of each of the 3000+ corner kicks by segmenting the field into four zones. Next, after determining the localization of the players and ball, we used event data to give the corner kicks a rating on a 1-4 scale. By employing a Convolutional Neural Network, our model managed to predict the success of a corner kick given the formations of players. This suggests that with the right formations, teams can optimize the way they approach corner kicks. By understanding this, we can help coaches formulate set-piece tactics for their own teams in order to maximize the success of their play. The proposed model can be easily extended; our method could be applied to even more game situations, from free kicks to counterattacks. This research project also gives insight into the myriad of possibilities that artificial intelligence possesses in transforming the domain of sports.Keywords: soccer, corner kicks, AI, computer vision
Procedia PDF Downloads 1741023 Vision-Based Daily Routine Recognition for Healthcare with Transfer Learning
Authors: Bruce X. B. Yu, Yan Liu, Keith C. C. Chan
Abstract:
We propose to record Activities of Daily Living (ADLs) of elderly people using a vision-based system so as to provide better assistive and personalization technologies. Current ADL-related research is based on data collected with help from non-elderly subjects in laboratory environments and the activities performed are predetermined for the sole purpose of data collection. To obtain more realistic datasets for the application, we recorded ADLs for the elderly with data collected from real-world environment involving real elderly subjects. Motivated by the need to collect data for more effective research related to elderly care, we chose to collect data in the room of an elderly person. Specifically, we installed Kinect, a vision-based sensor on the ceiling, to capture the activities that the elderly subject performs in the morning every day. Based on the data, we identified 12 morning activities that the elderly person performs daily. To recognize these activities, we created a HARELCARE framework to investigate into the effectiveness of existing Human Activity Recognition (HAR) algorithms and propose the use of a transfer learning algorithm for HAR. We compared the performance, in terms of accuracy, and training progress. Although the collected dataset is relatively small, the proposed algorithm has a good potential to be applied to all daily routine activities for healthcare purposes such as evidence-based diagnosis and treatment.Keywords: daily activity recognition, healthcare, IoT sensors, transfer learning
Procedia PDF Downloads 1321022 Complications of Contact Lens-Associated Keratitis: A Refresher for Emergency Departments
Abstract:
Microbial keratitis is a serious complication of contact lens wear that can be vision and eye-threatening. Diverse presentations relating to contact lens wear include dry corneal surface, corneal infiltrate, ulceration, scarring, and complete corneal melt leading to perforation. Contact lens wear is a major risk factor and, as such, is an important consideration in any patient presenting with a red eye in the primary care setting. This paper aims to provide an overview of the risk factors, common organisms, and spectrum of contact lens-associated keratitis (CLAK) complications. It will highlight some of the salient points relevant to the assessment and workup of patients suspected of CLAK in the emergency department based on the recent literature and therapeutic guidelines. An overview of the management principles will also be provided.Keywords: microbial keratitis, corneal pathology, contact lens-associated complications, painful vision loss
Procedia PDF Downloads 1101021 Fitness Action Recognition Based on MediaPipe
Authors: Zixuan Xu, Yichun Lou, Yang Song, Zihuai Lin
Abstract:
MediaPipe is an open-source machine learning computer vision framework that can be ported into a multi-platform environment, which makes it easier to use it to recognize the human activity. Based on this framework, many human recognition systems have been created, but the fundamental issue is the recognition of human behavior and posture. In this paper, two methods are proposed to recognize human gestures based on MediaPipe, the first one uses the Adaptive Boosting algorithm to recognize a series of fitness gestures, and the second one uses the Fast Dynamic Time Warping algorithm to recognize 413 continuous fitness actions. These two methods are also applicable to any human posture movement recognition.Keywords: computer vision, MediaPipe, adaptive boosting, fast dynamic time warping
Procedia PDF Downloads 1191020 Flashsonar or Echolocation Education: Expanding the Function of Hearing and Changing the Meaning of Blindness
Authors: Thomas, Daniel Tajo, Kish
Abstract:
Sight is primarily associated with the function of gathering and processing near and extended spatial information which is largely used to support self-determined interaction with the environment through self-directed movement and navigation. By contrast, hearing is primarily associated with the function of gathering and processing sequential information which may typically be used to support self-determined communication through the self-directed use of music and language. Blindness or the lack of vision is traditionally characterized by a lack of capacity to access spatial information which, in turn, is presumed to result in a lack of capacity for self-determined interaction with the environment due to limitations in self-directed movement and navigation. However, through a specific protocol of FlashSonar education developed by World Access for the Blind, the function of hearing can be expanded in blind people to carry out some of the functions normally associated with sight, that is to access and process near and extended spatial information to construct three-dimensional acoustic images of the environment. This perceptual education protocol results in a significant restoration in blind people of self-determined environmental interaction, movement, and navigational capacities normally attributed to vision - a new way to see. Thus, by expanding the function of hearing to process spatial information to restore self-determined movement, we are not only changing the meaning of blindness, and what it means to be blind, but we are also recasting the meaning of vision and what it is to see.Keywords: echolocation, changing, sensory, function
Procedia PDF Downloads 1541019 American Sign Language Recognition System
Authors: Rishabh Nagpal, Riya Uchagaonkar, Venkata Naga Narasimha Ashish Mernedi, Ahmed Hambaba
Abstract:
The rapid evolution of technology in the communication sector continually seeks to bridge the gap between different communities, notably between the deaf community and the hearing world. This project develops a comprehensive American Sign Language (ASL) recognition system, leveraging the advanced capabilities of convolutional neural networks (CNNs) and vision transformers (ViTs) to interpret and translate ASL in real-time. The primary objective of this system is to provide an effective communication tool that enables seamless interaction through accurate sign language interpretation. The architecture of the proposed system integrates dual networks -VGG16 for precise spatial feature extraction and vision transformers for contextual understanding of the sign language gestures. The system processes live input, extracting critical features through these sophisticated neural network models, and combines them to enhance gesture recognition accuracy. This integration facilitates a robust understanding of ASL by capturing detailed nuances and broader gesture dynamics. The system is evaluated through a series of tests that measure its efficiency and accuracy in real-world scenarios. Results indicate a high level of precision in recognizing diverse ASL signs, substantiating the potential of this technology in practical applications. Challenges such as enhancing the system’s ability to operate in varied environmental conditions and further expanding the dataset for training were identified and discussed. Future work will refine the model’s adaptability and incorporate haptic feedback to enhance the interactivity and richness of the user experience. This project demonstrates the feasibility of an advanced ASL recognition system and lays the groundwork for future innovations in assistive communication technologies.Keywords: sign language, computer vision, vision transformer, VGG16, CNN
Procedia PDF Downloads 431018 Multimodal Deep Learning for Human Activity Recognition
Authors: Ons Slimene, Aroua Taamallah, Maha Khemaja
Abstract:
In recent years, human activity recognition (HAR) has been a key area of research due to its diverse applications. It has garnered increasing attention in the field of computer vision. HAR plays an important role in people’s daily lives as it has the ability to learn advanced knowledge about human activities from data. In HAR, activities are usually represented by exploiting different types of sensors, such as embedded sensors or visual sensors. However, these sensors have limitations, such as local obstacles, image-related obstacles, sensor unreliability, and consumer concerns. Recently, several deep learning-based approaches have been proposed for HAR and these approaches are classified into two categories based on the type of data used: vision-based approaches and sensor-based approaches. This research paper highlights the importance of multimodal data fusion from skeleton data obtained from videos and data generated by embedded sensors using deep neural networks for achieving HAR. We propose a deep multimodal fusion network based on a twostream architecture. These two streams use the Convolutional Neural Network combined with the Bidirectional LSTM (CNN BILSTM) to process skeleton data and data generated by embedded sensors and the fusion at the feature level is considered. The proposed model was evaluated on a public OPPORTUNITY++ dataset and produced a accuracy of 96.77%.Keywords: human activity recognition, action recognition, sensors, vision, human-centric sensing, deep learning, context-awareness
Procedia PDF Downloads 1011017 Caring and Sustainable Government: An Examination of Political Vision of Jeong Do-Jeon
Authors: Hyeon Sop Baek
Abstract:
This paper will briefly investigate Jeong Do-jeon’s political philosophy. Jeong Do-jeon was a Korean Confucian philosopher and politician during the turbulent 14th Century who revolted against the old order, founded Joseon Dynasty, and significantly impacted the development of Korean culture. Jeong’s vision of an ideal state involved a polity that has its roots in the people -that is, an ideal government prioritizes caring for the welfare of the people, respecting and attending to the diverse opinions and concerns of the people, and relies on the genuine, voluntary support of the people. With the neo-Confucian worldview in mind -that every human being has the equal potential to become a moral person- Jeong sought to create a world suitable for everybody to contribute to the decision-making procedure and be able to realize their potential fully. This paper will first examine his works and present a quick overview of his vision of the ideal government. Then, it will examine the Confucian virtues of ren (仁) and yi (義) and how they formulate the basis of his philosophy, and then discuss the central features of his vision of government: popular mandate, equity of wealth, promoting freedom of expression and political participation, and elevating caring disposition as the paramount quality of the political leaders. Furthermore, this paper aims to analyze the element of care inherent within his political philosophy, namely his view on the dynamics of power, nurturing the people, and noncoercive justice. Finally, a discussion on why his philosophy is still relevant in the contemporary context will be provided. Jeong’s view aimed at building a sustainable model of government, by proposing that the people should be the foundation of a state and that they need to be carefully nurtured so they can realize their inborn potential and continue to contribute to the sustenance of the world, is the focal point of Jeong’s philosophy. Just as he sought to rebuild his world following the turmoils of the 14th Century, his philosophy still has a substantial implication on how we should strive to rebuild our society today.Keywords: Korea, Confucianism, Jeong Do-jeon, Joseon, Korean philosophy, political philosophy
Procedia PDF Downloads 801016 Autonomous Ground Vehicle Navigation Based on a Single Camera and Image Processing Methods
Authors: Auday Al-Mayyahi, Phil Birch, William Wang
Abstract:
A vision system-based navigation for autonomous ground vehicle (AGV) equipped with a single camera in an indoor environment is presented. A proposed navigation algorithm has been utilized to detect obstacles represented by coloured mini- cones placed in different positions inside a corridor. For the recognition of the relative position and orientation of the AGV to the coloured mini cones, the features of the corridor structure are extracted using a single camera vision system. The relative position, the offset distance and steering angle of the AGV from the coloured mini-cones are derived from the simple corridor geometry to obtain a mapped environment in real world coordinates. The corridor is first captured as an image using the single camera. Hence, image processing functions are then performed to identify the existence of the cones within the environment. Using a bounding box surrounding each cone allows to identify the locations of cones in a pixel coordinate system. Thus, by matching the mapped and pixel coordinates using a projection transformation matrix, the real offset distances between the camera and obstacles are obtained. Real time experiments in an indoor environment are carried out with a wheeled AGV in order to demonstrate the validity and the effectiveness of the proposed algorithm.Keywords: autonomous ground vehicle, navigation, obstacle avoidance, vision system, single camera, image processing, ultrasonic sensor
Procedia PDF Downloads 3021015 An Investigation into Computer Vision Methods to Identify Material Other Than Grapes in Harvested Wine Grape Loads
Authors: Riaan Kleyn
Abstract:
Mass wine production companies across the globe are provided with grapes from winegrowers that predominantly utilize mechanical harvesting machines to harvest wine grapes. Mechanical harvesting accelerates the rate at which grapes are harvested, allowing grapes to be delivered faster to meet the demands of wine cellars. The disadvantage of the mechanical harvesting method is the inclusion of material-other-than-grapes (MOG) in the harvested wine grape loads arriving at the cellar which degrades the quality of wine that can be produced. Currently, wine cellars do not have a method to determine the amount of MOG present within wine grape loads. This paper seeks to find an optimal computer vision method capable of detecting the amount of MOG within a wine grape load. A MOG detection method will encourage winegrowers to deliver MOG-free wine grape loads to avoid penalties which will indirectly enhance the quality of the wine to be produced. Traditional image segmentation methods were compared to deep learning segmentation methods based on images of wine grape loads that were captured at a wine cellar. The Mask R-CNN model with a ResNet-50 convolutional neural network backbone emerged as the optimal method for this study to determine the amount of MOG in an image of a wine grape load. Furthermore, a statistical analysis was conducted to determine how the MOG on the surface of a grape load relates to the mass of MOG within the corresponding grape load.Keywords: computer vision, wine grapes, machine learning, machine harvested grapes
Procedia PDF Downloads 961014 Monocular Depth Estimation Benchmarking with Thermal Dataset
Authors: Ali Akyar, Osman Serdar Gedik
Abstract:
Depth estimation is a challenging computer vision task that involves estimating the distance between objects in a scene and the camera. It predicts how far each pixel in the 2D image is from the capturing point. There are some important Monocular Depth Estimation (MDE) studies that are based on Vision Transformers (ViT). We benchmark three major studies. The first work aims to build a simple and powerful foundation model that deals with any images under any condition. The second work proposes a method by mixing multiple datasets during training and a robust training objective. The third work combines generalization performance and state-of-the-art results on specific datasets. Although there are studies with thermal images too, we wanted to benchmark these three non-thermal, state-of-the-art studies with a hybrid image dataset which is taken by Multi-Spectral Dynamic Imaging (MSX) technology. MSX technology produces detailed thermal images by bringing together the thermal and visual spectrums. Using this technology, our dataset images are not blur and poorly detailed as the normal thermal images. On the other hand, they are not taken at the perfect light conditions as RGB images. We compared three methods under test with our thermal dataset which was not done before. Additionally, we propose an image enhancement deep learning model for thermal data. This model helps extract the features required for monocular depth estimation. The experimental results demonstrate that, after using our proposed model, the performance of these three methods under test increased significantly for thermal image depth prediction.Keywords: monocular depth estimation, thermal dataset, benchmarking, vision transformers
Procedia PDF Downloads 321013 Advanced Concrete Crack Detection Using Light-Weight MobileNetV2 Neural Network
Authors: Li Hui, Riyadh Hindi
Abstract:
Concrete structures frequently suffer from crack formation, a critical issue that can significantly reduce their lifespan by allowing damaging agents to enter. Traditional methods of crack detection depend on manual visual inspections, which heavily relies on the experience and expertise of inspectors using tools. In this study, a more efficient, computer vision-based approach is introduced by using the lightweight MobileNetV2 neural network. A dataset of 40,000 images was used to develop a specialized crack evaluation algorithm. The analysis indicates that MobileNetV2 matches the accuracy of traditional CNN methods but is more efficient due to its smaller size, making it well-suited for mobile device applications. The effectiveness and reliability of this new method were validated through experimental testing, highlighting its potential as an automated solution for crack detection in concrete structures.Keywords: Concrete crack, computer vision, deep learning, MobileNetV2 neural network
Procedia PDF Downloads 661012 Web Page Design Optimisation Based on Segment Analytics
Authors: Varsha V. Rohini, P. R. Shreya, B. Renukadevi
Abstract:
In the web analytics the information delivery and the web usage is optimized and the analysis of data is done. The analytics is the measurement, collection and analysis of webpage data. Page statistics and user metrics are the important factor in most of the web analytics tool. This is the limitation of the existing tools. It does not provide design inputs for the optimization of information. This paper aims at providing an extension for the scope of web analytics to provide analysis and statistics of each segment of a webpage. The number of click count is calculated and the concentration of links in a web page is obtained. Its user metrics are used to help in proper design of the displayed content in a webpage by Vision Based Page Segmentation (VIPS) algorithm. When the algorithm is applied on the web page it divides the entire web page into the visual block tree. The visual block tree generated will further divide the web page into visual blocks or segments which help us to understand the usage of each segment in a page and its content. The dynamic web pages and deep web pages are used to extend the scope of web page segment analytics. Space optimization concept is used with the help of the output obtained from the Vision Based Page Segmentation (VIPS) algorithm. This technique provides us the visibility of the user interaction with the WebPages and helps us to place the important links in the appropriate segments of the webpage and effectively manage space in a page and the concentration of links.Keywords: analytics, design optimization, visual block trees, vision based technology
Procedia PDF Downloads 2661011 Statistical Analysis of Natural Images after Applying ICA and ISA
Authors: Peyman Sheikholharam Mashhadi
Abstract:
Difficulties in analyzing real world images in classical image processing and machine vision framework have motivated researchers towards considering the biology-based vision. It is a common belief that mammalian visual cortex has been adapted to the statistics of the real world images through the evolution process. There are two well-known successful models of mammalian visual cortical cells: Independent Component Analysis (ICA) and Independent Subspace Analysis (ISA). In this paper, we statistically analyze the dependencies which remain in the components after applying these models to the natural images. Also, we investigate the response of feature detectors to gratings with various parameters in order to find optimal parameters of the feature detectors. Finally, the selectiveness of feature detectors to phase, in both models is considered.Keywords: statistics, independent component analysis, independent subspace analysis, phase, natural images
Procedia PDF Downloads 3391010 Prevalence of Hemorrhagic Septicemia in Dromedary Camel (Camelus Dromedarius) for Some Selected Farms in Benadir Region, Somalia
Authors: Abdirahman Barre, Abdihamid Salad Hassan, Iftin Abdi Mohamud, Abdirahman Mohamed Mohamud, Ahmed Adan Mohamed, Mukhtaar Mohamed Idow
Abstract:
Pasteurellosis (Hemorrhagic septicemia) is a common respiratory disease of camel that is an acutely fatal disease caused by Pasteurella multocida type A or several serotypes of Mannheimia hemolytic, which also affect other animals. The disease had shown to spread between animals, across herds and to humans. Meaning that the disease is Zoonosis. The study aimed at establishment of sero-prevalence of Pasteurellosis in some selected Districts of camel rearing in the Benadir Region. It was a cross-sectional study, where the study population was purposively chosen to consist of animals taken within three sub-Districts of Benadir Region, namely Sub-District (Daynile Township), Sub-District (Yaaqshid) Sub-District (kaxda). This was because they normally handle many camels in a day, thus making it easy for the investigator to access the required number conveniently; it was also assumed that data collected from these for-slaughter camels was representative of the situation in the sub-District/county. A total of one hundred and sixty camels were tested using four serological tests: Rose Bengal Plate Test (RBPT),) and Complex Fixation Test (CFT). The serological tests were purposively chosen to increase the chances of picking positive cases and also to compare their sensitivities with respect to camel serum since they were originally meant for use on bovine serum. Blood samples (15 ml) were collected for serum harvesting from the jugular veins of the animals as they were waiting to be examined. Rose Bengal plate test and CFT were run at a laboratory within the Department of Veterinary Medicine, University of Horsed, 21 October campus; serum samples having been transported in a cool box. On average, out of an overall total of 300 serum samples tested, 180 samples were selected as sample procedures and were given eleven (11) positive results, amounting to a prevalence of 6.67%. For the three Districts, respective prevalence (averaged from the two (2) serological tests run) were: 7% (3/50) for Yaqshiid; 8% (3/60) for Deyniile and 10% (3/70) for Kaxda. When sensitivities of the two (2) serological tests were compared, there was no significant difference between them with respect to the picking of positive cases (p=0.05). The study has demonstrated presence of Pasterolosis in camels in Benadir Region and the authors are recommending the usage of RBPT and CFT as screening tests, since they are cheap, quick, and easy to carry out. Any of the other three involving tests can then be used if one wants to establish respective titers. Therefore, further detailed investigation needs to be conducted so as to understand specific etiological agents causing pasteurollosis in camel and can be instituted to optimize the benefit obtained from the camel sector.Keywords: hemorrhagic septicemia, camel, prevalence, Benadir region, Somalia
Procedia PDF Downloads 721009 Enhancer: An Effective Transformer Architecture for Single Image Super Resolution
Authors: Pitigalage Chamath Chandira Peiris
Abstract:
A widely researched domain in the field of image processing in recent times has been single image super-resolution, which tries to restore a high-resolution image from a single low-resolution image. Many more single image super-resolution efforts have been completed utilizing equally traditional and deep learning methodologies, as well as a variety of other methodologies. Deep learning-based super-resolution methods, in particular, have received significant interest. As of now, the most advanced image restoration approaches are based on convolutional neural networks; nevertheless, only a few efforts have been performed using Transformers, which have demonstrated excellent performance on high-level vision tasks. The effectiveness of CNN-based algorithms in image super-resolution has been impressive. However, these methods cannot completely capture the non-local features of the data. Enhancer is a simple yet powerful Transformer-based approach for enhancing the resolution of images. A method for single image super-resolution was developed in this study, which utilized an efficient and effective transformer design. This proposed architecture makes use of a locally enhanced window transformer block to alleviate the enormous computational load associated with non-overlapping window-based self-attention. Additionally, it incorporates depth-wise convolution in the feed-forward network to enhance its ability to capture local context. This study is assessed by comparing the results obtained for popular datasets to those obtained by other techniques in the domain.Keywords: single image super resolution, computer vision, vision transformers, image restoration
Procedia PDF Downloads 1051008 Hand Detection and Recognition for Malay Sign Language
Authors: Mohd Noah A. Rahman, Afzaal H. Seyal, Norhafilah Bara
Abstract:
Developing a software application using an interface with computers and peripheral devices using gestures of human body such as hand movements keeps growing in interest. A review on this hand gesture detection and recognition based on computer vision technique remains a very challenging task. This is to provide more natural, innovative and sophisticated way of non-verbal communication, such as sign language, in human computer interaction. Nevertheless, this paper explores hand detection and hand gesture recognition applying a vision based approach. The hand detection and recognition used skin color spaces such as HSV and YCrCb are applied. However, there are limitations that are needed to be considered. Almost all of skin color space models are sensitive to quickly changing or mixed lighting circumstances. There are certain restrictions in order for the hand recognition to give better results such as the distance of user’s hand to the webcam and the posture and size of the hand.Keywords: hand detection, hand gesture, hand recognition, sign language
Procedia PDF Downloads 3071007 Enhancing Fall Detection Accuracy with a Transfer Learning-Aided Transformer Model Using Computer Vision
Authors: Sheldon McCall, Miao Yu, Liyun Gong, Shigang Yue, Stefanos Kollias
Abstract:
Falls are a significant health concern for older adults globally, and prompt identification is critical to providing necessary healthcare support. Our study proposes a new fall detection method using computer vision based on modern deep learning techniques. Our approach involves training a trans- former model on a large 2D pose dataset for general action recognition, followed by transfer learning. Specifically, we freeze the first few layers of the trained transformer model and train only the last two layers for fall detection. Our experimental results demonstrate that our proposed method outperforms both classical machine learning and deep learning approaches in fall/non-fall classification. Overall, our study suggests that our proposed methodology could be a valuable tool for identifying falls.Keywords: healthcare, fall detection, transformer, transfer learning
Procedia PDF Downloads 1481006 Neural Style Transfer Using Deep Learning
Authors: Shaik Jilani Basha, Inavolu Avinash, Alla Venu Sai Reddy, Bitragunta Taraka Ramu
Abstract:
We can use the neural style transfer technique to build a picture with the same "content" as the beginning image but the "style" of the picture we've chosen. Neural style transfer is a technique for merging the style of one image into another while retaining its original information. The only change is how the image is formatted to give it an additional artistic sense. The content image depicts the plan or drawing, as well as the colors of the drawing or paintings used to portray the style. It is a computer vision programme that learns and processes images through deep convolutional neural networks. To implement software, we used to train deep learning models with the train data, and whenever a user takes an image and a styled image, the output will be as the style gets transferred to the original image, and it will be shown as the output.Keywords: neural networks, computer vision, deep learning, convolutional neural networks
Procedia PDF Downloads 951005 The Dual Catastrophe of Behçet’s Disease Visual Loss Followed by Acute Spinal Shock After Lumbar Drain Removal
Authors: Naim Izet Kajtazi
Abstract:
Context: Increased intracranial pressure and associated symptoms such as headache, papilledema, motor or sensory deficits, seizures, and conscious disturbance are well-known in acute CVT. However, visual loss is not commonly associated with this disease, except in the case of secondary IIH associated with it. Process: We report a case of a 40-year-old male with Behçet’s disease and cerebral venous thrombosis, and other multiple comorbidities admitted with a four-day history of increasing headache and rapidly progressive visual loss bilaterally. The neurological examination was positive for bilateral papilledema of grade 3 with light perception on the left eye and counting fingers on the right eye. Brain imaging showed old findings of cerebral venous thrombosis without any intraparenchymal lesions to suggest a flare-up of Behçet’s disease. The lumbar puncture, followed by the lumbar drain insertion, gave no benefit in headache or vision. However, he completely lost sight. The right optic nerve sheath fenestration did not result in vision improvement. The acute spinal shock complicated the lumbar drain removal due to epidural hematoma. An urgent lumbar laminectomy with hematoma evacuation undertook. Intra-operatively, the neurosurgeon noted suspicious abnormal vessels at conus medullaris with the possibility of an arteriovenous malformation. Outcome: In a few days following the spinal surgery, the patient vision started to improve. Further improvement was achieved after plasma exchange sessions followed by cyclophosphamide. In the recent follow-up in the clinic, he reported better vision, drove, and completed his Ph.D. studies. Relevance: Visual loss in patients with Behçet’s disease should always be anticipated and taken reasonable care of, ensuring that they receive well-combined immunosuppression with anticoagulation and agents to reduce intracranial pressure. This patient’s story is significant for a high disease burden and complicated hospital course by acute spinal shock due to spinal lumbar drain removal with a possible underlying spinal arteriovenous malformation.Keywords: Behcet disease, optic neuritis, IIH, CVT
Procedia PDF Downloads 731004 Improving the Performance of Deep Learning in Facial Emotion Recognition with Image Sharpening
Authors: Ksheeraj Sai Vepuri, Nada Attar
Abstract:
We as humans use words with accompanying visual and facial cues to communicate effectively. Classifying facial emotion using computer vision methodologies has been an active research area in the computer vision field. In this paper, we propose a simple method for facial expression recognition that enhances accuracy. We tested our method on the FER-2013 dataset that contains static images. Instead of using Histogram equalization to preprocess the dataset, we used Unsharp Mask to emphasize texture and details and sharpened the edges. We also used ImageDataGenerator from Keras library for data augmentation. Then we used Convolutional Neural Networks (CNN) model to classify the images into 7 different facial expressions, yielding an accuracy of 69.46% on the test set. Our results show that using image preprocessing such as the sharpening technique for a CNN model can improve the performance, even when the CNN model is relatively simple.Keywords: facial expression recognittion, image preprocessing, deep learning, CNN
Procedia PDF Downloads 1431003 Prevalence of Near Visual Impairment and Associated Factors among School Teachers in Gondar City, North West Ethiopia, 2022
Authors: Bersufekad Wubie
Abstract:
Introduction: Near visual impairment is presenting near visual acuity of the eye worse than N6 at a 40 cm distance. Teachers' regular duties, such as reading books, writing on the blackboard, and recognizing students' faces, need good near vision. If a teacher has near-visual impairment, the work output is unsatisfactory. Objective: The study was aimed to assess the prevalence and associated factors near vision impairment among school teachers at Gondar city Northwest Ethiopia, August 2022. Methods: To select 567 teachers in Gondar city schools, an institutional-based cross-sectional study design with a multistage sampling technique were used. The study was conducted in selected schools from May 1 to May 30, 2022. Trained data collectors used well-structured Amharic and English language questionnaires and ophthalmic instruments for examination. The collected data were checked for completeness and entered into Epi data version 4.6, then exported to SPSS version 26 for further analysis. A binary and multivariate logistic regression model was fitted. And associated factors of the outcome variable. Result: The prevalence of near visual impairment was 64.6%, with a confidence interval of 60.3%–68.4%. Near visual impairment was significantly associated with age >= 35 years (AOR: 4.90 at 95% CI: 3.15, 7.65), having prolonged years of teaching experience (AOR: 3.29 at 95% CI: 1.70, 4.62), having a history of ocular surgery (AOR: 1.96 at 95% CI: 1.10, 4.62), smokers (AOR: 2.21 at 95% CI: 1.22, 4.07), history of ocular trauma (AOR : 1.80 at 95%CI:1.11,3.18 and uncorrected refractive error (AOR:2.01 at 95%CI:1.13,4.03). Conclusion and recommendations: This study showed the prevalence of near vision impairment among school teachers was high, and it is not a problem of the presbyopia age group alone; it also happens at a young age. So teachers' ocular health should be well accommodated in the school's eye health.Keywords: Gondar, near visual impairment, school, teachers
Procedia PDF Downloads 1381002 Improving Sales through Inventory Reduction: A Retail Chain Case Study
Authors: M. G. Mattos, J. E. Pécora Jr, T. A. Briso
Abstract:
Today's challenging business environment, with unpredictable demand and volatility, requires a supply chain strategy that handles uncertainty and risks in the right way. Even though inventory models have been previously explored, this paper seeks to apply these concepts on a practical situation. This study involves the inventory replenishment problem, applying techniques that are mainly based on mathematical assumptions and modeling. The primary goal is to improve the retailer’s supply chain processes taking store differences when setting the various target stock levels. Through inventory review policy, picking piece implementation and minimum exposure definition, we were able not only to promote the inventory reduction as well as improve sales results. The inventory management theory from literature review was then tested on a single case study regarding a particular department in one of the largest Latam retail chains.Keywords: inventory, distribution, retail, risk, safety stock, sales, uncertainty
Procedia PDF Downloads 268