Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 3047

Search results for: computer vision

2957 An Exponential Field Path Planning Method for Mobile Robots Integrated with Visual Perception

Authors: Magdy Roman, Mostafa Shoeib, Mostafa Rostom

Abstract:

Global vision, whether provided by overhead fixed cameras, on-board aerial vehicle cameras, or satellite images can always provide detailed information on the environment around mobile robots. In this paper, an intelligent vision-based method of path planning and obstacle avoidance for mobile robots is presented. The method integrates visual perception with a new proposed field-based path-planning method to overcome common path-planning problems such as local minima, unreachable destination and unnecessary lengthy paths around obstacles. The method proposes an exponential angle deviation field around each obstacle that affects the orientation of a close robot. As the robot directs toward, the goal point obstacles are classified into right and left groups, and a deviation angle is exponentially added or subtracted to the orientation of the robot. Exponential field parameters are chosen based on Lyapunov stability criterion to guarantee robot convergence to the destination. The proposed method uses obstacles' shape and location, extracted from global vision system, through a collision prediction mechanism to decide whether to activate or deactivate obstacles field. In addition, a search mechanism is developed in case of robot or goal point is trapped among obstacles to find suitable exit or entrance. The proposed algorithm is validated both in simulation and through experiments. The algorithm shows effectiveness in obstacles' avoidance and destination convergence, overcoming common path planning problems found in classical methods.

Keywords: path planning, collision avoidance, convergence, computer vision, mobile robots

Procedia PDF Downloads 152

2956 Automatic Identification and Monitoring of Wildlife via Computer Vision and IoT

Authors: Bilal Arshad, Johan Barthelemy, Elliott Pilton, Pascal Perez

Abstract:

Getting reliable, informative, and up-to-date information about the location, mobility, and behavioural patterns of animals will enhance our ability to research and preserve biodiversity. The fusion of infra-red sensors and camera traps offers an inexpensive way to collect wildlife data in the form of images. However, extracting useful data from these images, such as the identification and counting of animals remains a manual, time-consuming, and costly process. In this paper, we demonstrate that such information can be automatically retrieved by using state-of-the-art deep learning methods. Another major challenge that ecologists are facing is the recounting of one single animal multiple times due to that animal reappearing in other images taken by the same or other camera traps. Nonetheless, such information can be extremely useful for tracking wildlife and understanding its behaviour. To tackle the multiple count problem, we have designed a meshed network of camera traps, so they can share the captured images along with timestamps, cumulative counts, and dimensions of the animal. The proposed method takes leverage of edge computing to support real-time tracking and monitoring of wildlife. This method has been validated in the field and can be easily extended to other applications focusing on wildlife monitoring and management, where the traditional way of monitoring is expensive and time-consuming.

Keywords: computer vision, ecology, internet of things, invasive species management, wildlife management

Procedia PDF Downloads 106

2955 How Envisioning Process Is Constructed: An Exploratory Research Comparing Three International Public Televisions

Authors: Alexandre Bedard, Johane Brunet, Wendellyn Reid

Abstract:

Public Television is constantly trying to maintain and develop its audience. And to achieve those goals, it needs a strong and clear vision. Vision or envision is a multidimensional process; it is simultaneously a conduit that orients and fixes the future, an idea that comes before the strategy and a mean by which action is accomplished, from a business perspective. Also, vision is often studied from a prescriptive and instrumental manner. Based on our understanding of the literature, we were able to explain how envisioning, as a process, is a creative one; it takes place in the mind and uses wisdom and intelligence through a process of evaluation, analysis and creation. Through an aggregation of the literature, we build a model of the envisioning process, based on past experiences, perceptions and knowledge and influenced by the context, being the individual, the organization and the environment. With exploratory research in which vision was deciphered through the discourse, through a qualitative and abductive approach and a grounded theory perspective, we explored three extreme cases, with eighteen interviews with experts, leaders, politicians, actors of the industry, etc. and more than twenty hours of interviews in three different countries. We compared the strategy, the business model, and the political and legal forces. We also looked at the history of each industry from an inertial point of view. Our analysis of the data revealed that a legitimacy effect due to the audience, the innovation and the creativity of the institutions was at the cornerstone of what would influence the envisioning process. This allowed us to identify how different the process was for Canadian, French and UK public broadcasters, although we concluded that the three of them had a socially constructed vision for their future, based on stakeholder management and an emerging role for the managers: ideas brokers.

Keywords: envisioning process, international comparison, television, vision

Procedia PDF Downloads 97

2954 Variation of Refractive Errors among Right and Left Eyes in Jos, Plateau State, Nigeria

Authors: F. B. Masok, S. S Songdeg, R. R. Dawam

Abstract:

Vision is an important process for learning and communication as man depends greatly on vision to sense his environment. Prevalence and variation of refractive errors conducted between December 2010 and May 2011 in Jos, revealed that 735 (77.50%) out 950 subjects examined for refractive error had various refractive errors. Myopia was observed in 373 (49.79%) of the subjects, the error in the right eyes was 263 (55.60%) while the error in the left was 210(44.39%). The mean myopic error was found to be -1.54± 3.32. Hyperopia was observed in 385 (40.53%) of the sampled population comprising 203(52.73%) of the right eyes and 182(47.27%). The mean hyperopic error was found to be +1.74± 3.13. Astigmatism accounted for 359 (38.84%) of the subjects, out of which 193(53.76%) were in the right eyes while 168(46.79%) were in the left eyes. Presbyopia was found in 404(42.53%) of the subjects, of this figure, 164(40.59%) were in the right eyes while 240(59.41%) were in left eyes. The number of right eyes and left eyes with refractive errors was observed in some age groups to increase with age and later had its peak within 60 – 69 age groups. This pattern of refractive errors could be attributed to exposure to various forms of light particularly the ultraviolet rays (e.g rays from television and computer screen). There was no remarkable differences between the mean Myopic error and mean Hyperopic error in the right eyes and in the left eyes which suggest the right eye and the left eye are similar.

Keywords: left eye, refractive errors, right eye, variation

Procedia PDF Downloads 400

2953 Deprivation of Visual Information Affects Differently the Gait Cycle in Children with Different Level of Motor Competence

Authors: Miriam Palomo-Nieto, Adrian Agricola, Rudolf Psotta, Reza Abdollahipour, Ludvik Valtr

Abstract:

The importance of vision and the visual control of movement have been labeled in the literature related to motor control and many studies have demonstrated that children with low motor competence may rely more heavily on vision to perform movements than their typically developing peers. The aim of the study was to highlight the effects of different visual conditions on motor performance during walking in children with different levels of motor coordination. Participants (n = 32, mean age = 8.5 years sd. ± 0.5) were divided into two groups: typical development (TD) and low motor coordination (LMC) based on the scores of the Movement Assessment Battery for Children (MABC-2). They were asked to walk along a 10 meters walkway where the Optojump-Next instrument was installed in a portable laboratory (15 x 3 m), which allows that all participants had the same visual information. They walked in self-selected speed under four visual conditions: full vision (FV), limited vision 100 ms (LV-100), limited vision 150 ms (LV-150) and non-vision (NV). For visual occlusion participants were equipped with Plato Goggles that shut for 100 and 150 ms, respectively, within each 2 sec. Data were analyzed in a two-way mixed-effect ANOVA including 2 (TD vs. LMC) x 4 (FV, LV-100, LV-150 & NV) with repeated-measures on the last factor (p ≤.05). Results indicated that TD children walked faster and with longer normalized steps length and strides than LMC children. For TD children the percentage of the single support and swing time were higher than for low motor competence children. However, the percentage of load response and pre swing was higher in the low motor competence children rather than the TD children. These findings indicated that through walking we could be able to identify different levels of motor coordination in children. Likewise, LMC children showed shorter percentages in those parameters regarding only one leg support, supporting the idea of balance problems.

Keywords: visual information, motor performance, walking pattern, optojump

Procedia PDF Downloads 542

2952 Amorphous Silicon-Based PINIP Structure for Human-Like Photosensor

Authors: Sheng-Chuan Hsu

Abstract:

Because the existing structure of ambient light sensor is most silicon photodiode device, it is extremely sensitive in the red and infrared regions. Even though the IR-Cut filter had added, it still cannot completely eliminate the influence of infrared light, and the spectral response of infrared light was stronger than that of the human eyes. Therefore, it is not able to present the vision spectrum of the human eye reacts with the ambient light. Then it needs to consider that the human eye feels the spectra that show significant differences between light and dark place. Consequently, in practical applications, we must create and develop advanced device of human-like photosensor which can solve these problems of ambient light sensor and let cognitive lighting system to provide suitable light to achieve the goals of vision spectrum of human eye and save energy.

Keywords: ambient light sensor, vision spectrum, cognitive lighting system, human eye

Procedia PDF Downloads 305

2951 Causes of Blindness and Low Vision among Visually Impaired Population Supported by Welfare Organization in Ardabil Province in Iran

Authors: Mohammad Maeiyat, Ali Maeiyat Ivatlou, Rasul Fani Khiavi, Abouzar Maeiyat Ivatlou, Parya Maeiyat

Abstract:

Purpose: Considering the fact that visual impairment is still one of the countries health problem, this study was conducted to determine the causes of blindness and low vision in visually impaired membership of Ardabil Province welfare organization. Methods: The present study which was based on descriptive and national-census, that carried out in visually impaired population supported by welfare organization in all urban and rural areas of Ardabil Province in 2013 and Collection of samples lasted for 7 months. The subjects were inspected by optometrist to determine their visual status (blindness or low vision) and then referred to ophthalmologist in order to discover the main causes of visual impairment based on the international classification of diseases version 10. Statistical analysis of collected data was performed using SPSS software version 18. Results: Overall, 403 subjects with mean age of years participated in this study. 73.2% were blind, 26.8 % were low vision and according gender grouping 60.50 % of them were male, 39.50 % were female that divided into three groups with the age level of lower than 15 (11.2%) 15 to 49 (76.7%), and 50 and higher (12.1%). The age range was 1 to 78 years. The causes of blindness and low vision were in descending order: optic atrophy (18.4%), retinitis pigmentosa (16.8%), corneal diseases (12.4%), chorioretinal diseases (9.4%), cataract (8.9%), glaucoma (8.2%), phthisis bulbi (7.2%), degenerative myopia (6.9%), microphtalmos ( 4%), amblyopia (3.2%), albinism (2.5%) and nistagmus (2%). Conclusion: in this study the main causes of visual impairments were optic atrophy and retinitis pigmentosa, thus specific prevention plans can be effective in reducing the incidence of visual disabilities.

Keywords: blindness, low vision, welfare, ardabil

Procedia PDF Downloads 403

2950 Multimodal Deep Learning for Human Activity Recognition

Authors: Ons Slimene, Aroua Taamallah, Maha Khemaja

Abstract:

In recent years, human activity recognition (HAR) has been a key area of research due to its diverse applications. It has garnered increasing attention in the field of computer vision. HAR plays an important role in people’s daily lives as it has the ability to learn advanced knowledge about human activities from data. In HAR, activities are usually represented by exploiting different types of sensors, such as embedded sensors or visual sensors. However, these sensors have limitations, such as local obstacles, image-related obstacles, sensor unreliability, and consumer concerns. Recently, several deep learning-based approaches have been proposed for HAR and these approaches are classified into two categories based on the type of data used: vision-based approaches and sensor-based approaches. This research paper highlights the importance of multimodal data fusion from skeleton data obtained from videos and data generated by embedded sensors using deep neural networks for achieving HAR. We propose a deep multimodal fusion network based on a twostream architecture. These two streams use the Convolutional Neural Network combined with the Bidirectional LSTM (CNN BILSTM) to process skeleton data and data generated by embedded sensors and the fusion at the feature level is considered. The proposed model was evaluated on a public OPPORTUNITY++ dataset and produced a accuracy of 96.77%.

Keywords: human activity recognition, action recognition, sensors, vision, human-centric sensing, deep learning, context-awareness

Procedia PDF Downloads 62

2949 Floodnet: Classification for Post Flood Scene with a High-Resolution Aerial Imaginary Dataset

Authors: Molakala Mourya Vardhan Reddy, Kandimala Revanth, Koduru Sumanth, Beena B. M.

Abstract:

Emergency response and recovery operations are severely hampered by natural catastrophes, especially floods. Understanding post-flood scenarios is essential to disaster management because it facilitates quick evaluation and decision-making. To this end, we introduce FloodNet, a brand-new high-resolution aerial picture collection created especially for comprehending post-flood scenes. A varied collection of excellent aerial photos taken during and after flood occurrences make up FloodNet, which offers comprehensive representations of flooded landscapes, damaged infrastructure, and changed topographies. The dataset provides a thorough resource for training and assessing computer vision models designed to handle the complexity of post-flood scenarios, including a variety of environmental conditions and geographic regions. Pixel-level semantic segmentation masks are used to label the pictures in FloodNet, allowing for a more detailed examination of flood-related characteristics, including debris, water bodies, and damaged structures. Furthermore, temporal and positional metadata improve the dataset's usefulness for longitudinal research and spatiotemporal analysis. For activities like flood extent mapping, damage assessment, and infrastructure recovery projection, we provide baseline standards and evaluation metrics to promote research and development in the field of post-flood scene comprehension. By integrating FloodNet into machine learning pipelines, it will be easier to create reliable algorithms that will help politicians, urban planners, and first responders make choices both before and after floods. The goal of the FloodNet dataset is to support advances in computer vision, remote sensing, and disaster response technologies by providing a useful resource for researchers. FloodNet helps to create creative solutions for boosting communities' resilience in the face of natural catastrophes by tackling the particular problems presented by post-flood situations.

Keywords: image classification, segmentation, computer vision, nature disaster, unmanned arial vehicle(UAV), machine learning.

Procedia PDF Downloads 27

2948 Resisting Adversarial Assaults: A Model-Agnostic Autoencoder Solution

Authors: Massimo Miccoli, Luca Marangoni, Alberto Aniello Scaringi, Alessandro Marceddu, Alessandro Amicone

Abstract:

The susceptibility of deep neural networks (DNNs) to adversarial manipulations is a recognized challenge within the computer vision domain. Adversarial examples, crafted by adding subtle yet malicious alterations to benign images, exploit this vulnerability. Various defense strategies have been proposed to safeguard DNNs against such attacks, stemming from diverse research hypotheses. Building upon prior work, our approach involves the utilization of autoencoder models. Autoencoders, a type of neural network, are trained to learn representations of training data and reconstruct inputs from these representations, typically minimizing reconstruction errors like mean squared error (MSE). Our autoencoder was trained on a dataset of benign examples; learning features specific to them. Consequently, when presented with significantly perturbed adversarial examples, the autoencoder exhibited high reconstruction errors. The architecture of the autoencoder was tailored to the dimensions of the images under evaluation. We considered various image sizes, constructing models differently for 256x256 and 512x512 images. Moreover, the choice of the computer vision model is crucial, as most adversarial attacks are designed with specific AI structures in mind. To mitigate this, we proposed a method to replace image-specific dimensions with a structure independent of both dimensions and neural network models, thereby enhancing robustness. Our multi-modal autoencoder reconstructs the spectral representation of images across the red-green-blue (RGB) color channels. To validate our approach, we conducted experiments using diverse datasets and subjected them to adversarial attacks using models such as ResNet50 and ViT_L_16 from the torch vision library. The autoencoder extracted features used in a classification model, resulting in an MSE (RGB) of 0.014, a classification accuracy of 97.33%, and a precision of 99%.

Keywords: adversarial attacks, malicious images detector, binary classifier, multimodal transformer autoencoder

Procedia PDF Downloads 33

2947 Machine Learning and Deep Learning Approach for People Recognition and Tracking in Crowd for Safety Monitoring

Authors: A. Degale Desta, Cheng Jian

Abstract:

Deep learning application in computer vision is rapidly advancing, giving it the ability to monitor the public and quickly identify potentially anomalous behaviour from crowd scenes. Therefore, the purpose of the current work is to improve the performance of safety of people in crowd events from panic behaviour through introducing the innovative idea of Aggregation of Ensembles (AOE), which makes use of the pre-trained ConvNets and a pool of classifiers to find anomalies in video data with packed scenes. According to the theory of algorithms that applied K-means, KNN, CNN, SVD, and Faster-CNN, YOLOv5 architectures learn different levels of semantic representation from crowd videos; the proposed approach leverages an ensemble of various fine-tuned convolutional neural networks (CNN), allowing for the extraction of enriched feature sets. In addition to the above algorithms, a long short-term memory neural network to forecast future feature values and a handmade feature that takes into consideration the peculiarities of the crowd to understand human behavior. On well-known datasets of panic situations, experiments are run to assess the effectiveness and precision of the suggested method. Results reveal that, compared to state-of-the-art methodologies, the system produces better and more promising results in terms of accuracy and processing speed.

Keywords: action recognition, computer vision, crowd detecting and tracking, deep learning

Procedia PDF Downloads 119

2946 Exploring Dynamics of Regional Creative Economy

Authors: Ari Lindeman, Melina Maunula, Jani Kiviranta, Ronja Pölkki

Abstract:

The aim of this paper is to build a vision of the utilization of creative industry competences in industrial and services firms connected to Kymenlaakso region, Finland, smart specialization focus areas. Research indicates that creativity and the use of creative industry’s inputs can enhance innovation and competitiveness. Currently creative methods and services are underutilized in regional businesses and the added value they provide is not well grasped. Methodologically, the research adopts a qualitative exploratory approach. Data is collected in multiple ways including a survey, focus groups, and interviews. Theoretically, the paper contributes to the discussion about the use creative industry competences in regional development, and argues for building regional creative economy ecosystems in close co-operation with regional strategies and traditional industries rather than as treating regional creative industry ecosystem initiatives separate from them. The practical contribution of the paper is the creative vision for the use of regional authorities in updating smart specialization strategy as well as boosting industrial and creative & cultural sectors’ competitiveness. The paper also illustrates a research-based model of vision building.

Keywords: business, cooperation, creative economy, regional development, vision

Procedia PDF Downloads 92

2945 Alternative Approach to the Machine Vision System Operating for Solving Industrial Control Issue

Authors: M. S. Nikitenko, S. A. Kizilov, D. Y. Khudonogov

Abstract:

The paper considers an approach to a machine vision operating system combined with using a grid of light markers. This approach is used to solve several scientific and technical problems, such as measuring the capability of an apron feeder delivering coal from a lining return port to a conveyor in the technology of mining high coal releasing to a conveyor and prototyping an autonomous vehicle obstacle detection system. Primary verification of a method of calculating bulk material volume using three-dimensional modeling and validation in laboratory conditions with relative errors calculation were carried out. A method of calculating the capability of an apron feeder based on a machine vision system and a simplifying technology of a three-dimensional modelled examined measuring area with machine vision was offered. The proposed method allows measuring the volume of rock mass moved by an apron feeder using machine vision. This approach solves the volume control issue of coal produced by a feeder while working off high coal by lava complexes with release to a conveyor with accuracy applied for practical application. The developed mathematical apparatus for measuring feeder productivity in kg/s uses only basic mathematical functions such as addition, subtraction, multiplication, and division. Thus, this fact simplifies software development, and this fact expands the variety of microcontrollers and microcomputers suitable for performing tasks of calculating feeder capability. A feature of an obstacle detection issue is to correct distortions of the laser grid, which simplifies their detection. The paper presents algorithms for video camera image processing and autonomous vehicle model control based on obstacle detection machine vision systems. A sample fragment of obstacle detection at the moment of distortion with the laser grid is demonstrated.

Keywords: machine vision, machine vision operating system, light markers, measuring capability, obstacle detection system, autonomous transport

Procedia PDF Downloads 77

2944 Drastic Improvement in Vision Following Surgical Excision of Juvenile Nasopharyngeal Angiofibroma with Compressive Optic Neuropathy

Authors: Sweta Das

Abstract:

This case report is a 15-year-old male who presented with painless unilateral vision loss from left optic nerve compression due to juvenile nasopharyngeal angiofibroma. JNA is a rare, benign neoplasm that causes intracranial and intraorbital bone destruction and extends aggressively into surrounding soft tissues. It accounts for <1% of all head and neck tumors, is predominantly found in pediatric males and tends to affect indigenous population disproportionately. The most common presenting symptom for JNA is epistaxis and nasal obstruction. However, it can invade orbit, chiasm and pituitary gland, causing loss of vision and field. Visual acuity and function near normalized following surgical excision. Optometry plays an important role in the diagnosis and co-management of JNA with optic nerve compression by closely monitoring afferent optic nerve function and structure, and extraocular motility. Visual function and acuity in patients with short-term compressive neuropathy may drastically improve following surgical resection as this case demonstrates.

Keywords: orbital mass, painless monocular vision loss, compressive optic neuropathy, pediatric tumor

Procedia PDF Downloads 28

2943 Classical Myths in Modern Drama: A Study of the Vision of Jean Anouilh in Antigone

Authors: Azza Taha Zaki

Abstract:

Modern drama was characterised by realism and naturalism as dominant literary movements that focused on contemporary people and their issues to reflect the status of modern man and his environment. However, some modern dramatists have often fallen on classical mythology in ancient Greek tragedies to create a sense of the universality of the human experience. The tragic overtones of classical myths have helped modern dramatists in their attempts to create an enduring piece by evoking the majestic grandeur of the ancient myths and the heroic struggle of man against forces he cannot fight. Myths have continued to appeal to modern playwrights not only for the plot and narrative material but also for the vision and insight into the human experience and human condition. This paper intends to study how the reworking of Sophocles’ Antigone by Jean Anouilh in his Antigone, written in 1942 at the height of the Second World War and during the German occupation of his country, France, fits his own purpose and his own time. The paper will also offer an analysis of the vision in both plays to show how Anouilh has used the classical Antigone freely to produce a modern vision of the dilemma of man when faced by personal and national conflicts.

Keywords: Anouilh, Antigone, drama, Greek tragedy, modern, myth, sophocles

Procedia PDF Downloads 152

2942 Glaucoma Detection in Retinal Tomography Using the Vision Transformer

Authors: Sushish Baral, Pratibha Joshi, Yaman Maharjan

Abstract:

Glaucoma is a chronic eye condition that causes vision loss that is irreversible. Early detection and treatment are critical to prevent vision loss because it can be asymptomatic. For the identification of glaucoma, multiple deep learning algorithms are used. Transformer-based architectures, which use the self-attention mechanism to encode long-range dependencies and acquire extremely expressive representations, have recently become popular. Convolutional architectures, on the other hand, lack knowledge of long-range dependencies in the image due to their intrinsic inductive biases. The aforementioned statements inspire this thesis to look at transformer-based solutions and investigate the viability of adopting transformer-based network designs for glaucoma detection. Using retinal fundus images of the optic nerve head to develop a viable algorithm to assess the severity of glaucoma necessitates a large number of well-curated images. Initially, data is generated by augmenting ocular pictures. After that, the ocular images are pre-processed to make them ready for further processing. The system is trained using pre-processed images, and it classifies the input images as normal or glaucoma based on the features retrieved during training. The Vision Transformer (ViT) architecture is well suited to this situation, as it allows the self-attention mechanism to utilise structural modeling. Extensive experiments are run on the common dataset, and the results are thoroughly validated and visualized.

Keywords: glaucoma, vision transformer, convolutional architectures, retinal fundus images, self-attention, deep learning

Procedia PDF Downloads 158

2941 Enhancer: An Effective Transformer Architecture for Single Image Super Resolution

Authors: Pitigalage Chamath Chandira Peiris

Abstract:

A widely researched domain in the field of image processing in recent times has been single image super-resolution, which tries to restore a high-resolution image from a single low-resolution image. Many more single image super-resolution efforts have been completed utilizing equally traditional and deep learning methodologies, as well as a variety of other methodologies. Deep learning-based super-resolution methods, in particular, have received significant interest. As of now, the most advanced image restoration approaches are based on convolutional neural networks; nevertheless, only a few efforts have been performed using Transformers, which have demonstrated excellent performance on high-level vision tasks. The effectiveness of CNN-based algorithms in image super-resolution has been impressive. However, these methods cannot completely capture the non-local features of the data. Enhancer is a simple yet powerful Transformer-based approach for enhancing the resolution of images. A method for single image super-resolution was developed in this study, which utilized an efficient and effective transformer design. This proposed architecture makes use of a locally enhanced window transformer block to alleviate the enormous computational load associated with non-overlapping window-based self-attention. Additionally, it incorporates depth-wise convolution in the feed-forward network to enhance its ability to capture local context. This study is assessed by comparing the results obtained for popular datasets to those obtained by other techniques in the domain.

Keywords: single image super resolution, computer vision, vision transformers, image restoration

Procedia PDF Downloads 72

2940 Refined Edge Detection Network

Authors: Omar Elharrouss, Youssef Hmamouche, Assia Kamal Idrissi, Btissam El Khamlichi, Amal El Fallah-Seghrouchni

Abstract:

Edge detection is represented as one of the most challenging tasks in computer vision, due to the complexity of detecting the edges or boundaries in real-world images that contains objects of different types and scales like trees, building as well as various backgrounds. Edge detection is represented also as a key task for many computer vision applications. Using a set of backbones as well as attention modules, deep-learning-based methods improved the detection of edges compared with the traditional methods like Sobel and Canny. However, images of complex scenes still represent a challenge for these methods. Also, the detected edges using the existing approaches suffer from non-refined results while the image output contains many erroneous edges. To overcome this, n this paper, by using the mechanism of residual learning, a refined edge detection network is proposed (RED-Net). By maintaining the high resolution of edges during the training process, and conserving the resolution of the edge image during the network stage, we make the pooling outputs at each stage connected with the output of the previous layer. Also, after each layer, we use an affined batch normalization layer as an erosion operation for the homogeneous region in the image. The proposed methods are evaluated using the most challenging datasets including BSDS500, NYUD, and Multicue. The obtained results outperform the designed edge detection networks in terms of performance metrics and quality of output images.

Keywords: edge detection, convolutional neural networks, deep learning, scale-representation, backbone

Procedia PDF Downloads 64

2939 Machine Learning Strategies for Data Extraction from Unstructured Documents in Financial Services

Authors: Delphine Vendryes, Dushyanth Sekhar, Baojia Tong, Matthew Theisen, Chester Curme

Abstract:

Much of the data that inform the decisions of governments, corporations and individuals are harvested from unstructured documents. Data extraction is defined here as a process that turns non-machine-readable information into a machine-readable format that can be stored, for instance, in a database. In financial services, introducing more automation in data extraction pipelines is a major challenge. Information sought by financial data consumers is often buried within vast bodies of unstructured documents, which have historically required thorough manual extraction. Automated solutions provide faster access to non-machine-readable datasets, in a context where untimely information quickly becomes irrelevant. Data quality standards cannot be compromised, so automation requires high data integrity. This multifaceted task is broken down into smaller steps: ingestion, table parsing (detection and structure recognition), text analysis (entity detection and disambiguation), schema-based record extraction, user feedback incorporation. Selected intermediary steps are phrased as machine learning problems. Solutions leveraging cutting-edge approaches from the fields of computer vision (e.g. table detection) and natural language processing (e.g. entity detection and disambiguation) are proposed.

Keywords: computer vision, entity recognition, finance, information retrieval, machine learning, natural language processing

Procedia PDF Downloads 85

2938 Nurturing Green Creativity in Women Intrapreneurs through Green HRM: Testing Moderated Mediation Model: A Step Towards Saudi Vision 2030

Authors: Tahira Iram, Ahmad Raza Bilal

Abstract:

In 2016, the Kingdom of Saudi Arabia (KSA) initiated Saudi Vision 2030, an ambitious plan to lessen the country's dependency on fossil fuels and increase economic diversification. The Vision 2030 framework strives to establish a thriving economy, a vibrant society, and an ambitious nation. This study aims to investigate the role of green service innovation (SI) and green work engagement (WE) in mediating the nexus between green HRM and green creativity (GC) under the conditional role of spiritual leadership (SL). A survey was done of 300 female intrepreneurs working in the organization within Saudi Arabia. This study has collected data via a stratified random sampling technique. The framework was tested using PLS-SEM software. The findings reveal that WE fully intervenes in the nexus between green HRM and GC. Moreover, SL positively moderates the nexus between green HRM and SI. Thus based on findings, it is recommended that female intrapreneurs prioritize environmentally responsible operations to gain and sustain a competitive edge over rivals in the Saudi competitive market.

Keywords: green HRM, spiritual leadership, Vision 2030, women intrapreneurs, green service innovation behavior, green creativity

Procedia PDF Downloads 37

2937 An Empirical Study on the Effect of Physical Exercise and Outdoor Lighting on Pupils’ Eyesight

Authors: Zhang Jun Xiong

Abstract:

Objective: To explore the effect of physical exercise and outdoor lighting on the improvement of pupils' eyesight. Methods: A total of 208 first grade students in a primary school in Chengdu were enrolled in the study, 104 of whom were nearsighted and 104 had normal vision. They were randomly divided into indoor exercise group, outdoor exercise group, indoor control group and outdoor control group. Indoor and outdoor exercise groups performed moderate and high-intensity aerobic exercise three times a week, 60 minutes each time; The indoor and outdoor control groups had normal study and life during the experiment, without exercise intervention. The experiment lasted for one academic year, and the visual indicators of the subjects were tested before and after the experiment. Results: After the experiment, the visual fatigue index of the subjects with normal vision in the outdoor exercise group, indoor exercise group and outdoor control group decreased by 1.5 ± 2.89, 1.4 ± 3.05, 2.12 ± 2.66 respectively, and the diopter index decreased by 0.30D ± 0.09, 0.41D ± 0.16, 0.40D ± 0.19 respectively, while the visual fatigue score of the subjects with normal vision in the indoor control group increased by 2.3 ± 2.15, and the diopter decreased by 0.53D ± 0.22, There were significant differences in visual fatigue and diopter among the subjects with normal vision in each group (P<0.001). After the experiment, the visual fatigue index of the myopic subjects in the outdoor exercise group, indoor exercise group and outdoor control group decreased by 1.8 ± 1.95, 0.8 ± 1.81, 1.1 ± 1.85 respectively, and the diopter index decreased by 0.35D ± 0.21, 0.52D ± 0.24, 0.52D ± 0.15 respectively, while the visual fatigue score of the myopic subjects in the indoor control group increased by 1.3 ± 2.66, and the diopter decreased by 0.62D ± 0.29. There were significant differences between groups in visual fatigue and diopter (P<0.001). Conclusion: Both physical exercise and outdoor lighting can have a beneficial effect on children's vision, and the superposition effect of the two is better. It is suggested that outdoor physical exercise should be carried out more in primary school.

Keywords: physical exercise, outdoor lighting, pupil, vision, myopia

Procedia PDF Downloads 57

2936 Using Computer Vision and Machine Learning to Improve Facility Design for Healthcare Facility Worker Safety

Authors: Hengameh Hosseini

Abstract:

Design of large healthcare facilities – such as hospitals, multi-service line clinics, and nursing facilities - that can accommodate patients with wide-ranging disabilities is a challenging endeavor and one that is poorly understood among healthcare facility managers, administrators, and executives. An even less-understood extension of this problem is the implications of weakly or insufficiently accommodative design of facilities for healthcare workers in physically-intensive jobs who may also suffer from a range of disabilities and who are therefore at increased risk of workplace accident and injury. Combine this reality with the vast range of facility types, ages, and designs, and the problem of universal accommodation becomes even more daunting and complex. In this study, we focus on the implication of facility design for healthcare workers suffering with low vision who also have physically active jobs. The points of difficulty are myriad and could span health service infrastructure, the equipment used in health facilities, and transport to and from appointments and other services can all pose a barrier to health care if they are inaccessible, less accessible, or even simply less comfortable for people with various disabilities. We conduct a series of surveys and interviews with employees and administrators of 7 facilities of a range of sizes and ownership models in the Northeastern United States and combine that corpus with in-facility observations and data collection to identify five major points of failure common to all the facilities that we concluded could pose safety threats to employees with vision impairments, ranging from very minor to severe. We determine that lack of design empathy is a major commonality among facility management and ownership. We subsequently propose three methods for remedying this lack of empathy-informed design, to remedy the dangers posed to employees: the use of an existing open-sourced Augmented Reality application to simulate the low-vision experience for designers and managers; the use of a machine learning model we develop to automatically infer facility shortcomings from large datasets of recorded patient and employee reviews and feedback; and the use of a computer vision model fine tuned on images of each facility to infer and predict facility features, locations, and workflows, that could again pose meaningful dangers to visually impaired employees of each facility. After conducting a series of real-world comparative experiments with each of these approaches, we conclude that each of these are viable solutions under particular sets of conditions, and finally characterize the range of facility types, workforce composition profiles, and work conditions under which each of these methods would be most apt and successful.

Keywords: artificial intelligence, healthcare workers, facility design, disability, visually impaired, workplace safety

Procedia PDF Downloads 66

2935 The Role of Middle Managers SBU's in Context of Change: Sense-Making Approach

Authors: Hala Alioua, Alberic Tellier

Abstract:

This paper is designed to spotlight the research on corporate strategic planning, by emphasizing the role of middle manager of SBU’s and related issues such as the context of vision change. Previous research on strategic vision has been focused principally at the SME, with relatively limited consideration given to the role of middle managers SBU’s in the context of change. This project of research has been done by using a single case study. We formulated through our immersion for 2.5 years on the ground and by a qualitative method and abduction approach. This entity that we analyze is a subsidiary of multinational companies headquartered in Germany, specialized in manufacturing automotive equipment. The "Delta Company" is a French manufacturing plant that has undergone numerous changes over the past three years. The two major strategic changes that have a significant impact on the Delta plant are the strengths of its core business through « lead plant strategy» in 2011 and the implementation of a new strategic vision in 2014. These consecutive changes impact the purpose of the mission of the middle managers. The plant managers ask the following questions: How the middle managers make sense of the corporate strategic planning imposed by the parent company? How they appropriate the new vision and decline it into actions on the ground? We chose the individual interview technique through open-ended questions as the source of data collection. We first of all carried out an exploratory approach by interviewing 8 members of the Management committee’s decision and 19 heads of services. The first findings and results show that exist a divergence of opinion and interpretations of the corporate strategic planning among organization members and there are difficulties to make sense and interpretations of the signals of the environment. The lead plant strategy enables new projects which insure the workload of Delta Company. Nevertheless, it creates a tension and stress among the middle managers because its provoke lack of resources to the detriment of their main jobs as manufacturer plant. The middle managers does not have a clear vision and they are wondering if the new strategic vision means more autonomy and less support from the group.

Keywords: change, middle managers, vision, sensemaking

Procedia PDF Downloads 371

2934 Soueif’s 'The Returning' and 'The Nativity': A Portrait of the Other as Others

Authors: Samira Brahimi

Abstract:

Throughout Aisha, her first collection of short stories, Ahdaf Soueif draws a multilayered picture of the Other as others, picturing a series of encounters of her protagonist with this very Other as a set of binary elements. The current essay includes a comparative study between two narratives, namely The Returning and The Nativity. The Other is portrayed as a male/female binary in The Returning and as 'The Foreigner' in an exotic land vs. the local in The Nativity. The analysis is to focus on Aisha, the main female character, who figures as conforming to the portrait of the stereotyped Arab Muslim woman as a sex-subject, submissive, and maudlin character, confining her vision of the Other to the boundaries of her cocooned self, epitomizing a self-centered vision of the world. This reduced vision results in the possibility of viewing the Other as a hindrance to her attaining a clarified and centrifugal representation of the latter, herself, and the outside world. The encounters could also be considered as the character's opportunity for a less stigmatized perception of the elements set forth. The main queries to be probed are: what are the different perceptions of the Other by the author in the narratives set forth? How does the protagonist's encounter with the Other(s) impede her ability to understand the Other, herself, and the world around her? Or how does this encounter allow her an enlightened vision of the aforementioned elements to forge a new start? The possibility of imagining a dialogic relation between different perceptions of the Other opens up new perspectives for adopting magnified representations of the later, oneself, and the world, dilating one's imagination.

Keywords: dialogic, female, foreigner, local, male, other, others

Procedia PDF Downloads 95

2933 Facial Expression Recognition Using Sparse Gaussian Conditional Random Field

Authors: Mohammadamin Abbasnejad

Abstract:

The analysis of expression and facial Action Units (AUs) detection are very important tasks in fields of computer vision and Human Computer Interaction (HCI) due to the wide range of applications in human life. Many works have been done during the past few years which has their own advantages and disadvantages. In this work, we present a new model based on Gaussian Conditional Random Field. We solve our objective problem using ADMM and we show how well the proposed model works. We train and test our work on two facial expression datasets, CK+, and RU-FACS. Experimental evaluation shows that our proposed approach outperform state of the art expression recognition.

Keywords: Gaussian Conditional Random Field, ADMM, convergence, gradient descent

Procedia PDF Downloads 319

2932 Enhancing Plant Throughput in Mineral Processing Through Multimodal Artificial Intelligence

Authors: Muhammad Bilal Shaikh

Abstract:

Mineral processing plants play a pivotal role in extracting valuable minerals from raw ores, contributing significantly to various industries. However, the optimization of plant throughput remains a complex challenge, necessitating innovative approaches for increased efficiency and productivity. This research paper investigates the application of Multimodal Artificial Intelligence (MAI) techniques to address this challenge, aiming to improve overall plant throughput in mineral processing operations. The integration of multimodal AI leverages a combination of diverse data sources, including sensor data, images, and textual information, to provide a holistic understanding of the complex processes involved in mineral extraction. The paper explores the synergies between various AI modalities, such as machine learning, computer vision, and natural language processing, to create a comprehensive and adaptive system for optimizing mineral processing plants. The primary focus of the research is on developing advanced predictive models that can accurately forecast various parameters affecting plant throughput. Utilizing historical process data, machine learning algorithms are trained to identify patterns, correlations, and dependencies within the intricate network of mineral processing operations. This enables real-time decision-making and process optimization, ultimately leading to enhanced plant throughput. Incorporating computer vision into the multimodal AI framework allows for the analysis of visual data from sensors and cameras positioned throughout the plant. This visual input aids in monitoring equipment conditions, identifying anomalies, and optimizing the flow of raw materials. The combination of machine learning and computer vision enables the creation of predictive maintenance strategies, reducing downtime and improving the overall reliability of mineral processing plants. Furthermore, the integration of natural language processing facilitates the extraction of valuable insights from unstructured textual data, such as maintenance logs, research papers, and operator reports. By understanding and analyzing this textual information, the multimodal AI system can identify trends, potential bottlenecks, and areas for improvement in plant operations. This comprehensive approach enables a more nuanced understanding of the factors influencing throughput and allows for targeted interventions. The research also explores the challenges associated with implementing multimodal AI in mineral processing plants, including data integration, model interpretability, and scalability. Addressing these challenges is crucial for the successful deployment of AI solutions in real-world industrial settings. To validate the effectiveness of the proposed multimodal AI framework, the research conducts case studies in collaboration with mineral processing plants. The results demonstrate tangible improvements in plant throughput, efficiency, and cost-effectiveness. The paper concludes with insights into the broader implications of implementing multimodal AI in mineral processing and its potential to revolutionize the industry by providing a robust, adaptive, and data-driven approach to optimizing plant operations. In summary, this research contributes to the evolving field of mineral processing by showcasing the transformative potential of multimodal artificial intelligence in enhancing plant throughput. The proposed framework offers a holistic solution that integrates machine learning, computer vision, and natural language processing to address the intricacies of mineral extraction processes, paving the way for a more efficient and sustainable future in the mineral processing industry.

Keywords: multimodal AI, computer vision, NLP, mineral processing, mining

Procedia PDF Downloads 31

2931 The Prediction of Evolutionary Process of Coloured Vision in Mammals: A System Biology Approach

Authors: Shivani Sharma, Prashant Saxena, Inamul Hasan Madar

Abstract:

Since the time of Darwin, it has been considered that genetic change is the direct indicator of variation in phenotype. But a few studies in system biology in the past years have proposed that epigenetic developmental processes also affect the phenotype thus shifting the focus from a linear genotype-phenotype map to a non-linear G-P map. In this paper, we attempt at explaining the evolution of colour vision in mammals by taking LWS/ Long-wave sensitive gene under consideration.

Keywords: evolution, phenotypes, epigenetics, LWS gene, G-P map

Procedia PDF Downloads 478

2930 Improvement of Visual Acuity in Patient Undergoing Occlusion Therapy

Authors: Rajib Husain, Mezbah Uddin, Mohammad Shamsal Islam, Rabeya Siddiquee

Abstract:

Purpose: To determine the improvement of visual acuity in patients undergoing occlusion therapy. Methods: This was a prospective hospital-based study of newly diagnosed of amblyopia seen at the pediatric clinic of Chittagong Eye Infirmary & Training Complex. There were 32 refractive amblyopia subjects were examined & questionnaire was piloted. Included were all patients diagnosed with refractive amblyopia between 5 to 8 years, without previous amblyopia treatment, and whose parents were interested to participate in the study. Patients diagnosed with strabismic amblyopia were excluded. Patients were first corrected with the best correction for a month. When the VA in the amblyopic eye did not improve over a month, then occlusion treatment was started. Occlusion was done daily for 6-8 h together with vision therapy. The occlusion was carried out for three months. Results: Out of study 32 children, 31 of them have a good compliance of amblyopic treatment whereas one child has poor compliance. About 6% Children have amblyopia from Myopia, 7% Hyperopia, 32% from myopic astigmatism, 42% from hyperopic astigmatism and 13% have mixed astigmatism. The mean and Standard deviation of present average VA was 0.452±0.275 Log MAR and after an intervention of amblyopia therapy with vision therapy mean and Standard deviation VA was 0.155±0.157 Log MAR. Out of total respondent 21.85% have BCVA in range from (0-.2) log MAR, 37.5% have BCVA in range from (0.22-0.5) log MAR, 35.95% have in range from (0.52-0.8) log MAR, 4.7% have in range from (0.82-1) log MAR and after intervention of occlusion therapy with vision therapy 76.6% have VA in range from (0-.2) log MAR, 21.85% have VA in range from (0.22-0.5) log MAR, 1.5% have in range from (0.52-0.8) log MAR. Conclusion: Amblyopia is a most important factor in pediatric age group because it can lead to visual impairment. Thus, this study concludes that occlusion therapy with vision therapy is probably one of the best treatment methods for amblyopic patients (age 5-8 years), and compliance and age were the most critical factor predicting a successful outcome.

Keywords: amblyopia, occlusion therapy, vision therapy, eccentric fixation, visuoscopy

Procedia PDF Downloads 478

2929 Electronic and Computer-Assisted Refreshable Braille Display Developed for Visually Impaired Individuals

Authors: Ayşe Eldem, Fatih Başçiftçi

Abstract:

Braille alphabet is an important tool that enables visually impaired individuals to have a comfortable life like those who have normal vision. For this reason, new applications related to the Braille alphabet are being developed. In this study, a new Refreshable Braille Display was developed to help visually impaired individuals learn the Braille alphabet easier. By means of this system, any text downloaded on a computer can be read by the visually impaired individual at that moment by feeling it by his/her hands. Through this electronic device, it was aimed to make learning the Braille alphabet easier for visually impaired individuals with whom the necessary tests were conducted.

Keywords: visually impaired individual, Braille, Braille display, refreshable Braille display, USB

Procedia PDF Downloads 318

2928 Research of Control System for Space Intelligent Robot Based on Vision Servo

Authors: Changchun Liang, Xiaodong Zhang, Xin Liu, Pengfei Sun

Abstract:

Space intelligent robotic systems are expected to play an increasingly important role in the future. The robotic on-orbital service, whose key is the tracking and capturing technology, becomes research hot in recent years. In this paper, the authors propose a vision servo control system for target capturing. Robotic manipulator will be an intelligent robotic system with large-scale movement, functional agility, and autonomous ability, and it can be operated by astronauts in the space station or be controlled by the ground operator in the remote operation mode. To realize the autonomous movement and capture mission of SRM, a kind of autonomous programming strategy based on multi-camera vision fusion is designed and the selection principle of object visual position and orientation measurement information is defined for the better precision. Distributed control system hierarchy is designed and reliability is considering to guarantee the abilities of control system. At last, a ground experiment system is set up based on the concept of robotic control system. With that, the autonomous target capturing experiments are conducted. The experiment results validate the proposed algorithm, and demonstrates that the control system can fulfill the needs of function, real-time and reliability.

Keywords: control system, on-orbital service, space robot, vision servo

Procedia PDF Downloads 387