Search results for: robot vision

123 DocPro: A Framework for Processing Semantic and Layout Information in Business Documents

Authors: Ming-Jen Huang, Chun-Fang Huang, Chiching Wei

Abstract:

With the recent advance of the deep neural network, we observe new applications of NLP (natural language processing) and CV (computer vision) powered by deep neural networks for processing business documents. However, creating a real-world document processing system needs to integrate several NLP and CV tasks, rather than treating them separately. There is a need to have a unified approach for processing documents containing textual and graphical elements with rich formats, diverse layout arrangement, and distinct semantics. In this paper, a framework that fulfills this unified approach is presented. The framework includes a representation model definition for holding the information generated by various tasks and specifications defining the coordination between these tasks. The framework is a blueprint for building a system that can process documents with rich formats, styles, and multiple types of elements. The flexible and lightweight design of the framework can help build a system for diverse business scenarios, such as contract monitoring and reviewing.

Keywords: Document processing, framework, formal definition, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 584

122 Investigation into the Role of Leadership in the Management of Digital Transformation for Small and Medium Enterprises

Authors: Francesco Coraci, Abdul-Hadi G. Abulrub

Abstract:

Digital technology is transforming the landscape of the industrial sector at a precedential level by connecting people, processes, and machines in real-time. It represents the means for a new pathway to achieve innovative, dynamic competitive advantages, deliver unique customers’ values, and sustain critical relationships. Thus, success in a constantly changing environment is governed by the ability of an organization to revolutionize their business models, deliver innovative solutions, and capture values from big data analytics and insights. Businesses need to re-strategize operations and develop extra capabilities to cope with the necessity for additional flexibility and agility. The traditional “command and control” leadership style is structurally and operationally incompatible with the digital era. In this paper, the authors discuss how transformational leaders can act as a glue in the social, organizational context, which is crucial to enable the workforce and develop a psychological attachment to the digital vision.

Keywords: Internet of things, strategy, change leadership, dynamic competitive advantage, digital transformation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 593

121 Urban Regeneration of Historic Paths: A Case Study of Kom El Dekka Historic Path

Authors: Ahmed R. Ismail, Hatem A. El Tawil, Nevin G. Rezk

Abstract:

Historic paths in today's cities are facing the pressure of the urban development due to the rapid urban growth. Every new development is tearing the old urban fabric and the socio-economic character of the historic paths. Furthermore, in some cases historic paths suffer from negligence and decay. Kom El Dekka historic path was one of those deteriorated paths in the city of Alexandria, Egypt, in spite of its high heritage and socio-economic value. Therefore, there was a need to develop urban regeneration strategies as a part of a wider sustainable development vision, to handle the situation and revitalize the path as a livable space in the heart of the city. This study aims to develop a comprehensive assessment methodology to evaluate the different values of the path and to create community-oriented and economic-based analysis methodology for its socio-economic values. These analysis and assessments provide strategies for any regeneration action plan for Kom El Dekka historic path.

Keywords: Community-oriented, economic-based, syntactical analysis, urban regeneration.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2011

120 The Use of Lane-Centering to Assure the Visible Light Communication Connectivity for a Platoon of Autonomous Vehicles

Authors: Mohammad Y. Abualhoul, Edgar Talavera Munoz, Fawzi Nashashibi

Abstract:

The new emerging Visible Light Communication (VLC) technology has been subjected to intensive investigation, evaluation, and lately, deployed in the context of convoy-based applications for Intelligent Transportations Systems (ITS). The technology limitations were defined and supported by different solutions proposals to enhance the crucial alignment and mobility limitations. In this paper, we propose the incorporation of VLC technology and Lane-Centering (LC) technique to assure the VLC-connectivity by keeping the autonomous vehicle aligned to the lane center using vision-based lane detection in a convoy-based formation. Such combination can ensure the optical communication connectivity with a lateral error less than 30 cm. As soon as the road lanes are detectable, the evaluated system showed stable behavior independently from the inter-vehicle distances and without the need for any exchanged information of the remote vehicles. The evaluation of the proposed system is verified using VLC prototype and an empirical result of LC running application over 60 km in Madrid M40 highway.

Keywords: VLC, lane-centering, platoon, ITS, road safety applications.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 731

119 Performance Improvement of Moving Object Recognition and Tracking Algorithm using Parallel Processing of SURF and Optical Flow

Authors: Jungho Choi, Youngwan Cho

Abstract:

The paper proposes a way of parallel processing of SURF and Optical Flow for moving object recognition and tracking. The object recognition and tracking is one of the most important task in computer vision, however disadvantage are many operations cause processing speed slower so that it can-t do real-time object recognition and tracking. The proposed method uses a typical way of feature extraction SURF and moving object Optical Flow for reduce disadvantage and real-time moving object recognition and tracking, and parallel processing techniques for speed improvement. First analyse that an image from DB and acquired through the camera using SURF for compared to the same object recognition then set ROI (Region of Interest) for tracking movement of feature points using Optical Flow. Secondly, using Multi-Thread is for improved processing speed and recognition by parallel processing. Finally, performance is evaluated and verified efficiency of algorithm throughout the experiment.

Keywords: moving object recognition, moving object tracking, SURF, Optical Flow, Multi-Thread.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2598

118 Strategies for Connectivity Configuration to Access e-Learning Resources: Case of Rural Secondary Schools in Tanzania

Authors: F. Simba, L. Trojer, N.H. Mvungi, B.M. Mwinyiwiwa, E.M. Mjema

Abstract:

In response to address different development challenges, Tanzania is striving to achieve its fourth attribute of the National Development Vision, i.e. to have a well educated and learned society by the year 2025. One of the most cost effective methods that can reach a large part of the society in a short time is to integrate ICT in education through e-learning initiatives. However, elearning initiatives are challenged by limited or lack of connectivity to majority of secondary schools, especially those in rural and remote areas. This paper has explores the possibility for rural secondary school to access online e-Learning resources from a centralized e- Learning Management System (e-LMS). The scope of this paper is limited to schools that have computers irrespective of internet connectivity, resulting in two categories schools; those with internet access and those without. Different connectivity configurations have been proposed according to the ICT infrastructure status of the respective schools. However, majority of rural secondary schools in Tanzania have neither computers nor internet connection. Therefore this is a challenge to be addressed for the disadvantaged schools to benefit from e-Learning initiatives.

Keywords: connectivity, configuration, e-Learning, replication, rural.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1920

117 Triangle Issues of Sustainability at the University Level within a Vision of Knowledge Economy and Society

Authors: Ashiquer Rahman

Abstract:

The paper focuses on the importance of the knowledge economy and society, emphasizing the significance of the triangle issues (Innovation, Sustainability, and Higher Education) for building a sustainable campus at the university level and preparing students to face the upcoming sustainability challenges in the competitive and sustainable world. Within a framework of the knowledge economy and society, the paper discusses the significance of sustainable campus, triangle issues and potential action plan for the university level. It makes mention of the emergence of a knowledge-based economy and society as well as the necessity of combining innovation, sustainability, and education to create a sustainable campus at the university level. The paper outlines nine significant issues or challenges related to a sustainable campus that have been emphasized, and cross-linked with each other. Optimistically, it will be a milestone in higher education, a pathway to meet the imminent sustainable challenges of the completive world and be able to manage the knowledge economy and societal system

Keywords: Triangle issues, sustainable campus, higher education, knowledge economy, knowledge society.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 172

116 A Universal Model for Content-Based Image Retrieval

Authors: S. Nandagopalan, Dr. B. S. Adiga, N. Deepak

Abstract:

In this paper a novel approach for generalized image retrieval based on semantic contents is presented. A combination of three feature extraction methods namely color, texture, and edge histogram descriptor. There is a provision to add new features in future for better retrieval efficiency. Any combination of these methods, which is more appropriate for the application, can be used for retrieval. This is provided through User Interface (UI) in the form of relevance feedback. The image properties analyzed in this work are by using computer vision and image processing algorithms. For color the histogram of images are computed, for texture cooccurrence matrix based entropy, energy, etc, are calculated and for edge density it is Edge Histogram Descriptor (EHD) that is found. For retrieval of images, a novel idea is developed based on greedy strategy to reduce the computational complexity. The entire system was developed using AForge.Imaging (an open source product), MATLAB .NET Builder, C#, and Oracle 10g. The system was tested with Coral Image database containing 1000 natural images and achieved better results.

Keywords: Content Based Image Retrieval (CBIR), Cooccurrencematrix, Feature vector, Edge Histogram Descriptor(EHD), Greedy strategy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2895

115 View-Point Insensitive Human Pose Recognition using Neural Network and CUDA

Authors: Sanghyeok Oh, Keechul Jung

Abstract:

Although lots of research work has been done for human pose recognition, the view-point of cameras is still critical problem of overall recognition system. In this paper, view-point insensitive human pose recognition is proposed. The aims of the proposed system are view-point insensitivity and real-time processing. Recognition system consists of feature extraction module, neural network and real-time feed forward calculation. First, histogram-based method is used to extract feature from silhouette image and it is suitable for represent the shape of human pose. To reduce the dimension of feature vector, Principle Component Analysis(PCA) is used. Second, real-time processing is implemented by using Compute Unified Device Architecture(CUDA) and this architecture improves the speed of feed-forward calculation of neural network. We demonstrate the effectiveness of our approach with experiments on real environment.

Keywords: computer vision, neural network, pose recognition, view-point insensitive, PCA, CUDA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1307

114 The Role of Leadership and Innovation in Ecotourism Services Activity in Candirejo Village, Borobudur, Central Java, Indonesia

Authors: Iwan Nugroho, Purnawan D. Negara

Abstract:

This paper is aimed to study the roles of leadership and innovation in the development of local people based ecotourism services. The survey is conducted in Candirejo village, Borobudur District, Magelang Regency. The study of a descriptive approach is employed to identify people's behavior in ecotourism services. The results showed that ecotourism services have developed and provided benefits to the people. The roles of leadership and innovation interact positively with a cooperative to organize an ecotourism services management. The leadership is able to identify substances, to do the vision and missions of environmental and cultural conservation. The innovation provides alternative development efforts and increases the added value of ecotourism. The cooperative management was able to support a process to realize the goals of ecotourism, to build participation and communication, and to perform organizational learning. The phenomenon of the leadership in the Candirejo ecotourism enriches the studies of the ecotourism management. During this time, the ecotourism management is always associated with the standard management of national park. The ecotourism management of Candirejo is considered successful even outside the national park management.

Keywords: Borobudur, Candirejo, ecotourism, inovation, Leadership.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2931

113 Urban Reforms of Tanzimat: Early Urbanization and Transportation Practices in The Formation Process of Turkish Reconstruction System(1839-1908) in Bursa The First Capital City of Ottoman Empire

Authors: M.Bilal Bagbanci, Ozlem Koprulu Bagbanci

Abstract:

Bursa, since the establishment of the Ottoman Empire, being on the important trade roads and having a capital accumulation as a result of silk production, was one of the first cities of modernization activities applied. Bursa maintained its importance even during the Republican Period and became one of the most important cities of the country and today is the fourth biggest and the industrialized city in Turkey. Social, political, economical and cultural changes occured with the reforms starting with the 1839 Edict of Tanzimat that aimed at modernizing the society and the government and centralizing the political power began in the Ottoman Empire. After the Tanzimat Reforms transformation of the city changed and planning processes began in Bursa according to the vision of Governors. The theresholds of the city are very important data for a sustainable planning for the city planners. Main aim of this study is to investigate the changes and transformations of the city according to the changes in the socio-economical and cultural properties for the city planners.

Keywords: Transportation, urbanization, Tanzimat reforms, modernization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2035

112 Interdisciplinary Principles of Field-Like Coordination in the Case of Self-Organized Social Systems1

Authors: D. Plikynas, S. Masteika, A. Budrionis

Abstract:

This interdisciplinary research aims to distinguish universal scale-free and field-like fundamental principles of selforganization observable across many disciplines like computer science, neuroscience, microbiology, social science, etc. Based on these universal principles we provide basic premises and postulates for designing holistic social simulation models. We also introduce pervasive information field (PIF) concept, which serves as a simulation media for contextual information storage, dynamic distribution and organization in social complex networks. PIF concept specifically is targeted for field-like uncoupled and indirect interactions among social agents capable of affecting and perceiving broadcasted contextual information. Proposed approach is expressive enough to represent contextual broadcasted information in a form locally accessible and immediately usable by network agents. This paper gives some prospective vision how system-s resources (tangible and intangible) could be simulated as oscillating processes immersed in the all pervasive information field.

Keywords: field-based coordination, multi-agent systems, information-rich social networks, pervasive information field

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1532

111 Practical Applications and Connectivity Algorithms in Future Wireless Sensor Networks

Authors: Mohamed K. Watfa

Abstract:

Like any sentient organism, a smart environment relies first and foremost on sensory data captured from the real world. The sensory data come from sensor nodes of different modalities deployed on different locations forming a Wireless Sensor Network (WSN). Embedding smart sensors in humans has been a research challenge due to the limitations imposed by these sensors from computational capabilities to limited power. In this paper, we first propose a practical WSN application that will enable blind people to see what their neighboring partners can see. The challenge is that the actual mapping between the input images to brain pattern is too complex and not well understood. We also study the connectivity problem in 3D/2D wireless sensor networks and propose distributed efficient algorithms to accomplish the required connectivity of the system. We provide a new connectivity algorithm CDCA to connect disconnected parts of a network using cooperative diversity. Through simulations, we analyze the connectivity gains and energy savings provided by this novel form of cooperative diversity in WSNs.

Keywords: Wireless Sensor Networks, Pervasive Computing, Eye Vision Application, 3D Connectivity, Clusters, Energy Efficient, Cooperative diversity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1592

110 A Proposal on the Educational Transactional Analysis as a Dialogical Vision of Culture: Conceptual Signposts and Practical Tools for Educators

Authors: Marina Sartor Hoffer

Abstract:

The multicultural composition of today's societies poses new challenges to educational contexts. Schools are therefore called first to develop dialogic aptitudes and communicative skills adapted to the complex reality of post-modern societies. It is indispensable for educators and for young people to learn theoretical and practical tools during their scholastic path, in order to allow the knowledge of themselves and of the others with the aim of recognizing the value of the others regardless of their culture. Dialogic Skills help to understand and manage individual differences by allowing the solution of problems and preventing conflicts. The Educational Sector of Eric Berne’s Transactional Analysis offers a range of methods and techniques for this purpose. Educational Transactional Analysis is firmly anchored in the Personalist Philosophy and deserves to be promoted as a theoretical frame suitable to face the challenges of contemporary education. The goal of this paper is therefore to outline some conceptual and methodological signposts for the education to dialogue by drawing concepts and methodologies from educational transactional analysis.

Keywords: Dialogic process, education to dialogue, educational transactional analysis, personalism, the good of the relationship.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 864

109 A Motion Dictionary to Real-Time Recognition of Sign Language Alphabet Using Dynamic Time Warping and Artificial Neural Network

Authors: Marcio Leal, Marta Villamil

Abstract:

Computacional recognition of sign languages aims to allow a greater social and digital inclusion of deaf people through interpretation of their language by computer. This article presents a model of recognition of two of global parameters from sign languages; hand configurations and hand movements. Hand motion is captured through an infrared technology and its joints are built into a virtual three-dimensional space. A Multilayer Perceptron Neural Network (MLP) was used to classify hand configurations and Dynamic Time Warping (DWT) recognizes hand motion. Beyond of the method of sign recognition, we provide a dataset of hand configurations and motion capture built with help of fluent professionals in sign languages. Despite this technology can be used to translate any sign from any signs dictionary, Brazilian Sign Language (Libras) was used as case study. Finally, the model presented in this paper achieved a recognition rate of 80.4%.

Keywords: Sign language recognition, computer vision, infrared, artificial neural network, dynamic time warping.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 831

108 2D Spherical Spaces for Face Relighting under Harsh Illumination

Authors: Amr Almaddah, Sadi Vural, Yasushi Mae, Kenichi Ohara, Tatsuo Arai

Abstract:

In this paper, we propose a robust face relighting technique by using spherical space properties. The proposed method is done for reducing the illumination effects on face recognition. Given a single 2D face image, we relight the face object by extracting the nine spherical harmonic bases and the face spherical illumination coefficients. First, an internal training illumination database is generated by computing face albedo and face normal from 2D images under different lighting conditions. Based on the generated database, we analyze the target face pixels and compare them with the training bootstrap by using pre-generated tiles. In this work, practical real time processing speed and small image size were considered when designing the framework. In contrast to other works, our technique requires no 3D face models for the training process and takes a single 2D image as an input. Experimental results on publicly available databases show that the proposed technique works well under severe lighting conditions with significant improvements on the face recognition rates.

Keywords: Face synthesis and recognition, Face illumination recovery, 2D spherical spaces, Vision for graphics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1725

107 Tape-Shaped Multiscale Fiducial Marker: A Design Prototype for Indoor Localization

Authors: Marcell S. A. Martins, Benedito S. R. Neto, Gerson L. Serejo, Carlos G. R. Santos

Abstract:

Indoor positioning systems use sensors such as Bluetooth, ZigBee, and Wi-Fi, as well as cameras for image capture, which can be fixed or mobile. These computer vision-based positioning approaches are low-cost to implement, mainly when it uses a mobile camera. The present study aims to create a design of a fiducial marker for a low-cost indoor localization system. The marker is tape-shaped to perform a continuous reading employing two detection algorithms, one for greater distances and another for smaller distances. Therefore, the location service is always operational, even with variations in capture distance. A minimal localization and reading algorithm was implemented for the proposed marker design, aiming to validate it. The accuracy tests consider readings varying the capture distance between [0.5, 10] meters, comparing the proposed marker with others. The tests showed that the proposed marker has a broader capture range than the ArUco and QRCode, maintaining the same size. Therefore, reducing the visual pollution and maximizing the tracking since the ambient can be covered entirely.

Keywords: Multiscale recognition, indoor localization, tape-shaped marker, Fiducial Marker.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 72

106 Enhancement of Stereo Video Pairs Using SDNs To Aid In 3D Reconstruction

Authors: Lewis E. Hibell, Honghai Liu, David J. Brown

Abstract:

This paper presents the results of enhancing images from a left and right stereo pair in order to increase the resolution of a 3D representation of a scene generated from that same pair. A new neural network structure known as a Self Delaying Dynamic Network (SDN) has been used to perform the enhancement. The advantage of SDNs over existing techniques such as bicubic interpolation is their ability to cope with motion and noise effects. SDNs are used to generate two high resolution images, one based on frames taken from the left view of the subject, and one based on the frames from the right. This new high resolution stereo pair is then processed by a disparity map generator. The disparity map generated is compared to two other disparity maps generated from the same scene. The first is a map generated from an original high resolution stereo pair and the second is a map generated using a stereo pair which has been enhanced using bicubic interpolation. The maps generated using the SDN enhanced pairs match more closely the target maps. The addition of extra noise into the input images is less problematic for the SDN system which is still able to out perform bicubic interpolation.

Keywords: Genetic Evolution, Image Enhancement, Neuron Networks, Stereo Vision

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1378

105 Face Detection in Color Images using Color Features of Skin

Authors: Fattah Alizadeh, Saeed Nalousi, Chiman Savari

Abstract:

Because of increasing demands for security in today-s society and also due to paying much more attention to machine vision, biometric researches, pattern recognition and data retrieval in color images, face detection has got more application. In this article we present a scientific approach for modeling human skin color, and also offer an algorithm that tries to detect faces within color images by combination of skin features and determined threshold in the model. Proposed model is based on statistical data in different color spaces. Offered algorithm, using some specified color threshold, first, divides image pixels into two groups: skin pixel group and non-skin pixel group and then based on some geometric features of face decides which area belongs to face. Two main results that we received from this research are as follow: first, proposed model can be applied easily on different databases and color spaces to establish proper threshold. Second, our algorithm can adapt itself with runtime condition and its results demonstrate desirable progress in comparison with similar cases.

Keywords: face detection, skin color modeling, color, colorfulimages, face recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2262

104 Vision Based Hand Gesture Recognition Using Generative and Discriminative Stochastic Models

Authors: Mahmoud Elmezain, Samar El-shinawy

Abstract:

Many approaches to pattern recognition are founded on probability theory, and can be broadly characterized as either generative or discriminative according to whether or not the distribution of the image features. Generative and discriminative models have very different characteristics, as well as complementary strengths and weaknesses. In this paper, we study these models to recognize the patterns of alphabet characters (A-Z) and numbers (0-9). To handle isolated pattern, generative model as Hidden Markov Model (HMM) and discriminative models like Conditional Random Field (CRF), Hidden Conditional Random Field (HCRF) and Latent-Dynamic Conditional Random Field (LDCRF) with different number of window size are applied on extracted pattern features. The gesture recognition rate is improved initially as the window size increase, but degrades as window size increase further. Experimental results show that the LDCRF is the best in terms of results than CRF, HCRF and HMM at window size equal 4. Additionally, our results show that; an overall recognition rates are 91.52%, 95.28%, 96.94% and 98.05% for CRF, HCRF, HMM and LDCRF respectively.

Keywords: Statistical Pattern Recognition, Generative Model, Discriminative Model, Human Computer Interaction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2874

103 A Differential Calculus Based Image Steganography with Crossover

Authors: Srilekha Mukherjee, Subha Ash, Goutam Sanyal

Abstract:

Information security plays a major role in uplifting the standard of secured communications via global media. In this paper, we have suggested a technique of encryption followed by insertion before transmission. Here, we have implemented two different concepts to carry out the above-specified tasks. We have used a two-point crossover technique of the genetic algorithm to facilitate the encryption process. For each of the uniquely identified rows of pixels, different mathematical methodologies are applied for several conditions checking, in order to figure out all the parent pixels on which we perform the crossover operation. This is done by selecting two crossover points within the pixels thereby producing the newly encrypted child pixels, and hence the encrypted cover image. In the next lap, the first and second order derivative operators are evaluated to increase the security and robustness. The last lap further ensures reapplication of the crossover procedure to form the final stego-image. The complexity of this system as a whole is huge, thereby dissuading the third party interferences. Also, the embedding capacity is very high. Therefore, a larger amount of secret image information can be hidden. The imperceptible vision of the obtained stego-image clearly proves the proficiency of this approach.

Keywords: Steganography, Crossover, Differential Calculus, Peak Signal to Noise Ratio, Cross-correlation Coefficient.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1351

102 Shifted Window Based Self-Attention via Swin Transformer for Zero-Shot Learning

Authors: Yasaswi Palagummi, Sareh Rowlands

Abstract:

Generalised Zero-Shot Learning, often known as GZSL, is an advanced variant of zero-shot learning in which the samples in the unseen category may be either seen or unseen. GZSL methods typically have a bias towards the seen classes because they learn a model to perform recognition for both the seen and unseen classes using data samples from the seen classes. This frequently leads to the misclassification of data from the unseen classes into the seen classes, making the task of GZSL more challenging. In this work, we propose an approach leveraging the Shifted Window based Self-Attention in the Swin Transformer (Swin-GZSL) to work in the inductive GZSL problem setting. We run experiments on three popular benchmark datasets: CUB, SUN, and AWA2, which are specifically used for ZSL and its other variants. The results show that our model based on Swin Transformer has achieved state-of-the-art harmonic mean for two datasets - AWA2 and SUN and near-state-of-the-art for the other dataset - CUB. More importantly, this technique has a linear computational complexity, which reduces training time significantly. We have also observed less bias than most of the existing GZSL models.

Keywords: Generalised Zero-shot Learning, Inductive Learning, Shifted-Window Attention, Swin Transformer, Vision Transformer.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 158

101 A Structural Support Vector Machine Approach for Biometric Recognition

Authors: Vishal Awasthi, Atul Kumar Agnihotri

Abstract:

Face is a non-intrusive strong biometrics for identification of original and dummy facial by different artificial means. Face recognition is extremely important in the contexts of computer vision, psychology, surveillance, pattern recognition, neural network, content based video processing. The availability of a widespread face database is crucial to test the performance of these face recognition algorithms. The openly available face databases include face images with a wide range of poses, illumination, gestures and face occlusions but there is no dummy face database accessible in public domain. This paper presents a face detection algorithm based on the image segmentation in terms of distance from a fixed point and template matching methods. This proposed work is having the most appropriate number of nodal points resulting in most appropriate outcomes in terms of face recognition and detection. The time taken to identify and extract distinctive facial features is improved in the range of 90 to 110 sec. with the increment of efficiency by 3%.

Keywords: Face recognition, Principal Component Analysis, PCA, Linear Discriminant Analysis, LDA, Improved Support Vector Machine, iSVM, elastic bunch mapping technique.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 418

100 A Hidden Markov Model-Based Isolated and Meaningful Hand Gesture Recognition

Authors: Mahmoud Elmezain, Ayoub Al-Hamadi, Jörg Appenrodt, Bernd Michaelis

Abstract:

Gesture recognition is a challenging task for extracting meaningful gesture from continuous hand motion. In this paper, we propose an automatic system that recognizes isolated gesture, in addition meaningful gesture from continuous hand motion for Arabic numbers from 0 to 9 in real-time based on Hidden Markov Models (HMM). In order to handle isolated gesture, HMM using Ergodic, Left-Right (LR) and Left-Right Banded (LRB) topologies is applied over the discrete vector feature that is extracted from stereo color image sequences. These topologies are considered to different number of states ranging from 3 to 10. A new system is developed to recognize the meaningful gesture based on zero-codeword detection with static velocity motion for continuous gesture. Therefore, the LRB topology in conjunction with Baum-Welch (BW) algorithm for training and forward algorithm with Viterbi path for testing presents the best performance. Experimental results show that the proposed system can successfully recognize isolated and meaningful gesture and achieve average rate recognition 98.6% and 94.29% respectively.

Keywords: Computer Vision & Image Processing, Gesture Recognition, Pattern Recognition, Application

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2204

99 Deep Learning Based Fall Detection Using Simplified Human Posture

Authors: Kripesh Adhikari, Hamid Bouchachia, Hammadi Nait-Charif

Abstract:

Falls are one of the major causes of injury and death among elderly people aged 65 and above. A support system to identify such kind of abnormal activities have become extremely important with the increase in ageing population. Pose estimation is a challenging task and to add more to this, it is even more challenging when pose estimations are performed on challenging poses that may occur during fall. Location of the body provides a clue where the person is at the time of fall. This paper presents a vision-based tracking strategy where available joints are grouped into three different feature points depending upon the section they are located in the body. The three feature points derived from different joints combinations represents the upper region or head region, mid-region or torso and lower region or leg region. Tracking is always challenging when a motion is involved. Hence the idea is to locate the regions in the body in every frame and consider it as the tracking strategy. Grouping these joints can be beneficial to achieve a stable region for tracking. The location of the body parts provides a crucial information to distinguish normal activities from falls.

Keywords: Fall detection, machine learning, deep learning, pose estimation, tracking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2074

98 The Roles of Community Based Telecenters in Bridging the Digital Divide in Rural Malaysia

Authors: Zulkefli bin Ibrahim, Ainin Sulaiman, Tengku M. Faziharudean

Abstract:

Malaysia is aggressive in promoting the usage of ICT to its mass population through the support by the government policies and programs targeting the general population. However, with the uneven distribution of the basic telecommunication infrastructure between the urban and rural area, cost for being “interconnected" that is considered high among the poorer rural population and the lack of local contents that suit the rural population needs or lifestyles, it is still a challenge for Malaysia to achieve its Vision 2020 Agenda moving the nation towards an information society by the year 2020. Among the existing programs that have been carried out by the government to encourage the usage of ICT by the rural population is “Kedaikom", a community telecenter with the general aim is to engage the community to get exposed and to use the ICT, encouraging the diffusion of the ICT technology to the rural population. The research investigated by using a questionnaire survey of how Kedaikom, as a community telecenter could play a role in encouraging the rural or underserved community to use the ICT. The result from the survey has proven that the community telecenter could bridge the digital divide between the underserved rural population and the well-accessed urban population in Malaysia. More of the rural population, especially from the younger generation and those with higher educational background are using the community telecenter to be connected to the ICT.

Keywords: Digital divide, ICT, telecenters.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2104

97 Health Assessment and Disorders of External Respiration Function among Physicians

Authors: A. G. Margaryan

Abstract:

Aims and Objectives: Assessment of health status and detection disorders of external respiration functions (ERF) during preventative medical examination among physicians of Armenia. Subjects and Methods: Overall, fifty-nine physicians (17 men and 42 women) were examined and spirometry was carried out. The average age of the physicians was 50 years old. The studies were conducted on the Micromedical MicroLab 3500 Spirometer. Results: 25.4% among 59 examined physicians are overweight; 22.0% of them suffer from obesity. Two physicians are currently smokers. About half of the examined physicians (50.8%) at the time of examination were diagnosed with some diseases and had different health-related problems (excluding the problems related to vision and hearing). FVC was 2.94±0.1, FEV1 – 2.64±0.1, PEF – 329.7±19.9, and FEV₁%/FVC – 89.7±1.3. Pathological changes of ERF are identified in 23 (39.0%) cases. 28.8% of physicians had first degree of restrictive disorders, 3.4% – first degree of combined obstructive/ restrictive disorders, 6.8% – second degree of combined obstructive/ restrictive disorders. Only three physicians with disorders of the ERF were diagnosed with chronic bronchitis and bronchial asthma. There were no statistically significant changes in ERF depending on the severity of obesity (P> 0.05). Conclusion: The study showed the prevalence of ERF among physicians, observing mainly mild and moderate changes in ERF parameters.

Keywords: Armenia, external respiration function, health status, physicians.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 997

96 End-to-End Pyramid Based Method for MRI Reconstruction

Authors: Omer Cahana, Maya Herman, Ofer Levi

Abstract:

Magnetic Resonance Imaging (MRI) is a lengthy medical scan that stems from a long acquisition time. Its length is mainly due to the traditional sampling theorem, which defines a lower boundary for sampling. However, it is still possible to accelerate the scan by using a different approach such as Compress Sensing (CS) or Parallel Imaging (PI). These two complementary methods can be combined to achieve a faster scan with high-fidelity imaging. To achieve that, two conditions must be satisfied: i) the signal must be sparse under a known transform domain, and ii) the sampling method must be incoherent. In addition, a nonlinear reconstruction algorithm must be applied to recover the signal. While the rapid advances in Deep Learning (DL) have had tremendous successes in various computer vision tasks, the field of MRI reconstruction is still in its early stages. In this paper, we present an end-to-end method for MRI reconstruction from k-space to image. Our method contains two parts. The first is sensitivity map estimation (SME), which is a small yet effective network that can easily be extended to a variable number of coils. The second is reconstruction, which is a top-down architecture with lateral connections developed for building high-level refinement at all scales. Our method holds the state-of-art fastMRI benchmark, which is the largest, most diverse benchmark for MRI reconstruction.

Keywords: Accelerate MRI scans, image reconstruction, pyramid network, deep learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 266

95 Image Processing Approach for Detection of Three-Dimensional Tree-Rings from X-Ray Computed Tomography

Authors: Jorge Martinez-Garcia, Ingrid Stelzner, Joerg Stelzner, Damian Gwerder, Philipp Schuetz

Abstract:

Tree-ring analysis is an important part of the quality assessment and the dating of (archaeological) wood samples. It provides quantitative data about the whole anatomical ring structure, which can be used, for example, to measure the impact of the fluctuating environment on the tree growth, for the dendrochronological analysis of archaeological wooden artefacts and to estimate the wood mechanical properties. Despite advances in computer vision and edge recognition algorithms, detection and counting of annual rings are still limited to 2D datasets and performed in most cases manually, which is a time consuming, tedious task and depends strongly on the operator’s experience. This work presents an image processing approach to detect the whole 3D tree-ring structure directly from X-ray computed tomography imaging data. The approach relies on a modified Canny edge detection algorithm, which captures fully connected tree-ring edges throughout the measured image stack and is validated on X-ray computed tomography data taken from six wood species.

Keywords: Ring recognition, edge detection, X-ray computed tomography, dendrochronology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 750

94 Metaverse as a Form of Reality and the Impact of Metaverse in Higher Education

Authors: Josefina Bengoechea, Alex Bell

Abstract:

In the metaverse, the characters were avatars working in a 3-dimensional virtual reality. This virtual reality existed beyond reality. The metaverse is a “the post-reality universe”; a perpetual and persistent multiuser environment in which physical reality and digital virtuality are merged. The virtual infrastructure needed to build a metaverse (which is in the process of being created), are: web3 technologies, non-fungible tokens (NFTs), blockchain, smart contracts, and cryptocurrencies. Web3 refers to a new iteration of the actual web2. The actual web2 is dominated by powerful providers like Google, Apple, Amazon, and other corporate tech companies. The vision for web3 is a decentralized, and thus more equitable version of the web. The aim of this paper is, first, to present the Metaverse as a form of reality in which physical reality and digital virtuality combined to provide new experiences to users; second, to discuss the implications for education, specifically for higher education, and how programs will have to be modified so that the skills obtained by graduates match those demanded by the virtual labour market. This paper builds upon a constructivist approach, combining a literature review and research on key publications.

Keywords: Ethics in technology, cross realities, cryptocurrencies, labour market, metaverse, technology in higher education.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 636