Search results for: Vision Transformer.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 521

Search results for: Vision Transformer.

101 End-to-End Pyramid Based Method for MRI Reconstruction

Authors: Omer Cahana, Maya Herman, Ofer Levi

Abstract:

Magnetic Resonance Imaging (MRI) is a lengthy medical scan that stems from a long acquisition time. Its length is mainly due to the traditional sampling theorem, which defines a lower boundary for sampling. However, it is still possible to accelerate the scan by using a different approach such as Compress Sensing (CS) or Parallel Imaging (PI). These two complementary methods can be combined to achieve a faster scan with high-fidelity imaging. To achieve that, two conditions must be satisfied: i) the signal must be sparse under a known transform domain, and ii) the sampling method must be incoherent. In addition, a nonlinear reconstruction algorithm must be applied to recover the signal. While the rapid advances in Deep Learning (DL) have had tremendous successes in various computer vision tasks, the field of MRI reconstruction is still in its early stages. In this paper, we present an end-to-end method for MRI reconstruction from k-space to image. Our method contains two parts. The first is sensitivity map estimation (SME), which is a small yet effective network that can easily be extended to a variable number of coils. The second is reconstruction, which is a top-down architecture with lateral connections developed for building high-level refinement at all scales. Our method holds the state-of-art fastMRI benchmark, which is the largest, most diverse benchmark for MRI reconstruction.

Keywords: Accelerate MRI scans, image reconstruction, pyramid network, deep learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 266
100 Image Processing Approach for Detection of Three-Dimensional Tree-Rings from X-Ray Computed Tomography

Authors: Jorge Martinez-Garcia, Ingrid Stelzner, Joerg Stelzner, Damian Gwerder, Philipp Schuetz

Abstract:

Tree-ring analysis is an important part of the quality assessment and the dating of (archaeological) wood samples. It provides quantitative data about the whole anatomical ring structure, which can be used, for example, to measure the impact of the fluctuating environment on the tree growth, for the dendrochronological analysis of archaeological wooden artefacts and to estimate the wood mechanical properties. Despite advances in computer vision and edge recognition algorithms, detection and counting of annual rings are still limited to 2D datasets and performed in most cases manually, which is a time consuming, tedious task and depends strongly on the operator’s experience. This work presents an image processing approach to detect the whole 3D tree-ring structure directly from X-ray computed tomography imaging data. The approach relies on a modified Canny edge detection algorithm, which captures fully connected tree-ring edges throughout the measured image stack and is validated on X-ray computed tomography data taken from six wood species.

Keywords: Ring recognition, edge detection, X-ray computed tomography, dendrochronology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 749
99 Metaverse as a Form of Reality and the Impact of Metaverse in Higher Education

Authors: Josefina Bengoechea, Alex Bell

Abstract:

In the metaverse, the characters were avatars working in a 3-dimensional virtual reality. This virtual reality existed beyond reality. The metaverse is a “the post-reality universe”; a perpetual and persistent multiuser environment in which physical reality and digital virtuality are merged. The virtual infrastructure needed to build a metaverse (which is in the process of being created), are: web3 technologies, non-fungible tokens (NFTs), blockchain, smart contracts, and cryptocurrencies. Web3 refers to a new iteration of the actual web2. The actual web2 is dominated by powerful providers like Google, Apple, Amazon, and other corporate tech companies. The vision for web3 is a decentralized, and thus more equitable version of the web. The aim of this paper is, first, to present the Metaverse as a form of reality in which physical reality and digital virtuality combined to provide new experiences to users; second, to discuss the implications for education, specifically for higher education, and how programs will have to be modified so that the skills obtained by graduates match those demanded by the virtual labour market. This paper builds upon a constructivist approach, combining a literature review and research on key publications.

Keywords: Ethics in technology, cross realities, cryptocurrencies, labour market, metaverse, technology in higher education.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 636
98 Latency-Based Motion Detection in Spiking Neural Networks

Authors: Mohammad Saleh Vahdatpour, Yanqing Zhang

Abstract:

Understanding the neural mechanisms underlying motion detection in the human visual system has long been a fascinating challenge in neuroscience and artificial intelligence. This paper presents a spiking neural network model inspired by the processing of motion information in the primate visual system, particularly focusing on the Middle Temporal (MT) area. In our study, we propose a multi-layer spiking neural network model to perform motion detection tasks, leveraging the idea that synaptic delays in neuronal communication are pivotal in motion perception. Synaptic delay, determined by factors like axon length and myelin insulation, affects the temporal order of input spikes, thereby encoding motion direction and speed. Overall, our spiking neural network model demonstrates the feasibility of capturing motion detection principles observed in the primate visual system. The combination of synaptic delays, learning mechanisms, and shared weights and delays in SMD provides a promising framework for motion perception in artificial systems, with potential applications in computer vision and robotics.

Keywords: Neural networks, motion detection, signature detection, convolutional neural network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 84
97 Classification Algorithms in Human Activity Recognition using Smartphones

Authors: Mohd Fikri Azli bin Abdullah, Ali Fahmi Perwira Negara, Md. Shohel Sayeed, Deok-Jai Choi, Kalaiarasi Sonai Muthu

Abstract:

Rapid advancement in computing technology brings computers and humans to be seamlessly integrated in future. The emergence of smartphone has driven computing era towards ubiquitous and pervasive computing. Recognizing human activity has garnered a lot of interest and has raised significant researches- concerns in identifying contextual information useful to human activity recognition. Not only unobtrusive to users in daily life, smartphone has embedded built-in sensors that capable to sense contextual information of its users supported with wide range capability of network connections. In this paper, we will discuss the classification algorithms used in smartphone-based human activity. Existing technologies pertaining to smartphone-based researches in human activity recognition will be highlighted and discussed. Our paper will also present our findings and opinions to formulate improvement ideas in current researches- trends. Understanding research trends will enable researchers to have clearer research direction and common vision on latest smartphone-based human activity recognition area.

Keywords: Classification algorithms, Human Activity Recognition (HAR), Smartphones

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6254
96 Fast 3D Collision Detection Algorithm using 2D Intersection Area

Authors: Taehyun Yoon, Keechul Jung

Abstract:

There are many researches to detect collision between real object and virtual object in 3D space. In general, these techniques are need to huge computing power. So, many research and study are constructed by using cloud computing, network computing, and distribute computing. As a reason of these, this paper proposed a novel fast 3D collision detection algorithm between real and virtual object using 2D intersection area. Proposed algorithm uses 4 multiple cameras and coarse-and-fine method to improve accuracy and speed performance of collision detection. In the coarse step, this system examines the intersection area between real and virtual object silhouettes from all camera views. The result of this step is the index of virtual sensors which has a possibility of collision in 3D space. To decide collision accurately, at the fine step, this system examines the collision detection in 3D space by using the visual hull algorithm. Performance of the algorithm is verified by comparing with existing algorithm. We believe proposed algorithm help many other research, study and application fields such as HCI, augmented reality, intelligent space, and so on.

Keywords: Collision Detection, Computer Vision, Human Computer Interaction, Visual Hull

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2373
95 AI-Based Techniques for Online Social Media Network Sentiment Analysis: A Methodical Review

Authors: A. M. John-Otumu, M. M. Rahman, O. C. Nwokonkwo, M. C. Onuoha

Abstract:

Online social media networks have long served as a primary arena for group conversations, gossip, text-based information sharing and distribution. The use of natural language processing techniques for text classification and unbiased decision making has not been far-fetched. Proper classification of these textual information in a given context has also been very difficult. As a result, a systematic review was conducted from previous literature on sentiment classification and AI-based techniques. The study was done in order to gain a better understanding of the process of designing and developing a robust and more accurate sentiment classifier that could correctly classify social media textual information of a given context between hate speech and inverted compliments with a high level of accuracy using the knowledge gain from the evaluation of different artificial intelligence techniques reviewed. The study evaluated over 250 articles from digital sources like ACM digital library, Google Scholar, and IEEE Xplore; and whittled down the number of research to 52 articles. Findings revealed that deep learning approaches such as Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Bidirectional Encoder Representations from Transformer (BERT), and Long Short-Term Memory (LSTM) outperformed various machine learning techniques in terms of performance accuracy. A large dataset is also required to develop a robust sentiment classifier. Results also revealed that data can be obtained from places like Twitter, movie reviews, Kaggle, Stanford Sentiment Treebank (SST), and SemEval Task4 based on the required domain. The hybrid deep learning techniques like CNN+LSTM, CNN+ Gated Recurrent Unit (GRU), CNN+BERT outperformed single deep learning techniques and machine learning techniques. Python programming language outperformed Java programming language in terms of development simplicity and AI-based library functionalities. Finally, the study recommended the findings obtained for building robust sentiment classifier in the future.

Keywords: Artificial Intelligence, Natural Language Processing, Sentiment Analysis, Social Network, Text.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 503
94 Motion-Based Detection and Tracking of Multiple Pedestrians

Authors: A. Harras, A. Tsuji, K. Terada

Abstract:

Tracking of moving people has gained a matter of great importance due to rapid technological advancements in the field of computer vision. The objective of this study is to design a motion based detection and tracking multiple walking pedestrians randomly in different directions. In our proposed method, Gaussian mixture model (GMM) is used to determine moving persons in image sequences. It reacts to changes that take place in the scene like different illumination; moving objects start and stop often, etc. Background noise in the scene is eliminated through applying morphological operations and the motions of tracked people which is determined by using the Kalman filter. The Kalman filter is applied to predict the tracked location in each frame and to determine the likelihood of each detection. We used a benchmark data set for the evaluation based on a side wall stationary camera. The actual scenes from the data set are taken on a street including up to eight people in front of the camera in different two scenes, the duration is 53 and 35 seconds, respectively. In the case of walking pedestrians in close proximity, the proposed method has achieved the detection ratio of 87%, and the tracking ratio is 77 % successfully. When they are deferred from each other, the detection ratio is increased to 90% and the tracking ratio is also increased to 79%.

Keywords: Automatic detection, tracking, pedestrians.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 801
93 A Cooperative Multi-Robot Control Using Ad Hoc Wireless Network

Authors: Amira Elsonbaty, Rawya Rizk, Mohamed Elksas, Mofreh Salem

Abstract:

In this paper, a Cooperative Multi-robot for Carrying Targets (CMCT) algorithm is proposed. The multi-robot team consists of three robots, one is a supervisor and the others are workers for carrying boxes in a store of 100×100 m2. Each robot has a self recharging mechanism. The CMCT minimizes robot-s worked time for carrying many boxes during day by working in parallel. That is, the supervisor detects the required variables in the same time another robots work with previous variables. It works with straightforward mechanical models by using simple cosine laws. It detects the robot-s shortest path for reaching the target position avoiding obstacles by using a proposed CMCT path planning (CMCT-PP) algorithm. It prevents the collision between robots during moving. The robots interact in an ad hoc wireless network. Simulation results show that the proposed system that consists of CMCT algorithm and its accomplished CMCT-PP algorithm achieves a high improvement in time and distance while performing the required tasks over the already existed algorithms.

Keywords: Ad hoc network, Computer vision based positioning, Dynamic collision avoidance, Multi-robot, Path planning algorithms, Self recharging.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1752
92 New Vision of 'Social Europe': Renationalising the Integration Process in the Internal Market of the European Union

Authors: Robert Grzeszczak, Magdalena Gniadzik

Abstract:

The article deals with one of the most significant issues concerning the functioning of the internal market of the European Union – the free movement of workers and free movement of persons. The purpose is to identify the political and legal effects of the “renationalisation process” on the EU and its Member States. The concept of renationalisation is expressed through Member States’ aim to verify the relationship with the EU. The tendency is more visible in the public opinion of several MS’s of the ‘EU core’ and may be confirmed by the changes applied by the regulatory body. The thesis for the article is the return of renationalisation tendencies in the area of the Single Market, which is supported by, among others, an open criticism of the foundations of EU integration or considerations on withdrawal from the EU by some MS. This analysis will focus primarily on the effects that renationalisation may have on the free movement of persons. The free movement of persons is one of the key issues for the development of the European integration. It is still subject to theoretical reflections, new doubts and practical issues. The latest developments in politics, law and jurisprudence demonstrate the need to reflect on the attempts to redefine certain principles regarding migrant EU workers and their protection against nationality-based discrimination.

Keywords: European law, European Union, common market, free movement of workers, posting of workers, case law.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1033
91 Role of Facade in Sustainability Enhancement of Contemporary Iranian Buildings

Authors: H. Nejadriahi

Abstract:

A growing demand for sustainability makes sustainability as one of the significant debates of nowadays. Energy saving is one of the main criteria to be considered in the context of sustainability. Reducing energy use in buildings is one of the most important ways to reduce humans’ overall environmental impact. Taking this into consideration, study of different design strategies, which can assist in reducing energy use and subsequently improving the sustainability level of today's buildings would be an essential task. The sustainability level of a building is highly affected by the sustainability performance of its components. One of the main building components, which can have a great impact on energy saving and sustainability level of the building, is its facade. The aim of this study is to investigate on the role of facade in sustainability enhancement of the contemporary buildings of Iran. In this study, the concept of sustainability in architecture, the building facades, and their relationship to sustainability are explained briefly. Following that, a number of contemporary Iranian buildings are discussed and analyzed in terms of different design strategies used in their facades in accordance to the sustainability concepts. The methods used in this study are descriptive and analytic. The results of this paper would assist in generating a wider vision and a source of inspiration for the current designers to design and create environmental and sustainable buildings for the future.

Keywords: Building facade, contemporary buildings, Iran, sustainability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 944
90 Multi-VSS Scheme by Shifting Random Grids

Authors: Joy Jo-Yi Chang, Justie Su-Tzu Juan

Abstract:

Visual secret sharing (VSS) was proposed by Naor and Shamir in 1995. Visual secret sharing schemes encode a secret image into two or more share images, and single share image can’t obtain any information about the secret image. When superimposes the shares, it can restore the secret by human vision. Due to the traditional VSS have some problems like pixel expansion and the cost of sophisticated. And this method only can encode one secret image. The schemes of encrypting more secret images by random grids into two shares were proposed by Chen et al. in 2008. But when those restored secret images have much distortion, those schemes are almost limited in decoding. In the other words, if there is too much distortion, we can’t encrypt too much information. So, if we can adjust distortion to very small, we can encrypt more secret images. In this paper, four new algorithms which based on Chang et al.’s scheme be held in 2010 are proposed. First algorithm can adjust distortion to very small. Second algorithm distributes the distortion into two restored secret images. Third algorithm achieves no distortion for special secret images. Fourth algorithm encrypts three secret images, which not only retain the advantage of VSS but also improve on the problems of decoding.

Keywords: Visual cryptography, visual secret sharing, random grids, multiple, secret image sharing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1489
89 On-line Recognition of Isolated Gestures of Flight Deck Officers (FDO)

Authors: Deniz T. Sodiri, Venkat V S S Sastry

Abstract:

The paper presents an on-line recognition machine (RM) for continuous/isolated, dynamic and static gestures that arise in Flight Deck Officer (FDO) training. RM is based on generic pattern recognition framework. Gestures are represented as templates using summary statistics. The proposed recognition algorithm exploits temporal and spatial characteristics of gestures via dynamic programming and Markovian process. The algorithm predicts corresponding index of incremental input data in the templates in an on-line mode. Accumulated consistency in the sequence of prediction provides a similarity measurement (Score) between input data and the templates. The algorithm provides an intuitive mechanism for automatic detection of start/end frames of continuous gestures. In the present paper, we consider isolated gestures. The performance of RM is evaluated using four datasets - artificial (W TTest), hand motion (Yang) and FDO (tracker, vision-based ). RM achieves comparable results which are in agreement with other on-line and off-line algorithms such as hidden Markov model (HMM) and dynamic time warping (DTW). The proposed algorithm has the additional advantage of providing timely feedback for training purposes.

Keywords: On-line Recognition Algorithm, IsolatedDynamic/Static Gesture Recognition, On-line Markovian/DynamicProgramming, Training in Virtual Environments.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1294
88 Real-time 3D Feature Extraction without Explicit 3D Object Reconstruction

Authors: Kwangjin Hong, Chulhan Lee, Keechul Jung, Kyoungsu Oh

Abstract:

For the communication between human and computer in an interactive computing environment, the gesture recognition is studied vigorously. Therefore, a lot of studies have proposed efficient methods about the recognition algorithm using 2D camera captured images. However, there is a limitation to these methods, such as the extracted features cannot fully represent the object in real world. Although many studies used 3D features instead of 2D features for more accurate gesture recognition, the problem, such as the processing time to generate 3D objects, is still unsolved in related researches. Therefore we propose a method to extract the 3D features combined with the 3D object reconstruction. This method uses the modified GPU-based visual hull generation algorithm which disables unnecessary processes, such as the texture calculation to generate three kinds of 3D projection maps as the 3D feature: a nearest boundary, a farthest boundary, and a thickness of the object projected on the base-plane. In the section of experimental results, we present results of proposed method on eight human postures: T shape, both hands up, right hand up, left hand up, hands front, stand, sit and bend, and compare the computational time of the proposed method with that of the previous methods.

Keywords: Fast 3D Feature Extraction, Gesture Recognition, Computer Vision.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1601
87 The Governance of Islamic Banks in Morocco: Meaning, Strategic Vision and Purposes Attributed to the Governance System

Authors: Lalla Nezha Lakmiti, Abdelkahar Zahid

Abstract:

Due to the setbacks on the international scene and the wave of cacophonic financial scandals affecting large international groups, the new Islamic finance industry is not immune despite its initial resistance. The purpose of this paper is to understand and analyze the meaning of the Corporate Governance (CG) concept in Moroccan Islamic banking systems with specific reference to their institutions. The research objective is to identify also the path taken and adopted by these banks recently set up in Morocco. The foundation is rooted in shari'a, in particular, no stakeholder (the shareholding approach) must be harmed, and the ethical value is reflected into these parties’ behavior. We chose a qualitative method, semi-structured interviews where six managers provided answers about their banking systems. Since these respondents held a senior position (directors) within their organizations, it is felt that they are well placed and have the necessary knowledge to provide us with information to answer the questions asked. The results identified the orientation of participating banks and assessing how governance works, while determining which party is fovoured: shareholders, stakeholders or both. This study discusses the favorable condition to the harmonization of the regulations and therefore a better integration between Islamic finance and conventional ones in the economic context of Morocco.

Keywords: Corporate governance, participating banks, stakeholders, shareholders, and interests.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 839
86 A Medical Images Based Retrieval System using Soft Computing Techniques

Authors: Pardeep Singh, Sanjay Sharma

Abstract:

Content-Based Image Retrieval (CBIR) has been one on the most vivid research areas in the field of computer vision over the last 10 years. Many programs and tools have been developed to formulate and execute queries based on the visual or audio content and to help browsing large multimedia repositories. Still, no general breakthrough has been achieved with respect to large varied databases with documents of difering sorts and with varying characteristics. Answers to many questions with respect to speed, semantic descriptors or objective image interpretations are still unanswered. In the medical field, images, and especially digital images, are produced in ever increasing quantities and used for diagnostics and therapy. In several articles, content based access to medical images for supporting clinical decision making has been proposed that would ease the management of clinical data and scenarios for the integration of content-based access methods into Picture Archiving and Communication Systems (PACS) have been created. This paper gives an overview of soft computing techniques. New research directions are being defined that can prove to be useful. Still, there are very few systems that seem to be used in clinical practice. It needs to be stated as well that the goal is not, in general, to replace text based retrieval methods as they exist at the moment.

Keywords: CBIR, GA, Rough sets, CBMIR

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2578
85 Study on the Impact of Size and Position of the Shear Field in Determining the Shear Modulus of Glulam Beam Using Photogrammetry Approach

Authors: Niaz Gharavi, Hexin Zhang

Abstract:

The shear modulus of a timber beam can be determined using torsion test or shear field test method. The shear field test method is based on shear distortion measurement of the beam at the zone with the constant transverse load in the standardized four-point bending test. The current code of practice advises using two metallic arms act as an instrument to measure the diagonal displacement of the constructing square. The size and the position of the constructing square might influence the shear modulus determination. This study aimed to investigate the size and the position effect of the square in the shear field test method. A binocular stereo vision system has been employed to determine the 3D displacement of a grid of target points. Six glue laminated beams were produced and tested. Analysis of Variance (ANOVA) was performed on the acquired data to evaluate the significance of the size effect and the position effect of the square. The results have shown that the size of the square has a noticeable influence on the value of shear modulus, while, the position of the square within the area with the constant shear force does not affect the measured mean shear modulus.

Keywords: Shear field test method, structural-sized test, shear modulus of Glulam beam, photogrammetry approach.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 961
84 The Role of Local Government Authorities in Managing the Pre-Hospital Emergency Medical Service (EMS) Systems in Thailand

Authors: Chanisada Choosuk, Napisporn Memongkol Runchana Sinthavalai, Fareeda Lambensah

Abstract:

The objective of this research is to explore the role of actors at the local level in managing the Pre-hospital Emergency Medical Service (EMS) system in Thailand. The research method was done through documentary research, individual interviews, and one forum conducted in each province. This paper uses the case of three provinces located in three regions in Thailand including; Ubon Ratchathani (North-eastern region), Lampang (Northern Region), and Songkhla (Southern Region). The result shows that, recently, the role of the local government in being the service provider for their local people is increasingly concerned. In identifying the key success factors towards the EMS system, it includes; (i) the local executives- vision and influence that the decisions made by them, for both PAO (Provincial Administration Organisation (PAO) and TAO (Tambon Administration Organisation), is vital to address the overall challenges in EMS development, (ii) the administrative system through reforming their working style create the flexibility in running the EMS task, (iii) the network-based management among different agencies at the local level leads to the better EMS practices, and (iv) the development in human resource is very vital in delivering the effective services.

Keywords: Local governments, Management, Emergency Medical Services (EMS), Thailand

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1570
83 A Robust Diverged Localization and Recognition of License Registration Characters

Authors: M. Sankari, R. Bremananth, C.Meena

Abstract:

Localization and Recognition of License registration characters from the moving vehicle is a computationally complex task in the field of machine vision and is of substantial interest because of its diverse applications such as cross border security, law enforcement and various other intelligent transportation applications. Previous research used the plate specific details such as aspect ratio, character style, color or dimensions of the plate in the complex task of plate localization. In this paper, license registration character is localized by Enhanced Weight based density map (EWBDM) method, which is independent of such constraints. In connection with our previous method, this paper proposes a method that relaxes constraints in lighting conditions, different fonts of character occurred in the plate and plates with hand-drawn characters in various aspect quotients. The robustness of this method is well suited for applications where the appearance of plates seems to be varied widely. Experimental results show that this approach is suited for recognizing license plates in different external environments. 

Keywords: Character segmentation, Connectivity checking, Edge detection, Image analysis, license plate localization, license number recognition, Quality frame selection

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1857
82 MITOS-RCNN: Mitotic Figure Detection in Breast Cancer Histopathology Images Using Region Based Convolutional Neural Networks

Authors: Siddhant Rao

Abstract:

Studies estimate that there will be 266,120 new cases of invasive breast cancer and 40,920 breast cancer induced deaths in the year of 2018 alone. Despite the pervasiveness of this affliction, the current process to obtain an accurate breast cancer prognosis is tedious and time consuming. It usually requires a trained pathologist to manually examine histopathological images and identify the features that characterize various cancer severity levels. We propose MITOS-RCNN: a region based convolutional neural network (RCNN) geared for small object detection to accurately grade one of the three factors that characterize tumor belligerence described by the Nottingham Grading System: mitotic count. Other computational approaches to mitotic figure counting and detection do not demonstrate ample recall or precision to be clinically viable. Our models outperformed all previous participants in the ICPR 2012 challenge, the AMIDA 2013 challenge and the MITOS-ATYPIA-14 challenge along with recently published works. Our model achieved an F- measure score of 0.955, a 6.11% improvement in accuracy from the most accurate of the previously proposed models.

Keywords: Object detection, histopathology, breast cancer, mitotic count, deep learning, computer vision.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1340
81 Recognition Machine (RM) for On-line and Isolated Flight Deck Officer (FDO) Gestures

Authors: Deniz T. Sodiri, Venkat V S S Sastry

Abstract:

The paper presents an on-line recognition machine (RM) for continuous/isolated, dynamic and static gestures that arise in Flight Deck Officer (FDO) training. RM is based on generic pattern recognition framework. Gestures are represented as templates using summary statistics. The proposed recognition algorithm exploits temporal and spatial characteristics of gestures via dynamic programming and Markovian process. The algorithm predicts corresponding index of incremental input data in the templates in an on-line mode. Accumulated consistency in the sequence of prediction provides a similarity measurement (Score) between input data and the templates. The algorithm provides an intuitive mechanism for automatic detection of start/end frames of continuous gestures. In the present paper, we consider isolated gestures. The performance of RM is evaluated using four datasets - artificial (W TTest), hand motion (Yang) and FDO (tracker, vision-based ). RM achieves comparable results which are in agreement with other on-line and off-line algorithms such as hidden Markov model (HMM) and dynamic time warping (DTW). The proposed algorithm has the additional advantage of providing timely feedback for training purposes.

Keywords: On-line Recognition Algorithm, IsolatedDynamic/Static Gesture Recognition, On-line Markovian/DynamicProgramming, Training in Virtual Environments.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1427
80 An Empirical Study on Switching Activation Functions in Shallow and Deep Neural Networks

Authors: Apoorva Vinod, Archana Mathur, Snehanshu Saha

Abstract:

Though there exists a plethora of Activation Functions (AFs) used in single and multiple hidden layer Neural Networks (NN), their behavior always raised curiosity, whether used in combination or singly. The popular AFs – Sigmoid, ReLU, and Tanh – have performed prominently well for shallow and deep architectures. Most of the time, AFs are used singly in multi-layered NN, and, to the best of our knowledge, their performance is never studied and analyzed deeply when used in combination. In this manuscript, we experiment on multi-layered NN architecture (both on shallow and deep architectures; Convolutional NN and VGG16) and investigate how well the network responds to using two different AFs (Sigmoid-Tanh, Tanh-ReLU, ReLU-Sigmoid) used alternately against a traditional, single (Sigmoid-Sigmoid, Tanh-Tanh, ReLU-ReLU) combination. Our results show that on using two different AFs, the network achieves better accuracy, substantially lower loss, and faster convergence on 4 computer vision (CV) and 15 Non-CV (NCV) datasets. When using different AFs, not only was the accuracy greater by 6-7%, but we also accomplished convergence twice as fast. We present a case study to investigate the probability of networks suffering vanishing and exploding gradients when using two different AFs. Additionally, we theoretically showed that a composition of two or more AFs satisfies Universal Approximation Theorem (UAT).

Keywords: Activation Function, Universal Approximation function, Neural Networks, convergence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 99
79 Fuzzy Sequential Algorithm for Discrimination and Decision Maker in Sporting Events

Authors: Mourad Moussa, Ali Douik, Hassani Messaoud

Abstract:

Events discrimination and decision maker in sport field are the subject of many interesting studies in computer vision and artificial intelligence. A large volume of research has been conducted for automatic semantic event detection and summarization of sports videos. Indeed the results of these researches have a very significant contribution, as well to television broadcasts as to the football teams, since the result of sporting event can be reflected on the economic field. In this paper, we propose a novel fuzzy sequential technique which lead to discriminate events and specify the technico-tactics on going the game, nor the fuzzy system or the sequential one, may be able to respond to the asked question, in fact fuzzy process is not sufficient, it does not respect the chronological order according the time of various events, similarly the sequential process needs flexibility about the parameters used in this study, it may affect a membership degree of each parameter on the one hand and respect the sequencing of events for each frame on the other hand. Indeed this technique describes special events such as dribbling, headings, short sprints, rapid acceleration or deceleration, turning, jumping, kicking, ball occupation, and tackling according velocity vectors of the two players and the ball direction.

Keywords: Sequential process, Event detection, Soccer videos analysis, Fuzzy process, Spatio-temporal parameters.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1840
78 Efficient Boosting-Based Active Learning for Specific Object Detection Problems

Authors: Thuy Thi Nguyen, Nguyen Dang Binh, Horst Bischof

Abstract:

In this work, we present a novel active learning approach for learning a visual object detection system. Our system is composed of an active learning mechanism as wrapper around a sub-algorithm which implement an online boosting-based learning object detector. In the core is a combination of a bootstrap procedure and a semi automatic learning process based on the online boosting procedure. The idea is to exploit the availability of classifier during learning to automatically label training samples and increasingly improves the classifier. This addresses the issue of reducing labeling effort meanwhile obtain better performance. In addition, we propose a verification process for further improvement of the classifier. The idea is to allow re-update on seen data during learning for stabilizing the detector. The main contribution of this empirical study is a demonstration that active learning based on an online boosting approach trained in this manner can achieve results comparable or even outperform a framework trained in conventional manner using much more labeling effort. Empirical experiments on challenging data set for specific object deteciton problems show the effectiveness of our approach.

Keywords: Computer vision, object detection, online boosting, active learning, labeling complexity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1749
77 Modeling Directional Thermal Radiance Anisotropy for Urban Canopy

Authors: Limin Zhao, Xingfa Gu, C. Tao Yu

Abstract:

one of the significant factors for improving the accuracy of Land Surface Temperature (LST) retrieval is the correct understanding of the directional anisotropy for thermal radiance. In this paper, the multiple scattering effect between heterogeneous non-isothermal surfaces is described rigorously according to the concept of configuration factor, based on which a directional thermal radiance model is built, and the directional radiant character for urban canopy is analyzed. The model is applied to a simple urban canopy with row structure to simulate the change of Directional Brightness Temperature (DBT). The results show that the DBT is aggrandized because of the multiple scattering effects, whereas the change range of DBT is smoothed. The temperature difference, spatial distribution, emissivity of the components can all lead to the change of DBT. The “hot spot" phenomenon occurs when the proportion of high temperature component in the vision field came to a head. On the other hand, the “cool spot" phenomena occur when low temperature proportion came to the head. The “spot" effect disappears only when the proportion of every component keeps invariability. The model built in this paper can be used for the study of directional effect on emissivity, the LST retrieval over urban areas and the adjacency effect of thermal remote sensing pixels.

Keywords: Directional thermal radiance, multiple scattering, configuration factor, urban canopy, hot spot effect

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1564
76 Intelligent Video-Based Monitoring of Freeway Traffic

Authors: Saad M. Al-Garni, Adel A. Abdennour

Abstract:

Freeways are originally designed to provide high mobility to road users. However, the increase in population and vehicle numbers has led to increasing congestions around the world. Daily recurrent congestion substantially reduces the freeway capacity when it is most needed. Building new highways and expanding the existing ones is an expensive solution and impractical in many situations. Intelligent and vision-based techniques can, however, be efficient tools in monitoring highways and increasing the capacity of the existing infrastructures. The crucial step for highway monitoring is vehicle detection. In this paper, we propose one of such techniques. The approach is based on artificial neural networks (ANN) for vehicles detection and counting. The detection process uses the freeway video images and starts by automatically extracting the image background from the successive video frames. Once the background is identified, subsequent frames are used to detect moving objects through image subtraction. The result is segmented using Sobel operator for edge detection. The ANN is, then, used in the detection and counting phase. Applying this technique to the busiest freeway in Riyadh (King Fahd Road) achieved higher than 98% detection accuracy despite the light intensity changes, the occlusion situations, and shadows.

Keywords: Background Extraction, Neural Networks, VehicleDetection, Freeway Traffic.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1874
75 Providing a Secure, Reliable and Decentralized Document Management Solution Using Blockchain by a Virtual Identity Card

Authors: Meet Shah, Ankita Aditya, Dhruv Bindra, V. S. Omkar, Aashruti Seervi

Abstract:

In today's world, we need documents everywhere for a smooth workflow in the identification process or any other security aspects. The current system and techniques which are used for identification need one thing, that is ‘proof of existence’, which involves valid documents, for example, educational, financial, etc. The main issue with the current identity access management system and digital identification process is that the system is centralized in their network, which makes it inefficient. The paper presents the system which resolves all these cited issues. It is based on ‘blockchain’ technology, which is a 'decentralized system'. It allows transactions in a decentralized and immutable manner. The primary notion of the model is to ‘have everything with nothing’. It involves inter-linking required documents of a person with a single identity card so that a person can go anywhere without having the required documents with him/her. The person just needs to be physically present at a place wherein documents are necessary, and using a fingerprint impression and an iris scan print, the rest of the verification will progress. Furthermore, some technical overheads and advancements are listed. This paper also aims to layout its far-vision scenario of blockchain and its impact on future trends.

Keywords: Blockchain, decentralized system, fingerprint impression, identity management, iris scan.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1225
74 SIFT Accordion: A Space-Time Descriptor Applied to Human Action Recognition

Authors: Olfa.Ben Ahmed, Mahmoud. Mejdoub, Chokri. Ben Amar

Abstract:

Recognizing human action from videos is an active field of research in computer vision and pattern recognition. Human activity recognition has many potential applications such as video surveillance, human machine interaction, sport videos retrieval and robot navigation. Actually, local descriptors and bag of visuals words models achieve state-of-the-art performance for human action recognition. The main challenge in features description is how to represent efficiently the local motion information. Most of the previous works focus on the extension of 2D local descriptors on 3D ones to describe local information around every interest point. In this paper, we propose a new spatio-temporal descriptor based on a spacetime description of moving points. Our description is focused on an Accordion representation of video which is well-suited to recognize human action from 2D local descriptors without the need to 3D extensions. We use the bag of words approach to represent videos. We quantify 2D local descriptor describing both temporal and spatial features with a good compromise between computational complexity and action recognition rates. We have reached impressive results on publicly available action data set

Keywords: Accordion, Bag of Features, Human action, Motion, Moving point, Space-Time Descriptor, SIFT, Video.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2070
73 View-Point Insensitive Human Pose Recognition using Neural Network

Authors: Sanghyeok Oh, Yunli Lee, Kwangjin Hong, Kirak Kim, Keechul Jung

Abstract:

This paper proposes view-point insensitive human pose recognition system using neural network. Recognition system consists of silhouette image capturing module, data driven database, and neural network. The advantages of our system are first, it is possible to capture multiple view-point silhouette images of 3D human model automatically. This automatic capture module is helpful to reduce time consuming task of database construction. Second, we develop huge feature database to offer view-point insensitivity at pose recognition. Third, we use neural network to recognize human pose from multiple-view because every pose from each model have similar feature patterns, even though each model has different appearance and view-point. To construct database, we need to create 3D human model using 3D manipulate tools. Contour shape is used to convert silhouette image to feature vector of 12 degree. This extraction task is processed semi-automatically, which benefits in that capturing images and converting to silhouette images from the real capturing environment is needless. We demonstrate the effectiveness of our approach with experiments on virtual environment.

Keywords: Computer vision, neural network, pose recognition, view-point insensitive.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1291
72 3D Star Skeleton for Fast Human Posture Representation

Authors: Sungkuk Chun, Kwangjin Hong, Keechul Jung

Abstract:

In this paper, we propose an improved 3D star skeleton technique, which is a suitable skeletonization for human posture representation and reflects the 3D information of human posture. Moreover, the proposed technique is simple and then can be performed in real-time. The existing skeleton construction techniques, such as distance transformation, Voronoi diagram, and thinning, focus on the precision of skeleton information. Therefore, those techniques are not applicable to real-time posture recognition since they are computationally expensive and highly susceptible to noise of boundary. Although a 2D star skeleton was proposed to complement these problems, it also has some limitations to describe the 3D information of the posture. To represent human posture effectively, the constructed skeleton should consider the 3D information of posture. The proposed 3D star skeleton contains 3D data of human, and focuses on human action and posture recognition. Our 3D star skeleton uses the 8 projection maps which have 2D silhouette information and depth data of human surface. And the extremal points can be extracted as the features of 3D star skeleton, without searching whole boundary of object. Therefore, on execution time, our 3D star skeleton is faster than the “greedy" 3D star skeleton using the whole boundary points on the surface. Moreover, our method can offer more accurate skeleton of posture than the existing star skeleton since the 3D data for the object is concerned. Additionally, we make a codebook, a collection of representative 3D star skeletons about 7 postures, to recognize what posture of constructed skeleton is.

Keywords: computer vision, gesture recognition, skeletonization, human posture representation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2072