Search results for: pose normalization
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 662

Search results for: pose normalization

662 Pose Normalization Network for Object Classification

Authors: Bingquan Shen

Abstract:

Convolutional Neural Networks (CNN) have demonstrated their effectiveness in synthesizing 3D views of object instances at various viewpoints. Given the problem where one have limited viewpoints of a particular object for classification, we present a pose normalization architecture to transform the object to existing viewpoints in the training dataset before classification to yield better classification performance. We have demonstrated that this Pose Normalization Network (PNN) can capture the style of the target object and is able to re-render it to a desired viewpoint. Moreover, we have shown that the PNN improves the classification result for the 3D chairs dataset and ShapeNet airplanes dataset when given only images at limited viewpoint, as compared to a CNN baseline.

Keywords: convolutional neural networks, object classification, pose normalization, viewpoint invariant

Procedia PDF Downloads 301
661 Applying Spanning Tree Graph Theory for Automatic Database Normalization

Authors: Chetneti Srisa-an

Abstract:

In Knowledge and Data Engineering field, relational database is the best repository to store data in a real world. It has been using around the world more than eight decades. Normalization is the most important process for the analysis and design of relational databases. It aims at creating a set of relational tables with minimum data redundancy that preserve consistency and facilitate correct insertion, deletion, and modification. Normalization is a major task in the design of relational databases. Despite its importance, very few algorithms have been developed to be used in the design of commercial automatic normalization tools. It is also rare technique to do it automatically rather manually. Moreover, for a large and complex database as of now, it make even harder to do it manually. This paper presents a new complete automated relational database normalization method. It produces the directed graph and spanning tree, first. It then proceeds with generating the 2NF, 3NF and also BCNF normal forms. The benefit of this new algorithm is that it can cope with a large set of complex function dependencies.

Keywords: relational database, functional dependency, automatic normalization, primary key, spanning tree

Procedia PDF Downloads 323
660 Online Pose Estimation and Tracking Approach with Siamese Region Proposal Network

Authors: Cheng Fang, Lingwei Quan, Cunyue Lu

Abstract:

Human pose estimation and tracking are to accurately identify and locate the positions of human joints in the video. It is a computer vision task which is of great significance for human motion recognition, behavior understanding and scene analysis. There has been remarkable progress on human pose estimation in recent years. However, more researches are needed for human pose tracking especially for online tracking. In this paper, a framework, called PoseSRPN, is proposed for online single-person pose estimation and tracking. We use Siamese network attaching a pose estimation branch to incorporate Single-person Pose Tracking (SPT) and Visual Object Tracking (VOT) into one framework. The pose estimation branch has a simple network structure that replaces the complex upsampling and convolution network structure with deconvolution. By augmenting the loss of fully convolutional Siamese network with the pose estimation task, pose estimation and tracking can be trained in one stage. Once trained, PoseSRPN only relies on a single bounding box initialization and producing human joints location. The experimental results show that while maintaining the good accuracy of pose estimation on COCO and PoseTrack datasets, the proposed method achieves a speed of 59 frame/s, which is superior to other pose tracking frameworks.

Keywords: computer vision, pose estimation, pose tracking, Siamese network

Procedia PDF Downloads 117
659 Deep Learning Based 6D Pose Estimation for Bin-Picking Using 3D Point Clouds

Authors: Hesheng Wang, Haoyu Wang, Chungang Zhuang

Abstract:

Estimating the 6D pose of objects is a core step for robot bin-picking tasks. The problem is that various objects are usually randomly stacked with heavy occlusion in real applications. In this work, we propose a method to regress 6D poses by predicting three points for each object in the 3D point cloud through deep learning. To solve the ambiguity of symmetric pose, we propose a labeling method to help the network converge better. Based on the predicted pose, an iterative method is employed for pose optimization. In real-world experiments, our method outperforms the classical approach in both precision and recall.

Keywords: pose estimation, deep learning, point cloud, bin-picking, 3D computer vision

Procedia PDF Downloads 125
658 Basic Calibration and Normalization Techniques for Time Domain Reflectometry Measurements

Authors: Shagufta Tabassum

Abstract:

The study of dielectric properties in a binary mixture of liquids is very useful to understand the liquid structure, molecular interaction, dynamics, and kinematics of the mixture. Time-domain reflectometry (TDR) is a powerful tool for studying the cooperation and molecular dynamics of the H-bonded system. In this paper, we discuss the basic calibration and normalization procedure for time-domain reflectometry measurements. Our approach is to explain the different types of error occur during TDR measurements and how these errors can be eliminated or minimized.

Keywords: time domain reflectometry measurement techinque, cable and connector loss, oscilloscope loss, and normalization technique

Procedia PDF Downloads 169
657 Normalizing Scientometric Indicators of Individual Publications Using Local Cluster Detection Methods on Citation Networks

Authors: Levente Varga, Dávid Deritei, Mária Ercsey-Ravasz, Răzvan Florian, Zsolt I. Lázár, István Papp, Ferenc Járai-Szabó

Abstract:

One of the major shortcomings of widely used scientometric indicators is that different disciplines cannot be compared with each other. The issue of cross-disciplinary normalization has been long discussed, but even the classification of publications into scientific domains poses problems. Structural properties of citation networks offer new possibilities, however, the large size and constant growth of these networks asks for precaution. Here we present a new tool that in order to perform cross-field normalization of scientometric indicators of individual publications relays on the structural properties of citation networks. Due to the large size of the networks, a systematic procedure for identifying scientific domains based on a local community detection algorithm is proposed. The algorithm is tested with different benchmark and real-world networks. Then, by the use of this algorithm, the mechanism of the scientometric indicator normalization process is shown for a few indicators like the citation number, P-index and a local version of the PageRank indicator. The fat-tail trend of the article indicator distribution enables us to successfully perform the indicator normalization process.

Keywords: citation networks, cross-field normalization, local cluster detection, scientometric indicators

Procedia PDF Downloads 169
656 Investigating Data Normalization Techniques in Swarm Intelligence Forecasting for Energy Commodity Spot Price

Authors: Yuhanis Yusof, Zuriani Mustaffa, Siti Sakira Kamaruddin

Abstract:

Data mining is a fundamental technique in identifying patterns from large data sets. The extracted facts and patterns contribute in various domains such as marketing, forecasting, and medical. Prior to that, data are consolidated so that the resulting mining process may be more efficient. This study investigates the effect of different data normalization techniques, which are Min-max, Z-score, and decimal scaling, on Swarm-based forecasting models. Recent swarm intelligence algorithms employed includes the Grey Wolf Optimizer (GWO) and Artificial Bee Colony (ABC). Forecasting models are later developed to predict the daily spot price of crude oil and gasoline. Results showed that GWO works better with Z-score normalization technique while ABC produces better accuracy with the Min-Max. Nevertheless, the GWO is more superior that ABC as its model generates the highest accuracy for both crude oil and gasoline price. Such a result indicates that GWO is a promising competitor in the family of swarm intelligence algorithms.

Keywords: artificial bee colony, data normalization, forecasting, Grey Wolf optimizer

Procedia PDF Downloads 442
655 Normalizing Logarithms of Realized Volatility in an ARFIMA Model

Authors: G. L. C. Yap

Abstract:

Modelling realized volatility with high-frequency returns is popular as it is an unbiased and efficient estimator of return volatility. A computationally simple model is fitting the logarithms of the realized volatilities with a fractionally integrated long-memory Gaussian process. The Gaussianity assumption simplifies the parameter estimation using the Whittle approximation. Nonetheless, this assumption may not be met in the finite samples and there may be a need to normalize the financial series. Based on the empirical indices S&P500 and DAX, this paper examines the performance of the linear volatility model pre-treated with normalization compared to its existing counterpart. The empirical results show that by including normalization as a pre-treatment procedure, the forecast performance outperforms the existing model in terms of statistical and economic evaluations.

Keywords: Gaussian process, long-memory, normalization, value-at-risk, volatility, Whittle estimator

Procedia PDF Downloads 322
654 Facial Pose Classification Using Hilbert Space Filling Curve and Multidimensional Scaling

Authors: Mekamı Hayet, Bounoua Nacer, Benabderrahmane Sidahmed, Taleb Ahmed

Abstract:

Pose estimation is an important task in computer vision. Though the majority of the existing solutions provide good accuracy results, they are often overly complex and computationally expensive. In this perspective, we propose the use of dimensionality reduction techniques to address the problem of facial pose estimation. Firstly, a face image is converted into one-dimensional time series using Hilbert space filling curve, then the approach converts these time series data to a symbolic representation. Furthermore, a distance matrix is calculated between symbolic series of an input learning dataset of images, to generate classifiers of frontal vs. profile face pose. The proposed method is evaluated with three public datasets. Experimental results have shown that our approach is able to achieve a correct classification rate exceeding 97% with K-NN algorithm.

Keywords: machine learning, pattern recognition, facial pose classification, time series

Procedia PDF Downloads 317
653 A New Criterion Using Pose and Shape of Objects for Collision Risk Estimation

Authors: DoHyeung Kim, DaeHee Seo, ByungDoo Kim, ByungGil Lee

Abstract:

As many recent researches being implemented in aviation and maritime aspects, strong doubts have been raised concerning the reliability of the estimation of collision risk. It is shown that using position and velocity of objects can lead to imprecise results. In this paper, therefore, a new approach to the estimation of collision risks using pose and shape of objects is proposed. Simulation results are presented validating the accuracy of the new criterion to adapt to collision risk algorithm based on fuzzy logic.

Keywords: collision risk, pose, shape, fuzzy logic

Procedia PDF Downloads 480
652 Adversarial Disentanglement Using Latent Classifier for Pose-Independent Representation

Authors: Hamed Alqahtani, Manolya Kavakli-Thorne

Abstract:

The large pose discrepancy is one of the critical challenges in face recognition during video surveillance. Due to the entanglement of pose attributes with identity information, the conventional approaches for pose-independent representation lack in providing quality results in recognizing largely posed faces. In this paper, we propose a practical approach to disentangle the pose attribute from the identity information followed by synthesis of a face using a classifier network in latent space. The proposed approach employs a modified generative adversarial network framework consisting of an encoder-decoder structure embedded with a classifier in manifold space for carrying out factorization on the latent encoding. It can be further generalized to other face and non-face attributes for real-life video frames containing faces with significant attribute variations. Experimental results and comparison with state of the art in the field prove that the learned representation of the proposed approach synthesizes more compelling perceptual images through a combination of adversarial and classification losses.

Keywords: disentanglement, face detection, generative adversarial networks, video surveillance

Procedia PDF Downloads 85
651 Comparison of Bioelectric and Biomechanical Electromyography Normalization Techniques in Disparate Populations

Authors: Drew Commandeur, Ryan Brodie, Sandra Hundza, Marc Klimstra

Abstract:

The amplitude of raw electromyography (EMG) is affected by recording conditions and often requires normalization to make meaningful comparisons. Bioelectric methods normalize with an EMG signal recorded during a standardized task or from the experimental protocol itself, while biomechanical methods often involve measurements with an additional sensor such as a force transducer. Common bioelectric normalization techniques for treadmill walking include maximum voluntary isometric contraction (MVIC), dynamic EMG peak (EMGPeak) or dynamic EMG mean (EMGMean). There are several concerns with using MVICs to normalize EMG, including poor reliability and potential discomfort. A limitation of bioelectric normalization techniques is that they could result in a misrepresentation of the absolute magnitude of force generated by the muscle and impact the interpretation of EMG between functionally disparate groups. Additionally, methods that normalize to EMG recorded during the task may eliminate some real inter-individual variability due to biological variation. This study compared biomechanical and bioelectric EMG normalization techniques during treadmill walking to assess the impact of the normalization method on the functional interpretation of EMG data. For the biomechanical method, we normalized EMG to a target torque (EMGTS) and the bioelectric methods used were normalization to the mean and peak of the signal during the walking task (EMGMean and EMGPeak). The effect of normalization on muscle activation pattern, EMG amplitude, and inter-individual variability were compared between disparate cohorts of OLD (76.6 yrs N=11) and YOUNG (26.6 yrs N=11) adults. Participants walked on a treadmill at a self-selected pace while EMG was recorded from the right lower limb. EMG data from the soleus (SOL), medial gastrocnemius (MG), tibialis anterior (TA), vastus lateralis (VL), and biceps femoris (BF) were phase averaged into 16 bins (phases) representing the gait cycle with bins 1-10 associated with right stance and bins 11-16 with right swing. Pearson’s correlations showed that activation patterns across the gait cycle were similar between all methods, ranging from r =0.86 to r=1.00 with p<0.05. This indicates that each method can characterize the muscle activation pattern during walking. Repeated measures ANOVA showed a main effect for age in MG for EMGPeak but no other main effects were observed. Interactions between age*phase of EMG amplitude between YOUNG and OLD with each method resulted in different statistical interpretation between methods. EMGTS normalization characterized the fewest differences (four phases across all 5 muscles) while EMGMean (11 phases) and EMGPeak (19 phases) showed considerably more differences between cohorts. The second notable finding was that coefficient of variation, the representation of inter-individual variability, was greatest for EMGTS and lowest for EMGMean while EMGPeak was slightly higher than EMGMean for all muscles. This finding supports our expectation that EMGTS normalization would retain inter-individual variability which may be desirable, however, it also suggests that even when large differences are expected, a larger sample size may be required to observe the differences. Our findings clearly indicate that interpretation of EMG is highly dependent on the normalization method used, and it is essential to consider the strengths and limitations of each method when drawing conclusions.

Keywords: electromyography, EMG normalization, functional EMG, older adults

Procedia PDF Downloads 58
650 Acceleration-Based Motion Model for Visual Simultaneous Localization and Mapping

Authors: Daohong Yang, Xiang Zhang, Lei Li, Wanting Zhou

Abstract:

Visual Simultaneous Localization and Mapping (VSLAM) is a technology that obtains information in the environment for self-positioning and mapping. It is widely used in computer vision, robotics and other fields. Many visual SLAM systems, such as OBSLAM3, employ a constant-speed motion model that provides the initial pose of the current frame to improve the speed and accuracy of feature matching. However, in actual situations, the constant velocity motion model is often difficult to be satisfied, which may lead to a large deviation between the obtained initial pose and the real value, and may lead to errors in nonlinear optimization results. Therefore, this paper proposed a motion model based on acceleration, which can be applied on most SLAM systems. In order to better describe the acceleration of the camera pose, we decoupled the pose transformation matrix, and calculated the rotation matrix and the translation vector respectively, where the rotation matrix is represented by rotation vector. We assume that, in a short period of time, the changes of rotating angular velocity and translation vector remain the same. Based on this assumption, the initial pose of the current frame is estimated. In addition, the error of constant velocity model was analyzed theoretically. Finally, we applied our proposed approach to the ORBSLAM3 system and evaluated two sets of sequences on the TUM dataset. The results showed that our proposed method had a more accurate initial pose estimation and the accuracy of ORBSLAM3 system is improved by 6.61% and 6.46% respectively on the two test sequences.

Keywords: error estimation, constant acceleration motion model, pose estimation, visual SLAM

Procedia PDF Downloads 55
649 A New Scheme for Chain Code Normalization in Arabic and Farsi Scripts

Authors: Reza Shakoori

Abstract:

This paper presents a structural correction of Arabic and Persian strokes using manipulation of their chain codes in order to improve the rate and performance of Persian and Arabic handwritten word recognition systems. It collects pure and effective features to represent a character with one consolidated feature vector and reduces variations in order to decrease the number of training samples and increase the chance of successful classification. Our results also show that how the proposed approaches can simplify classification and consequently recognition by reducing variations and possible noises on the chain code by keeping orientation of characters and their backbone structures.

Keywords: Arabic, chain code normalization, OCR systems, image processing

Procedia PDF Downloads 365
648 Extensions of Schwarz Lemma in the Half-Plane

Authors: Nicolae Pascu

Abstract:

Aside from being a fundamental tool in Complex analysis, Schwarz Lemma-which was finalized in its most complete form at the beginning of the last century-generated an important area of research in various fields of mathematics, which continues to advance even today. We present some properties of analytic functions in the half-plane which satisfy the conditions of the classical Schwarz Lemma (Carathéodory functions) and obtain a generalization of the well-known Aleksandrov-Sobolev Lemma for analytic functions in the half-plane (the correspondent of Schwarz-Pick Lemma from the unit disk). Using this Schwarz-type lemma, we obtain a characterization for the entire class of Carathéodory functions, which might be of independent interest. We prove two monotonicity properties for Carathéodory functions that do not depend upon their normalization at infinity (the hydrodynamic normalization). The method is based on conformal mapping arguments for analytic functions in the half-plane satisfying appropriate conditions, in the spirit of Schwarz lemma. According to the research findings in this paper, our main results give estimates for the modulus and the argument for the entire class of Carathéodory functions. As applications, we give several extensions of Julia-Wolf-Carathéodory Lemma in a half-strip and show that our results are sharp.

Keywords: schwarz lemma, Julia-wolf-caratéodory lemma, analytic function, normalization condition, caratéodory function

Procedia PDF Downloads 152
647 Spatiotemporal Neural Network for Video-Based Pose Estimation

Authors: Bin Ji, Kai Xu, Shunyu Yao, Jingjing Liu, Ye Pan

Abstract:

Human pose estimation is a popular research area in computer vision for its important application in human-machine interface. In recent years, 2D human pose estimation based on convolution neural network has got great progress and development. However, in more and more practical applications, people often need to deal with tasks based on video. It’s not far-fetched for us to consider how to combine the spatial and temporal information together to achieve a balance between computing cost and accuracy. To address this issue, this study proposes a new spatiotemporal model, namely Spatiotemporal Net (STNet) to combine both temporal and spatial information more rationally. As a result, the predicted keypoints heatmap is potentially more accurate and spatially more precise. Under the condition of ensuring the recognition accuracy, the algorithm deal with spatiotemporal series in a decoupled way, which greatly reduces the computation of the model, thus reducing the resource consumption. This study demonstrate the effectiveness of our network over the Penn Action Dataset, and the results indicate superior performance of our network over the existing methods.

Keywords: convolutional long short-term memory, deep learning, human pose estimation, spatiotemporal series

Procedia PDF Downloads 112
646 Pose-Dependency of Machine Tool Structures: Appearance, Consequences, and Challenges for Lightweight Large-Scale Machines

Authors: S. Apprich, F. Wulle, A. Lechler, A. Pott, A. Verl

Abstract:

Large-scale machine tools for the manufacturing of large work pieces, e.g. blades, casings or gears for wind turbines, feature pose-dependent dynamic behavior. Small structural damping coefficients lead to long decay times for structural vibrations that have negative impacts on the production process. Typically, these vibrations are handled by increasing the stiffness of the structure by adding mass. That is counterproductive to the needs of sustainable manufacturing as it leads to higher resource consumption both in material and in energy. Recent research activities have led to higher resource efficiency by radical mass reduction that rely on control-integrated active vibration avoidance and damping methods. These control methods depend on information describing the dynamic behavior of the controlled machine tools in order to tune the avoidance or reduction method parameters according to the current state of the machine. The paper presents the appearance, consequences and challenges of the pose-dependent dynamic behavior of lightweight large-scale machine tool structures in production. The paper starts with the theoretical introduction of the challenges of lightweight machine tool structures resulting from reduced stiffness. The statement of the pose-dependent dynamic behavior is corroborated by the results of the experimental modal analysis of a lightweight test structure. Afterwards, the consequences of the pose-dependent dynamic behavior of lightweight machine tool structures for the use of active control and vibration reduction methods are explained. Based on the state of the art on pose-dependent dynamic machine tool models and the modal investigation of an FE-model of the lightweight test structure, the criteria for a pose-dependent model for use in vibration reduction are derived. The description of the approach for a general pose-dependent model of the dynamic behavior of large lightweight machine tools that provides the necessary input to the aforementioned vibration avoidance and reduction methods to properly tackle machine vibrations is the outlook of the paper.

Keywords: dynamic behavior, lightweight, machine tool, pose-dependency

Procedia PDF Downloads 422
645 Intelligent Human Pose Recognition Based on EMG Signal Analysis and Machine 3D Model

Authors: Si Chen, Quanhong Jiang

Abstract:

In the increasingly mature posture recognition technology, human movement information is widely used in sports rehabilitation, human-computer interaction, medical health, human posture assessment, and other fields today; this project uses the most original ideas; it is proposed to use the collection equipment for the collection of myoelectric data, reflect the muscle posture change on a degree of freedom through data processing, carry out data-muscle three-dimensional model joint adjustment, and realize basic pose recognition. Based on this, bionic aids or medical rehabilitation equipment can be further developed with the help of robotic arms and cutting-edge technology, which has a bright future and unlimited development space.

Keywords: pose recognition, 3D animation, electromyography, machine learning, bionics

Procedia PDF Downloads 42
644 A Transformer-Based Approach for Multi-Human 3D Pose Estimation Using Color and Depth Images

Authors: Qiang Wang, Hongyang Yu

Abstract:

Multi-human 3D pose estimation is a challenging task in computer vision, which aims to recover the 3D joint locations of multiple people from multi-view images. In contrast to traditional methods, which typically only use color (RGB) images as input, our approach utilizes both color and depth (D) information contained in RGB-D images. We also employ a transformer-based model as the backbone of our approach, which is able to capture long-range dependencies and has been shown to perform well on various sequence modeling tasks. Our method is trained and tested on the Carnegie Mellon University (CMU) Panoptic dataset, which contains a diverse set of indoor and outdoor scenes with multiple people in varying poses and clothing. We evaluate the performance of our model on the standard 3D pose estimation metrics of mean per-joint position error (MPJPE). Our results show that the transformer-based approach outperforms traditional methods and achieves competitive results on the CMU Panoptic dataset. We also perform an ablation study to understand the impact of different design choices on the overall performance of the model. In summary, our work demonstrates the effectiveness of using a transformer-based approach with RGB-D images for multi-human 3D pose estimation and has potential applications in real-world scenarios such as human-computer interaction, robotics, and augmented reality.

Keywords: multi-human 3D pose estimation, RGB-D images, transformer, 3D joint locations

Procedia PDF Downloads 40
643 A Unified Deep Framework for Joint 3d Pose Estimation and Action Recognition from a Single Color Camera

Authors: Huy Hieu Pham, Houssam Salmane, Louahdi Khoudour, Alain Crouzil, Pablo Zegers, Sergio Velastin

Abstract:

We present a deep learning-based multitask framework for joint 3D human pose estimation and action recognition from color video sequences. Our approach proceeds along two stages. In the first, we run a real-time 2D pose detector to determine the precise pixel location of important key points of the body. A two-stream neural network is then designed and trained to map detected 2D keypoints into 3D poses. In the second, we deploy the Efficient Neural Architecture Search (ENAS) algorithm to find an optimal network architecture that is used for modeling the Spatio-temporal evolution of the estimated 3D poses via an image-based intermediate representation and performing action recognition. Experiments on Human3.6M, Microsoft Research Redmond (MSR) Action3D, and Stony Brook University (SBU) Kinect Interaction datasets verify the effectiveness of the proposed method on the targeted tasks. Moreover, we show that our method requires a low computational budget for training and inference.

Keywords: human action recognition, pose estimation, D-CNN, deep learning

Procedia PDF Downloads 107
642 Real Time Multi Person Action Recognition Using Pose Estimates

Authors: Aishrith Rao

Abstract:

Human activity recognition is an important aspect of video analytics, and many approaches have been recommended to enable action recognition. In this approach, the model is used to identify the action of the multiple people in the frame and classify them accordingly. A few approaches use RNNs and 3D CNNs, which are computationally expensive and cannot be trained with the small datasets which are currently available. Multi-person action recognition has been performed in order to understand the positions and action of people present in the video frame. The size of the video frame can be adjusted as a hyper-parameter depending on the hardware resources available. OpenPose has been used to calculate pose estimate using CNN to produce heap-maps, one of which provides skeleton features, which are basically joint features. The features are then extracted, and a classification algorithm can be applied to classify the action.

Keywords: human activity recognition, computer vision, pose estimates, convolutional neural networks

Procedia PDF Downloads 104
641 The Relationship between Human Pose and Intention to Fire a Handgun

Authors: Joshua van Staden, Dane Brown, Karen Bradshaw

Abstract:

Gun violence is a significant problem in modern-day society. Early detection of carried handguns through closed-circuit television (CCTV) can aid in preventing potential gun violence. However, CCTV operators have a limited attention span. Machine learning approaches to automating the detection of dangerous gun carriers provide a way to aid CCTV operators in identifying these individuals. This study provides insight into the relationship between human key points extracted using human pose estimation (HPE) and their intention to fire a weapon. We examine the feature importance of each keypoint and their correlations. We use principal component analysis (PCA) to reduce the feature space and optimize detection. Finally, we run a set of classifiers to determine what form of classifier performs well on this data. We find that hips, shoulders, and knees tend to be crucial aspects of the human pose when making these predictions. Furthermore, the horizontal position plays a larger role than the vertical position. Of the 66 key points, nine principal components could be used to make nonlinear classifications with 86% accuracy. Furthermore, linear classifications could be done with 85% accuracy, showing that there is a degree of linearity in the data.

Keywords: feature engineering, human pose, machine learning, security

Procedia PDF Downloads 60
640 Evaluating the Performance of Color Constancy Algorithm

Authors: Damanjit Kaur, Avani Bhatia

Abstract:

Color constancy is significant for human vision since color is a pictorial cue that helps in solving different visions tasks such as tracking, object recognition, or categorization. Therefore, several computational methods have tried to simulate human color constancy abilities to stabilize machine color representations. Two different kinds of methods have been used, i.e., normalization and constancy. While color normalization creates a new representation of the image by canceling illuminant effects, color constancy directly estimates the color of the illuminant in order to map the image colors to a canonical version. Color constancy is the capability to determine colors of objects independent of the color of the light source. This research work studies the most of the well-known color constancy algorithms like white point and gray world.

Keywords: color constancy, gray world, white patch, modified white patch

Procedia PDF Downloads 276
639 Light-Weight Network for Real-Time Pose Estimation

Authors: Jianghao Hu, Hongyu Wang

Abstract:

The effective and efficient human pose estimation algorithm is an important task for real-time human pose estimation on mobile devices. This paper proposes a light-weight human key points detection algorithm, Light-Weight Network for Real-Time Pose Estimation (LWPE). LWPE uses light-weight backbone network and depthwise separable convolutions to reduce parameters and lower latency. LWPE uses the feature pyramid network (FPN) to fuse the high-resolution, semantically weak features with the low-resolution, semantically strong features. In the meantime, with multi-scale prediction, the predicted result by the low-resolution feature map is stacked to the adjacent higher-resolution feature map to intermediately monitor the network and continuously refine the results. At the last step, the key point coordinates predicted in the highest-resolution are used as the final output of the network. For the key-points that are difficult to predict, LWPE adopts the online hard key points mining strategy to focus on the key points that hard predicting. The proposed algorithm achieves excellent performance in the single-person dataset selected in the AI (artificial intelligence) challenge dataset. The algorithm maintains high-precision performance even though the model only contains 3.9M parameters, and it can run at 225 frames per second (FPS) on the generic graphics processing unit (GPU).

Keywords: depthwise separable convolutions, feature pyramid network, human pose estimation, light-weight backbone

Procedia PDF Downloads 115
638 Author Name Disambiguation for Biomedical Literature

Authors: Parthiban Srinivasan

Abstract:

PubMed provides online access to the National Library of Medicine database (MEDLINE) and other publications, which contain close to 25 million scientific citations from 1865 to the present. There are close to 80 million author name instances in those close to 25 million citations. For any work of literature, a fundamental issue is to identify the individual(s) who wrote it, and conversely, to identify all of the works that belong to a given individual. Due to the lack of universal standards for name information, there are two aspects of name ambiguity: name synonymy (a single author with multiple name representations), and name homonymy (multiple authors sharing the same name representation). In this talk, we present some results from our extensive work in author name disambiguation for PubMed citations. Information will be presented on the effectiveness and shortcomings of different aspects of successful name disambiguation such as parsing, validation, standardization and normalization.

Keywords: disambiguation, normalization, parsing, PubMed

Procedia PDF Downloads 268
637 Single-Camera Basketball Tracker through Pose and Semantic Feature Fusion

Authors: Adrià Arbués-Sangüesa, Coloma Ballester, Gloria Haro

Abstract:

Tracking sports players is a widely challenging scenario, specially in single-feed videos recorded in tight courts, where cluttering and occlusions cannot be avoided. This paper presents an analysis of several geometric and semantic visual features to detect and track basketball players. An ablation study is carried out and then used to remark that a robust tracker can be built with Deep Learning features, without the need of extracting contextual ones, such as proximity or color similarity, nor applying camera stabilization techniques. The presented tracker consists of: (1) a detection step, which uses a pretrained deep learning model to estimate the players pose, followed by (2) a tracking step, which leverages pose and semantic information from the output of a convolutional layer in a VGG network. Its performance is analyzed in terms of MOTA over a basketball dataset with more than 10k instances.

Keywords: basketball, deep learning, feature extraction, single-camera, tracking

Procedia PDF Downloads 106
636 Indoor Real-Time Positioning and Mapping Based on Manhattan Hypothesis Optimization

Authors: Linhang Zhu, Hongyu Zhu, Jiahe Liu

Abstract:

This paper investigated a method of indoor real-time positioning and mapping based on the Manhattan world assumption. In indoor environments, relying solely on feature matching techniques or other geometric algorithms for sensor pose estimation inevitably resulted in cumulative errors, posing a significant challenge to indoor positioning. To address this issue, we adopt the Manhattan world hypothesis to optimize the camera pose algorithm based on feature matching, which improves the accuracy of camera pose estimation. A special processing method was applied to image data frames that conformed to the Manhattan world assumption. When similar data frames appeared subsequently, this could be used to eliminate drift in sensor pose estimation, thereby reducing cumulative errors in estimation and optimizing mapping and positioning. Through experimental verification, it is found that our method achieves high-precision real-time positioning in indoor environments and successfully generates maps of indoor environments. This provides effective technical support for applications such as indoor navigation and robot control.

Keywords: Manhattan world hypothesis, real-time positioning and mapping, feature matching, loopback detection

Procedia PDF Downloads 26
635 Assessment of Pre-Processing Influence on Near-Infrared Spectra for Predicting the Mechanical Properties of Wood

Authors: Aasheesh Raturi, Vimal Kothiyal, P. D. Semalty

Abstract:

We studied mechanical properties of Eucalyptus tereticornis using FT-NIR spectroscopy. Firstly, spectra were pre-processed to eliminate useless information. Then, prediction model was constructed by partial least squares regression. To study the influence of pre-processing on prediction of mechanical properties for NIR analysis of wood samples, we applied various pretreatment methods like straight line subtraction, constant offset elimination, vector-normalization, min-max normalization, multiple scattering. Correction, first derivative, second derivatives and their combination with other treatment such as First derivative + straight line subtraction, First derivative+ vector normalization and First derivative+ multiplicative scattering correction. The data processing methods in combination of preprocessing with different NIR regions, RMSECV, RMSEP and optimum factors/rank were obtained by optimization process of model development. More than 350 combinations were obtained during optimization process. More than one pre-processing method gave good calibration/cross-validation and prediction/test models, but only the best calibration/cross-validation and prediction/test models are reported here. The results show that one can safely use NIR region between 4000 to 7500 cm-1 with straight line subtraction, constant offset elimination, first derivative and second derivative preprocessing method which were found to be most appropriate for models development.

Keywords: FT-NIR, mechanical properties, pre-processing, PLS

Procedia PDF Downloads 306
634 Enhancement of Underwater Haze Image with Edge Reveal Using Pixel Normalization

Authors: M. Dhana Lakshmi, S. Sakthivel Murugan

Abstract:

As light passes from source to observer in the water medium, it is scattered by the suspended particulate matter. This scattering effect will plague the captured images with non-uniform illumination, blurring details, halo artefacts, weak edges, etc. To overcome this, pixel normalization with an Amended Unsharp Mask (AUM) filter is proposed to enhance the degraded image. To validate the robustness of the proposed technique irrespective of atmospheric light, the considered datasets are collected on dual locations. For those images, the maxima and minima pixel intensity value is computed and normalized; then the AUM filter is applied to strengthen the blurred edges. Finally, the enhanced image is obtained with good illumination and contrast. Thus, the proposed technique removes the effect of scattering called de-hazing and restores the perceptual information with enhanced edge detail. Both qualitative and quantitative analyses are done on considering the standard non-reference metric called underwater image sharpness measure (UISM), and underwater image quality measure (UIQM) is used to measure color, sharpness, and contrast for both of the location images. It is observed that the proposed technique has shown overwhelming performance compared to other deep-based enhancement networks and traditional techniques in an adaptive manner.

Keywords: underwater drone imagery, pixel normalization, thresholding, masking, unsharp mask filter

Procedia PDF Downloads 157
633 A Normalized Non-Stationary Wavelet Based Analysis Approach for a Computer Assisted Classification of Laryngoscopic High-Speed Video Recordings

Authors: Mona K. Fehling, Jakob Unger, Dietmar J. Hecker, Bernhard Schick, Joerg Lohscheller

Abstract:

Voice disorders origin from disturbances of the vibration patterns of the two vocal folds located within the human larynx. Consequently, the visual examination of vocal fold vibrations is an integral part within the clinical diagnostic process. For an objective analysis of the vocal fold vibration patterns, the two-dimensional vocal fold dynamics are captured during sustained phonation using an endoscopic high-speed camera. In this work, we present an approach allowing a fully automatic analysis of the high-speed video data including a computerized classification of healthy and pathological voices. The approach bases on a wavelet-based analysis of so-called phonovibrograms (PVG), which are extracted from the high-speed videos and comprise the entire two-dimensional vibration pattern of each vocal fold individually. Using a principal component analysis (PCA) strategy a low-dimensional feature set is computed from each phonovibrogram. From the PCA-space clinically relevant measures can be derived that quantify objectively vibration abnormalities. In the first part of the work it will be shown that, using a machine learning approach, the derived measures are suitable to distinguish automatically between healthy and pathological voices. Within the approach the formation of the PCA-space and consequently the extracted quantitative measures depend on the clinical data, which were used to compute the principle components. Therefore, in the second part of the work we proposed a strategy to achieve a normalization of the PCA-space by registering the PCA-space to a coordinate system using a set of synthetically generated vibration patterns. The results show that owing to the normalization step potential ambiguousness of the parameter space can be eliminated. The normalization further allows a direct comparison of research results, which bases on PCA-spaces obtained from different clinical subjects.

Keywords: Wavelet-based analysis, Multiscale product, normalization, computer assisted classification, high-speed laryngoscopy, vocal fold analysis, phonovibrogram

Procedia PDF Downloads 230