Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 25

Search results for: encoders

25 Two Wheels Differential Type Odometry for Robot

Abstract:

This paper proposes a new type of two wheels differential type odometry to estimate the next position and orientation of mobile robots. The proposed odometry is composed for two independent wheels with respective encoders. The two wheels rotate independently, and the change is determined by the difference in the velocity of the two wheels. Angular velocities of the two wheels are measured by rotary encoders. A mathematical model is proposed for the mobile robots to precisely move towards the goal. Using measured values of the two encoders, the current displacement vector of a mobile robot is calculated by kinematics of the mathematical model. Using the displacement vector, the next position and orientation of the mobile robot are estimated by proposed odometry. Result of simulator experiment by the developed odometry is shown.

Keywords: mobile robot, odometry, unicycle, differential type, encoders, infrared range sensors, kinematic model

Procedia PDF Downloads 443

24 Cross Attention Fusion for Dual-Stream Speech Emotion Recognition

Authors: Shaode Yu, Jiajian Meng, Bing Zhu, Hang Yu, Qiurui Sun

Abstract:

Speech emotion recognition (SER) is for recognizing human subjective emotions through audio data in-depth analysis. From speech audios, how to comprehensively extract emotional information and how to effectively fuse extracted features remain challenging. This paper presents a dual-stream SER framework that embraces both full training and transfer learning of different networks for thorough feature encoding. Besides, a plug-and-play cross-attention fusion (CAF) module is implemented for the valid integration of the dual-stream encoder output. The effectiveness of the proposed CAF module is compared to the other three fusion modules (feature summation, feature concatenation, and feature-wise linear modulation) on two databases (RAVDESS and IEMO-CAP) using different dual-stream encoders (full training network, DPCNN or TextRCNN; transfer learning network, HuBERT or Wav2Vec2). Experimental results suggest that the CAF module can effectively reconcile conflicts between features from different encoders and outperform the other three feature fusion modules on the SER task. In the future, the plug-and-play CAF module can be extended for multi-branch feature fusion, and the dual-stream SER framework can be widened for multi-stream data representation to improve the recognition performance and generalization capacity.

Keywords: speech emotion recognition, cross-attention fusion, dual-stream, pre-trained

Procedia PDF Downloads 66

23 PatchMix: Learning Transferable Semi-Supervised Representation by Predicting Patches

Authors: Arpit Rai

Abstract:

In this work, we propose PatchMix, a semi-supervised method for pre-training visual representations. PatchMix mixes patches of two images and then solves an auxiliary task of predicting the label of each patch in the mixed image. Our experiments on the CIFAR-10, 100 and the SVHN dataset show that the representations learned by this method encodes useful information for transfer to new tasks and outperform the baseline Residual Network encoders by on CIFAR 10 by 12% on ResNet 101 and 2% on ResNet-56, by 4% on CIFAR-100 on ResNet101 and by 6% on SVHN dataset on the ResNet-101 baseline model.

Keywords: self-supervised learning, representation learning, computer vision, generalization

Procedia PDF Downloads 84

22 Wireless Communication in Sunlight

Authors: Karmveer Sheoran

Abstract:

To make wireless communication a vast success is to use sunlight for wireless communication. We can use sunlight in upper atmosphere to encode messages to efficiently use sunlight. This use of sunlight for wireless communication will need encoders which will encode sunlight according to our message and then resultant will be spread in all atmospheres wherever sunlight goes, it will take our messages with it. With minimum requirement of cost in equipment used at the edge of atmosphere is where sunlight is being encoded. In this way a very high efficient wireless communication system can be designed. On receiver side we will need light detectors which will detect sunlight variations and will finally give the information contained i it. Sunlight can be encoded at a very high speed that nobody will be annoyed by flickering. It will be most sophisticated and efficient wireless communication ever designed. There are far more possibilities in this sunlight communication. Let us call it “Sunlight Communication".

Keywords: sunlight communication, emerging trends, wireless communication, wifi

Procedia PDF Downloads 396

21 Partial M-Sequence Code Families Applied in Spectral Amplitude Coding Fiber-Optic Code-Division Multiple-Access Networks

Authors: Shin-Pin Tseng

Abstract:

Nowadays, numerous spectral amplitude coding (SAC) fiber-optic code-division-multiple-access (FO-CDMA) techniques were appealing due to their capable of providing moderate security and relieving the effects of multiuser interference (MUI). Nonetheless, the performance of the previous network is degraded due to fixed in-phase cross-correlation (IPCC) value. Based on the above problems, a new SAC FO-CDMA network using partial M-sequence (PMS) code is presented in this study. Because the proposed PMS code is originated from M-sequence code, the system using the PMS code could effectively suppress the effects of MUI. In addition, two-code keying (TCK) scheme can applied in the proposed SAC FO-CDMA network and enhance the whole network performance. According to the consideration of system flexibility, simple optical encoders/decoders (codecs) using fiber Bragg gratings (FBGs) were also developed. First, we constructed a diagram of the SAC FO-CDMA network, including (N/2-1) optical transmitters, (N/2-1) optical receivers, and one N×N star coupler for broadcasting transmitted optical signals to arrive at the input port of each optical receiver. Note that the parameter N for the PMS code was the code length. In addition, the proposed SAC network was using superluminescent diodes (SLDs) as light sources, which then can save a lot of system cost compared with the other FO-CDMA methods. For the design of each optical transmitter, it is composed of an SLD, one optical switch, and two optical encoders according to assigned PMS codewords. On the other hand, each optical receivers includes a 1 × 2 splitter, two optical decoders, and one balanced photodiode for mitigating the effect of MUI. In order to simplify the next analysis, the some assumptions were used. First, the unipolarized SLD has flat power spectral density (PSD). Second, the received optical power at the input port of each optical receiver is the same. Third, all photodiodes in the proposed network have the same electrical properties. Fourth, transmitting '1' and '0' has an equal probability. Subsequently, by taking the factors of phase‐induced intensity noise (PIIN) and thermal noise, the corresponding performance was displayed and compared with the performance of the previous SAC FO-CDMA networks. From the numerical result, it shows that the proposed network improved about 25% performance than that using other codes at BER=10-9. This is because the effect of PIIN was effectively mitigated and the received power was enhanced by two times. As a result, the SAC FO-CDMA network using PMS codes has an opportunity to apply in applications of the next-generation optical network.

Keywords: spectral amplitude coding, SAC, fiber-optic code-division multiple-access, FO-CDMA, partial M-sequence, PMS code, fiber Bragg grating, FBG

Procedia PDF Downloads 381

20 Performance Evaluation of the Classic seq2seq Model versus a Proposed Semi-supervised Long Short-Term Memory Autoencoder for Time Series Data Forecasting

Authors: Aswathi Thrivikraman, S. Advaith

Abstract:

The study is aimed at designing encoders for deciphering intricacies in time series data by redescribing the dynamics operating on a lower-dimensional manifold. A semi-supervised LSTM autoencoder is devised and investigated to see if the latent representation of the time series data can better forecast the data. End-to-end training of the LSTM autoencoder, together with another LSTM network that is connected to the latent space, forces the hidden states of the encoder to represent the most meaningful latent variables relevant for forecasting. Furthermore, the study compares the predictions with those of a traditional seq2seq model.

Keywords: LSTM, autoencoder, forecasting, seq2seq model

Procedia PDF Downloads 148

19 Extended Constraint Mask Based One-Bit Transform for Low-Complexity Fast Motion Estimation

Authors: Oğuzhan Urhan

Abstract:

In this paper, an improved motion estimation (ME) approach based on weighted constrained one-bit transform is proposed for block-based ME employed in video encoders. Binary ME approaches utilize low bit-depth representation of the original image frames with a Boolean exclusive-OR based hardware efficient matching criterion to decrease computational burden of the ME stage. Weighted constrained one-bit transform (WC‑1BT) based approach improves the performance of conventional C-1BT based ME employing 2-bit depth constraint mask instead of a 1-bit depth mask. In this work, the range of constraint mask is further extended to increase ME performance of WC-1BT approach. Experiments reveal that the proposed method provides better ME accuracy compared existing similar ME methods in the literature.

Keywords: fast motion estimation; low-complexity motion estimation, video coding

Procedia PDF Downloads 311

18 Image Captioning with Vision-Language Models

Authors: Promise Ekpo Osaine, Daniel Melesse

Abstract:

Image captioning is an active area of research in the multi-modal artificial intelligence (AI) community as it connects vision and language understanding, especially in settings where it is required that a model understands the content shown in an image and generates semantically and grammatically correct descriptions. In this project, we followed a standard approach to a deep learning-based image captioning model, injecting architecture for the encoder-decoder setup, where the encoder extracts image features, and the decoder generates a sequence of words that represents the image content. As such, we investigated image encoders, which are ResNet101, InceptionResNetV2, EfficientNetB7, EfficientNetV2M, and CLIP. As a caption generation structure, we explored long short-term memory (LSTM). The CLIP-LSTM model demonstrated superior performance compared to the encoder-decoder models, achieving a BLEU-1 score of 0.904 and a BLEU-4 score of 0.640. Additionally, among the CNN-LSTM models, EfficientNetV2M-LSTM exhibited the highest performance with a BLEU-1 score of 0.896 and a BLEU-4 score of 0.586 while using a single-layer LSTM.

Keywords: multi-modal AI systems, image captioning, encoder, decoder, BLUE score

Procedia PDF Downloads 69

17 Deep Learning Based Unsupervised Sport Scene Recognition and Highlights Generation

Authors: Ksenia Meshkova

Abstract:

With increasing amount of multimedia data, it is very important to automate and speed up the process of obtaining meta. This process means not just recognition of some object or its movement, but recognition of the entire scene versus separate frames and having timeline segmentation as a final result. Labeling datasets is time consuming, besides, attributing characteristics to particular scenes is clearly difficult due to their nature. In this article, we will consider autoencoders application to unsupervised scene recognition and clusterization based on interpretable features. Further, we will focus on particular types of auto encoders that relevant to our study. We will take a look at the specificity of deep learning related to information theory and rate-distortion theory and describe the solutions empowering poor interpretability of deep learning in media content processing. As a conclusion, we will present the results of the work of custom framework, based on autoencoders, capable of scene recognition as was deeply studied above, with highlights generation resulted out of this recognition. We will not describe in detail the mathematical description of neural networks work but will clarify the necessary concepts and pay attention to important nuances.

Keywords: neural networks, computer vision, representation learning, autoencoders

Procedia PDF Downloads 118

16 Deep Learning Based on Image Decomposition for Restoration of Intrinsic Representation

Authors: Hyohun Kim, Dongwha Shin, Yeonseok Kim, Ji-Su Ahn, Kensuke Nakamura, Dongeun Choi, Byung-Woo Hong

Abstract:

Artefacts are commonly encountered in the imaging process of clinical computed tomography (CT) where the artefact refers to any systematic discrepancy between the reconstructed observation and the true attenuation coefficient of the object. It is known that CT images are inherently more prone to artefacts due to its image formation process where a large number of independent detectors are involved, and they are assumed to yield consistent measurements. There are a number of different artefact types including noise, beam hardening, scatter, pseudo-enhancement, motion, helical, ring, and metal artefacts, which cause serious difficulties in reading images. Thus, it is desired to remove nuisance factors from the degraded image leaving the fundamental intrinsic information that can provide better interpretation of the anatomical and pathological characteristics. However, it is considered as a difficult task due to the high dimensionality and variability of data to be recovered, which naturally motivates the use of machine learning techniques. We propose an image restoration algorithm based on the deep neural network framework where the denoising auto-encoders are stacked building multiple layers. The denoising auto-encoder is a variant of a classical auto-encoder that takes an input data and maps it to a hidden representation through a deterministic mapping using a non-linear activation function. The latent representation is then mapped back into a reconstruction the size of which is the same as the size of the input data. The reconstruction error can be measured by the traditional squared error assuming the residual follows a normal distribution. In addition to the designed loss function, an effective regularization scheme using residual-driven dropout determined based on the gradient at each layer. The optimal weights are computed by the classical stochastic gradient descent algorithm combined with the back-propagation algorithm. In our algorithm, we initially decompose an input image into its intrinsic representation and the nuisance factors including artefacts based on the classical Total Variation problem that can be efficiently optimized by the convex optimization algorithm such as primal-dual method. The intrinsic forms of the input images are provided to the deep denosing auto-encoders with their original forms in the training phase. In the testing phase, a given image is first decomposed into the intrinsic form and then provided to the trained network to obtain its reconstruction. We apply our algorithm to the restoration of the corrupted CT images by the artefacts. It is shown that our algorithm improves the readability and enhances the anatomical and pathological properties of the object. The quantitative evaluation is performed in terms of the PSNR, and the qualitative evaluation provides significant improvement in reading images despite degrading artefacts. The experimental results indicate the potential of our algorithm as a prior solution to the image interpretation tasks in a variety of medical imaging applications. This work was supported by the MISP(Ministry of Science and ICT), Korea, under the National Program for Excellence in SW (20170001000011001) supervised by the IITP(Institute for Information and Communications Technology Promotion).

Keywords: auto-encoder neural network, CT image artefact, deep learning, intrinsic image representation, noise reduction, total variation

Procedia PDF Downloads 187

15 Low Light Image Enhancement with Multi-Stage Interconnected Autoencoders Integration in Pix to Pix GAN

Authors: Muhammad Atif, Cang Yan

Abstract:

The enhancement of low-light images is a significant area of study aimed at enhancing the quality of captured images in challenging lighting environments. Recently, methods based on convolutional neural networks (CNN) have gained prominence as they offer state-of-the-art performance. However, many approaches based on CNN rely on increasing the size and complexity of the neural network. In this study, we propose an alternative method for improving low-light images using an autoencoder-based multiscale knowledge transfer model. Our method leverages the power of three autoencoders, where the encoders of the first two autoencoders are directly connected to the decoder of the third autoencoder. Additionally, the decoder of the first two autoencoders is connected to the encoder of the third autoencoder. This architecture enables effective knowledge transfer, allowing the third autoencoder to learn and benefit from the enhanced knowledge extracted by the first two autoencoders. We further integrate the proposed model into the PIX to PIX GAN framework. By integrating our proposed model as the generator in the GAN framework, we aim to produce enhanced images that not only exhibit improved visual quality but also possess a more authentic and realistic appearance. These experimental results, both qualitative and quantitative, show that our method is better than the state-of-the-art methodologies.

Keywords: low light image enhancement, deep learning, convolutional neural network, image processing

Procedia PDF Downloads 60

14 Anomaly Detection in a Data Center with a Reconstruction Method Using a Multi-Autoencoders Model

Authors: Victor Breux, Jérôme Boutet, Alain Goret, Viviane Cattin

Abstract:

Early detection of anomalies in data centers is important to reduce downtimes and the costs of periodic maintenance. However, there is little research on this topic and even fewer on the fusion of sensor data for the detection of abnormal events. The goal of this paper is to propose a method for anomaly detection in data centers by combining sensor data (temperature, humidity, power) and deep learning models. The model described in the paper uses one autoencoder per sensor to reconstruct the inputs. The auto-encoders contain Long-Short Term Memory (LSTM) layers and are trained using the normal samples of the relevant sensors selected by correlation analysis. The difference signal between the input and its reconstruction is then used to classify the samples using feature extraction and a random forest classifier. The data measured by the sensors of a data center between January 2019 and May 2020 are used to train the model, while the data between June 2020 and May 2021 are used to assess it. Performances of the model are assessed a posteriori through F1-score by comparing detected anomalies with the data center’s history. The proposed model outperforms the state-of-the-art reconstruction method, which uses only one autoencoder taking multivariate sequences and detects an anomaly with a threshold on the reconstruction error, with an F1-score of 83.60% compared to 24.16%.

Keywords: anomaly detection, autoencoder, data centers, deep learning

Procedia PDF Downloads 187

13 3D Printing Perceptual Models of Preference Using a Fuzzy Extreme Learning Machine Approach

Authors: Xinyi Le

Abstract:

In this paper, 3D printing orientations were determined through our perceptual model. Some FDM (Fused Deposition Modeling) 3D printers, which are widely used in universities and industries, often require support structures during the additive manufacturing. After removing the residual material, some surface artifacts remain at the contact points. These artifacts will damage the function and visual effect of the model. To prevent the impact of these artifacts, we present a fuzzy extreme learning machine approach to find printing directions that avoid placing supports in perceptually significant regions. The proposed approach is able to solve the evaluation problem by combing both the subjective knowledge and objective information. Our method combines the advantages of fuzzy theory, auto-encoders, and extreme learning machine. Fuzzy set theory is applied for dealing with subjective preference information, and auto-encoder step is used to extract good features without supervised labels before extreme learning machine. An extreme learning machine method is then developed successfully for training and learning perceptual models. The performance of this perceptual model will be demonstrated on both natural and man-made objects. It is a good human-computer interaction practice which draws from supporting knowledge on both the machine side and the human side.

Keywords: 3d printing, perceptual model, fuzzy evaluation, data-driven approach

Procedia PDF Downloads 435

12 Low-Cost Mechatronic Design of an Omnidirectional Mobile Robot

Authors: S. Cobos-Guzman

Abstract:

This paper presents the results of a mechatronic design based on a 4-wheel omnidirectional mobile robot that can be used in indoor logistic applications. The low-level control has been selected using two open-source hardware (Raspberry Pi 3 Model B+ and Arduino Mega 2560) that control four industrial motors, four ultrasound sensors, four optical encoders, a vision system of two cameras, and a Hokuyo URG-04LX-UG01 laser scanner. Moreover, the system is powered with a lithium battery that can supply 24 V DC and a maximum current-hour of 20Ah.The Robot Operating System (ROS) has been implemented in the Raspberry Pi and the performance is evaluated with the selection of the sensors and hardware selected. The mechatronic system is evaluated and proposed safe modes of power distribution for controlling all the electronic devices based on different tests. Therefore, based on different performance results, some recommendations are indicated for using the Raspberry Pi and Arduino in terms of power, communication, and distribution of control for different devices. According to these recommendations, the selection of sensors is distributed in both real-time controllers (Arduino and Raspberry Pi). On the other hand, the drivers of the cameras have been implemented in Linux and a python program has been implemented to access the cameras. These cameras will be used for implementing a deep learning algorithm to recognize people and objects. In this way, the level of intelligence can be increased in combination with the maps that can be obtained from the laser scanner.

Keywords: autonomous, indoor robot, mechatronic, omnidirectional robot

Procedia PDF Downloads 168

11 Using Autoencoder as Feature Extractor for Malware Detection

Authors: Umm-E-Hani, Faiza Babar, Hanif Durad

Abstract:

Malware-detecting approaches suffer many limitations, due to which all anti-malware solutions have failed to be reliable enough for detecting zero-day malware. Signature-based solutions depend upon the signatures that can be generated only when malware surfaces at least once in the cyber world. Another approach that works by detecting the anomalies caused in the environment can easily be defeated by diligently and intelligently written malware. Solutions that have been trained to observe the behavior for detecting malicious files have failed to cater to the malware capable of detecting the sandboxed or protected environment. Machine learning and deep learning-based approaches greatly suffer in training their models with either an imbalanced dataset or an inadequate number of samples. AI-based anti-malware solutions that have been trained with enough samples targeted a selected feature vector, thus ignoring the input of leftover features in the maliciousness of malware just to cope with the lack of underlying hardware processing power. Our research focuses on producing an anti-malware solution for detecting malicious PE files by circumventing the earlier-mentioned shortcomings. Our proposed framework, which is based on automated feature engineering through autoencoders, trains the model over a fairly large dataset. It focuses on the visual patterns of malware samples to automatically extract the meaningful part of the visual pattern. Our experiment has successfully produced a state-of-the-art accuracy of 99.54 % over test data.

Keywords: malware, auto encoders, automated feature engineering, classification

Procedia PDF Downloads 67

10 Design and Performance Improvement of Three-Dimensional Optical Code Division Multiple Access Networks with NAND Detection Technique

Authors: Satyasen Panda, Urmila Bhanja

Abstract:

In this paper, we have presented and analyzed three-dimensional (3-D) matrices of wavelength/time/space code for optical code division multiple access (OCDMA) networks with NAND subtraction detection technique. The 3-D codes are constructed by integrating a two-dimensional modified quadratic congruence (MQC) code with one-dimensional modified prime (MP) code. The respective encoders and decoders were designed using fiber Bragg gratings and optical delay lines to minimize the bit error rate (BER). The performance analysis of the 3D-OCDMA system is based on measurement of signal to noise ratio (SNR), BER and eye diagram for a different number of simultaneous users. Also, in the analysis, various types of noises and multiple access interference (MAI) effects were considered. The results obtained with NAND detection technique were compared with those obtained with OR and AND subtraction techniques. The comparison results proved that the NAND detection technique with 3-D MQC\MP code can accommodate more number of simultaneous users for longer distances of fiber with minimum BER as compared to OR and AND subtraction techniques. The received optical power is also measured at various levels of BER to analyze the effect of attenuation.

Keywords: Cross Correlation (CC), Three dimensional Optical Code Division Multiple Access (3-D OCDMA), Spectral Amplitude Coding Optical Code Division Multiple Access (SAC-OCDMA), Multiple Access Interference (MAI), Phase Induced Intensity Noise (PIIN), Three Dimensional Modified Quadratic Congruence/Modified Prime (3-D MQC/MP) code

Procedia PDF Downloads 409

9 Learning Dynamic Representations of Nodes in Temporally Variant Graphs

Authors: Sandra Mitrovic, Gaurav Singh

Abstract:

In many industries, including telecommunications, churn prediction has been a topic of active research. A lot of attention has been drawn on devising the most informative features, and this area of research has gained even more focus with spread of (social) network analytics. The call detail records (CDRs) have been used to construct customer networks and extract potentially useful features. However, to the best of our knowledge, no studies including network features have yet proposed a generic way of representing network information. Instead, ad-hoc and dataset dependent solutions have been suggested. In this work, we build upon a recently presented method (node2vec) to obtain representations for nodes in observed network. The proposed approach is generic and applicable to any network and domain. Unlike node2vec, which assumes a static network, we consider a dynamic and time-evolving network. To account for this, we propose an approach that constructs the feature representation of each node by generating its node2vec representations at different timestamps, concatenating them and finally compressing using an auto-encoder-like method in order to retain reasonably long and informative feature vectors. We test the proposed method on churn prediction task in telco domain. To predict churners at timestamp ts+1, we construct training and testing datasets consisting of feature vectors from time intervals [t1, ts-1] and [t2, ts] respectively, and use traditional supervised classification models like SVM and Logistic Regression. Observed results show the effectiveness of proposed approach as compared to ad-hoc feature selection based approaches and static node2vec.

Keywords: churn prediction, dynamic networks, node2vec, auto-encoders

Procedia PDF Downloads 310

8 Memory Based Reinforcement Learning with Transformers for Long Horizon Timescales and Continuous Action Spaces

Authors: Shweta Singh, Sudaman Katti

Abstract:

The most well-known sequence models make use of complex recurrent neural networks in an encoder-decoder configuration. The model used in this research makes use of a transformer, which is based purely on a self-attention mechanism, without relying on recurrence at all. More specifically, encoders and decoders which make use of self-attention and operate based on a memory, are used. In this research work, results for various 3D visual and non-visual reinforcement learning tasks designed in Unity software were obtained. Convolutional neural networks, more specifically, nature CNN architecture, are used for input processing in visual tasks, and comparison with standard long short-term memory (LSTM) architecture is performed for both visual tasks based on CNNs and non-visual tasks based on coordinate inputs. This research work combines the transformer architecture with the proximal policy optimization technique used popularly in reinforcement learning for stability and better policy updates while training, especially for continuous action spaces, which are used in this research work. Certain tasks in this paper are long horizon tasks that carry on for a longer duration and require extensive use of memory-based functionalities like storage of experiences and choosing appropriate actions based on recall. The transformer, which makes use of memory and self-attention mechanism in an encoder-decoder configuration proved to have better performance when compared to LSTM in terms of exploration and rewards achieved. Such memory based architectures can be used extensively in the field of cognitive robotics and reinforcement learning.

Keywords: convolutional neural networks, reinforcement learning, self-attention, transformers, unity

Procedia PDF Downloads 128

7 Self-Supervised Attributed Graph Clustering with Dual Contrastive Loss Constraints

Authors: Lijuan Zhou, Mengqi Wu, Changyong Niu

Abstract:

Attributed graph clustering can utilize the graph topology and node attributes to uncover hidden community structures and patterns in complex networks, aiding in the understanding and analysis of complex systems. Utilizing contrastive learning for attributed graph clustering can effectively exploit meaningful implicit relationships between data. However, existing attributed graph clustering methods based on contrastive learning suffer from the following drawbacks: 1) Complex data augmentation increases computational cost, and inappropriate data augmentation may lead to semantic drift. 2) The selection of positive and negative samples neglects the intrinsic cluster structure learned from graph topology and node attributes. Therefore, this paper proposes a method called self-supervised Attributed Graph Clustering with Dual Contrastive Loss constraints (AGC-DCL). Firstly, Siamese Multilayer Perceptron (MLP) encoders are employed to generate two views separately to avoid complex data augmentation. Secondly, the neighborhood contrastive loss is introduced to constrain node representation using local topological structure while effectively embedding attribute information through attribute reconstruction. Additionally, clustering-oriented contrastive loss is applied to fully utilize clustering information in global semantics for discriminative node representations, regarding the cluster centers from two views as negative samples to fully leverage effective clustering information from different views. Comparative clustering results with existing attributed graph clustering algorithms on six datasets demonstrate the superiority of the proposed method.

Keywords: attributed graph clustering, contrastive learning, clustering-oriented, self-supervised learning

Procedia PDF Downloads 43

6 Novel Hole-Bar Standard Design and Inter-Comparison for Geometric Errors Identification on Machine-Tool

Authors: F. Viprey, H. Nouira, S. Lavernhe, C. Tournier

Abstract:

Manufacturing of freeform parts may be achieved on 5-axis machine tools currently considered as a common means of production. In particular, the geometrical quality of the freeform parts depends on the accuracy of the multi-axis structural loop, which is composed of several component assemblies maintaining the relative positioning between the tool and the workpiece. Therefore, to reach high quality of the geometries of the freeform parts the geometric errors of the 5 axis machine should be evaluated and compensated, which leads one to master the deviations between the tool and the workpiece (volumetric accuracy). In this study, a novel hole-bar design was developed and used for the characterization of the geometric errors of a RRTTT 5-axis machine tool. The hole-bar standard design is made of Invar material, selected since it is less sensitive to thermal drift. The proposed design allows once to extract 3 intrinsic parameters: one linear positioning and two straightnesses. These parameters can be obtained by measuring the cylindricity of 12 holes (bores) and 11 cylinders located on a perpendicular plane. By mathematical analysis, twelve 3D points coordinates can be identified and correspond to the intersection of each hole axis with the least square plane passing through two perpendicular neighbour cylinders axes. The hole-bar was calibrated using a precision CMM at LNE traceable the SI meter definition. The reversal technique was applied in order to separate the error forms of the hole bar from the motion errors of the mechanical guiding systems. An inter-comparison was additionally conducted between four NMIs (National Metrology Institutes) within the EMRP IND62: JRP-TIM project. Afterwards, the hole-bar was integrated in RRTTT 5-axis machine tool to identify its volumetric errors. Measurements were carried out in real time and combine raw data acquired by the Renishaw RMP600 touch probe and the linear and rotary encoders. The geometric errors of the 5 axis machine were also evaluated by an accurate laser tracer interferometer system. The results were compared to those obtained with the hole bar.

Keywords: volumetric errors, CMM, 3D hole-bar, inter-comparison

Procedia PDF Downloads 380

5 An Absolute Femtosecond Rangefinder for Metrological Support in Coordinate Measurements

Authors: Denis A. Sokolov, Andrey V. Mazurkevich

Abstract:

In the modern world, there is an increasing demand for highly precise measurements in various fields, such as aircraft, shipbuilding, and rocket engineering. This has resulted in the development of appropriate measuring instruments that are capable of measuring the coordinates of objects within a range of up to 100 meters, with an accuracy of up to one micron. The calibration process for such optoelectronic measuring devices (trackers and total stations) involves comparing the measurement results from these devices to a reference measurement based on a linear or spatial basis. The reference used in such measurements could be a reference base or a reference range finder with the capability to measure angle increments (EDM). The base would serve as a set of reference points for this purpose. The concept of the EDM for replicating the unit of measurement has been implemented on a mobile platform, which allows for angular changes in the direction of laser radiation in two planes. To determine the distance to an object, a high-precision interferometer with its own design is employed. The laser radiation travels to the corner reflectors, which form a spatial reference with precisely known positions. When the femtosecond pulses from the reference arm and the measuring arm coincide, an interference signal is created, repeating at the frequency of the laser pulses. The distance between reference points determined by interference signals is calculated in accordance with recommendations from the International Bureau of Weights and Measures for the indirect measurement of time of light passage according to the definition of a meter. This distance is D/2 = c/2nF, approximately 2.5 meters, where c is the speed of light in a vacuum, n is the refractive index of a medium, and F is the frequency of femtosecond pulse repetition. The achieved uncertainty of type A measurement of the distance to reflectors 64 m (N•D/2, where N is an integer) away and spaced apart relative to each other at a distance of 1 m does not exceed 5 microns. The angular uncertainty is calculated theoretically since standard high-precision ring encoders will be used and are not a focus of research in this study. The Type B uncertainty components are not taken into account either, as the components that contribute most do not depend on the selected coordinate measuring method. This technology is being explored in the context of laboratory applications under controlled environmental conditions, where it is possible to achieve an advantage in terms of accuracy. In general, the EDM tests showed high accuracy, and theoretical calculations and experimental studies on an EDM prototype have shown that the uncertainty type A of distance measurements to reflectors can be less than 1 micrometer. The results of this research will be utilized to develop a highly accurate mobile absolute range finder designed for the calibration of high-precision laser trackers and laser rangefinders, as well as other equipment, using a 64 meter laboratory comparator as a reference.

Keywords: femtosecond laser, pulse correlation, interferometer, laser absolute range finder, coordinate measurement

Procedia PDF Downloads 52

4 Row Detection and Graph-Based Localization in Tree Nurseries Using a 3D LiDAR

Authors: Ionut Vintu, Stefan Laible, Ruth Schulz

Abstract:

Agricultural robotics has been developing steadily over recent years, with the goal of reducing and even eliminating pesticides used in crops and to increase productivity by taking over human labor. The majority of crops are arranged in rows. The first step towards autonomous robots, capable of driving in fields and performing crop-handling tasks, is for robots to robustly detect the rows of plants. Recent work done towards autonomous driving between plant rows offers big robotic platforms equipped with various expensive sensors as a solution to this problem. These platforms need to be driven over the rows of plants. This approach lacks flexibility and scalability when it comes to the height of plants or distance between rows. This paper proposes instead an algorithm that makes use of cheaper sensors and has a higher variability. The main application is in tree nurseries. Here, plant height can range from a few centimeters to a few meters. Moreover, trees are often removed, leading to gaps within the plant rows. The core idea is to combine row detection algorithms with graph-based localization methods as they are used in SLAM. Nodes in the graph represent the estimated pose of the robot, and the edges embed constraints between these poses or between the robot and certain landmarks. This setup aims to improve individual plant detection and deal with exception handling, like row gaps, which are falsely detected as an end of rows. Four methods were developed for detecting row structures in the fields, all using a point cloud acquired with a 3D LiDAR as an input. Comparing the field coverage and number of damaged plants, the method that uses a local map around the robot proved to perform the best, with 68% covered rows and 25% damaged plants. This method is further used and combined with a graph-based localization algorithm, which uses the local map features to estimate the robot’s position inside the greater field. Testing the upgraded algorithm in a variety of simulated fields shows that the additional information obtained from localization provides a boost in performance over methods that rely purely on perception to navigate. The final algorithm achieved a row coverage of 80% and an accuracy of 27% damaged plants. Future work would focus on achieving a perfect score of 100% covered rows and 0% damaged plants. The main challenges that the algorithm needs to overcome are fields where the height of the plants is too small for the plants to be detected and fields where it is hard to distinguish between individual plants when they are overlapping. The method was also tested on a real robot in a small field with artificial plants. The tests were performed using a small robot platform equipped with wheel encoders, an IMU and an FX10 3D LiDAR. Over ten runs, the system achieved 100% coverage and 0% damaged plants. The framework built within the scope of this work can be further used to integrate data from additional sensors, with the goal of achieving even better results.

Keywords: 3D LiDAR, agricultural robots, graph-based localization, row detection

Procedia PDF Downloads 135

3 High Speed Motion Tracking with Magnetometer in Nonuniform Magnetic Field

Authors: Jeronimo Cox, Tomonari Furukawa

Abstract:

Magnetometers have become more popular in inertial measurement units (IMU) for their ability to correct estimations using the earth's magnetic field. Accelerometer and gyroscope-based packages fail with dead-reckoning errors accumulated over time. Localization in robotic applications with magnetometer-inclusive IMUs has become popular as a way to track the odometry of slower-speed robots. With high-speed motions, the accumulated error increases over smaller periods of time, making them difficult to track with IMU. Tracking a high-speed motion is especially difficult with limited observability. Visual obstruction of motion leaves motion-tracking cameras unusable. When motions are too dynamic for estimation techniques reliant on the observability of the gravity vector, the use of magnetometers is further justified. As available magnetometer calibration methods are limited with the assumption that background magnetic fields are uniform, estimation in nonuniform magnetic fields is problematic. Hard iron distortion is a distortion of the magnetic field by other objects that produce magnetic fields. This kind of distortion is often observed as the offset from the origin of the center of data points when a magnetometer is rotated. The magnitude of hard iron distortion is dependent on proximity to distortion sources. Soft iron distortion is more related to the scaling of the axes of magnetometer sensors. Hard iron distortion is more of a contributor to the error of attitude estimation with magnetometers. Indoor environments or spaces inside ferrite-based structures, such as building reinforcements or a vehicle, often cause distortions with proximity. As positions correlate to areas of distortion, methods of magnetometer localization include the production of spatial mapping of magnetic field and collection of distortion signatures to better aid location tracking. The goal of this paper is to compare magnetometer methods that don't need pre-productions of magnetic field maps. Mapping the magnetic field in some spaces can be costly and inefficient. Dynamic measurement fusion is used to track the motion of a multi-link system with us. Conventional calibration by data collection of rotation at a static point, real-time estimation of calibration parameters each time step, and using two magnetometers for determining local hard iron distortion are compared to confirm the robustness and accuracy of each technique. With opposite-facing magnetometers, hard iron distortion can be accounted for regardless of position, Rather than assuming that hard iron distortion is constant regardless of positional change. The motion measured is a repeatable planar motion of a two-link system connected by revolute joints. The links are translated on a moving base to impulse rotation of the links. Equipping the joints with absolute encoders and recording the motion with cameras to enable ground truth comparison to each of the magnetometer methods. While the two-magnetometer method accounts for local hard iron distortion, the method fails where the magnetic field direction in space is inconsistent.

Keywords: motion tracking, sensor fusion, magnetometer, state estimation

Procedia PDF Downloads 78

2 The Integration of Apps for Communicative Competence in English Teaching

Authors: L. J. de Jager

Abstract:

In the South African English school curriculum, one of the aims is to achieve communicative competence, the knowledge of using language competently and appropriately in a speech community. Communicatively competent speakers should not only produce grammatically correct sentences but also produce contextually appropriate sentences for various purposes and in different situations. As most speakers of English are non-native speakers, achieving communicative competence remains a complex challenge. Moreover, the changing needs of society necessitate not merely language proficiency, but also technological proficiency. One of the burning issues in the South African educational landscape is the replacement of the standardised literacy model by the pedagogy of multiliteracies that incorporate, by default, the exploration of technological text forms that are part of learners’ everyday lives. It foresees learners as decoders, encoders, and manufacturers of their own futures by exploiting technological possibilities to constantly create and recreate meaning. As such, 21st century learners will feel comfortable working with multimodal texts that are intrinsically part of their lives and by doing so, become authors of their own learning experiences while teachers may become agents supporting learners to discover their capacity to acquire new digital skills for the century of multiliteracies. The aim is transformed practice where learners use their skills, ideas, and knowledge in new contexts. This paper reports on a research project on the integration of technology for language learning, based on the technological pedagogical content knowledge framework, conceptually founded in the theory of multiliteracies, and which aims to achieve communicative competence. The qualitative study uses the community of inquiry framework to answer the research question: How does the integration of technology transform language teaching of preservice teachers? Pre-service teachers in the Postgraduate Certificate of Education Programme with English as methodology were purposively selected to source and evaluate apps for teaching and learning English. The participants collaborated online in a dedicated Blackboard module, using discussion threads to sift through applicable apps and develop interactive lessons using the Apps. The selected apps were entered on to a predesigned Qualtrics form. Data from the online discussions, focus group interviews, and reflective journals were thematically and inductively analysed to determine the participants’ perceptions and experiences when integrating technology in lesson design and the extent to which communicative competence was achieved when using these apps. Findings indicate transformed practice among participants and research team members alike with a better than average technology acceptance and integration. Participants found value in online collaboration to develop and improve their own teaching practice by experiencing directly the benefits of integrating e-learning into the teaching of languages. It could not, however, be clearly determined whether communicative competence was improved. The findings of the project may potentially inform future e-learning activities, thus supporting student learning and development in follow-up cycles of the project.

Keywords: apps, communicative competence, English teaching, technology integration, technological pedagogical content knowledge

Procedia PDF Downloads 157

1 Black-Box-Optimization Approach for High Precision Multi-Axes Forward-Feed Design

Authors: Sebastian Kehne, Alexander Epple, Werner Herfs

Abstract:

A new method for optimal selection of components for multi-axes forward-feed drive systems is proposed in which the choice of motors, gear boxes and ball screw drives is optimized. Essential is here the synchronization of electrical and mechanical frequency behavior of all axes because even advanced controls (like H∞-controls) can only control a small part of the mechanical modes – namely only those of observable and controllable states whose value can be derived from the positions of extern linear length measurement systems and/or rotary encoders on the motor or gear box shafts. Further problems are the unknown processing forces like cutting forces in machine tools during normal operation which make the estimation and control via an observer even more difficult. To start with, the open source Modelica Feed Drive Library which was developed at the Laboratory for Machine Tools, and Production Engineering (WZL) is extended from one axis design to the multi axes design. It is capable to simulate the mechanical, electrical and thermal behavior of permanent magnet synchronous machines with inverters, different gear boxes and ball screw drives in a mechanical system. To keep the calculation time down analytical equations are used for field and torque producing equivalent circuit, heat dissipation and mechanical torque at the shaft. As a first step, a small machine tool with a working area of 635 x 315 x 420 mm is taken apart, and the mechanical transfer behavior is measured with an impulse hammer and acceleration sensors. With the frequency transfer functions, a mechanical finite element model is built up which is reduced with substructure coupling to a mass-damper system which models the most important modes of the axes. The model is modelled with Modelica Feed Drive Library and validated by further relative measurements between machine table and spindle holder with a piezo actor and acceleration sensors. In a next step, the choice of possible components in motor catalogues is limited by derived analytical formulas which are based on well-known metrics to gain effective power and torque of the components. The simulation in Modelica is run with different permanent magnet synchronous motors, gear boxes and ball screw drives from different suppliers. To speed up the optimization different black-box optimization methods (Surrogate-based, gradient-based and evolutionary) are tested on the case. The objective that was chosen is to minimize the integral of the deviations if a step is given on the position controls of the different axes. Small values are good measures for a high dynamic axes. In each iteration (evaluation of one set of components) the control variables are adjusted automatically to have an overshoot less than 1%. It is obtained that the order of the components in optimization problem has a deep impact on the speed of the black-box optimization. An approach to do efficient black-box optimization for multi-axes design is presented in the last part. The authors would like to thank the German Research Foundation DFG for financial support of the project “Optimierung des mechatronischen Entwurfs von mehrachsigen Antriebssystemen (HE 5386/14-1 | 6954/4-1)” (English: Optimization of the Mechatronic Design of Multi-Axes Drive Systems).

Keywords: ball screw drive design, discrete optimization, forward feed drives, gear box design, linear drives, machine tools, motor design, multi-axes design

Procedia PDF Downloads 281