Search results for: sesimic data processing
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 26447

Search results for: sesimic data processing

26087 Text Analysis to Support Structuring and Modelling a Public Policy Problem-Outline of an Algorithm to Extract Inferences from Textual Data

Authors: Claudia Ehrentraut, Osama Ibrahim, Hercules Dalianis

Abstract:

Policy making situations are real-world problems that exhibit complexity in that they are composed of many interrelated problems and issues. To be effective, policies must holistically address the complexity of the situation rather than propose solutions to single problems. Formulating and understanding the situation and its complex dynamics, therefore, is a key to finding holistic solutions. Analysis of text based information on the policy problem, using Natural Language Processing (NLP) and Text analysis techniques, can support modelling of public policy problem situations in a more objective way based on domain experts knowledge and scientific evidence. The objective behind this study is to support modelling of public policy problem situations, using text analysis of verbal descriptions of the problem. We propose a formal methodology for analysis of qualitative data from multiple information sources on a policy problem to construct a causal diagram of the problem. The analysis process aims at identifying key variables, linking them by cause-effect relationships and mapping that structure into a graphical representation that is adequate for designing action alternatives, i.e., policy options. This study describes the outline of an algorithm used to automate the initial step of a larger methodological approach, which is so far done manually. In this initial step, inferences about key variables and their interrelationships are extracted from textual data to support a better problem structuring. A small prototype for this step is also presented.

Keywords: public policy, problem structuring, qualitative analysis, natural language processing, algorithm, inference extraction

Procedia PDF Downloads 562
26086 Prediction of Remaining Life of Industrial Cutting Tools with Deep Learning-Assisted Image Processing Techniques

Authors: Gizem Eser Erdek

Abstract:

This study is research on predicting the remaining life of industrial cutting tools used in the industrial production process with deep learning methods. When the life of cutting tools decreases, they cause destruction to the raw material they are processing. This study it is aimed to predict the remaining life of the cutting tool based on the damage caused by the cutting tools to the raw material. For this, hole photos were collected from the hole-drilling machine for 8 months. Photos were labeled in 5 classes according to hole quality. In this way, the problem was transformed into a classification problem. Using the prepared data set, a model was created with convolutional neural networks, which is a deep learning method. In addition, VGGNet and ResNet architectures, which have been successful in the literature, have been tested on the data set. A hybrid model using convolutional neural networks and support vector machines is also used for comparison. When all models are compared, it has been determined that the model in which convolutional neural networks are used gives successful results of a %74 accuracy rate. In the preliminary studies, the data set was arranged to include only the best and worst classes, and the study gave ~93% accuracy when the binary classification model was applied. The results of this study showed that the remaining life of the cutting tools could be predicted by deep learning methods based on the damage to the raw material. Experiments have proven that deep learning methods can be used as an alternative for cutting tool life estimation.

Keywords: classification, convolutional neural network, deep learning, remaining life of industrial cutting tools, ResNet, support vector machine, VggNet

Procedia PDF Downloads 48
26085 Food Processing Technology and Packaging: A Case Study of Indian Cashew-Nut Industry

Authors: Parashram Jakappa Patil

Abstract:

India is the global leader in world cashew business and cashew-nut industry is one of the important food processing industries in world. However India is the largest producer, processor, exporter and importer eschew in the world. India is providing cashew to the rest of the world. India is meeting world demand of cashew. India has a tremendous potential of cashew production and export to other countries. Every year India earns more than 2000 cores rupees through cashew trade. Cashew industry is one of the important small scale industries in the country which is playing significant role in rural development. It is generating more than 400000 jobs at remote area and 95% cashew worker are women, it is giving income to poor cashew farmers, majority cashew processing units are small and cottage, it is helping to stop migration from young farmers for employment opportunities, it is motivation rural entrepreneurship development and it is also helping to environment protection etc. Hence India cashew business is very important agribusiness in India which has potential make inclusive development. World Bank and IMF recognized cashew-nut industry is one the important tool for poverty eradication at global level. It shows important of cashew business and its strong existence in India. In spite of such huge potential cashew processing industry is facing different problems such as lack of infrastructure ability, lack of supply of raw cashew, lack of availability of finance, collection of raw cashew, unavailability of warehouse, marketing of cashew kernels, lack of technical knowledge and especially processing technology and packaging of finished products. This industry has great prospects such as scope for more cashew cultivation and cashew production, employment generation, formation of cashew processing units, alcohols production from cashew apple, shield oil production, rural development, poverty elimination, development of social and economic backward class and environment protection etc. This industry has domestic as well as foreign market; India has tremendous potential in this regard. The cashew is a poor men’s crop but rich men’s food. The cashew is a source of income and livelihood for poor farmers. Cashew-nut industry may play very important role in the development of hilly region. The objectives of this paper are to identify problems of cashew processing and use of processing technology, problems of cashew kernel packaging, evolving of cashew processing technology over the year and its impact on final product and impact of good processing by adopting appropriate technology packaging on international trade of cashew-nut. The most important problem of cashew processing industry is that is processing and packaging. Bad processing reduce the quality of cashew kernel at large extent especially broken of cashew kernel which has very less price in market compare to whole cashew kernel and not eligible for export. On the other hand if there is no good packaging of cashew kernel will get moisture which destroy test of it. International trade of cashew-nut is depend of two things one is cashew processing and other is packaging. This study has strong relevance because cashew-nut industry is the labour oriented, where processing technology is not playing important role because 95% processing work is manual. Hence processing work was depending on physical performance of worker which makes presence of large workforce inevitable. There are many cashew processing units closed because they are not getting sufficient work force. However due to advancement in technology slowly this picture is changing and processing work get improve. Therefore it is interesting to explore all the aspects in context of cashew processing and packaging of cashew business.

Keywords: cashew, processing technology, packaging, international trade, change

Procedia PDF Downloads 397
26084 Inversion of Electrical Resistivity Data: A Review

Authors: Shrey Sharma, Gunjan Kumar Verma

Abstract:

High density electrical prospecting has been widely used in groundwater investigation, civil engineering and environmental survey. For efficient inversion, the forward modeling routine, sensitivity calculation, and inversion algorithm must be efficient. This paper attempts to provide a brief summary of the past and ongoing developments of the method. It includes reviews of the procedures used for data acquisition, processing and inversion of electrical resistivity data based on compilation of academic literature. In recent times there had been a significant evolution in field survey designs and data inversion techniques for the resistivity method. In general 2-D inversion for resistivity data is carried out using the linearized least-square method with the local optimization technique .Multi-electrode and multi-channel systems have made it possible to conduct large 2-D, 3-D and even 4-D surveys efficiently to resolve complex geological structures that were not possible with traditional 1-D surveys. 3-D surveys play an increasingly important role in very complex areas where 2-D models suffer from artifacts due to off-line structures. Continued developments in computation technology, as well as fast data inversion techniques and software, have made it possible to use optimization techniques to obtain model parameters to a higher accuracy. A brief discussion on the limitations of the electrical resistivity method has also been presented.

Keywords: inversion, limitations, optimization, resistivity

Procedia PDF Downloads 339
26083 Building Atmospheric Moisture Diagnostics: Environmental Monitoring and Data Collection

Authors: Paula Lopez-Arce, Hector Altamirano, Dimitrios Rovas, James Berry, Bryan Hindle, Steven Hodgson

Abstract:

Efficient mould remediation and accurate moisture diagnostics leading to condensation and mould growth in dwellings are largely untapped. Number of factors are contributing to the rising trend of excessive moisture in homes mainly linked with modern living, increased levels of occupation and rising fuel costs, as well as making homes more energy efficient. Environmental monitoring by means of data collection though loggers sensors and survey forms has been performed in a range of buildings from different UK regions. Air and surface temperature and relative humidity values of residential areas affected by condensation and/or mould issues were recorded. Additional measurements were taken through different trials changing type, location, and position of loggers. In some instances, IR thermal images and ventilation rates have also been acquired. Results have been interpreted together with environmental key parameters by processing and connecting data from loggers and survey questionnaires, both in buildings with and without moisture issues. Monitoring exercises carried out during Winter and Spring time show the importance of developing and following accurate protocols for guidance to obtain consistent, repeatable and comparable results and to improve the performance of environmental monitoring. A model and a protocol are being developed to build a diagnostic tool with the goal of performing a simple but precise residential atmospheric moisture diagnostics to distinguish the cause entailing condensation and mould generation, i.e., ventilation, insulation or heating systems issue. This research shows the relevance of monitoring and processing environmental data to assign moisture risk levels and determine the origin of condensation or mould when dealing with a building atmospheric moisture excess.

Keywords: environmental monitoring, atmospheric moisture, protocols, mould

Procedia PDF Downloads 116
26082 Surveillance of Super-Extended Objects: Bimodal Approach

Authors: Andrey V. Timofeev, Dmitry Egorov

Abstract:

This paper describes an effective solution to the task of a remote monitoring of super-extended objects (oil and gas pipeline, railways, national frontier). The suggested solution is based on the principle of simultaneously monitoring of seismoacoustic and optical/infrared physical fields. The principle of simultaneous monitoring of those fields is not new but in contrast to the known solutions the suggested approach allows to control super-extended objects with very limited operational costs. So-called C-OTDR (Coherent Optical Time Domain Reflectometer) systems are used to monitor the seismoacoustic field. Far-CCTV systems are used to monitor the optical/infrared field. A simultaneous data processing provided by both systems allows effectively detecting and classifying target activities, which appear in the monitored objects vicinity. The results of practical usage had shown high effectiveness of the suggested approach.

Keywords: C-OTDR monitoring system, bimodal processing, LPboost, SVM

Procedia PDF Downloads 444
26081 VIAN-DH: Computational Multimodal Conversation Analysis Software and Infrastructure

Authors: Teodora Vukovic, Christoph Hottiger, Noah Bubenhofer

Abstract:

The development of VIAN-DH aims at bridging two linguistic approaches: conversation analysis/interactional linguistics (IL), so far a dominantly qualitative field, and computational/corpus linguistics and its quantitative and automated methods. Contemporary IL investigates the systematic organization of conversations and interactions composed of speech, gaze, gestures, and body positioning, among others. These highly integrated multimodal behaviour is analysed based on video data aimed at uncovering so called “multimodal gestalts”, patterns of linguistic and embodied conduct that reoccur in specific sequential positions employed for specific purposes. Multimodal analyses (and other disciplines using videos) are so far dependent on time and resource intensive processes of manual transcription of each component from video materials. Automating these tasks requires advanced programming skills, which is often not in the scope of IL. Moreover, the use of different tools makes the integration and analysis of different formats challenging. Consequently, IL research often deals with relatively small samples of annotated data which are suitable for qualitative analysis but not enough for making generalized empirical claims derived quantitatively. VIAN-DH aims to create a workspace where many annotation layers required for the multimodal analysis of videos can be created, processed, and correlated in one platform. VIAN-DH will provide a graphical interface that operates state-of-the-art tools for automating parts of the data processing. The integration of tools that already exist in computational linguistics and computer vision, facilitates data processing for researchers lacking programming skills, speeds up the overall research process, and enables the processing of large amounts of data. The main features to be introduced are automatic speech recognition for the transcription of language, automatic image recognition for extraction of gestures and other visual cues, as well as grammatical annotation for adding morphological and syntactic information to the verbal content. In the ongoing instance of VIAN-DH, we focus on gesture extraction (pointing gestures, in particular), making use of existing models created for sign language and adapting them for this specific purpose. In order to view and search the data, VIAN-DH will provide a unified format and enable the import of the main existing formats of annotated video data and the export to other formats used in the field, while integrating different data source formats in a way that they can be combined in research. VIAN-DH will adapt querying methods from corpus linguistics to enable parallel search of many annotation levels, combining token-level and chronological search for various types of data. VIAN-DH strives to bring crucial and potentially revolutionary innovation to the field of IL, (that can also extend to other fields using video materials). It will allow the processing of large amounts of data automatically and, the implementation of quantitative analyses, combining it with the qualitative approach. It will facilitate the investigation of correlations between linguistic patterns (lexical or grammatical) with conversational aspects (turn-taking or gestures). Users will be able to automatically transcribe and annotate visual, spoken and grammatical information from videos, and to correlate those different levels and perform queries and analyses.

Keywords: multimodal analysis, corpus linguistics, computational linguistics, image recognition, speech recognition

Procedia PDF Downloads 78
26080 Analysis of Big Data

Authors: Sandeep Sharma, Sarabjit Singh

Abstract:

As per the user demand and growth trends of large free data the storage solutions are now becoming more challenge-able to protect, store and to retrieve data. The days are not so far when the storage companies and organizations are start saying 'no' to store our valuable data or they will start charging a huge amount for its storage and protection. On the other hand as per the environmental conditions it becomes challenge-able to maintain and establish new data warehouses and data centers to protect global warming threats. A challenge of small data is over now, the challenges are big that how to manage the exponential growth of data. In this paper we have analyzed the growth trend of big data and its future implications. We have also focused on the impact of the unstructured data on various concerns and we have also suggested some possible remedies to streamline big data.

Keywords: big data, unstructured data, volume, variety, velocity

Procedia PDF Downloads 517
26079 Spontaneous Message Detection of Annoying Situation in Community Networks Using Mining Algorithm

Authors: P. Senthil Kumari

Abstract:

Main concerns in data mining investigation are social controls of data mining for handling ambiguity, noise, or incompleteness on text data. We describe an innovative approach for unplanned text data detection of community networks achieved by classification mechanism. In a tangible domain claim with humble secrecy backgrounds provided by community network for evading annoying content is presented on consumer message partition. To avoid this, mining methodology provides the capability to unswervingly switch the messages and similarly recover the superiority of ordering. Here we designated learning-centered mining approaches with pre-processing technique to complete this effort. Our involvement of work compact with rule-based personalization for automatic text categorization which was appropriate in many dissimilar frameworks and offers tolerance value for permits the background of comments conferring to a variety of conditions associated with the policy or rule arrangements processed by learning algorithm. Remarkably, we find that the choice of classifier has predicted the class labels for control of the inadequate documents on community network with great value of effect.

Keywords: text mining, data classification, community network, learning algorithm

Procedia PDF Downloads 479
26078 Constructing the Density of States from the Parallel Wang Landau Algorithm Overlapping Data

Authors: Arman S. Kussainov, Altynbek K. Beisekov

Abstract:

This work focuses on building an efficient universal procedure to construct a single density of states from the multiple pieces of data provided by the parallel implementation of the Wang Landau Monte Carlo based algorithm. The Ising and Pott models were used as the examples of the two-dimensional spin lattices to construct their densities of states. Sampled energy space was distributed between the individual walkers with certain overlaps. This was made to include the latest development of the algorithm as the density of states replica exchange technique. Several factors of immediate importance for the seamless stitching process have being considered. These include but not limited to the speed and universality of the initial parallel algorithm implementation as well as the data post-processing to produce the expected smooth density of states.

Keywords: density of states, Monte Carlo, parallel algorithm, Wang Landau algorithm

Procedia PDF Downloads 370
26077 Framework for Socio-Technical Issues in Requirements Engineering for Developing Resilient Machine Vision Systems Using Levels of Automation through the Lifecycle

Authors: Ryan Messina, Mehedi Hasan

Abstract:

This research is to examine the impacts of using data to generate performance requirements for automation in visual inspections using machine vision. These situations are intended for design and how projects can smooth the transfer of tacit knowledge to using an algorithm. We have proposed a framework when specifying machine vision systems. This framework utilizes varying levels of automation as contingency planning to reduce data processing complexity. Using data assists in extracting tacit knowledge from those who can perform the manual tasks to assist design the system; this means that real data from the system is always referenced and minimizes errors between participating parties. We propose using three indicators to know if the project has a high risk of failing to meet requirements related to accuracy and reliability. All systems tested achieved a better integration into operations after applying the framework.

Keywords: automation, contingency planning, continuous engineering, control theory, machine vision, system requirements, system thinking

Procedia PDF Downloads 173
26076 A Relative Entropy Regularization Approach for Fuzzy C-Means Clustering Problem

Authors: Ouafa Amira, Jiangshe Zhang

Abstract:

Clustering is an unsupervised machine learning technique; its aim is to extract the data structures, in which similar data objects are grouped in the same cluster, whereas dissimilar objects are grouped in different clusters. Clustering methods are widely utilized in different fields, such as: image processing, computer vision , and pattern recognition, etc. Fuzzy c-means clustering (fcm) is one of the most well known fuzzy clustering methods. It is based on solving an optimization problem, in which a minimization of a given cost function has been studied. This minimization aims to decrease the dissimilarity inside clusters, where the dissimilarity here is measured by the distances between data objects and cluster centers. The degree of belonging of a data point in a cluster is measured by a membership function which is included in the interval [0, 1]. In fcm clustering, the membership degree is constrained with the condition that the sum of a data object’s memberships in all clusters must be equal to one. This constraint can cause several problems, specially when our data objects are included in a noisy space. Regularization approach took a part in fuzzy c-means clustering technique. This process introduces an additional information in order to solve an ill-posed optimization problem. In this study, we focus on regularization by relative entropy approach, where in our optimization problem we aim to minimize the dissimilarity inside clusters. Finding an appropriate membership degree to each data object is our objective, because an appropriate membership degree leads to an accurate clustering result. Our clustering results in synthetic data sets, gaussian based data sets, and real world data sets show that our proposed model achieves a good accuracy.

Keywords: clustering, fuzzy c-means, regularization, relative entropy

Procedia PDF Downloads 242
26075 Instant Location Detection of Objects Moving at High Speed in C-OTDR Monitoring Systems

Authors: Andrey V. Timofeev

Abstract:

The practical efficient approach is suggested to estimate the high-speed objects instant bounds in C-OTDR monitoring systems. In case of super-dynamic objects (trains, cars) is difficult to obtain the adequate estimate of the instantaneous object localization because of estimation lag. In other words, reliable estimation coordinates of monitored object requires taking some time for data observation collection by means of C-OTDR system, and only if the required sample volume will be collected the final decision could be issued. But it is contrary to requirements of many real applications. For example, in rail traffic management systems we need to get data off the dynamic objects localization in real time. The way to solve this problem is to use the set of statistical independent parameters of C-OTDR signals for obtaining the most reliable solution in real time. The parameters of this type we can call as 'signaling parameters' (SP). There are several the SP’s which carry information about dynamic objects instant localization for each of C-OTDR channels. The problem is that some of these parameters are very sensitive to dynamics of seismoacoustic emission sources but are non-stable. On the other hand, in case the SP is very stable it becomes insensitive as a rule. This report contains describing the method for SP’s co-processing which is designed to get the most effective dynamic objects localization estimates in the C-OTDR monitoring system framework.

Keywords: C-OTDR-system, co-processing of signaling parameters, high-speed objects localization, multichannel monitoring systems

Procedia PDF Downloads 444
26074 A Novel Approach to Design of EDDR Architecture for High Speed Motion Estimation Testing Applications

Authors: T. Gangadhararao, K. Krishna Kishore

Abstract:

Motion Estimation (ME) plays a critical role in a video coder, testing such a module is of priority concern. While focusing on the testing of ME in a video coding system, this work presents an error detection and data recovery (EDDR) design, based on the residue-and-quotient (RQ) code, to embed into ME for video coding testing applications. An error in processing Elements (PEs), i.e. key components of a ME, can be detected and recovered effectively by using the proposed EDDR design. The proposed EDDR design for ME testing can detect errors and recover data with an acceptable area overhead and timing penalty.

Keywords: area overhead, data recovery, error detection, motion estimation, reliability, residue-and-quotient (RQ) code

Procedia PDF Downloads 404
26073 Biogas Control: Methane Production Monitoring Using Arduino

Authors: W. Ait Ahmed, M. Aggour, M. Naciri

Abstract:

Extracting energy from biomass is an important alternative to produce different types of energy (heat, electricity, or both) assuring low pollution and better efficiency. It is a new yet reliable approach to reduce green gas emission by extracting methane from industry effluents and use it to power machinery. We focused in our project on using paper and mill effluents, treated in a UASB reactor. The methane produced is used in the factory’s power supply. The aim of this work is to develop an electronic system using Arduino platform connected to a gas sensor, to measure and display the curve of daily methane production on processing. The sensor will send the gas values in ppm to the Arduino board so that the later sends the RS232 hardware protocol. The code developed with processing will transform the values into a curve and display it on the computer screen.

Keywords: biogas, Arduino, processing, code, methane, gas sensor, program

Procedia PDF Downloads 284
26072 Morphological Processing of Punjabi Text for Sentiment Analysis of Farmer Suicides

Authors: Jaspreet Singh, Gurvinder Singh, Prabhsimran Singh, Rajinder Singh, Prithvipal Singh, Karanjeet Singh Kahlon, Ravinder Singh Sawhney

Abstract:

Morphological evaluation of Indian languages is one of the burgeoning fields in the area of Natural Language Processing (NLP). The evaluation of a language is an eminent task in the era of information retrieval and text mining. The extraction and classification of knowledge from text can be exploited for sentiment analysis and morphological evaluation. This study coalesce morphological evaluation and sentiment analysis for the task of classification of farmer suicide cases reported in Punjab state of India. The pre-processing of Punjabi text involves morphological evaluation and normalization of Punjabi word tokens followed by the training of proposed model using deep learning classification on Punjabi language text extracted from online Punjabi news reports. The class-wise accuracies of sentiment prediction for four negatively oriented classes of farmer suicide cases are 93.85%, 88.53%, 83.3%, and 95.45% respectively. The overall accuracy of sentiment classification obtained using proposed framework on 275 Punjabi text documents is found to be 90.29%.

Keywords: deep neural network, farmer suicides, morphological processing, punjabi text, sentiment analysis

Procedia PDF Downloads 290
26071 Microarray Data Visualization and Preprocessing Using R and Bioconductor

Authors: Ruchi Yadav, Shivani Pandey, Prachi Srivastava

Abstract:

Microarrays provide a rich source of data on the molecular working of cells. Each microarray reports on the abundance of tens of thousands of mRNAs. Virtually every human disease is being studied using microarrays with the hope of finding the molecular mechanisms of disease. Bioinformatics analysis plays an important part of processing the information embedded in large-scale expression profiling studies and for laying the foundation for biological interpretation. A basic, yet challenging task in the analysis of microarray gene expression data is the identification of changes in gene expression that are associated with particular biological conditions. Careful statistical design and analysis are essential to improve the efficiency and reliability of microarray experiments throughout the data acquisition and analysis process. One of the most popular platforms for microarray analysis is Bioconductor, an open source and open development software project based on the R programming language. This paper describes specific procedures for conducting quality assessment, visualization and preprocessing of Affymetrix Gene Chip and also details the different bioconductor packages used to analyze affymetrix microarray data and describe the analysis and outcome of each plots.

Keywords: microarray analysis, R language, affymetrix visualization, bioconductor

Procedia PDF Downloads 453
26070 Image Rotation Using an Augmented 2-Step Shear Transform

Authors: Hee-Choul Kwon, Heeyong Kwon

Abstract:

Image rotation is one of main pre-processing steps for image processing or image pattern recognition. It is implemented with a rotation matrix multiplication. It requires a lot of floating point arithmetic operations and trigonometric calculations, so it takes a long time to execute. Therefore, there has been a need for a high speed image rotation algorithm without two major time-consuming operations. However, the rotated image has a drawback, i.e. distortions. We solved the problem using an augmented two-step shear transform. We compare the presented algorithm with the conventional rotation with images of various sizes. Experimental results show that the presented algorithm is superior to the conventional rotation one.

Keywords: high-speed rotation operation, image rotation, transform matrix, image processing, pattern recognition

Procedia PDF Downloads 244
26069 Cloud Shield: Model to Secure User Data While Using Content Delivery Network Services

Authors: Rachna Jain, Sushila Madan, Bindu Garg

Abstract:

Cloud computing is the key powerhouse in numerous organizations due to shifting of their data to the cloud environment. In recent years it has been observed that cloud-based-services are being used on large scale for content storage, distribution and processing. Various issues have been observed in cloud computing environment that need to be addressed. Security and privacy are found topmost concern area. In this paper, a novel security model is proposed to secure data by utilizing CDN services like image to icon conversion. CDN Service is a content delivery service which converts an image to icon, word to pdf & Latex to pdf etc. Presented model is used to convert an image into icon by keeping image secret. Here security of image is imparted so that image should be encrypted and decrypted by data owners only. It is also discussed in the paper that how server performs multiplication and selection on encrypted data without decryption. The data can be image file, word file, audio or video file. Moreover, the proposed model is capable enough to multiply images, encrypt them and send to a server application for conversion. Eventually, the prime objective is to encrypt an image and convert the encrypted image to image Icon by utilizing homomorphic encryption.

Keywords: cloud computing, user data security, homomorphic encryption, image multiplication, CDN service

Procedia PDF Downloads 316
26068 Research of Data Cleaning Methods Based on Dependency Rules

Authors: Yang Bao, Shi Wei Deng, WangQun Lin

Abstract:

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSQL), and gives 6 data cleaning methods based on these algorithms.

Keywords: data cleaning, dependency rules, violation data discovery, data repair

Procedia PDF Downloads 536
26067 Local Image Features Emerging from Brain Inspired Multi-Layer Neural Network

Authors: Hui Wei, Zheng Dong

Abstract:

Object recognition has long been a challenging task in computer vision. Yet the human brain, with the ability to rapidly and accurately recognize visual stimuli, manages this task effortlessly. In the past decades, advances in neuroscience have revealed some neural mechanisms underlying visual processing. In this paper, we present a novel model inspired by the visual pathway in primate brains. This multi-layer neural network model imitates the hierarchical convergent processing mechanism in the visual pathway. We show that local image features generated by this model exhibit robust discrimination and even better generalization ability compared with some existing image descriptors. We also demonstrate the application of this model in an object recognition task on image data sets. The result provides strong support for the potential of this model.

Keywords: biological model, feature extraction, multi-layer neural network, object recognition

Procedia PDF Downloads 518
26066 Social-Cognitive Aspects of Interpretation: Didactic Approaches in Language Processing and English as a Second Language Difficulties in Dyslexia

Authors: Schnell Zsuzsanna

Abstract:

Background: The interpretation of written texts, language processing in the visual domain, in other words, atypical reading abilities, also known as dyslexia, is an ever-growing phenomenon in today’s societies and educational communities. The much-researched problem affects cognitive abilities and, coupled with normal intelligence normally manifests difficulties in the differentiation of sounds and orthography and in the holistic processing of written words. The factors of susceptibility are varied: social, cognitive psychological, and linguistic factors interact with each other. Methods: The research will explain the psycholinguistics of dyslexia on the basis of several empirical experiments and demonstrate how domain-general abilities of inhibition, retrieval from the mental lexicon, priming, phonological processing, and visual modality transfer affect successful language processing and interpretation. Interpretation of visual stimuli is hindered, and the problem seems to be embedded in a sociocultural, psycholinguistic, and cognitive background. This makes the picture even more complex, suggesting that the understanding and resolving of the issues of dyslexia has to be interdisciplinary, aided by several disciplines in the field of humanities and social sciences, and should be researched from an empirical approach, where the practical, educational corollaries can be analyzed on an applied basis. Aim and applicability: The lecture sheds light on the applied, cognitive aspects of interpretation, social cognitive traits of language processing, the mental underpinnings of cognitive interpretation strategies in different languages (namely, Hungarian and English), offering solutions with a few applied techniques for success in foreign language learning that can be useful advice for the developers of testing methodologies and measures across ESL teaching and testing platforms.

Keywords: dyslexia, social cognition, transparency, modalities

Procedia PDF Downloads 56
26065 Reading Comprehension in Profound Deaf Readers

Authors: S. Raghibdoust, E. Kamari

Abstract:

Research show that reduced functional hearing has a detrimental influence on the ability of an individual to establish proper phonological representations of words, since the phonological representations are claimed to mediate the conceptual processing of written words. Word processing efficiency is expected to decrease with a decrease in functional hearing. In other words, it is predicted that hearing individuals would be more capable of word processing than individuals with hearing loss, as their functional hearing works normally. Studies also demonstrate that the quality of the functional hearing affects reading comprehension via its effect on their word processing skills. In other words, better hearing facilitates the development of phonological knowledge, and can promote enhanced strategies for the recognition of written words, which in turn positively affect higher-order processes underlying reading comprehension. The aims of this study were to investigate and compare the effect of deafness on the participants’ abilities to process written words at the lexical and sentence levels through using two online and one offline reading comprehension tests. The performance of a group of 8 deaf male students (ages 8-12) was compared with that of a control group of normal hearing male students. All the participants had normal IQ and visual status, and came from an average socioeconomic background. None were diagnosed with a particular learning or motor disability. The language spoken in the homes of all participants was Persian. Two tests of word processing were developed and presented to the participants using OpenSesame software, in order to measure the speed and accuracy of their performance at the two perceptual and conceptual levels. In the third offline test of reading comprehension which comprised of semantically plausible and semantically implausible subject relative clauses, the participants had to select the correct answer out of two choices. The data derived from the statistical analysis using SPSS software indicated that hearing and deaf participants had a similar word processing performance both in terms of speed and accuracy of their responses. The results also showed that there was no significant difference between the performance of the deaf and hearing participants in comprehending semantically plausible sentences (p > 0/05). However, a significant difference between the performances of the two groups was observed with respect to their comprehension of semantically implausible sentences (p < 0/05). In sum, the findings revealed that the seriously impoverished sentence reading ability characterizing the profound deaf subjects of the present research, exhibited their reliance on reading strategies that are based on insufficient or deviant structural knowledge, in particular in processing semantically implausible sentences, rather than a failure to efficiently process written words at the lexical level. This conclusion, of course, does not mean to say that deaf individuals may never experience deficits at the word processing level, deficits that impede their understanding of written texts. However, as stated in previous researches, it sounds reasonable to assume that the more deaf individuals get familiar with written words, the better they can recognize them, despite having a profound phonological weakness.

Keywords: deafness, reading comprehension, reading strategy, word processing, subject and object relative sentences

Procedia PDF Downloads 308
26064 Ice Load Measurements on Known Structures Using Image Processing Methods

Authors: Azam Fazelpour, Saeed R. Dehghani, Vlastimil Masek, Yuri S. Muzychka

Abstract:

This study employs a method based on image analyses and structure information to detect accumulated ice on known structures. The icing of marine vessels and offshore structures causes significant reductions in their efficiency and creates unsafe working conditions. Image processing methods are used to measure ice loads automatically. Most image processing methods are developed based on captured image analyses. In this method, ice loads on structures are calculated by defining structure coordinates and processing captured images. A pyramidal structure is designed with nine cylindrical bars as the known structure of experimental setup. Unsymmetrical ice accumulated on the structure in a cold room represents the actual case of experiments. Camera intrinsic and extrinsic parameters are used to define structure coordinates in the image coordinate system according to the camera location and angle. The thresholding method is applied to capture images and detect iced structures in a binary image. The ice thickness of each element is calculated by combining the information from the binary image and the structure coordinate. Averaging ice diameters from different camera views obtains ice thicknesses of structure elements. Comparison between ice load measurements using this method and the actual ice loads shows positive correlations with an acceptable range of error. The method can be applied to complex structures defining structure and camera coordinates.

Keywords: camera calibration, ice detection, ice load measurements, image processing

Procedia PDF Downloads 345
26063 Integrated On-Board Diagnostic-II and Direct Controller Area Network Access for Vehicle Monitoring System

Authors: Kavian Khosravinia, Mohd Khair Hassan, Ribhan Zafira Abdul Rahman, Syed Abdul Rahman Al-Haddad

Abstract:

The CAN (controller area network) bus is introduced as a multi-master, message broadcast system. The messages sent on the CAN are used to communicate state information, referred as a signal between different ECUs, which provides data consistency in every node of the system. OBD-II Dongles that are based on request and response method is the wide-spread solution for extracting sensor data from cars among researchers. Unfortunately, most of the past researches do not consider resolution and quantity of their input data extracted through OBD-II technology. The maximum feasible scan rate is only 9 queries per second which provide 8 data points per second with using ELM327 as well-known OBD-II dongle. This study aims to develop and design a programmable, and latency-sensitive vehicle data acquisition system that improves the modularity and flexibility to extract exact, trustworthy, and fresh car sensor data with higher frequency rates. Furthermore, the researcher must break apart, thoroughly inspect, and observe the internal network of the vehicle, which may cause severe damages to the expensive ECUs of the vehicle due to intrinsic vulnerabilities of the CAN bus during initial research. Desired sensors data were collected from various vehicles utilizing Raspberry Pi3 as computing and processing unit with using OBD (request-response) and direct CAN method at the same time. Two types of data were collected for this study. The first, CAN bus frame data that illustrates data collected for each line of hex data sent from an ECU and the second type is the OBD data that represents some limited data that is requested from ECU under standard condition. The proposed system is reconfigurable, human-readable and multi-task telematics device that can be fitted into any vehicle with minimum effort and minimum time lag in the data extraction process. The standard operational procedure experimental vehicle network test bench is developed and can be used for future vehicle network testing experiment.

Keywords: CAN bus, OBD-II, vehicle data acquisition, connected cars, telemetry, Raspberry Pi3

Procedia PDF Downloads 174
26062 Glucose Monitoring System Using Machine Learning Algorithms

Authors: Sangeeta Palekar, Neeraj Rangwani, Akash Poddar, Jayu Kalambe

Abstract:

The bio-medical analysis is an indispensable procedure for identifying health-related diseases like diabetes. Monitoring the glucose level in our body regularly helps us identify hyperglycemia and hypoglycemia, which can cause severe medical problems like nerve damage or kidney diseases. This paper presents a method for predicting the glucose concentration in blood samples using image processing and machine learning algorithms. The glucose solution is prepared by the glucose oxidase (GOD) and peroxidase (POD) method. An experimental database is generated based on the colorimetric technique. The image of the glucose solution is captured by the raspberry pi camera and analyzed using image processing by extracting the RGB, HSV, LUX color space values. Regression algorithms like multiple linear regression, decision tree, RandomForest, and XGBoost were used to predict the unknown glucose concentration. The multiple linear regression algorithm predicts the results with 97% accuracy. The image processing and machine learning-based approach reduce the hardware complexities of existing platforms.

Keywords: artificial intelligence glucose detection, glucose oxidase, peroxidase, image processing, machine learning

Procedia PDF Downloads 171
26061 The Amount of Information Processing and Balance Performance in Children: The Dual-Task Paradigm

Authors: Chin-Chih Chiou, Tai-Yuan Su, Ti-Yu Chen, Wen-Yu Chiu, Chungyu Chen

Abstract:

The purpose of this study was to investigate the effect of reaction time (RT) or balance performance as the number of stimulus-response choices increases, the amount of information processing of 0-bit and 1-bit conditions based on Hick’s law, using the dual-task design. Eighteen children (age: 9.38 ± 0.27 years old) were recruited as the participants for this study, and asked to assess RT and balance performance separately and simultaneously as following five conditions: simple RT (0-bit decision), choice RT (1-bit decision), single balance control, balance control with simple RT, and balance control with choice RT. Biodex 950-300 balance system and You-Shang response timer were used to record and analyze the postural stability and information processing speed (RT) respectively for the participants. Repeated measures one-way ANOVA with HSD post-hoc test and 2 (balance) × 2 (amount of information processing) repeated measures two-way ANOVA were used to test the parameters of balance performance and RT (α = .05). The results showed the overall stability index in the 1-bit decision was lower than in 0-bit decision, and the mean deflection in the 1-bit decision was lower than in single balance performance. Simple RTs were faster than choice RTs both in single task condition and dual task condition. It indicated that the chronometric approach of RT could use to infer the attention requirement of the secondary task. However, this study did not find that the balance performance is interfered for children by the increasing of the amount of information processing.

Keywords: capacity theory, reaction time, Hick’s law, balance

Procedia PDF Downloads 425
26060 Data Analysis Tool for Predicting Water Scarcity in Industry

Authors: Tassadit Issaadi Hamitouche, Nicolas Gillard, Jean Petit, Valerie Lavaste, Celine Mayousse

Abstract:

Water is a fundamental resource for the industry. It is taken from the environment either from municipal distribution networks or from various natural water sources such as the sea, ocean, rivers, aquifers, etc. Once used, water is discharged into the environment, reprocessed at the plant or treatment plants. These withdrawals and discharges have a direct impact on natural water resources. These impacts can apply to the quantity of water available, the quality of the water used, or to impacts that are more complex to measure and less direct, such as the health of the population downstream from the watercourse, for example. Based on the analysis of data (meteorological, river characteristics, physicochemical substances), we wish to predict water stress episodes and anticipate prefectoral decrees, which can impact the performance of plants and propose improvement solutions, help industrialists in their choice of location for a new plant, visualize possible interactions between companies to optimize exchanges and encourage the pooling of water treatment solutions, and set up circular economies around the issue of water. The development of a system for the collection, processing, and use of data related to water resources requires the functional constraints specific to the latter to be made explicit. Thus the system will have to be able to store a large amount of data from sensors (which is the main type of data in plants and their environment). In addition, manufacturers need to have 'near-real-time' processing of information in order to be able to make the best decisions (to be rapidly notified of an event that would have a significant impact on water resources). Finally, the visualization of data must be adapted to its temporal and geographical dimensions. In this study, we set up an infrastructure centered on the TICK application stack (for Telegraf, InfluxDB, Chronograf, and Kapacitor), which is a set of loosely coupled but tightly integrated open source projects designed to manage huge amounts of time-stamped information. The software architecture is coupled with the cross-industry standard process for data mining (CRISP-DM) data mining methodology. The robust architecture and the methodology used have demonstrated their effectiveness on the study case of learning the level of a river with a 7-day horizon. The management of water and the activities within the plants -which depend on this resource- should be considerably improved thanks, on the one hand, to the learning that allows the anticipation of periods of water stress, and on the other hand, to the information system that is able to warn decision-makers with alerts created from the formalization of prefectoral decrees.

Keywords: data mining, industry, machine Learning, shortage, water resources

Procedia PDF Downloads 96
26059 Massively-Parallel Bit-Serial Neural Networks for Fast Epilepsy Diagnosis: A Feasibility Study

Authors: Si Mon Kueh, Tom J. Kazmierski

Abstract:

There are about 1% of the world population suffering from the hidden disability known as epilepsy and major developing countries are not fully equipped to counter this problem. In order to reduce the inconvenience and danger of epilepsy, different methods have been researched by using a artificial neural network (ANN) classification to distinguish epileptic waveforms from normal brain waveforms. This paper outlines the aim of achieving massive ANN parallelization through a dedicated hardware using bit-serial processing. The design of this bit-serial Neural Processing Element (NPE) is presented which implements the functionality of a complete neuron using variable accuracy. The proposed design has been tested taking into consideration non-idealities of a hardware ANN. The NPE consists of a bit-serial multiplier which uses only 16 logic elements on an Altera Cyclone IV FPGA and a bit-serial ALU as well as a look-up table. Arrays of NPEs can be driven by a single controller which executes the neural processing algorithm. In conclusion, the proposed compact NPE design allows the construction of complex hardware ANNs that can be implemented in a portable equipment that suits the needs of a single epileptic patient in his or her daily activities to predict the occurrences of impending tonic conic seizures.

Keywords: Artificial Neural Networks (ANN), bit-serial neural processor, FPGA, Neural Processing Element (NPE)

Procedia PDF Downloads 295
26058 Hierarchical Piecewise Linear Representation of Time Series Data

Authors: Vineetha Bettaiah, Heggere S. Ranganath

Abstract:

This paper presents a Hierarchical Piecewise Linear Approximation (HPLA) for the representation of time series data in which the time series is treated as a curve in the time-amplitude image space. The curve is partitioned into segments by choosing perceptually important points as break points. Each segment between adjacent break points is recursively partitioned into two segments at the best point or midpoint until the error between the approximating line and the original curve becomes less than a pre-specified threshold. The HPLA representation achieves dimensionality reduction while preserving prominent local features and general shape of time series. The representation permits course-fine processing at different levels of details, allows flexible definition of similarity based on mathematical measures or general time series shape, and supports time series data mining operations including query by content, clustering and classification based on whole or subsequence similarity.

Keywords: data mining, dimensionality reduction, piecewise linear representation, time series representation

Procedia PDF Downloads 249