Search results for: data recognition
25074 Timing and Noise Data Mining Algorithm and Software Tool in Very Large Scale Integration (VLSI) Design
Authors: Qing K. Zhu
Abstract:
Very Large Scale Integration (VLSI) design becomes very complex due to the continuous integration of millions of gates in one chip based on Moore’s law. Designers have encountered numerous report files during design iterations using timing and noise analysis tools. This paper presented our work using data mining techniques combined with HTML tables to extract and represent critical timing/noise data. When we apply this data-mining tool in real applications, the running speed is important. The software employs table look-up techniques in the programming for the reasonable running speed based on performance testing results. We added several advanced features for the application in one industry chip design.Keywords: VLSI design, data mining, big data, HTML forms, web, VLSI, EDA, timing, noise
Procedia PDF Downloads 25225073 Introduction of Electronic Health Records to Improve Data Quality in Emergency Department Operations
Authors: Anuruddha Jagoda, Samiddhi Samarakoon, Anil Jasinghe
Abstract:
In its simplest form, data quality can be defined as 'fitness for use' and it is a concept with multi-dimensions. Emergency Departments(ED) require information to treat patients and on the other hand it is the primary source of information regarding accidents, injuries, emergencies etc. Also, it is the starting point of various patient registries, databases and surveillance systems. This interventional study was carried out to improve data quality at the ED of the National Hospital of Sri Lanka (NHSL) by introducing an e health solution to improve data quality. The NHSL is the premier trauma care centre in Sri Lanka. The study consisted of three components. A research study was conducted to assess the quality of data in relation to selected five dimensions of data quality namely accuracy, completeness, timeliness, legibility and reliability. The intervention was to develop and deploy an electronic emergency department information system (eEDIS). Post assessment of the intervention confirmed that all five dimensions of data quality had improved. The most significant improvements are noticed in accuracy and timeliness dimensions.Keywords: electronic health records, electronic emergency department information system, emergency department, data quality
Procedia PDF Downloads 27225072 Data Presentation of Lane-Changing Events Trajectories Using HighD Dataset
Authors: Basma Khelfa, Antoine Tordeux, Ibrahima Ba
Abstract:
We present a descriptive analysis data of lane-changing events in multi-lane roads. The data are provided from The Highway Drone Dataset (HighD), which are microscopic trajectories in highway. This paper describes and analyses the role of the different parameters and their significance. Thanks to HighD data, we aim to find the most frequent reasons that motivate drivers to change lanes. We used the programming language R for the processing of these data. We analyze the involvement and relationship of different variables of each parameter of the ego vehicle and the four vehicles surrounding it, i.e., distance, speed difference, time gap, and acceleration. This was studied according to the class of the vehicle (car or truck), and according to the maneuver it undertook (overtaking or falling back).Keywords: autonomous driving, physical traffic model, prediction model, statistical learning process
Procedia PDF Downloads 26025071 Evaluation of Golden Beam Data for the Commissioning of 6 and 18 MV Photons Beams in Varian Linear Accelerator
Authors: Shoukat Ali, Abdul Qadir Jandga, Amjad Hussain
Abstract:
Objective: The main purpose of this study is to compare the Percent Depth dose (PDD) and In-plane and cross-plane profiles of Varian Golden beam data to the measured data of 6 and 18 MV photons for the commissioning of Eclipse treatment planning system. Introduction: Commissioning of treatment planning system requires an extensive acquisition of beam data for the clinical use of linear accelerators. Accurate dose delivery require to enter the PDDs, Profiles and dose rate tables for open and wedges fields into treatment planning system, enabling to calculate the MUs and dose distribution. Varian offers a generic set of beam data as a reference data, however not recommend for clinical use. In this study, we compared the generic beam data with the measured beam data to evaluate the reliability of generic beam data to be used for the clinical purpose. Methods and Material: PDDs and Profiles of Open and Wedge fields for different field sizes and at different depths measured as per Varian’s algorithm commissioning guideline. The measurement performed with PTW 3D-scanning water phantom with semi-flex ion chamber and MEPHYSTO software. The online available Varian Golden Beam Data compared with the measured data to evaluate the accuracy of the golden beam data to be used for the commissioning of Eclipse treatment planning system. Results: The deviation between measured vs. golden beam data was in the range of 2% max. In PDDs, the deviation increases more in the deeper depths than the shallower depths. Similarly, profiles have the same trend of increasing deviation at large field sizes and increasing depths. Conclusion: Study shows that the percentage deviation between measured and golden beam data is within the acceptable tolerance and therefore can be used for the commissioning process; however, verification of small subset of acquired data with the golden beam data should be mandatory before clinical use.Keywords: percent depth dose, flatness, symmetry, golden beam data
Procedia PDF Downloads 48825070 Variable-Fidelity Surrogate Modelling with Kriging
Authors: Selvakumar Ulaganathan, Ivo Couckuyt, Francesco Ferranti, Tom Dhaene, Eric Laermans
Abstract:
Variable-fidelity surrogate modelling offers an efficient way to approximate function data available in multiple degrees of accuracy each with varying computational cost. In this paper, a Kriging-based variable-fidelity surrogate modelling approach is introduced to approximate such deterministic data. Initially, individual Kriging surrogate models, which are enhanced with gradient data of different degrees of accuracy, are constructed. Then these Gradient enhanced Kriging surrogate models are strategically coupled using a recursive CoKriging formulation to provide an accurate surrogate model for the highest fidelity data. While, intuitively, gradient data is useful to enhance the accuracy of surrogate models, the primary motivation behind this work is to investigate if it is also worthwhile incorporating gradient data of varying degrees of accuracy.Keywords: Kriging, CoKriging, Surrogate modelling, Variable- fidelity modelling, Gradients
Procedia PDF Downloads 55625069 Robust Barcode Detection with Synthetic-to-Real Data Augmentation
Authors: Xiaoyan Dai, Hsieh Yisan
Abstract:
Barcode processing of captured images is a huge challenge, as different shooting conditions can result in different barcode appearances. This paper proposes a deep learning-based barcode detection using synthetic-to-real data augmentation. We first augment barcodes themselves; we then augment images containing the barcodes to generate a large variety of data that is close to the actual shooting environments. Comparisons with previous works and evaluations with our original data show that this approach achieves state-of-the-art performance in various real images. In addition, the system uses hybrid resolution for barcode “scan” and is applicable to real-time applications.Keywords: barcode detection, data augmentation, deep learning, image-based processing
Procedia PDF Downloads 16725068 Psychotherapeutic Narratives and the Importance of Truth
Authors: Spencer Jay Knafelc
Abstract:
Some mental health practitioners and theorists have suggested that we approach remedying psychological problems by centering and intervening upon patients’ narrations. Such theorists and their corresponding therapeutic approaches see persons as narrators of their lives, where the stories they tell constitute and reflect their sense-making of the world. Psychological problems, according to these approaches to therapy, are often the result of problematic narratives. The solution is the construction of more salubrious narratives through therapy. There is trouble lurking within the history of these narrative approaches. These thinkers tend to denigrate the importance of truth, insisting that narratives are not to be thought of as aiming at truth, and thus the truth of our self-narratives is not important. There are multiple motivations for the tendency to eschew truth’s importance within the tradition of narrative approaches to therapy. The most plausible and interesting motivation comes from the observation that, in general, all dominant approaches to therapy are equally effective. The theoretical commitments of each approach are quite different and are often ostensibly incompatible (psychodynamic therapists see psychological problems as resulting from unconscious conflict and repressed desires, Cognitive-Behavioral approaches see them as resulting from distorted cognitions). This strongly suggests that there must be some cases in which therapeutic efficacy does not depend on truth and that insisting that patient’s therapeutic narratives be true in all instances is a mistake. Lewis’ solution is to suggest that narratives are metaphors. Lewis’ account appreciates that there are many ways to tell a story and that many different approaches to mental health treatment can be appropriate without committing us to any contradictions, providing us with an ostensibly coherent way to treat narratives as non-literal, instead of seeing them as tools that can be more or less apt. Here, it is argued that Lewis’ metaphor approach fails. Narratives do not have the right kind of structure to be metaphors. Still, another way to understand Lewis’ view might be that self-narratives, especially when articulated in the language of any specific approach, should not be taken literally. This is an idea at the core of the narrative theorists’ tendency to eschew the importance of the ordinary understanding of truth. This very tendency will be critiqued. The view defended in this paper more accurately captures the nature of self-narratives. The truth of one’s self-narrative is important. Not only do people care about having the right conception of their abilities, who they are, and the way the world is, but self-narratives are composed of beliefs, and the nature of belief is to aim at truth. This view also allows the recognition of the importance of developing accurate representations of oneself and reality for one’s psychological well-being. It is also argued that in many cases, truth factors in as a mechanism of change over the course of therapy. Therapeutic benefit can be achieved by coming to have a better understanding of the nature of oneself and the world. Finally, the view defended here allows for the recognition of the nature of the tension between values: truth and efficacy. It is better to recognize this tension and develop strategies to navigate it as opposed to insisting that it doesn’t exist.Keywords: philosophy, narrative, psychotherapy, truth
Procedia PDF Downloads 10425067 Evaluating the Performance of Color Constancy Algorithm
Authors: Damanjit Kaur, Avani Bhatia
Abstract:
Color constancy is significant for human vision since color is a pictorial cue that helps in solving different visions tasks such as tracking, object recognition, or categorization. Therefore, several computational methods have tried to simulate human color constancy abilities to stabilize machine color representations. Two different kinds of methods have been used, i.e., normalization and constancy. While color normalization creates a new representation of the image by canceling illuminant effects, color constancy directly estimates the color of the illuminant in order to map the image colors to a canonical version. Color constancy is the capability to determine colors of objects independent of the color of the light source. This research work studies the most of the well-known color constancy algorithms like white point and gray world.Keywords: color constancy, gray world, white patch, modified white patch
Procedia PDF Downloads 31525066 Analysis of Delivery of Quad Play Services
Authors: Rahul Malhotra, Anurag Sharma
Abstract:
Fiber based access networks can deliver performance that can support the increasing demands for high speed connections. One of the new technologies that have emerged in recent years is Passive Optical Networks. This paper is targeted to show the simultaneous delivery of triple play service (data, voice, and video). The comparative investigation and suitability of various data rates is presented. It is demonstrated that as we increase the data rate, number of users to be accommodated decreases due to increase in bit error rate.Keywords: FTTH, quad play, play service, access networks, data rate
Procedia PDF Downloads 41225065 Classification of Manufacturing Data for Efficient Processing on an Edge-Cloud Network
Authors: Onyedikachi Ulelu, Andrew P. Longstaff, Simon Fletcher, Simon Parkinson
Abstract:
The widespread interest in 'Industry 4.0' or 'digital manufacturing' has led to significant research requiring the acquisition of data from sensors, instruments, and machine signals. In-depth research then identifies methods of analysis of the massive amounts of data generated before and during manufacture to solve a particular problem. The ultimate goal is for industrial Internet of Things (IIoT) data to be processed automatically to assist with either visualisation or autonomous system decision-making. However, the collection and processing of data in an industrial environment come with a cost. Little research has been undertaken on how to specify optimally what data to capture, transmit, process, and store at various levels of an edge-cloud network. The first step in this specification is to categorise IIoT data for efficient and effective use. This paper proposes the required attributes and classification to take manufacturing digital data from various sources to determine the most suitable location for data processing on the edge-cloud network. The proposed classification framework will minimise overhead in terms of network bandwidth/cost and processing time of machine tool data via efficient decision making on which dataset should be processed at the ‘edge’ and what to send to a remote server (cloud). A fast-and-frugal heuristic method is implemented for this decision-making. The framework is tested using case studies from industrial machine tools for machine productivity and maintenance.Keywords: data classification, decision making, edge computing, industrial IoT, industry 4.0
Procedia PDF Downloads 17625064 Denoising Transient Electromagnetic Data
Authors: Lingerew Nebere Kassie, Ping-Yu Chang, Hsin-Hua Huang, , Chaw-Son Chen
Abstract:
Transient electromagnetic (TEM) data plays a crucial role in hydrogeological and environmental applications, providing valuable insights into geological structures and resistivity variations. However, the presence of noise often hinders the interpretation and reliability of these data. Our study addresses this issue by utilizing a FASTSNAP system for the TEM survey, which operates at different modes (low, medium, and high) with continuous adjustments to discretization, gain, and current. We employ a denoising approach that processes the raw data obtained from each acquisition mode to improve signal quality and enhance data reliability. We use a signal-averaging technique for each mode, increasing the signal-to-noise ratio. Additionally, we utilize wavelet transform to suppress noise further while preserving the integrity of the underlying signals. This approach significantly improves the data quality, notably suppressing severe noise at late times. The resulting denoised data exhibits a substantially improved signal-to-noise ratio, leading to increased accuracy in parameter estimation. By effectively denoising TEM data, our study contributes to a more reliable interpretation and analysis of underground structures. Moreover, the proposed denoising approach can be seamlessly integrated into existing ground-based TEM data processing workflows, facilitating the extraction of meaningful information from noisy measurements and enhancing the overall quality and reliability of the acquired data.Keywords: data quality, signal averaging, transient electromagnetic, wavelet transform
Procedia PDF Downloads 8325063 Attribute Analysis of Quick Response Code Payment Users Using Discriminant Non-negative Matrix Factorization
Authors: Hironori Karachi, Haruka Yamashita
Abstract:
Recently, the system of quick response (QR) code is getting popular. Many companies introduce new QR code payment services and the services are competing with each other to increase the number of users. For increasing the number of users, we should grasp the difference of feature of the demographic information, usage information, and value of users between services. In this study, we conduct an analysis of real-world data provided by Nomura Research Institute including the demographic data of users and information of users’ usages of two services; LINE Pay, and PayPay. For analyzing such data and interpret the feature of them, Nonnegative Matrix Factorization (NMF) is widely used; however, in case of the target data, there is a problem of the missing data. EM-algorithm NMF (EMNMF) to complete unknown values for understanding the feature of the given data presented by matrix shape. Moreover, for comparing the result of the NMF analysis of two matrices, there is Discriminant NMF (DNMF) shows the difference of users features between two matrices. In this study, we combine EMNMF and DNMF and also analyze the target data. As the interpretation, we show the difference of the features of users between LINE Pay and Paypay.Keywords: data science, non-negative matrix factorization, missing data, quality of services
Procedia PDF Downloads 13025062 Developing Guidelines for Public Health Nurse Data Management and Use in Public Health Emergencies
Authors: Margaret S. Wright
Abstract:
Background/Significance: During many recent public health emergencies/disasters, public health nursing data has been missing or delayed, potentially impacting the decision-making and response. Data used as evidence for decision-making in response, planning, and mitigation has been erratic and slow, decreasing the ability to respond. Methodology: Applying best practices in data management and data use in public health settings, and guided by the concepts outlined in ‘Disaster Standards of Care’ models leads to the development of recommendations for a model of best practices in data management and use in public health disasters/emergencies by public health nurses. As the ‘patient’ in public health disasters/emergencies is the community (local, regional or national), guidelines for patient documentation are incorporated in the recommendations. Findings: Using model public health nurses could better plan how to prepare for, respond to, and mitigate disasters in their communities, and better participate in decision-making in all three phases bringing public health nursing data to the discussion as part of the evidence base for decision-making.Keywords: data management, decision making, disaster planning documentation, public health nursing
Procedia PDF Downloads 22125061 Proposal for a Web System for the Control of Fungal Diseases in Grapes in Fruits Markets
Authors: Carlos Tarmeño Noriega, Igor Aguilar Alonso
Abstract:
Fungal diseases are common in vineyards; they cause a decrease in the quality of the products that can be sold, generating distrust of the customer towards the seller when buying fruit. Currently, technology allows the classification of fruits according to their characteristics thanks to artificial intelligence. This study proposes the implementation of a control system that allows the identification of the main fungal diseases present in the Italia grape, making use of a convolutional neural network (CNN), OpenCV, and TensorFlow. The methodology used was based on a collection of 20 articles referring to the proposed research on quality control, classification, and recognition of fruits through artificial vision techniques.Keywords: computer vision, convolutional neural networks, quality control, fruit market, OpenCV, TensorFlow
Procedia PDF Downloads 8225060 An Embarrassingly Simple Semi-supervised Approach to Increase Recall in Online Shopping Domain to Match Structured Data with Unstructured Data
Authors: Sachin Nagargoje
Abstract:
Complete labeled data is often difficult to obtain in a practical scenario. Even if one manages to obtain the data, the quality of the data is always in question. In shopping vertical, offers are the input data, which is given by advertiser with or without a good quality of information. In this paper, an author investigated the possibility of using a very simple Semi-supervised learning approach to increase the recall of unhealthy offers (has badly written Offer Title or partial product details) in shopping vertical domain. The author found that the semisupervised learning method had improved the recall in the Smart Phone category by 30% on A=B testing on 10% traffic and increased the YoY (Year over Year) number of impressions per month by 33% at production. This also made a significant increase in Revenue, but that cannot be publicly disclosed.Keywords: semi-supervised learning, clustering, recall, coverage
Procedia PDF Downloads 12025059 Genodata: The Human Genome Variation Using BigData
Authors: Surabhi Maiti, Prajakta Tamhankar, Prachi Uttam Mehta
Abstract:
Since the accomplishment of the Human Genome Project, there has been an unparalled escalation in the sequencing of genomic data. This project has been the first major vault in the field of medical research, especially in genomics. This project won accolades by using a concept called Bigdata which was earlier, extensively used to gain value for business. Bigdata makes use of data sets which are generally in the form of files of size terabytes, petabytes, or exabytes and these data sets were traditionally used and managed using excel sheets and RDBMS. The voluminous data made the process tedious and time consuming and hence a stronger framework called Hadoop was introduced in the field of genetic sciences to make data processing faster and efficient. This paper focuses on using SPARK which is gaining momentum with the advancement of BigData technologies. Cloud Storage is an effective medium for storage of large data sets which is generated from the genetic research and the resultant sets produced from SPARK analysis.Keywords: human genome project, Bigdata, genomic data, SPARK, cloud storage, Hadoop
Procedia PDF Downloads 25825058 On Musical Information Geometry with Applications to Sonified Image Analysis
Authors: Shannon Steinmetz, Ellen Gethner
Abstract:
In this paper, a theoretical foundation is developed for patterned segmentation of audio using the geometry of music and statistical manifold. We demonstrate image content clustering using conic space sonification. The algorithm takes a geodesic curve as a model estimator of the three-parameter Gamma distribution. The random variable is parameterized by musical centricity and centric velocity. Model parameters predict audio segmentation in the form of duration and frame count based on the likelihood of musical geometry transition. We provide an example using a database of randomly selected images, resulting in statistically significant clusters of similar image content.Keywords: sonification, musical information geometry, image, content extraction, automated quantification, audio segmentation, pattern recognition
Procedia PDF Downloads 23525057 Prototyping a Portable, Affordable Sign Language Glove
Authors: Vidhi Jain
Abstract:
Communication between speakers and non-speakers of American Sign Language (ASL) can be problematic, inconvenient, and expensive. This project attempts to bridge the communication gap by designing a portable glove that captures the user’s ASL gestures and outputs the translated text on a smartphone. The glove is equipped with flex sensors, contact sensors, and a gyroscope to measure the flexion of the fingers, the contact between fingers, and the rotation of the hand. The glove’s Arduino UNO microcontroller analyzes the sensor readings to identify the gesture from a library of learned gestures. The Bluetooth module transmits the gesture to a smartphone. Using this device, one day speakers of ASL may be able to communicate with others in an affordable and convenient way.Keywords: sign language, morse code, convolutional neural network, American sign language, gesture recognition
Procedia PDF Downloads 6025056 Ontology for a Voice Transcription of OpenStreetMap Data: The Case of Space Apprehension by Visually Impaired Persons
Authors: Said Boularouk, Didier Josselin, Eitan Altman
Abstract:
In this paper, we present a vocal ontology of OpenStreetMap data for the apprehension of space by visually impaired people. Indeed, the platform based on produsage gives a freedom to data producers to choose the descriptors of geocoded locations. Unfortunately, this freedom, called also folksonomy leads to complicate subsequent searches of data. We try to solve this issue in a simple but usable method to extract data from OSM databases in order to send them to visually impaired people using Text To Speech technology. We focus on how to help people suffering from visual disability to plan their itinerary, to comprehend a map by querying computer and getting information about surrounding environment in a mono-modal human-computer dialogue.Keywords: TTS, ontology, open street map, visually impaired
Procedia PDF Downloads 29525055 Improved Particle Swarm Optimization with Cellular Automata and Fuzzy Cellular Automata
Authors: Ramin Javadzadeh
Abstract:
The particle swarm optimization are Meta heuristic optimization method, which are used for clustering and pattern recognition applications are abundantly. These algorithms in multimodal optimization problems are more efficient than genetic algorithms. A major drawback in these algorithms is their slow convergence to global optimum and their weak stability can be considered in various running of these algorithms. In this paper, improved Particle swarm optimization is introduced for the first time to overcome its problems. The fuzzy cellular automata is used for improving the algorithm efficiently. The credibility of the proposed approach is evaluated by simulations, and it is shown that the proposed approach achieves better results can be achieved compared to the Particle swarm optimization algorithms.Keywords: cellular automata, cellular learning automata, local search, optimization, particle swarm optimization
Procedia PDF Downloads 60525054 Design and Development of a Platform for Analyzing Spatio-Temporal Data from Wireless Sensor Networks
Authors: Walid Fantazi
Abstract:
The development of sensor technology (such as microelectromechanical systems (MEMS), wireless communications, embedded systems, distributed processing and wireless sensor applications) has contributed to a broad range of WSN applications which are capable of collecting a large amount of spatiotemporal data in real time. These systems require real-time data processing to manage storage in real time and query the data they process. In order to cover these needs, we propose in this paper a Snapshot spatiotemporal data model based on object-oriented concepts. This model allows saving storing and reducing data redundancy which makes it easier to execute spatiotemporal queries and save analyzes time. Further, to ensure the robustness of the system as well as the elimination of congestion from the main access memory we propose a spatiotemporal indexing technique in RAM called Captree *. As a result, we offer an RIA (Rich Internet Application) -based SOA application architecture which allows the remote monitoring and control.Keywords: WSN, indexing data, SOA, RIA, geographic information system
Procedia PDF Downloads 25225053 Prediction of Marine Ecosystem Changes Based on the Integrated Analysis of Multivariate Data Sets
Authors: Prozorkevitch D., Mishurov A., Sokolov K., Karsakov L., Pestrikova L.
Abstract:
The current body of knowledge about the marine environment and the dynamics of marine ecosystems includes a huge amount of heterogeneous data collected over decades. It generally includes a wide range of hydrological, biological and fishery data. Marine researchers collect these data and analyze how and why the ecosystem changes from past to present. Based on these historical records and linkages between the processes it is possible to predict future changes. Multivariate analysis of trends and their interconnection in the marine ecosystem may be used as an instrument for predicting further ecosystem evolution. A wide range of information about the components of the marine ecosystem for more than 50 years needs to be used to investigate how these arrays can help to predict the future.Keywords: barents sea ecosystem, abiotic, biotic, data sets, trends, prediction
Procedia PDF Downloads 11425052 Optical Fiber Data Throughput in a Quantum Communication System
Authors: Arash Kosari, Ali Araghi
Abstract:
A mathematical model for an optical-fiber communication channel is developed which results in an expression that calculates the throughput and loss of the corresponding link. The data are assumed to be transmitted by using of separate photons with different polarizations. The derived model also shows the dependency of data throughput with length of the channel and depolarization factor. It is observed that absorption of photons affects the throughput in a more intensive way in comparison with that of depolarization. Apart from that, the probability of depolarization and the absorption of radiated photons are obtained.Keywords: absorption, data throughput, depolarization, optical fiber
Procedia PDF Downloads 28425051 Event Driven Dynamic Clustering and Data Aggregation in Wireless Sensor Network
Authors: Ashok V. Sutagundar, Sunilkumar S. Manvi
Abstract:
Energy, delay and bandwidth are the prime issues of wireless sensor network (WSN). Energy usage optimization and efficient bandwidth utilization are important issues in WSN. Event triggered data aggregation facilitates such optimal tasks for event affected area in WSN. Reliable delivery of the critical information to sink node is also a major challenge of WSN. To tackle these issues, we propose an event driven dynamic clustering and data aggregation scheme for WSN that enhances the life time of the network by minimizing redundant data transmission. The proposed scheme operates as follows: (1) Whenever the event is triggered, event triggered node selects the cluster head. (2) Cluster head gathers data from sensor nodes within the cluster. (3) Cluster head node identifies and classifies the events out of the collected data using Bayesian classifier. (4) Aggregation of data is done using statistical method. (5) Cluster head discovers the paths to the sink node using residual energy, path distance and bandwidth. (6) If the aggregated data is critical, cluster head sends the aggregated data over the multipath for reliable data communication. (7) Otherwise aggregated data is transmitted towards sink node over the single path which is having the more bandwidth and residual energy. The performance of the scheme is validated for various WSN scenarios to evaluate the effectiveness of the proposed approach in terms of aggregation time, cluster formation time and energy consumed for aggregation.Keywords: wireless sensor network, dynamic clustering, data aggregation, wireless communication
Procedia PDF Downloads 44925050 Offshore Outsourcing: Global Data Privacy Controls and International Compliance Issues
Authors: Michelle J. Miller
Abstract:
In recent year, there has been a rise of two emerging issues that impact the global employment and business market that the legal community must review closer: offshore outsourcing and data privacy. These two issues intersect because employment opportunities are shifting due to offshore outsourcing and some States, like the United States, anti-outsourcing legislation has been passed or presented to retain jobs within the country. In addition, the legal requirements to retain the privacy of data as a global employer extends to employees and third party service provides, including services outsourced to offshore locations. For this reason, this paper will review the intersection of these two issues with a specific focus on data privacy.Keywords: outsourcing, data privacy, international compliance, multinational corporations
Procedia PDF Downloads 40925049 Weighted Data Replication Strategy for Data Grid Considering Economic Approach
Authors: N. Mansouri, A. Asadi
Abstract:
Data Grid is a geographically distributed environment that deals with data intensive application in scientific and enterprise computing. Data replication is a common method used to achieve efficient and fault-tolerant data access in Grids. In this paper, a dynamic data replication strategy, called Enhanced Latest Access Largest Weight (ELALW) is proposed. This strategy is an enhanced version of Latest Access Largest Weight strategy. However, replication should be used wisely because the storage capacity of each Grid site is limited. Thus, it is important to design an effective strategy for the replication replacement task. ELALW replaces replicas based on the number of requests in future, the size of the replica, and the number of copies of the file. It also improves access latency by selecting the best replica when various sites hold replicas. The proposed replica selection selects the best replica location from among the many replicas based on response time that can be determined by considering the data transfer time, the storage access latency, the replica requests that waiting in the storage queue and the distance between nodes. Simulation results utilizing the OptorSim show our replication strategy achieve better performance overall than other strategies in terms of job execution time, effective network usage and storage resource usage.Keywords: data grid, data replication, simulation, replica selection, replica placement
Procedia PDF Downloads 26025048 Evaluation of Satellite and Radar Rainfall Product over Seyhan Plain
Authors: Kazım Kaba, Erdem Erdi, M. Akif Erdoğan, H. Mustafa Kandırmaz
Abstract:
Rainfall is crucial data source for very different discipline such as agriculture, hydrology and climate. Therefore rain rate should be known well both spatial and temporal for any area. Rainfall is measured by using rain-gauge at meteorological ground stations traditionally for many years. At the present time, rainfall products are acquired from radar and satellite images with a temporal and spatial continuity. In this study, we investigated the accuracy of these rainfall data according to rain-gauge data. For this purpose, we used Adana-Hatay radar hourly total precipitation product (RN1) and Meteosat convective rainfall rate (CRR) product over Seyhan plain. We calculated daily rainfall values from RN1 and CRR hourly precipitation products. We used the data of rainy days of four stations located within range of the radar from October 2013 to November 2015. In the study, we examined two rainfall data over Seyhan plain and the correlation between the rain-gauge data and two raster rainfall data was observed lowly.Keywords: meteosat, radar, rainfall, rain-gauge, Turkey
Procedia PDF Downloads 32525047 Spatial Data Mining by Decision Trees
Authors: Sihem Oujdi, Hafida Belbachir
Abstract:
Existing methods of data mining cannot be applied on spatial data because they require spatial specificity consideration, as spatial relationships. This paper focuses on the classification with decision trees, which are one of the data mining techniques. We propose an extension of the C4.5 algorithm for spatial data, based on two different approaches Join materialization and Querying on the fly the different tables. Similar works have been done on these two main approaches, the first - Join materialization - favors the processing time in spite of memory space, whereas the second - Querying on the fly different tables- promotes memory space despite of the processing time. The modified C4.5 algorithm requires three entries tables: a target table, a neighbor table, and a spatial index join that contains the possible spatial relationship among the objects in the target table and those in the neighbor table. Thus, the proposed algorithms are applied to a spatial data pattern in the accidentology domain. A comparative study of our approach with other works of classification by spatial decision trees will be detailed.Keywords: C4.5 algorithm, decision trees, S-CART, spatial data mining
Procedia PDF Downloads 61125046 Features Dimensionality Reduction and Multi-Dimensional Voice-Processing Program to Parkinson Disease Discrimination
Authors: Djamila Meghraoui, Bachir Boudraa, Thouraya Meksen, M.Boudraa
Abstract:
Parkinson's disease is a pathology that involves characteristic perturbations in patients’ voices. This paper describes a proposed method that aims to diagnose persons with Parkinson (PWP) by analyzing on line their voices signals. First, Thresholds signals alterations are determined by the Multi-Dimensional Voice Program (MDVP). Principal Analysis (PCA) is exploited to select the main voice principal componentsthat are significantly affected in a patient. The decision phase is realized by a Mul-tinomial Bayes (MNB) Classifier that categorizes an analyzed voice in one of the two resulting classes: healthy or PWP. The prediction accuracy achieved reaching 98.8% is very promising.Keywords: Parkinson’s disease recognition, PCA, MDVP, multinomial Naive Bayes
Procedia PDF Downloads 27625045 Evaluation of Modern Natural Language Processing Techniques via Measuring a Company's Public Perception
Authors: Burak Oksuzoglu, Savas Yildirim, Ferhat Kutlu
Abstract:
Opinion mining (OM) is one of the natural language processing (NLP) problems to determine the polarity of opinions, mostly represented on a positive-neutral-negative axis. The data for OM is usually collected from various social media platforms. In an era where social media has considerable control over companies’ futures, it’s worth understanding social media and taking actions accordingly. OM comes to the fore here as the scale of the discussion about companies increases, and it becomes unfeasible to gauge opinion on individual levels. Thus, the companies opt to automize this process by applying machine learning (ML) approaches to their data. For the last two decades, OM or sentiment analysis (SA) has been mainly performed by applying ML classification algorithms such as support vector machines (SVM) and Naïve Bayes to a bag of n-gram representations of textual data. With the advent of deep learning and its apparent success in NLP, traditional methods have become obsolete. Transfer learning paradigm that has been commonly used in computer vision (CV) problems started to shape NLP approaches and language models (LM) lately. This gave a sudden rise to the usage of the pretrained language model (PTM), which contains language representations that are obtained by training it on the large datasets using self-supervised learning objectives. The PTMs are further fine-tuned by a specialized downstream task dataset to produce efficient models for various NLP tasks such as OM, NER (Named-Entity Recognition), Question Answering (QA), and so forth. In this study, the traditional and modern NLP approaches have been evaluated for OM by using a sizable corpus belonging to a large private company containing about 76,000 comments in Turkish: SVM with a bag of n-grams, and two chosen pre-trained models, multilingual universal sentence encoder (MUSE) and bidirectional encoder representations from transformers (BERT). The MUSE model is a multilingual model that supports 16 languages, including Turkish, and it is based on convolutional neural networks. The BERT is a monolingual model in our case and transformers-based neural networks. It uses a masked language model and next sentence prediction tasks that allow the bidirectional training of the transformers. During the training phase of the architecture, pre-processing operations such as morphological parsing, stemming, and spelling correction was not used since the experiments showed that their contribution to the model performance was found insignificant even though Turkish is a highly agglutinative and inflective language. The results show that usage of deep learning methods with pre-trained models and fine-tuning achieve about 11% improvement over SVM for OM. The BERT model achieved around 94% prediction accuracy while the MUSE model achieved around 88% and SVM did around 83%. The MUSE multilingual model shows better results than SVM, but it still performs worse than the monolingual BERT model.Keywords: BERT, MUSE, opinion mining, pretrained language model, SVM, Turkish
Procedia PDF Downloads 146