Search results for: Data science
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7715

Search results for: Data science

7325 Li-Fi Technology: Data Transmission through Visible Light

Authors: Shahzad Hassan, Kamran Saeed

Abstract:

People are always in search of Wi-Fi hotspots because Internet is a major demand nowadays. But like all other technologies, there is still room for improvement in the Wi-Fi technology with regards to the speed and quality of connectivity. In order to address these aspects, Harald Haas, a professor at the University of Edinburgh, proposed what we know as the Li-Fi (Light Fidelity). Li-Fi is a new technology in the field of wireless communication to provide connectivity within a network environment. It is a two-way mode of wireless communication using light. Basically, the data is transmitted through Light Emitting Diodes which can vary the intensity of light very fast, even faster than the blink of an eye. From the research and experiments conducted so far, it can be said that Li-Fi can increase the speed and reliability of the transfer of data. This paper pays particular attention on the assessment of the performance of this technology. In other words, it is a 5G technology which uses LED as the medium of data transfer. For coverage within the buildings, Wi-Fi is good but Li-Fi can be considered favorable in situations where large amounts of data are to be transferred in areas with electromagnetic interferences. It brings a lot of data related qualities such as efficiency, security as well as large throughputs to the table of wireless communication. All in all, it can be said that Li-Fi is going to be a future phenomenon where the presence of light will mean access to the Internet as well as speedy data transfer.

Keywords: Communication, LED, Li-Fi, Wi-Fi.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2110
7324 Business Rules for Data Warehouse

Authors: Rajeev Kaula

Abstract:

Business rules and data warehouse are concepts and technologies that impact a wide variety of organizational tasks. In general, each area has evolved independently, impacting application development and decision-making. Generating knowledge from data warehouse is a complex process. This paper outlines an approach to ease import of information and knowledge from a data warehouse star schema through an inference class of business rules. The paper utilizes the Oracle database for illustrating the working of the concepts. The star schema structure and the business rules are stored within a relational database. The approach is explained through a prototype in Oracle-s PL/SQL Server Pages.

Keywords: Business Rules, Data warehouse, PL/SQL ServerPages, Relational model, Web Application.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2946
7323 Authorization of Commercial Communication Satellite Grounds for Promoting Turkish Data Relay System

Authors: Celal Dudak, Aslı Utku, Burak Yağlioğlu

Abstract:

Uninterrupted and continuous satellite communication through the whole orbit time is becoming more indispensable every day. Data relay systems are developed and built for various high/low data rate information exchanges like TDRSS of USA and EDRSS of Europe. In these missions, a couple of task-dedicated communication satellites exist. In this regard, for Turkey a data relay system is attempted to be defined exchanging low data rate information (i.e. TTC) for Earth-observing LEO satellites appointing commercial GEO communication satellites all over the world. First, justification of this attempt is given, demonstrating duration enhancements in the link. Discussion of preference of RF communication is, also, given instead of laser communication. Then, preferred communication GEOs – including TURKSAT4A already belonging to Turkey- are given, together with the coverage enhancements through STK simulations and the corresponding link budget. Also, a block diagram of the communication system is given on the LEO satellite.

Keywords: Communication, satellite, data relay system, coverage.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1385
7322 An Efficient Approach to Mining Frequent Itemsets on Data Streams

Authors: Sara Ansari, Mohammad Hadi Sadreddini

Abstract:

The increasing importance of data stream arising in a wide range of advanced applications has led to the extensive study of mining frequent patterns. Mining data streams poses many new challenges amongst which are the one-scan nature, the unbounded memory requirement and the high arrival rate of data streams. In this paper, we propose a new approach for mining itemsets on data stream. Our approach SFIDS has been developed based on FIDS algorithm. The main attempts were to keep some advantages of the previous approach and resolve some of its drawbacks, and consequently to improve run time and memory consumption. Our approach has the following advantages: using a data structure similar to lattice for keeping frequent itemsets, separating regions from each other with deleting common nodes that results in a decrease in search space, memory consumption and run time; and Finally, considering CPU constraint, with increasing arrival rate of data that result in overloading system, SFIDS automatically detect this situation and discard some of unprocessing data. We guarantee that error of results is bounded to user pre-specified threshold, based on a probability technique. Final results show that SFIDS algorithm could attain about 50% run time improvement than FIDS approach.

Keywords: Data stream, frequent itemset, stream mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1387
7321 Anomaly Detection in a Data Center with a Reconstruction Method Using a Multi-Autoencoders Model

Authors: Victor Breux, Jérôme Boutet, Alain Goret, Viviane Cattin

Abstract:

Early detection of anomalies in data centers is important to reduce downtimes and the costs of periodic maintenance. However, there is little research on this topic and even fewer on the fusion of sensor data for the detection of abnormal events. The goal of this paper is to propose a method for anomaly detection in data centers by combining sensor data (temperature, humidity, power) and deep learning models. The model described in the paper uses one autoencoder per sensor to reconstruct the inputs. The auto-encoders contain Long-Short Term Memory (LSTM) layers and are trained using the normal samples of the relevant sensors selected by correlation analysis. The difference signal between the input and its reconstruction is then used to classify the samples using feature extraction and a random forest classifier. The data measured by the sensors of a data center between January 2019 and May 2020 are used to train the model, while the data between June 2020 and May 2021 are used to assess it. Performances of the model are assessed a posteriori through F1-score by comparing detected anomalies with the data center’s history. The proposed model outperforms the state-of-the-art reconstruction method, which uses only one autoencoder taking multivariate sequences and detects an anomaly with a threshold on the reconstruction error, with an F1-score of 83.60% compared to 24.16%.

Keywords: Anomaly detection, autoencoder, data centers, deep learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 682
7320 Paradigms Shift in Sport Sciences: Body's focus

Authors: Michele V. Carbinatto, Wagner Wey Moreira, Myrian Nunomura; Mariana H. C. Tsukamoto, VilmaLeni Nista-Piccolo

Abstract:

Sports Sciences has been historically supported by the positivism idea of science, especially by the mechanistic/reductionist and becomes a field that views experimentation and measurement as the mayor research domains. The disposition to simplify nature and the world by parts has fragmented and reduced the idea of bodyathletes as machine. In this paper we intent to re-think this perception lined by Complexity Theory. We come with the idea of athletes as a reflexive and active being (corporeity-body). Therefore, the construction of a training that considers the cultural, biological, psychological elements regarding the experience of the human corporal movements in a circumspect and responsible way could bring better chances of accomplishment. In the end, we hope to help coaches understand the intrinsic complexity of the body they are training, how better deal with it, and, in the field of a deep globalization among the different types of knowledge, to respect and accepted the peculiarities of knowledge that comprise this area.

Keywords: Sport science, body, complexity theory, corporeity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2085
7319 AnQL: A Query Language for Annotation Documents

Authors: Neerja Bhatnagar, Ben A. Juliano, Renee S. Renner

Abstract:

This paper presents data annotation models at five levels of granularity (database, relation, column, tuple, and cell) of relational data to address the problem of unsuitability of most relational databases to express annotations. These models do not require any structural and schematic changes to the underlying database. These models are also flexible, extensible, customizable, database-neutral, and platform-independent. This paper also presents an SQL-like query language, named Annotation Query Language (AnQL), to query annotation documents. AnQL is simple to understand and exploits the already-existent wide knowledge and skill set of SQL.

Keywords: Annotation query language, data annotations, data annotation models, semantic data annotations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1770
7318 Machine Learning-Enabled Classification of Climbing Using Small Data

Authors: Nicholas Milburn, Yu Liang, Dalei Wu

Abstract:

Athlete performance scoring within the climbing domain presents interesting challenges as the sport does not have an objective way to assign skill. Assessing skill levels within any sport is valuable as it can be used to mark progress while training, and it can help an athlete choose appropriate climbs to attempt. Machine learning-based methods are popular for complex problems like this. The dataset available was composed of dynamic force data recorded during climbing; however, this dataset came with challenges such as data scarcity, imbalance, and it was temporally heterogeneous. Investigated solutions to these challenges include data augmentation, temporal normalization, conversion of time series to the spectral domain, and cross validation strategies. The investigated solutions to the classification problem included light weight machine classifiers KNN and SVM as well as the deep learning with CNN. The best performing model had an 80% accuracy. In conclusion, there seems to be enough information within climbing force data to accurately categorize climbers by skill.

Keywords: Classification, climbing, data imbalance, data scarcity, machine learning, time sequence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 501
7317 FCNN-MR: A Parallel Instance Selection Method Based on Fast Condensed Nearest Neighbor Rule

Authors: Lu Si, Jie Yu, Shasha Li, Jun Ma, Lei Luo, Qingbo Wu, Yongqi Ma, Zhengji Liu

Abstract:

Instance selection (IS) technique is used to reduce the data size to improve the performance of data mining methods. Recently, to process very large data set, several proposed methods divide the training set into some disjoint subsets and apply IS algorithms independently to each subset. In this paper, we analyze the limitation of these methods and give our viewpoint about how to divide and conquer in IS procedure. Then, based on fast condensed nearest neighbor (FCNN) rule, we propose a large data sets instance selection method with MapReduce framework. Besides ensuring the prediction accuracy and reduction rate, it has two desirable properties: First, it reduces the work load in the aggregation node; Second and most important, it produces the same result with the sequential version, which other parallel methods cannot achieve. We evaluate the performance of FCNN-MR on one small data set and two large data sets. The experimental results show that it is effective and practical.

Keywords: Instance selection, data reduction, MapReduce, kNN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 978
7316 Optimization of Real Time Measured Data Transmission, Given the Amount of Data Transmitted

Authors: Michal Kopcek, Tomas Skulavik, Michal Kebisek, Gabriela Krizanova

Abstract:

The operation of nuclear power plants involves continuous monitoring of the environment in their area. This monitoring is performed using a complex data acquisition system, which collects status information about the system itself and values of many important physical variables e.g. temperature, humidity, dose rate etc. This paper describes a proposal and optimization of communication that takes place in teledosimetric system between the central control server responsible for the data processing and storing and the decentralized measuring stations, which are measuring the physical variables. Analyzes of ongoing communication were performed and consequently the optimization of the system architecture and communication was done.

Keywords: Communication protocol, transmission optimization, data acquisition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1778
7315 Digital Paradoxes in Learning Theories

Authors: Marcello Bettoni

Abstract:

As a learning theory tries to borrow from science a framework to found its method, it shows paradoxes and paralysing contraddictions. This results, on one hand, from adopting a learning/teaching model as it were a mere “transfer of data" (mechanical learning approach), and on the other hand from borrowing the complexity theory (an indeterministic and non-linear model), that risks to vanish every educational effort. This work is aimed at describing existing criticism, unveiling the antinomic nature of such paradoxes, focussing on a view where neither the mechanical learning perspective nor the chaotic and nonlinear model can threaten and jeopardize the educational work. Author intends to go back over the steps that led to these paradoxes and to unveil their antinomic nature. Actually this could serve the purpose to explain some current misunderstandings about the real usefulness of Ict within the youth-s learning process and growth.

Keywords: Antinomy, complexity, Leibniz, paradox

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1241
7314 Empirical Process Monitoring Via Chemometric Analysis of Partially Unbalanced Data

Authors: Hyun-Woo Cho

Abstract:

Real-time or in-line process monitoring frameworks are designed to give early warnings for a fault along with meaningful identification of its assignable causes. In artificial intelligence and machine learning fields of pattern recognition various promising approaches have been proposed such as kernel-based nonlinear machine learning techniques. This work presents a kernel-based empirical monitoring scheme for batch type production processes with small sample size problem of partially unbalanced data. Measurement data of normal operations are easy to collect whilst special events or faults data are difficult to collect. In such situations, noise filtering techniques can be helpful in enhancing process monitoring performance. Furthermore, preprocessing of raw process data is used to get rid of unwanted variation of data. The performance of the monitoring scheme was demonstrated using three-dimensional batch data. The results showed that the monitoring performance was improved significantly in terms of detection success rate of process fault.

Keywords: Process Monitoring, kernel methods, multivariate filtering, data-driven techniques, quality improvement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1715
7313 Physiological Effects during Aerobatic Flights on Science Astronaut Candidates

Authors: Pedro Llanos, Diego García

Abstract:

Spaceflight is considered the last frontier in terms of science, technology, and engineering. But it is also the next frontier in terms of human physiology and performance. After more than 200,000 years humans have evolved under earth’s gravity and atmospheric conditions, spaceflight poses environmental stresses for which human physiology is not adapted. Hypoxia, accelerations, and radiation are among such stressors, our research involves suborbital flights aiming to develop effective countermeasures in order to assure sustainable human space presence. The physiologic baseline of spaceflight participants is subject to great variability driven by age, gender, fitness, and metabolic reserve. The objective of the present study is to characterize different physiologic variables in a population of STEM practitioners during an aerobatic flight. Cardiovascular and pulmonary responses were determined in Science Astronaut Candidates (SACs) during unusual attitude aerobatic flight indoctrination. Physiologic data recordings from 20 subjects participating in high-G flight training were analyzed. These recordings were registered by wearable sensor-vest that monitored electrocardiographic tracings (ECGs), signs of dysrhythmias or other electric disturbances during all the flight. The same cardiovascular parameters were also collected approximately 10 min pre-flight, during each high-G/unusual attitude maneuver and 10 min after the flights. The ratio (pre-flight/in-flight/post-flight) of the cardiovascular responses was calculated for comparison of inter-individual differences. The resulting tracings depicting the cardiovascular responses of the subjects were compared against the G-loads (Gs) during the aerobatic flights to analyze cardiovascular variability aspects and fluid/pressure shifts due to the high Gs. In-flight ECG revealed cardiac variability patterns associated with rapid Gs onset in terms of reduced heart rate (HR) and some scattered dysrhythmic patterns (15% premature ventricular contractions-type) that were considered as triggered physiological responses to high-G/unusual attitude training and some were considered as instrument artifact. Variation events were observed in subjects during the +Gz and –Gz maneuvers and these may be due to preload and afterload, sudden shift. Our data reveal that aerobatic flight influenced the breathing rate of the subject, due in part by the various levels of energy expenditure due to the increased use of muscle work during these aerobatic maneuvers. Noteworthy was the high heterogeneity in the different physiological responses among a relatively small group of SACs exposed to similar aerobatic flights with similar Gs exposures. The cardiovascular responses clearly demonstrated that SACs were subjected to significant flight stress. Routine ECG monitoring during high-G/unusual attitude flight training is recommended to capture pathology underlying dangerous dysrhythmias in suborbital flight safety. More research is currently being conducted to further facilitate the development of robust medical screening, medical risk assessment approaches, and suborbital flight training in the context of the evolving commercial human suborbital spaceflight industry. A more mature and integrative medical assessment method is required to understand the physiology state and response variability among highly diverse populations of prospective suborbital flight participants.

Keywords: Aerobatic maneuvers, G force, hypoxia, suborbital flight, commercial astronauts.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 506
7312 A Comparison of Image Data Representations for Local Stereo Matching

Authors: André Smith, Amr Abdel-Dayem

Abstract:

The stereo matching problem, while having been present for several decades, continues to be an active area of research. The goal of this research is to find correspondences between elements found in a set of stereoscopic images. With these pairings, it is possible to infer the distance of objects within a scene, relative to the observer. Advancements in this field have led to experimentations with various techniques, from graph-cut energy minimization to artificial neural networks. At the basis of these techniques is a cost function, which is used to evaluate the likelihood of a particular match between points in each image. While at its core, the cost is based on comparing the image pixel data; there is a general lack of consistency as to what image data representation to use. This paper presents an experimental analysis to compare the effectiveness of more common image data representations. The goal is to determine the effectiveness of these data representations to reduce the cost for the correct correspondence relative to other possible matches.

Keywords: Colour data, local stereo matching, stereo correspondence, disparity map.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 871
7311 Flexible, Adaptable and Scaleable Business Rules Management System for Data Validation

Authors: Kashif Kamran, Farooque Azam

Abstract:

The policies governing the business of any organization are well reflected in her business rules. The business rules are implemented by data validation techniques, coded during the software development process. Any change in business policies results in change in the code written for data validation used to enforce the business policies. Implementing the change in business rules without changing the code is the objective of this paper. The proposed approach enables users to create rule sets at run time once the software has been developed. The newly defined rule sets by end users are associated with the data variables for which the validation is required. The proposed approach facilitates the users to define business rules using all the comparison operators and Boolean operators. Multithreading is used to validate the data entered by end user against the business rules applied. The evaluation of the data is performed by a newly created thread using an enhanced form of the RPN (Reverse Polish Notation) algorithm.

Keywords: Business Rules, data validation, multithreading, Reverse Polish Notation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2237
7310 Spatial Data Mining by Decision Trees

Authors: S. Oujdi, H. Belbachir

Abstract:

Existing methods of data mining cannot be applied on spatial data because they require spatial specificity consideration, as spatial relationships. This paper focuses on the classification with decision trees, which are one of the data mining techniques. We propose an extension of the C4.5 algorithm for spatial data, based on two different approaches Join materialization and Querying on the fly the different tables. Similar works have been done on these two main approaches, the first - Join materialization - favors the processing time in spite of memory space, whereas the second - Querying on the fly different tables- promotes memory space despite of the processing time. The modified C4.5 algorithm requires three entries tables: a target table, a neighbor table, and a spatial index join that contains the possible spatial relationship among the objects in the target table and those in the neighbor table. Thus, the proposed algorithms are applied to a spatial data pattern in the accidentology domain. A comparative study of our approach with other works of classification by spatial decision trees will be detailed.

Keywords: C4.5 Algorithm, Decision trees, S-CART, Spatial data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2947
7309 Affine Projection Algorithm with Variable Data-Reuse Factor

Authors: ChangWoo Lee, Young Kow Lee, Sung Jun Ban, SungHoo Choi, Sang Woo Kim

Abstract:

This paper suggests a new Affine Projection (AP) algorithm with variable data-reuse factor using the condition number as a decision factor. To reduce computational burden, we adopt a recently reported technique which estimates the condition number of an input data matrix. Several simulations show that the new algorithm has better performance than that of the conventional AP algorithm.

Keywords: Affine projection algorithm, variable data-reuse factor, condition number, convergence rate, misalignment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1511
7308 Influence of Shading on a BIPV System’s Performance in an Urban Context: Case Study of BIPV Systems of the Science Center of Complexity Building of the National and Autonomous University of Mexico in Mexico City

Authors: Viridiana Edith Ardura Perea, José Luis Bermúdez Alcocer

Abstract:

The purpose of this paper is to establish the influence of shading on a Building Integrated Photovoltaic (BIPV) system´s performance in an urban context. The PV systems of the Science Center of Complexity (Centro de Ciencias de la Complejidad) Building based in the Main Campus of the National and Autonomous University of Mexico (UNAM) in Mexico City was taken as case study.  The PV systems are placed on the rooftop and on the south façade of the building.  The south-façade PV system, operating as sunshades, consists of two strings:  one at the ground floor and the other one at the first floor.  According to the building’s facility manager, the south-façade PV system generates 42% less electricity per kilowatt peak (kWp) installed than the one on the roof.  The methods applied in this study were Solar Radiation Analysis (SRA) simulations performed with the Insight 360 Plug-in from Revit 2018® and an on-site measurement using specialized tools.  The results of the SRA simulations showed that the shading casted by the PV system placed on the first floor on top of the PV system of the ground floor decreases its solar incident radiation over 50%.  The simulation outcome was compared and validated to the measured data obtained from the on-site measurement.  In conclusion, the loss factor achieved from the shading of the PVs is due to the surroundings and the PV system´s own design.  The south-façade BIPV system’s deficient design generates critical losses on its performance and decreases its profitability.

Keywords: Building integrated photovoltaics (BIPV) design, energy analysis software, shading losses, solar radiation analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1457
7307 Selecting Negative Examples for Protein-Protein Interaction

Authors: Mohammad Shoyaib, M. Abdullah-Al-Wadud, Oksam Chae

Abstract:

Proteomics is one of the largest areas of research for bioinformatics and medical science. An ambitious goal of proteomics is to elucidate the structure, interactions and functions of all proteins within cells and organisms. Predicting Protein-Protein Interaction (PPI) is one of the crucial and decisive problems in current research. Genomic data offer a great opportunity and at the same time a lot of challenges for the identification of these interactions. Many methods have already been proposed in this regard. In case of in-silico identification, most of the methods require both positive and negative examples of protein interaction and the perfection of these examples are very much crucial for the final prediction accuracy. Positive examples are relatively easy to obtain from well known databases. But the generation of negative examples is not a trivial task. Current PPI identification methods generate negative examples based on some assumptions, which are likely to affect their prediction accuracy. Hence, if more reliable negative examples are used, the PPI prediction methods may achieve even more accuracy. Focusing on this issue, a graph based negative example generation method is proposed, which is simple and more accurate than the existing approaches. An interaction graph of the protein sequences is created. The basic assumption is that the longer the shortest path between two protein-sequences in the interaction graph, the less is the possibility of their interaction. A well established PPI detection algorithm is employed with our negative examples and in most cases it increases the accuracy more than 10% in comparison with the negative pair selection method in that paper.

Keywords: Interaction graph, Negative training data, Protein-Protein interaction, Support vector machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1668
7306 Using Data Mining for Learning and Clustering FCM

Authors: Somayeh Alizadeh, Mehdi Ghazanfari, Mohammad Fathian

Abstract:

Fuzzy Cognitive Maps (FCMs) have successfully been applied in numerous domains to show relations between essential components. In some FCM, there are more nodes, which related to each other and more nodes means more complex in system behaviors and analysis. In this paper, a novel learning method used to construct FCMs based on historical data and by using data mining and DEMATEL method, a new method defined to reduce nodes number. This method cluster nodes in FCM based on their cause and effect behaviors.

Keywords: Clustering, Data Mining, Fuzzy Cognitive Map(FCM), Learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1987
7305 Modeling Low Voltage Power Line as a Data Communication Channel

Authors: Eklas Hossain, Sheroz Khan, Ahad Ali

Abstract:

Power line communications may be used as a data communication channel in public and indoor distribution networks so that it does not require the installing of new cables. Industrial low voltage distribution network may be utilized for data transfer required by the on-line condition monitoring of electric motors. This paper presents a pilot distribution network for modeling low voltage power line as data transfer channel. The signal attenuation in communication channels in the pilot environment is presented and the analysis is done by varying the corresponding parameters for the signal attenuation.

Keywords: Data communication, indoor distribution networks, low voltage, power line.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3248
7304 Generating Concept Trees from Dynamic Self-organizing Map

Authors: Norashikin Ahmad, Damminda Alahakoon

Abstract:

Self-organizing map (SOM) provides both clustering and visualization capabilities in mining data. Dynamic self-organizing maps such as Growing Self-organizing Map (GSOM) has been developed to overcome the problem of fixed structure in SOM to enable better representation of the discovered patterns. However, in mining large datasets or historical data the hierarchical structure of the data is also useful to view the cluster formation at different levels of abstraction. In this paper, we present a technique to generate concept trees from the GSOM. The formation of tree from different spread factor values of GSOM is also investigated and the quality of the trees analyzed. The results show that concept trees can be generated from GSOM, thus, eliminating the need for re-clustering of the data from scratch to obtain a hierarchical view of the data under study.

Keywords: dynamic self-organizing map, concept formation, clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1421
7303 Optical Fiber Data Throughput in a Quantum Communication System

Authors: Arash Kosari, Ali Araghi

Abstract:

A mathematical model for an optical-fiber communication channel is developed which results in an expression that calculates the throughput and loss of the corresponding link. The data are assumed to be transmitted by using of separate photons with different polarizations. The derived model also shows the dependency of data throughput with length of the channel and depolarization factor. It is observed that absorption of photons affects the throughput in a more intensive way in comparison with that of depolarization. Apart from that, the probability of depolarization and the absorption of radiated photons are obtained.

Keywords: Absorption, data throughput, depolarization, optical fiber.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1637
7302 Fuzzy Clustering of Categorical Attributes and its Use in Analyzing Cultural Data

Authors: George E. Tsekouras, Dimitris Papageorgiou, Sotiris Kotsiantis, Christos Kalloniatis, Panagiotis Pintelas

Abstract:

We develop a three-step fuzzy logic-based algorithm for clustering categorical attributes, and we apply it to analyze cultural data. In the first step the algorithm employs an entropy-based clustering scheme, which initializes the cluster centers. In the second step we apply the fuzzy c-modes algorithm to obtain a fuzzy partition of the data set, and the third step introduces a novel cluster validity index, which decides the final number of clusters.

Keywords: Categorical data, cultural data, fuzzy logic clustering, fuzzy c-modes, cluster validity index.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1669
7301 Weighted Data Replication Strategy for Data Grid Considering Economic Approach

Authors: N. Mansouri, A. Asadi

Abstract:

Data Grid is a geographically distributed environment that deals with data intensive application in scientific and enterprise computing. Data replication is a common method used to achieve efficient and fault-tolerant data access in Grids. In this paper, a dynamic data replication strategy, called Enhanced Latest Access Largest Weight (ELALW) is proposed. This strategy is an enhanced version of Latest Access Largest Weight strategy. However, replication should be used wisely because the storage capacity of each Grid site is limited. Thus, it is important to design an effective strategy for the replication replacement task. ELALW replaces replicas based on the number of requests in future, the size of the replica, and the number of copies of the file. It also improves access latency by selecting the best replica when various sites hold replicas. The proposed replica selection selects the best replica location from among the many replicas based on response time that can be determined by considering the data transfer time, the storage access latency, the replica requests that waiting in the storage queue and the distance between nodes. Simulation results utilizing the OptorSim show our replication strategy achieve better performance overall than other strategies in terms of job execution time, effective network usage and storage resource usage.

Keywords: Data grid, data replication, simulation, replica selection, replica placement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2072
7300 A Proposal of an Automatic Formatting Method for Transforming XML Data

Authors: Zhe JIN, Motomichi TOYAMA

Abstract:

PPX(Pretty Printer for XML) is a query language that offers a concise description method of formatting the XML data into HTML. In this paper, we propose a simple specification of formatting method that is a combination description of automatic layout operators and variables in the layout expression of the GENERATE clause of PPX. This method can automatically format irregular XML data included in a part of XML with layout decision rule that is referred to DTD. In the experiment, a quick comparison shows that PPX requires far less description compared to XSLT or XQuery programs doing same tasks.

Keywords: PPX, Irregular XML data, Layout decision rule, HTML.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1385
7299 A Programming Assessment Software Artefact Enhanced with the Help of Learners

Authors: Romeo A. Botes, Imelda Smit

Abstract:

The demands of an ever changing and complex higher education environment, along with the profile of modern learners challenge current approaches to assessment and feedback. More learners enter the education system every year. The younger generation expects immediate feedback. At the same time, feedback should be meaningful. The assessment of practical activities in programming poses a particular problem, since both lecturers and learners in the information and computer science discipline acknowledge that paper-based assessment for programming subjects lacks meaningful real-life testing. At the same time, feedback lacks promptness, consistency, comprehensiveness and individualisation. Most of these aspects may be addressed by modern, technology-assisted assessment. The focus of this paper is the continuous development of an artefact that is used to assist the lecturer in the assessment and feedback of practical programming activities in a senior database programming class. The artefact was developed using three Design Science Research cycles. The first implementation allowed one programming activity submission per assessment intervention. This pilot provided valuable insight into the obstacles regarding the implementation of this type of assessment tool. A second implementation improved the initial version to allow multiple programming activity submissions per assessment. The focus of this version is on providing scaffold feedback to the learner – allowing improvement with each subsequent submission. It also has a built-in capability to provide the lecturer with information regarding the key problem areas of each assessment intervention.

Keywords: Programming, computer-aided assessment, technology-assisted assessment, programming assessment software, design science research, mixed-method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 960
7298 Data Mining in Oral Medicine Using Decision Trees

Authors: Fahad Shahbaz Khan, Rao Muhammad Anwer, Olof Torgersson, Göran Falkman

Abstract:

Data mining has been used very frequently to extract hidden information from large databases. This paper suggests the use of decision trees for continuously extracting the clinical reasoning in the form of medical expert-s actions that is inherent in large number of EMRs (Electronic Medical records). In this way the extracted data could be used to teach students of oral medicine a number of orderly processes for dealing with patients who represent with different problems within the practice context over time.

Keywords: Data mining, Oral Medicine, Decision Trees, WEKA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2460
7297 An Efficient Data Collection Approach for Wireless Sensor Networks

Authors: Hanieh Alipour, Alireza Nemaney Pour

Abstract:

One of the most important applications of wireless sensor networks is data collection. This paper proposes as efficient approach for data collection in wireless sensor networks by introducing Member Forward List. This list includes the nodes with highest priority for forwarding the data. When a node fails or dies, this list is used to select the next node with higher priority. The benefit of this node is that it prevents the algorithm from repeating when a node fails or dies. The results show that Member Forward List decreases power consumption and latency in wireless sensor networks.

Keywords: Data Collection, Wireless Sensor Network, SensorNode, Tree-Based

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2372
7296 Scientific Production on Lean Supply Chains Published in Journals Indexed by SCOPUS and Web of Science Databases: A Bibliometric Study

Authors: T. Botelho de Sousa, F. Raphael Cabral Furtado, O. Eduardo da Silva Ferri, A. Batista, W. Augusto Varella, C. Eduardo Pinto, J. Mimar Santa Cruz Yabarrena, S. Gibran Ruwer, F. Müller Guerrini, L. Adalberto Philippsen Júnior

Abstract:

Lean Supply Chain Management (LSCM) is an emerging research field in Operations Management (OM). As a strategic model that focuses on reduced cost and waste with fulfilling the needs of customers, LSCM attracts great interest among researchers and practitioners. The purpose of this paper is to present an overview of Lean Supply Chains literature, based on bibliometric analysis through 57 papers published in indexed journals by SCOPUS and/or Web of Science databases. The results indicate that the last three years (2015, 2016, and 2017) were the most productive on LSCM discussion, especially in Supply Chain Management and International Journal of Lean Six Sigma journals. India, USA, and UK are the most productive countries; nevertheless, cross-country studies by collaboration among researchers were detected, by social network analysis, as a research practice, appearing to play a more important role on LSCM studies. Despite existing limitation, such as limited indexed journal database, bibliometric analysis helps to enlighten ongoing efforts on LSCM researches, including most used technical procedures and collaboration network, showing important research gaps, especially, for development countries researchers.

Keywords: Lean supply chains, bibliometric study, SCOPUS, web of Science.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 896