Search results for: Large Data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8997

Search results for: Large Data

8817 Simulation of Large Deformations of Rubbers by the RKPM Method

Authors: M. Foroutan, H. Dalayeli, M. Sadeghian

Abstract:

In this paper processes including large deformations of a rubber with hyperelastic material behavior are simulated by the RKPM method. Due to the loss of kronecker delta properties in the mesh less shape functions, the imposition of essential boundary conditions consumes significant CPU time in mesh free computations. In this work transformation method is used for imposition of essential boundary conditions. A RKPM material shape function is used in this analysis. The support of the material shape functions covers the same set of particles during material deformation and hence the transformation matrix is formed only once at the initial stages. A computer program in MATLAB is developed for simulations.

Keywords: RKPM, large deformations, transformation, essentialboundary conditions.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1893
8816 Lean in Large Enterprises: Study Results

Authors: Dorota Stadnicka, Katarzyna Antosz

Abstract:

The idea of Lean Manufacturing has been known for 20 years. In Polish enterprises its first implementation took place in the automotive industry in the 90s. Many companies, in order to reduce costs, use lower quality materials or overload employees with work and do not notice other possibilities for improving the company’s effectiveness. Enterprises are afraid of the unknown. And that is the problem in many cases with the Lean Manufacturing conception. This article presents the study results conducted in enterprises, the aim of which was to identify waste awareness and the need of lean manufacturing implementation. The authors also wanted to gain information on the most commonly used tools and the way of implementation and the methods of the effects assessment of using the mentioned conception. The study was conducted in large enterprises located on a limited area.

Keywords: Large enterprises, lean manufacturing, tools, wastes elimination.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3586
8815 Standard Languages for Creating a Database to Display Financial Statements on a Web Application

Authors: Vladimir Simovic, Matija Varga, Predrag Oreski

Abstract:

XHTML and XBRL are the standard languages for creating a database for the purpose of displaying financial statements on web applications. Today, XBRL is one of the most popular languages for business reporting. A large number of countries in the world recognize the role of XBRL language for financial reporting and the benefits that the reporting format provides in the collection, analysis, preparation, publication and the exchange of data (information) which is the positive side of this language. Here we present all advantages and opportunities that a company may have by using the XBRL format for business reporting. Also, this paper presents XBRL and other languages that are used for creating the database, such XML, XHTML, etc. The role of the AJAX complex model and technology will be explained in detail, and during the exchange of financial data between the web client and web server. Here will be mentioned basic layers of the network for data exchange via the web.

Keywords: XHTML, XBRL, XML, JavaScript, AJAX technology, data exchange.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1070
8814 Revised PLWAP Tree with Non-frequent Items for Mining Sequential Pattern

Authors: R. Vishnu Priya, A. Vadivel

Abstract:

Sequential pattern mining is a challenging task in data mining area with large applications. One among those applications is mining patterns from weblog. Recent times, weblog is highly dynamic and some of them may become absolute over time. In addition, users may frequently change the threshold value during the data mining process until acquiring required output or mining interesting rules. Some of the recently proposed algorithms for mining weblog, build the tree with two scans and always consume large time and space. In this paper, we build Revised PLWAP with Non-frequent Items (RePLNI-tree) with single scan for all items. While mining sequential patterns, the links related to the nonfrequent items are not considered. Hence, it is not required to delete or maintain the information of nodes while revising the tree for mining updated transactions. The algorithm supports both incremental and interactive mining. It is not required to re-compute the patterns each time, while weblog is updated or minimum support changed. The performance of the proposed tree is better, even the size of incremental database is more than 50% of existing one. For evaluation purpose, we have used the benchmark weblog dataset and found that the performance of proposed tree is encouraging compared to some of the recently proposed approaches.

Keywords: Sequential pattern mining, weblog, frequent and non-frequent items, incremental and interactive mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1931
8813 Visual Text Analytics Technologies for Real-Time Big Data: Chronological Evolution and Issues

Authors: Siti Azrina B. A. Aziz, Siti Hafizah A. Hamid

Abstract:

New approaches to analyze and visualize data stream in real-time basis is important in making a prompt decision by the decision maker. Financial market trading and surveillance, large-scale emergency response and crowd control are some example scenarios that require real-time analytic and data visualization. This situation has led to the development of techniques and tools that support humans in analyzing the source data. With the emergence of Big Data and social media, new techniques and tools are required in order to process the streaming data. Today, ranges of tools which implement some of these functionalities are available. In this paper, we present chronological evolution evaluation of technologies for supporting of real-time analytic and visualization of the data stream. Based on the past research papers published from 2002 to 2014, we gathered the general information, main techniques, challenges and open issues. The techniques for streaming text visualization are identified based on Text Visualization Browser in chronological order. This paper aims to review the evolution of streaming text visualization techniques and tools, as well as to discuss the problems and challenges for each of identified tools.

Keywords: Information visualization, visual analytics, text mining, visual text analytics tools, big data visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1002
8812 Reducing SAGE Data Using Genetic Algorithms

Authors: Cheng-Hong Yang, Tsung-Mu Shih, Li-Yeh Chuang

Abstract:

Serial Analysis of Gene Expression is a powerful quantification technique for generating cell or tissue gene expression data. The profile of the gene expression of cell or tissue in several different states is difficult for biologists to analyze because of the large number of genes typically involved. However, feature selection in machine learning can successfully reduce this problem. The method allows reducing the features (genes) in specific SAGE data, and determines only relevant genes. In this study, we used a genetic algorithm to implement feature selection, and evaluate the classification accuracy of the selected features with the K-nearest neighbor method. In order to validate the proposed method, we used two SAGE data sets for testing. The results of this study conclusively prove that the number of features of the original SAGE data set can be significantly reduced and higher classification accuracy can be achieved.

Keywords: Serial Analysis of Gene Expression, Feature selection, Genetic Algorithm, K-nearest neighbor method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1609
8811 Lightweight Mirrors for Space X-Ray Telescopes

Authors: M. Mika, L. Pina, M. Landova, L. Sveda, R. Havlikova, V. Marsikova, R. Hudec, A. Inneman

Abstract:

Future astronomical projects on large space x-ray imaging telescopes require novel substrates and technologies for the construction of their reflecting mirrors. The mirrors must be lightweight and precisely shaped to achieve large collecting area with high angular resolution. The new materials and technologies must be cost-effective. Currently, the most promising materials are glass or silicon foils. We focused on precise shaping these foils by thermal forming process. We studied free and forced slumping in the temperature region of hot plastic deformation and compared the shapes obtained by the different slumping processes. We measured the shapes and the surface quality of the foils. In the experiments, we varied both heat-treatment temperature and time following our experiment design. The obtained data and relations we can use for modeling and optimizing the thermal forming procedure.

Keywords: Glass, silicon, thermal forming, x-ray

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1385
8810 A Methodology for Automatic Diversification of Document Categories

Authors: Dasom Kim, Chen Liu, Myungsu Lim, Soo-Hyeon Jeon, Byeoung Kug Jeon, Kee-Young Kwahk, Namgyu Kim

Abstract:

Recently, numerous documents including large volumes of unstructured data and text have been created because of the rapid increase in the use of social media and the Internet. Usually, these documents are categorized for the convenience of users. Because the accuracy of manual categorization is not guaranteed, and such categorization requires a large amount of time and incurs huge costs. Many studies on automatic categorization have been conducted to help mitigate the limitations of manual categorization. Unfortunately, most of these methods cannot be applied to categorize complex documents with multiple topics because they work on the assumption that individual documents can be categorized into single categories only. Therefore, to overcome this limitation, some studies have attempted to categorize each document into multiple categories. However, the learning process employed in these studies involves training using a multi-categorized document set. These methods therefore cannot be applied to the multi-categorization of most documents unless multi-categorized training sets using traditional multi-categorization algorithms are provided. To overcome this limitation, in this study, we review our novel methodology for extending the category of a single-categorized document to multiple categorizes, and then introduce a survey-based verification scenario for estimating the accuracy of our automatic categorization methodology.

Keywords: Big Data Analysis, Document Classification, Text Mining, Topic Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1745
8809 Localization for Indoor Service Robot Using Natural Landmark on the Ceiling

Authors: Seung-Hun Kim, Changwoo Park

Abstract:

In this paper, we present a localization of a mobile robot with localization modules which have two ceiling-view cameras in indoor environments. We propose two kinds of localization method. The one is the localization in the local space; we use the line feature and the corner feature between the ceiling and wall. The other is the localization in the large space; we use the natural features such as bulbs, structures on the ceiling. These methods are installed on the embedded module able to mount on the robot. The embedded module has two cameras to be able to localize in both the local space and the large spaces. The experiment is practiced in our indoor test-bed and a government office. The proposed method is proved by the experimental results.

Keywords: Robot, Localization, Indoor, Ceiling vision, Local space, Large space, Complex space.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2177
8808 A Weighted Sum Technique for the Joint Optimization of Performance and Power Consumption in Data Centers

Authors: Samee Ullah Khan, C.Ardil

Abstract:

With data centers, end-users can realize the pervasiveness of services that will be one day the cornerstone of our lives. However, data centers are often classified as computing systems that consume the most amounts of power. To circumvent such a problem, we propose a self-adaptive weighted sum methodology that jointly optimizes the performance and power consumption of any given data center. Compared to traditional methodologies for multi-objective optimization problems, the proposed self-adaptive weighted sum technique does not rely on a systematical change of weights during the optimization procedure. The proposed technique is compared with the greedy and LR heuristics for large-scale problems, and the optimal solution for small-scale problems implemented in LINDO. the experimental results revealed that the proposed selfadaptive weighted sum technique outperforms both of the heuristics and projects a competitive performance compared to the optimal solution.

Keywords: Meta-heuristics, distributed systems, adaptive methods, resource allocation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1836
8807 Design and Implementation of Security Middleware for Data Warehouse Signature Framework

Authors: Mayada AlMeghari

Abstract:

Recently, grid middlewares have provided large integrated use of network resources as the shared data and the CPU to become a virtual supercomputer. In this work, we present the design and implementation of the middleware for Data Warehouse Signature (DWS) Framework. The aim of using the middleware in the proposed DWS framework is to achieve the high performance by the parallel computing. This middleware is developed on Alchemi.Net framework to increase the security among the network nodes through the authentication and group-key distribution model. This model achieves the key security and prevents any intermediate attacks in the middleware. This paper presents the flow process structures of the middleware design. In addition, the paper ensures the implementation of security for DWS middleware enhancement with the authentication and group-key distribution model. Finally, from the analysis of other middleware approaches, the developed middleware of DWS framework is the optimal solution of a complete covering of security issues.

Keywords: Middleware, parallel computing, data warehouse, security, group-key, high performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 337
8806 Parallel Computation of Data Summation for Multiple Problem Spaces on Partitioned Optical Passive Stars Network

Authors: Khin Thida Latt, Mineo Kaneko, Yoichi Shinoda

Abstract:

In Partitioned Optical Passive Stars POPS network,nodes and couplers become free after slot to slot in some computation.It is necessary to efficiently utilize free couplers and nodes to be cost effective. Improving parallelism, we present the fast data summation algorithm for multiple problem spaces on P OP S(g, g) with smaller number of nodes for the case of d =n = g. For the case of d >n > g, we simulate the calculation of large number of data items dedicated to larger system with many nodes on smaller system with smaller number of nodes. The algorithm is faster than the best know algorithm and using smaller number of nodes and groups make the system low cost and practical.

Keywords: Partitioned optical passive stars network, parallelcomputing, optical computing, data sum

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1179
8805 A Tree Based Association Rule Approach for XML Data with Semantic Integration

Authors: D. Sasikala, K. Premalatha

Abstract:

The use of eXtensible Markup Language (XML) in web, business and scientific databases lead to the development of methods, techniques and systems to manage and analyze XML data. Semi-structured documents suffer due to its heterogeneity and dimensionality. XML structure and content mining represent convergence for research in semi-structured data and text mining. As the information available on the internet grows drastically, extracting knowledge from XML documents becomes a harder task. Certainly, documents are often so large that the data set returned as answer to a query may also be very big to convey the required information. To improve the query answering, a Semantic Tree Based Association Rule (STAR) mining method is proposed. This method provides intentional information by considering the structure, content and the semantics of the content. The method is applied on Reuter’s dataset and the results show that the proposed method outperforms well.

Keywords: Semi--structured Document, Tree based Association Rule (TAR), Semantic Association Rule Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2352
8804 A Survey in Techniques for Imbalanced Intrusion Detection System Datasets

Authors: Najmeh Abedzadeh, Matthew Jacobs

Abstract:

An intrusion detection system (IDS) is a software application that monitors malicious activities and generates alerts if any are detected. However, most network activities in IDS datasets are normal, and the relatively few numbers of attacks make the available data imbalanced. Consequently, cyber-attacks can hide inside a large number of normal activities, and machine learning algorithms have difficulty learning and classifying the data correctly. In this paper, a comprehensive literature review is conducted on different types of algorithms for both implementing the IDS and methods in correcting the imbalanced IDS dataset. The most famous algorithms are machine learning (ML), deep learning (DL), synthetic minority over-sampling technique (SMOTE), and reinforcement learning (RL). Most of the research use the CSE-CIC-IDS2017, CSE-CIC-IDS2018, and NSL-KDD datasets for evaluating their algorithms.

Keywords: IDS, intrusion detection system, imbalanced datasets, sampling algorithms, big data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1125
8803 Data Traffic Dynamics and Saturation on a Single Link

Authors: Reginald D. Smith

Abstract:

The dynamics of User Datagram Protocol (UDP) traffic over Ethernet between two computers are analyzed using nonlinear dynamics which shows that there are two clear regimes in the data flow: free flow and saturated. The two most important variables affecting this are the packet size and packet flow rate. However, this transition is due to a transcritical bifurcation rather than phase transition in models such as in vehicle traffic or theorized large-scale computer network congestion. It is hoped this model will help lay the groundwork for further research on the dynamics of networks, especially computer networks.

Keywords: congestion, packet flow, Internet, traffic dynamics, transcritical bifurcation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1615
8802 Influential Parameters in Estimating Soil Properties from Cone Penetrating Test: An Artificial Neural Network Study

Authors: Ahmed G. Mahgoub, Dahlia H. Hafez, Mostafa A. Abu Kiefa

Abstract:

The Cone Penetration Test (CPT) is a common in-situ test which generally investigates a much greater volume of soil more quickly than possible from sampling and laboratory tests. Therefore, it has the potential to realize both cost savings and assessment of soil properties rapidly and continuously. The principle objective of this paper is to demonstrate the feasibility and efficiency of using artificial neural networks (ANNs) to predict the soil angle of internal friction (Φ) and the soil modulus of elasticity (E) from CPT results considering the uncertainties and non-linearities of the soil. In addition, ANNs are used to study the influence of different parameters and recommend which parameters should be included as input parameters to improve the prediction. Neural networks discover relationships in the input data sets through the iterative presentation of the data and intrinsic mapping characteristics of neural topologies. General Regression Neural Network (GRNN) is one of the powerful neural network architectures which is utilized in this study. A large amount of field and experimental data including CPT results, plate load tests, direct shear box, grain size distribution and calculated data of overburden pressure was obtained from a large project in the United Arab Emirates. This data was used for the training and the validation of the neural network. A comparison was made between the obtained results from the ANN's approach, and some common traditional correlations that predict Φ and E from CPT results with respect to the actual results of the collected data. The results show that the ANN is a very powerful tool. Very good agreement was obtained between estimated results from ANN and actual measured results with comparison to other correlations available in the literature. The study recommends some easily available parameters that should be included in the estimation of the soil properties to improve the prediction models. It is shown that the use of friction ration in the estimation of Φ and the use of fines content in the estimation of E considerable improve the prediction models.

Keywords: Angle of internal friction, Cone penetrating test, General regression neural network, Soil modulus of elasticity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2282
8801 Distributed Splay Suffix Arrays: A New Structure for Distributed String Search

Authors: Tu Kun, Gu Nai-jie, Bi Kun, Liu Gang, Dong Wan-li

Abstract:

As a structure for processing string problem, suffix array is certainly widely-known and extensively-studied. But if the string access pattern follows the “90/10" rule, suffix array can not take advantage of the fact that we often find something that we have just found. Although the splay tree is an efficient data structure for small documents when the access pattern follows the “90/10" rule, it requires many structures and an excessive amount of pointer manipulations for efficiently processing and searching large documents. In this paper, we propose a new and conceptually powerful data structure, called splay suffix arrays (SSA), for string search. This data structure combines the features of splay tree and suffix arrays into a new approach which is suitable to implementation on both conventional and clustered computers.

Keywords: suffix arrays, splay tree, string search, distributedalgorithm

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1777
8800 Inversion of Electrical Resistivity Data: A Review

Authors: Shrey Sharma, Gunjan Kumar Verma

Abstract:

High density electrical prospecting has been widely used in groundwater investigation, civil engineering and environmental survey. For efficient inversion, the forward modeling routine, sensitivity calculation, and inversion algorithm must be efficient. This paper attempts to provide a brief summary of the past and ongoing developments of the method. It includes reviews of the procedures used for data acquisition, processing and inversion of electrical resistivity data based on compilation of academic literature. In recent times there had been a significant evolution in field survey designs and data inversion techniques for the resistivity method. In general 2-D inversion for resistivity data is carried out using the linearized least-square method with the local optimization technique .Multi-electrode and multi-channel systems have made it possible to conduct large 2-D, 3-D and even 4-D surveys efficiently to resolve complex geological structures that were not possible with traditional 1-D surveys. 3-D surveys play an increasingly important role in very complex areas where 2-D models suffer from artifacts due to off-line structures. Continued developments in computation technology, as well as fast data inversion techniques and software, have made it possible to use optimization techniques to obtain model parameters to a higher accuracy. A brief discussion on the limitations of the electrical resistivity method has also been presented.

Keywords: Resistivity, inversion, optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6073
8799 A Materialized View Approach to Support Aggregation Operations over Long Periods in Sensor Networks

Authors: Minsoo Lee, Julee Choi, Sookyung Song

Abstract:

The increasing interest on processing data created by sensor networks has evolved into approaches to implement sensor networks as databases. The aggregation operator, which calculates a value from a large group of data such as computing averages or sums, etc. is an essential function that needs to be provided when implementing such sensor network databases. This work proposes to add the DURING clause into TinySQL to calculate values during a specific long period and suggests a way to implement the aggregation service in sensor networks by applying materialized view and incremental view maintenance techniques that is used in data warehouses. In sensor networks, data values are passed from child nodes to parent nodes and an aggregation value is computed at the root node. As such root nodes need to be memory efficient and low powered, it becomes a problem to recompute aggregate values from all past and current data. Therefore, applying incremental view maintenance techniques can reduce the memory consumption and support fast computation of aggregate values.

Keywords: Aggregation, Incremental View Maintenance, Materialized view, Sensor Network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1540
8798 The Fuel Consumption and Non Linear Model Metropolitan and Large City Transportation System

Authors: Mudjiastuti Handajani

Abstract:

The national economy development affects the vehicle ownership which ultimately increases fuel consumption. The rise of the vehicle ownership is dominated by the increasing number of motorcycles. This research aims to analyze and identify the characteristics of fuel consumption, the city transportation system, and to analyze the relationship and the effect of the city transportation system on the fuel consumption. A multivariable analysis is used in this study. The data analysis techniques include: a Multivariate Multivariable Analysis by using the R software. More than 84% of fuel on Java is consumed in metropolitan and large cities. The city transportation system variables that strongly effect the fuel consumption are population, public vehicles, private vehicles and private bus. This method can be developed to control the fuel consumption by considering the urban transport system and city tipology. The effect can reducing subsidy on the fuel consumption, increasing state economic.

Keywords: city, consumption, fuel, transportation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1950
8797 Ranking DMUs by Ideal PPS in Data Envelopment Analysis

Authors: V.Rezaie, M.Khanmohammady

Abstract:

An original DEA model is to evaluate each DMU optimistically, but the interval DEA Model proposed in this paper has been formulated to obtain an efficiency interval consisting of Evaluations from both the optimistic and the pessimistic view points. DMUs are improved so that their lower bounds become so large as to attain the maximum Value one. The points obtained by this method are called ideal points. Ideal PPS is calculated by ideal of efficiency DMUs. The purpose of this paper is to rank DMUs by this ideal PPS. Finally we extend the efficiency interval of a DMU under variable RTS technology.

Keywords: Data envelopment analysis (DEA), Decision makingunit (DMU), Interval DEA, Ideal points, Ideal PPS, Return to scale(RTS).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1926
8796 Machine Learning Facing Behavioral Noise Problem in an Imbalanced Data Using One Side Behavioral Noise Reduction: Application to a Fraud Detection

Authors: Salma El Hajjami, Jamal Malki, Alain Bouju, Mohammed Berrada

Abstract:

With the expansion of machine learning and data mining in the context of Big Data analytics, the common problem that affects data is class imbalance. It refers to an imbalanced distribution of instances belonging to each class. This problem is present in many real world applications such as fraud detection, network intrusion detection, medical diagnostics, etc. In these cases, data instances labeled negatively are significantly more numerous than the instances labeled positively. When this difference is too large, the learning system may face difficulty when tackling this problem, since it is initially designed to work in relatively balanced class distribution scenarios. Another important problem, which usually accompanies these imbalanced data, is the overlapping instances between the two classes. It is commonly referred to as noise or overlapping data. In this article, we propose an approach called: One Side Behavioral Noise Reduction (OSBNR). This approach presents a way to deal with the problem of class imbalance in the presence of a high noise level. OSBNR is based on two steps. Firstly, a cluster analysis is applied to groups similar instances from the minority class into several behavior clusters. Secondly, we select and eliminate the instances of the majority class, considered as behavioral noise, which overlap with behavior clusters of the minority class. The results of experiments carried out on a representative public dataset confirm that the proposed approach is efficient for the treatment of class imbalances in the presence of noise.

Keywords: Machine learning, Imbalanced data, Data mining, Big data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1137
8795 Data Preprocessing for Supervised Leaning

Authors: S. B. Kotsiantis, D. Kanellopoulos, P. E. Pintelas

Abstract:

Many factors affect the success of Machine Learning (ML) on a given task. The representation and quality of the instance data is first and foremost. If there is much irrelevant and redundant information present or noisy and unreliable data, then knowledge discovery during the training phase is more difficult. It is well known that data preparation and filtering steps take considerable amount of processing time in ML problems. Data pre-processing includes data cleaning, normalization, transformation, feature extraction and selection, etc. The product of data pre-processing is the final training set. It would be nice if a single sequence of data pre-processing algorithms had the best performance for each data set but this is not happened. Thus, we present the most well know algorithms for each step of data pre-processing so that one achieves the best performance for their data set.

Keywords: Data mining, feature selection, data cleaning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6091
8794 System Survivability in Networks in the Context of Defense/Attack Strategies: The Large Scale

Authors: A. Ben Yaghlane, M. N. Azaiez, M. Mrad

Abstract:

We investigate the large scale of networks in the context of network survivability under attack. We use appropriate techniques to evaluate and the attacker-based- and the defenderbased- network survivability. The attacker is unaware of the operated links by the defender. Each attacked link has some pre-specified probability to be disconnected. The defender choice is so that to maximize the chance of successfully sending the flow to the destination node. The attacker however will select the cut-set with the highest chance to be disabled in order to partition the network. Moreover, we extend the problem to the case of selecting the best p paths to operate by the defender and the best k cut-sets to target by the attacker, for arbitrary integers p,k>1. We investigate some variations of the problem and suggest polynomial-time solutions.

Keywords: Defense/attack strategies, large scale, networks, partitioning a network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1481
8793 Solving the Teacher Assignment-Course Scheduling Problem by a Hybrid Algorithm

Authors: Aldy Gunawan, Kien Ming Ng, Kim Leng Poh

Abstract:

This paper presents a hybrid algorithm for solving a timetabling problem, which is commonly encountered in many universities. The problem combines both teacher assignment and course scheduling problems simultaneously, and is presented as a mathematical programming model. However, this problem becomes intractable and it is unlikely that a proven optimal solution can be obtained by an integer programming approach, especially for large problem instances. A hybrid algorithm that combines an integer programming approach, a greedy heuristic and a modified simulated annealing algorithm collaboratively is proposed to solve the problem. Several randomly generated data sets of sizes comparable to that of an institution in Indonesia are solved using the proposed algorithm. Computational results indicate that the algorithm can overcome difficulties of large problem sizes encountered in previous related works.

Keywords: Timetabling problem, mathematical programming model, hybrid algorithm, simulated annealing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4573
8792 A Spatial Point Pattern Analysis to Recognize Fail Bit Patterns in Semiconductor Manufacturing

Authors: Youngji Yoo, Seung Hwan Park, Daewoong An, Sung-Shick Kim, Jun-Geol Baek

Abstract:

The yield management system is very important to produce high-quality semiconductor chips in the semiconductor manufacturing process. In order to improve quality of semiconductors, various tests are conducted in the post fabrication (FAB) process. During the test process, large amount of data are collected and the data includes a lot of information about defect. In general, the defect on the wafer is the main causes of yield loss. Therefore, analyzing the defect data is necessary to improve performance of yield prediction. The wafer bin map (WBM) is one of the data collected in the test process and includes defect information such as the fail bit patterns. The fail bit has characteristics of spatial point patterns. Therefore, this paper proposes the feature extraction method using the spatial point pattern analysis. Actual data obtained from the semiconductor process is used for experiments and the experimental result shows that the proposed method is more accurately recognize the fail bit patterns.

Keywords: Semiconductor, wafer bin map (WBM), feature extraction, spatial point patterns, contour map.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2500
8791 Comparative Analysis of Measures to Secure Two-Way Evacuation Routes for Vulnerable People during Large Disasters in a Historic Area

Authors: Nobuo Mishima, Naomi Miyamoto, Yoko Taguchi

Abstract:

Historic preservation areas are extremely vulnerable to disasters because they are home to many vulnerable people and contain many closely spaced wooden houses. However, the narrow streets in these regions have historic meaning, which means that they cannot be widened and can become blocked easily during large disasters. Here, we describe our efforts to establish a methodology for the planning of evacuation route sin such historic preservation areas. In particular, this study aims to clarify the effectiveness of measures intended to secure two-way evacuation routes for vulnerable people during large disasters in a historic area preserved under the Cultural Properties Protection Law, Japan.

Keywords: Historic preservation, evacuation route analysis, vulnerable people, street blockade.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1588
8790 Replicating Data Objects in Large-scale Distributed Computing Systems using Extended Vickrey Auction

Authors: Samee Ullah Khan, Ishfaq Ahmad

Abstract:

This paper proposes a novel game theoretical technique to address the problem of data object replication in largescale distributed computing systems. The proposed technique draws inspiration from computational economic theory and employs the extended Vickrey auction. Specifically, players in a non-cooperative environment compete for server-side scarce memory space to replicate data objects so as to minimize the total network object transfer cost, while maintaining object concurrency. Optimization of such a cost in turn leads to load balancing, fault-tolerance and reduced user access time. The method is experimentally evaluated against four well-known techniques from the literature: branch and bound, greedy, bin-packing and genetic algorithms. The experimental results reveal that the proposed approach outperforms the four techniques in both the execution time and solution quality.

Keywords: Auctions, data replication, pricing, static allocation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1465
8789 Influence of Displacement Amplitude and Vertical Load on the Horizontal Dynamic and Static Behavior of Helical Wire Rope Isolators

Authors: Nicolò Vaiana, Mariacristina Spizzuoco, Giorgio Serino

Abstract:

In this paper, the results of experimental tests performed on a Helical Wire Rope Isolator (HWRI) are presented in order to describe the dynamic and static behavior of the selected metal device in three different displacements ranges, namely small, relatively large, and large displacements ranges, without and under the effect of a vertical load. A testing machine, allowing to apply horizontal displacement or load histories to the tested bearing with a constant vertical load, has been adopted to perform the dynamic and static tests. According to the experimental results, the dynamic behavior of the tested device depends on the applied displacement amplitude. Indeed, the HWRI displays a softening and a hardening stiffness at small and relatively large displacements, respectively, and a stronger nonlinear stiffening behavior at large displacements. Furthermore, the experimental tests reveal that the application of a vertical load allows to have a more flexible device with higher damping properties and that the applied vertical load affects much less the dynamic response of the metal device at large displacements. Finally, a decrease in the static to dynamic effective stiffness ratio with increasing displacement amplitude has been observed.

Keywords: Base isolation, earthquake engineering, experimental hysteresis loops, wire rope isolators.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1333
8788 A Prediction-Based Reversible Watermarking for MRI Images

Authors: Nuha Omran Abokhdair, Azizah Bt Abdul Manaf

Abstract:

Reversible watermarking is a special branch of image watermarking, that is able to recover the original image after extracting the watermark from the image. In this paper, an adaptive prediction-based reversible watermarking scheme is presented, in order to increase the payload capacity of MRI medical images. The scheme divides the image into two parts, Region of Interest (ROI) and Region of Non-Interest (RONI). Two bits are embedded in each embeddable pixel of RONI and one bit is embedded in each embeddable pixel of ROI. The experimental results demonstrate that the proposed scheme is able to achieve high embedding capacity. This is mainly caused by two reasons. First, the pixels that were excluded from data embedding due to overflow/underflow are used for data embedding. Second, large location map that need to be added to watermark data as overhead is eliminated and thus lower data embedding capacity is prevented. Moreover, the scheme provides good visual quality to the watermarked image.

Keywords: Medical image watermarking, reversible watermarking, Difference Expansion, Prediction-Error Expansion.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1916