Search results for: Large Data
8547 Vehicular Ad Hoc Network
Authors: S. Swapna Kumar
Abstract:
A Vehicular Ad-Hoc Network (VANET) is a mobile Ad-Hoc Network that provides connectivity moving device to fixed equipments. Such type of device is equipped with vehicle provides safety for the passengers. In the recent research areas of traffic management there observed the wide scope of design of new methodology of extension of wireless sensor networks and ad-hoc network principal for development of VANET technology. This paper provides the wide research view of the VANET and MANET concept for the researchers to contribute the better optimization technique for the development of effective and fast atomization technique for the large size of data exchange in this complex networks.
Keywords: Ad-Hoc, MANET, Sensors, Security, VANET
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 45018546 Identifying Corporate Managerial Topics with Web Pages
Authors: Juan Llopis, Reyes Gonzalez, Jose Gasco
Abstract:
This paper has as its main aim to analyse how corporate web pages can become an essential tool in order to detect strategic trends by firms or sectors, and even a primary source for benchmarking. This technique has made it possible to identify the key issues in the strategic management of the most excellent large Spanish firms and also to describe trends in their long-range planning, a way of working that can be generalised to any country or firm group. More precisely, two objectives were sought. The first one consisted in showing the way in which corporate websites make it possible to obtain direct information about the strategic variables which can define firms. This tool is dynamic (since web pages are constantly updated) as well as direct and reliable, since the information comes from the firm itself, not from comments of third parties (such as journalists, academicians, consultants...). When this information is analysed for a group of firms, one can observe their characteristics in terms of both managerial tasks and business management. As for the second objective, the methodology proposed served to describe the corporate profile of the large Spanish enterprises included in the Ibex35 (the Ibex35 or Iberia Index is the reference index in the Spanish Stock Exchange and gathers periodically the 35 most outstanding Spanish firms). An attempt is therefore made to define the long-range planning that would be characteristic of the largest Spanish firms.Keywords: Web Pages, Strategic Management, Corporate Description, Large Firms, Spain.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15768545 The Impact of System and Data Quality on Organizational Success in the Kingdom of Bahrain
Authors: Amal M. Alrayes
Abstract:
Data and system quality play a central role in organizational success, and the quality of any existing information system has a major influence on the effectiveness of overall system performance. Given the importance of system and data quality to an organization, it is relevant to highlight their importance on organizational performance in the Kingdom of Bahrain. This research aims to discover whether system quality and data quality are related, and to study the impact of system and data quality on organizational success. A theoretical model based on previous research is used to show the relationship between data and system quality, and organizational impact. We hypothesize, first, that system quality is positively associated with organizational impact, secondly that system quality is positively associated with data quality, and finally that data quality is positively associated with organizational impact. A questionnaire was conducted among public and private organizations in the Kingdom of Bahrain. The results show that there is a strong association between data and system quality, that affects organizational success.Keywords: Data quality, performance, system quality.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21188544 Relative Navigation with Laser-Based Intermittent Measurement for Formation Flying Satellites
Authors: Jongwoo Lee, Dae-Eun Kang, Sang-Young Park
Abstract:
This study presents a precise relative navigational method for satellites flying in formation using laser-based intermittent measurement data. The measurement data for the relative navigation between two satellites consist of a relative distance measured by a laser instrument and relative attitude angles measured by attitude determination. The relative navigation solutions are estimated by both the Extended Kalman filter (EKF) and unscented Kalman filter (UKF). The solutions estimated by the EKF may become inaccurate or even diverge as measurement outage time gets longer because the EKF utilizes a linearization approach. However, this study shows that the UKF with the appropriate scaling parameters provides a stable and accurate relative navigation solutions despite the long measurement outage time and large initial error as compared to the relative navigation solutions of the EKF. Various navigation results have been analyzed by adjusting the scaling parameters of the UKF.
Keywords: Satellite relative navigation, laser-based measurement, intermittent measurement, unscented kalman filter.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11018543 Integration of Multi-Source Data to Monitor Coral Biodiversity
Authors: K. Jitkue, W. Srisang, C. Yaiprasert, K. Jaroensutasinee, M. Jaroensutasinee
Abstract:
This study aims at using multi-source data to monitor coral biodiversity and coral bleaching. We used coral reef at Racha Islands, Phuket as a study area. There were three sources of data: coral diversity, sensor based data and satellite data.Keywords: Coral reefs, Remote sensing, Sea surfacetemperatue, Satellite imagery.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15538542 Pose-Dependency of Machine Tool Structures: Appearance, Consequences, and Challenges for Lightweight Large-Scale Machines
Authors: S. Apprich, F. Wulle, A. Lechler, A. Pott, A. Verl
Abstract:
Large-scale machine tools for the manufacturing of large work pieces, e.g. blades, casings or gears for wind turbines, feature pose-dependent dynamic behavior. Small structural damping coefficients lead to long decay times for structural vibrations that have negative impacts on the production process. Typically, these vibrations are handled by increasing the stiffness of the structure by adding mass. This is counterproductive to the needs of sustainable manufacturing as it leads to higher resource consumption both in material and in energy. Recent research activities have led to higher resource efficiency by radical mass reduction that is based on controlintegrated active vibration avoidance and damping methods. These control methods depend on information describing the dynamic behavior of the controlled machine tools in order to tune the avoidance or reduction method parameters according to the current state of the machine. This paper presents the appearance, consequences and challenges of the pose-dependent dynamic behavior of lightweight large-scale machine tool structures in production. It starts with the theoretical introduction of the challenges of lightweight machine tool structures resulting from reduced stiffness. The statement of the pose-dependent dynamic behavior is corroborated by the results of the experimental modal analysis of a lightweight test structure. Afterwards, the consequences of the pose-dependent dynamic behavior of lightweight machine tool structures for the use of active control and vibration reduction methods are explained. Based on the state of the art of pose-dependent dynamic machine tool models and the modal investigation of an FE-model of the lightweight test structure, the criteria for a pose-dependent model for use in vibration reduction are derived. The description of the approach for a general posedependent model of the dynamic behavior of large lightweight machine tools that provides the necessary input to the aforementioned vibration avoidance and reduction methods to properly tackle machine vibrations is the outlook of the paper.Keywords: Dynamic behavior, lightweight, machine tool, pose-dependency.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 28448541 Decision Support System Based on Data Warehouse
Authors: Yang Bao, LuJing Zhang
Abstract:
Typical Intelligent Decision Support System is 4-based, its design composes of Data Warehouse, Online Analytical Processing, Data Mining and Decision Supporting based on models, which is called Decision Support System Based on Data Warehouse (DSSBDW). This way takes ETL,OLAP and DM as its implementing means, and integrates traditional model-driving DSS and data-driving DSS into a whole. For this kind of problem, this paper analyzes the DSSBDW architecture and DW model, and discusses the following key issues: ETL designing and Realization; metadata managing technology using XML; SQL implementing, optimizing performance, data mapping in OLAP; lastly, it illustrates the designing principle and method of DW in DSSBDW.
Keywords: Decision Support System, Data Warehouse, Data Mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 38628540 Stability Enhancement of a Large-Scale Power System Using Power System Stabilizer Based on Adaptive Neuro Fuzzy Inference System
Authors: Agung Budi Muljono, I Made Ginarsa, I Made Ari Nrartha
Abstract:
A large-scale power system (LSPS) consists of two or more sub-systems connected by inter-connecting transmission. Loading pattern on an LSPS always changes from time to time and varies depend on consumer need. The serious instability problem is appeared in an LSPS due to load fluctuation in all of the bus. Adaptive neuro-fuzzy inference system (ANFIS)-based power system stabilizer (PSS) is presented to cover the stability problem and to enhance the stability of an LSPS. The ANFIS control is presented because the ANFIS control is more effective than Mamdani fuzzy control in the computation aspect. Simulation results show that the presented PSS is able to maintain the stability by decreasing peak overshoot to the value of −2.56 × 10−5 pu for rotor speed deviation Δω2−3. The presented PSS also makes the settling time to achieve at 3.78 s on local mode oscillation. Furthermore, the presented PSS is able to improve the peak overshoot and settling time of Δω3−9 to the value of −0.868 × 10−5 pu and at the time of 3.50 s for inter-area oscillation.Keywords: ANFIS, large-scale, power system, PSS, stability enhancement.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11948539 Methodology of Realization for Supervisor and Simulator Dedicated to a Semiconductor Research and Production Factory
Authors: Hanane Ondella, Pierre Ladet, David Ferrand, Pat Sloan
Abstract:
In the micro and nano-technology industry, the «clean-rooms» dedicated to manufacturing chip, are equipped with the most sophisticated equipment-tools. There use a large number of resources in according to strict specifications for an optimum working and result. The distribution of «utilities» to the production is assured by teams who use a supervision tool. The studies show the interest to control the various parameters of production or/and distribution, in real time, through a reliable and effective supervision tool. This document looks at a large part of the functions that the supervisor must assure, with complementary functionalities to help the diagnosis and simulation that prove very useful in our case where the supervised installations are complexed and in constant evolution.Keywords: Control-Command, evolution, non regression, performances, real time, simulation, supervision.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12608538 A New History Based Method to Handle the Recurring Concept Shifts in Data Streams
Authors: Hossein Morshedlou, Ahmad Abdollahzade Barforoush
Abstract:
Recent developments in storage technology and networking architectures have made it possible for broad areas of applications to rely on data streams for quick response and accurate decision making. Data streams are generated from events of real world so existence of associations, which are among the occurrence of these events in real world, among concepts of data streams is logical. Extraction of these hidden associations can be useful for prediction of subsequent concepts in concept shifting data streams. In this paper we present a new method for learning association among concepts of data stream and prediction of what the next concept will be. Knowing the next concept, an informed update of data model will be possible. The results of conducted experiments show that the proposed method is proper for classification of concept shifting data streams.Keywords: Data Stream, Classification, Concept Shift, History.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12788537 Incremental Learning of Independent Topic Analysis
Authors: Takahiro Nishigaki, Katsumi Nitta, Takashi Onoda
Abstract:
In this paper, we present a method of applying Independent Topic Analysis (ITA) to increasing the number of document data. The number of document data has been increasing since the spread of the Internet. ITA was presented as one method to analyze the document data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis (ICA). ICA is a technique in the signal processing; however, it is difficult to apply the ITA to increasing number of document data. Because ITA must use the all document data so temporal and spatial cost is very high. Therefore, we present Incremental ITA which extracts the independent topics from increasing number of document data. Incremental ITA is a method of updating the independent topics when the document data is added after extracted the independent topics from a just previous the data. In addition, Incremental ITA updates the independent topics when the document data is added. And we show the result applied Incremental ITA to benchmark datasets.Keywords: Text mining, topic extraction, independent, incremental, independent component analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10588536 An efficient Activity Network Reduction Algorithm based on the Label Correcting Tracing Algorithm
Authors: Weng Ming Chu
Abstract:
When faced with stochastic networks with an uncertain duration for their activities, the securing of network completion time becomes problematical, not only because of the non-identical pdf of duration for each node, but also because of the interdependence of network paths. As evidenced by Adlakha & Kulkarni [1], many methods and algorithms have been put forward in attempt to resolve this issue, but most have encountered this same large-size network problem. Therefore, in this research, we focus on network reduction through a Series/Parallel combined mechanism. Our suggested algorithm, named the Activity Network Reduction Algorithm (ANRA), can efficiently transfer a large-size network into an S/P Irreducible Network (SPIN). SPIN can enhance stochastic network analysis, as well as serve as the judgment of symmetry for the Graph Theory.Keywords: Series/Parallel network, Stochastic network, Network reduction, Interdictive Graph, Complexity Index.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13798535 Powerful Tool to Expand Business Intelligence: Text Mining
Authors: Li Gao, Elizabeth Chang, Song Han
Abstract:
With the extensive inclusion of document, especially text, in the business systems, data mining does not cover the full scope of Business Intelligence. Data mining cannot deliver its impact on extracting useful details from the large collection of unstructured and semi-structured written materials based on natural languages. The most pressing issue is to draw the potential business intelligence from text. In order to gain competitive advantages for the business, it is necessary to develop the new powerful tool, text mining, to expand the scope of business intelligence. In this paper, we will work out the strong points of text mining in extracting business intelligence from huge amount of textual information sources within business systems. We will apply text mining to each stage of Business Intelligence systems to prove that text mining is the powerful tool to expand the scope of BI. After reviewing basic definitions and some related technologies, we will discuss the relationship and the benefits of these to text mining. Some examples and applications of text mining will also be given. The motivation behind is to develop new approach to effective and efficient textual information analysis. Thus we can expand the scope of Business Intelligence using the powerful tool, text mining.Keywords: Business intelligence, document warehouse, text mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 26608534 Evolutionary Feature Selection for Text Documents using the SVM
Authors: Daniel I. Morariu, Lucian N. Vintan, Volker Tresp
Abstract:
Text categorization is the problem of classifying text documents into a set of predefined classes. After a preprocessing step, the documents are typically represented as large sparse vectors. When training classifiers on large collections of documents, both the time and memory restrictions can be quite prohibitive. This justifies the application of feature selection methods to reduce the dimensionality of the document-representation vector. In this paper, we present three feature selection methods: Information Gain, Support Vector Machine feature selection called (SVM_FS) and Genetic Algorithm with SVM (called GA_SVM). We show that the best results were obtained with GA_SVM method for a relatively small dimension of the feature vector.Keywords: Feature Selection, Learning with Kernels, Support Vector Machine, Genetic Algorithm, and Classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17068533 A Framework for Data Mining Based Multi-Agent: An Application to Spatial Data
Authors: H. Baazaoui Zghal, S. Faiz, H. Ben Ghezala
Abstract:
Data mining is an extraordinarily demanding field referring to extraction of implicit knowledge and relationships, which are not explicitly stored in databases. A wide variety of methods of data mining have been introduced (classification, characterization, generalization...). Each one of these methods includes more than algorithm. A system of data mining implies different user categories,, which mean that the user-s behavior must be a component of the system. The problem at this level is to know which algorithm of which method to employ for an exploratory end, which one for a decisional end, and how can they collaborate and communicate. Agent paradigm presents a new way of conception and realizing of data mining system. The purpose is to combine different algorithms of data mining to prepare elements for decision-makers, benefiting from the possibilities offered by the multi-agent systems. In this paper the agent framework for data mining is introduced, and its overall architecture and functionality are presented. The validation is made on spatial data. Principal results will be presented.
Keywords: Databases, data mining, multi-agent, spatial datamart.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20458532 Feature Selection Methods for an Improved SVM Classifier
Authors: Daniel Morariu, Lucian N. Vintan, Volker Tresp
Abstract:
Text categorization is the problem of classifying text documents into a set of predefined classes. After a preprocessing step, the documents are typically represented as large sparse vectors. When training classifiers on large collections of documents, both the time and memory restrictions can be quite prohibitive. This justifies the application of feature selection methods to reduce the dimensionality of the document-representation vector. In this paper, three feature selection methods are evaluated: Random Selection, Information Gain (IG) and Support Vector Machine feature selection (called SVM_FS). We show that the best results were obtained with SVM_FS method for a relatively small dimension of the feature vector. Also we present a novel method to better correlate SVM kernel-s parameters (Polynomial or Gaussian kernel).Keywords: Feature Selection, Learning with Kernels, SupportVector Machine, and Classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18298531 Assessing the Effect of Grid Connection of Large-Scale Wind Farms on Power System Small-Signal Angular Stability
Authors: Wenjuan Du, Jingtian Bi, Tong Wang, Haifeng Wang
Abstract:
Grid connection of a large-scale wind farm affects power system small-signal angular stability in two aspects. Firstly, connection of the wind farm brings about the change of load flow and configuration of a power system. Secondly, the dynamic interaction is introduced by the wind farm with the synchronous generators (SGs) in the power system. This paper proposes a method to assess the two aspects of the effect of the wind farm on power system small-signal angular stability. The effect of the change of load flow/system configuration brought about by the wind farm can be examined separately by displacing wind farms with constant power sources, then the effect of the dynamic interaction of the wind farm with the SGs can be also computed individually. Thus, a clearer picture and better understanding on the power system small-signal angular stability as affected by grid connection of the large-scale wind farm are provided. In the paper, an example power system with grid connection of a wind farm is presented to demonstrate the proposed approach.Keywords: power system small-signal angular stability, power system low-frequency oscillations, electromechanical oscillation modes, wind farms, double fed induction generator (DFIG)
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18198530 Latent Topic Based Medical Data Classification
Authors: Jian-hua Yeh, Shi-yi Kuo
Abstract:
This paper discusses the classification process for medical data. In this paper, we use the data from ACM KDDCup 2008 to demonstrate our classification process based on latent topic discovery. In this data set, the target set and outliers are quite different in their nature: target set is only 0.6% size in total, while the outliers consist of 99.4% of the data set. We use this data set as an example to show how we dealt with this extremely biased data set with latent topic discovery and noise reduction techniques. Our experiment faces two major challenge: (1) extremely distributed outliers, and (2) positive samples are far smaller than negative ones. We try to propose a suitable process flow to deal with these issues and get a best AUC result of 0.98.
Keywords: classification, latent topics, outlier adjustment, feature scaling
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16428529 Improving Fault Resilience and Reconstruction of Overlay Multicast Tree Using Leaving Time of Participants
Authors: Bhed Bahadur Bista
Abstract:
Network layer multicast, i.e. IP multicast, even after many years of research, development and standardization, is not deployed in large scale due to both technical (e.g. upgrading of routers) and political (e.g. policy making and negotiation) issues. Researchers looked for alternatives and proposed application/overlay multicast where multicast functions are handled by end hosts, not network layer routers. Member hosts wishing to receive multicast data form a multicast delivery tree. The intermediate hosts in the tree act as routers also, i.e. they forward data to the lower hosts in the tree. Unlike IP multicast, where a router cannot leave the tree until all members below it leave, in overlay multicast any member can leave the tree at any time thus disjoining the tree and disrupting the data dissemination. All the disrupted hosts have to rejoin the tree. This characteristic of the overlay multicast causes multicast tree unstable, data loss and rejoin overhead. In this paper, we propose that each node sets its leaving time from the tree and sends join request to a number of nodes in the tree. The nodes in the tree will reject the request if their leaving time is earlier than the requesting node otherwise they will accept the request. The node can join at one of the accepting nodes. This makes the tree more stable as the nodes will join the tree according to their leaving time, earliest leaving time node being at the leaf of the tree. Some intermediate nodes may not follow their leaving time and leave earlier than their leaving time thus disrupting the tree. For this, we propose a proactive recovery mechanism so that disrupted nodes can rejoin the tree at predetermined nodes immediately. We have shown by simulation that there is less overhead when joining the multicast tree and the recovery time of the disrupted nodes is much less than the previous works. KeywordsKeywords: Network layer multicast, Fault Resilience, IP multicast
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13878528 A Parallel Algorithm for 2-D Cylindrical Geometry Transport Equation with Interface Corrections
Authors: Wei Jun-xia, Yuan Guang-wei, Yang Shu-lin, Shen Wei-dong
Abstract:
In order to make conventional implicit algorithm to be applicable in large scale parallel computers , an interface prediction and correction of discontinuous finite element method is presented to solve time-dependent neutron transport equations under 2-D cylindrical geometry. Domain decomposition is adopted in the computational domain.The numerical experiments show that our parallel algorithm with explicit prediction and implicit correction has good precision, parallelism and simplicity. Especially, it can reach perfect speedup even on hundreds of processors for large-scale problems.
Keywords: Transport Equation, Discontinuous Finite Element, Domain Decomposition, Interface Prediction And Correction
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16658527 Dynamic Features Selection for Heart Disease Classification
Authors: Walid MOUDANI
Abstract:
The healthcare environment is generally perceived as being information rich yet knowledge poor. However, there is a lack of effective analysis tools to discover hidden relationships and trends in data. In fact, valuable knowledge can be discovered from application of data mining techniques in healthcare system. In this study, a proficient methodology for the extraction of significant patterns from the Coronary Heart Disease warehouses for heart attack prediction, which unfortunately continues to be a leading cause of mortality in the whole world, has been presented. For this purpose, we propose to enumerate dynamically the optimal subsets of the reduced features of high interest by using rough sets technique associated to dynamic programming. Therefore, we propose to validate the classification using Random Forest (RF) decision tree to identify the risky heart disease cases. This work is based on a large amount of data collected from several clinical institutions based on the medical profile of patient. Moreover, the experts- knowledge in this field has been taken into consideration in order to define the disease, its risk factors, and to establish significant knowledge relationships among the medical factors. A computer-aided system is developed for this purpose based on a population of 525 adults. The performance of the proposed model is analyzed and evaluated based on set of benchmark techniques applied in this classification problem.Keywords: Multi-Classifier Decisions Tree, Features Reduction, Dynamic Programming, Rough Sets.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25328526 Data Transformation Services (DTS): Creating Data Mart by Consolidating Multi-Source Enterprise Operational Data
Authors: J. D. D. Daniel, K. N. Goh, S. M. Yusop
Abstract:
Trends in business intelligence, e-commerce and remote access make it necessary and practical to store data in different ways on multiple systems with different operating systems. As business evolve and grow, they require efficient computerized solution to perform data update and to access data from diverse enterprise business applications. The objective of this paper is to demonstrate the capability of DTS [1] as a database solution for automatic data transfer and update in solving business problem. This DTS package is developed for the sales of variety of plants and eventually expanded into commercial supply and landscaping business. Dimension data modeling is used in DTS package to extract, transform and load data from heterogeneous database systems such as MySQL, Microsoft Access and Oracle that consolidates into a Data Mart residing in SQL Server. Hence, the data transfer from various databases is scheduled to run automatically every quarter of the year to review the efficient sales analysis. Therefore, DTS is absolutely an attractive solution for automatic data transfer and update which meeting today-s business needs.Keywords: Data Transformation Services (DTS), ObjectLinking and Embedding Database (OLEDB), Data Mart, OnlineAnalytical Processing (OLAP), Online Transactional Processing(OLTP).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20388525 Extraction of Data from Web Pages: A Vision Based Approach
Authors: P. S. Hiremath, Siddu P. Algur
Abstract:
With the explosive growth of information sources available on the World Wide Web, it has become increasingly difficult to identify the relevant pieces of information, since web pages are often cluttered with irrelevant content like advertisements, navigation-panels, copyright notices etc., surrounding the main content of the web page. Hence, tools for the mining of data regions, data records and data items need to be developed in order to provide value-added services. Currently available automatic techniques to mine data regions from web pages are still unsatisfactory because of their poor performance and tag-dependence. In this paper a novel method to extract data items from the web pages automatically is proposed. It comprises of two steps: (1) Identification and Extraction of the data regions based on visual clues information. (2) Identification of data records and extraction of data items from a data region. For step1, a novel and more effective method is proposed based on visual clues, which finds the data regions formed by all types of tags using visual clues. For step2 a more effective method namely, Extraction of Data Items from web Pages (EDIP), is adopted to mine data items. The EDIP technique is a list-based approach in which the list is a linear data structure. The proposed technique is able to mine the non-contiguous data records and can correctly identify data regions, irrespective of the type of tag in which it is bound. Our experimental results show that the proposed technique performs better than the existing techniques.
Keywords: Web data records, web data regions, web mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19018524 Visual-Graphical Methods for Exploring Longitudinal Data
Authors: H. W. Ker
Abstract:
Longitudinal data typically have the characteristics of changes over time, nonlinear growth patterns, between-subjects variability, and the within errors exhibiting heteroscedasticity and dependence. The data exploration is more complicated than that of cross-sectional data. The purpose of this paper is to organize/integrate of various visual-graphical techniques to explore longitudinal data. From the application of the proposed methods, investigators can answer the research questions include characterizing or describing the growth patterns at both group and individual level, identifying the time points where important changes occur and unusual subjects, selecting suitable statistical models, and suggesting possible within-error variance.Keywords: Data exploration, exploratory analysis, HLMs/LMEs, longitudinal data, visual-graphical methods.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20948523 A Materialized Approach to the Integration of XML Documents: the OSIX System
Authors: H. Ahmad, S. Kermanshahani, A. Simonet, M. Simonet
Abstract:
The data exchanged on the Web are of different nature from those treated by the classical database management systems; these data are called semi-structured data since they do not have a regular and static structure like data found in a relational database; their schema is dynamic and may contain missing data or types. Therefore, the needs for developing further techniques and algorithms to exploit and integrate such data, and extract relevant information for the user have been raised. In this paper we present the system OSIX (Osiris based System for Integration of XML Sources). This system has a Data Warehouse model designed for the integration of semi-structured data and more precisely for the integration of XML documents. The architecture of OSIX relies on the Osiris system, a DL-based model designed for the representation and management of databases and knowledge bases. Osiris is a viewbased data model whose indexing system supports semantic query optimization. We show that the problem of query processing on a XML source is optimized by the indexing approach proposed by Osiris.Keywords: Data integration, semi-structured data, views, XML.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15908522 Information Retrieval: A Comparative Study of Textual Indexing Using an Oriented Object Database (db4o) and the Inverted File
Authors: Mohammed Erritali
Abstract:
The growth in the volume of text data such as books and articles in libraries for centuries has imposed to establish effective mechanisms to locate them. Early techniques such as abstraction, indexing and the use of classification categories have marked the birth of a new field of research called "Information Retrieval". Information Retrieval (IR) can be defined as the task of defining models and systems whose purpose is to facilitate access to a set of documents in electronic form (corpus) to allow a user to find the relevant ones for him, that is to say, the contents which matches with the information needs of the user. Most of the models of information retrieval use a specific data structure to index a corpus which is called "inverted file" or "reverse index". This inverted file collects information on all terms over the corpus documents specifying the identifiers of documents that contain the term in question, the frequency of each term in the documents of the corpus, the positions of the occurrences of the word... In this paper we use an oriented object database (db4o) instead of the inverted file, that is to say, instead to search a term in the inverted file, we will search it in the db4o database. The purpose of this work is to make a comparative study to see if the oriented object databases may be competing for the inverse index in terms of access speed and resource consumption using a large volume of data.
Keywords: Information Retrieval, indexation, oriented object database (db4o), inverted file.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17348521 Artificial Neural Network-Based Short-Term Load Forecasting for Mymensingh Area of Bangladesh
Authors: S. M. Anowarul Haque, Md. Asiful Islam
Abstract:
Electrical load forecasting is considered to be one of the most indispensable parts of a modern-day electrical power system. To ensure a reliable and efficient supply of electric energy, special emphasis should have been put on the predictive feature of electricity supply. Artificial Neural Network-based approaches have emerged to be a significant area of interest for electric load forecasting research. This paper proposed an Artificial Neural Network model based on the particle swarm optimization algorithm for improved electric load forecasting for Mymensingh, Bangladesh. The forecasting model is developed and simulated on the MATLAB environment with a large number of training datasets. The model is trained based on eight input parameters including historical load and weather data. The predicted load data are then compared with an available dataset for validation. The proposed neural network model is proved to be more reliable in terms of day-wise load forecasting for Mymensingh, Bangladesh.Keywords: Load forecasting, artificial neural network, particle swarm optimization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6868520 Embedding a Large Amount of Information Using High Secure Neural Based Steganography Algorithm
Authors: Nameer N. EL-Emam
Abstract:
In this paper, we construct and implement a new Steganography algorithm based on learning system to hide a large amount of information into color BMP image. We have used adaptive image filtering and adaptive non-uniform image segmentation with bits replacement on the appropriate pixels. These pixels are selected randomly rather than sequentially by using new concept defined by main cases with sub cases for each byte in one pixel. According to the steps of design, we have been concluded 16 main cases with their sub cases that covere all aspects of the input information into color bitmap image. High security layers have been proposed through four layers of security to make it difficult to break the encryption of the input information and confuse steganalysis too. Learning system has been introduces at the fourth layer of security through neural network. This layer is used to increase the difficulties of the statistical attacks. Our results against statistical and visual attacks are discussed before and after using the learning system and we make comparison with the previous Steganography algorithm. We show that our algorithm can embed efficiently a large amount of information that has been reached to 75% of the image size (replace 18 bits for each pixel as a maximum) with high quality of the output.Keywords: Adaptive image segmentation, hiding with high capacity, hiding with high security, neural networks, Steganography.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19898519 Co-Articulation between Consonant and Vowel in Cantonese Syllables
Authors: Wai-Sum Lee
Abstract:
This study investigates C-V and V-C co-articulation in Cantonese monosyllables of the CV, VC or CVC structure, with C = one of the three stop consonants [p, t, k] and V = one of the three corner vowels [i, a, u]. Five repetitions of each test syllable on a randomized list were elicited from Cantonese young adult speakers in their early-20s. A research tool, EMA AG500, was used to record the synchronized audio signals and articulatory data at three different locations of the tongue – tongue tip, tongue middle, and tongue back – and the positions of the upper and lower lips during the test syllables. The main findings based on the articulatory data collected from two male Cantonese speakers are as follows: (i) For the syllable-initial [p-], strong co-articulation is observed when [p-] preceding the high vowel [i] or [u], but not the low vowel [a]. As for the syllable-final [-p], it is strongly co-articulated with the preceding vowel, even when the vowel is [a]. (ii) The co-articulation between the initial [t-] and the following vowel of any type is weak. In the syllable-final position, the degree of co-articulatory resistance of [-t] is also large when following the vowel [u], but [-t] is largely co-articulated with the preceding vowel when the vowel is [i] or [a]. (iii) The strength of co-articulation differs when the initial [k-] precedes the different types of vowel. A stronger co-articulation between [k-] and [i] than between [k-] and [u], and the strength of co-articulation is much reduced between [k-] and [a]. However, in the syllable-final position, there is strong co-articulation between [-k] and the preceding vowel [a]. (iv) Among the three types of stop consonants in the syllable-initial position, the decreasing degree of co-articulatory resistance (CR) is [t-] > [k-] > [p-], and the degree of CR is reduced during all three types of stop in the syllable-final position. In general, the data on co-articulation between consonant and vowel in the Cantonese monosyllables are similar to those in other languages reported in previous studies.
Keywords: Cantonese, co-articulation, consonant, vowel.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11238518 Data-Driven Decision-Making in Digital Entrepreneurship
Authors: Abeba Nigussie Turi, Xiangming Samuel Li
Abstract:
Data-driven business models are more typical for established businesses than early-stage startups that strive to penetrate a market. This paper provided an extensive discussion on the principles of data analytics for early-stage digital entrepreneurial businesses. Here, we developed data-driven decision-making (DDDM) framework that applies to startups prone to multifaceted barriers in the form of poor data access, technical and financial constraints, to state some. The startup DDDM framework proposed in this paper is novel in its form encompassing startup data analytics enablers and metrics aligning with startups' business models ranging from customer-centric product development to servitization which is the future of modern digital entrepreneurship.
Keywords: Startup data analytics, data-driven decision-making, data acquisition, data generation, digital entrepreneurship.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 827