Search results for: Distributed Data Mining
7314 Big Brain: A Single Database System for a Federated Data Warehouse Architecture
Authors: X. Gumara Rigol, I. Martínez de Apellaniz Anzuola, A. Garcia Serrano, A. Franzi Cros, O. Vidal Calbet, A. Al Maruf
Abstract:
Traditional federated architectures for data warehousing work well when corporations have existing regional data warehouses and there is a need to aggregate data at a global level. Schibsted Media Group has been maturing from a decentralised organisation into a more globalised one and needed to build both some of the regional data warehouses for some brands at the same time as the global one. In this paper, we present the architectural alternatives studied and why a custom federated approach was the notable recommendation to go further with the implementation. Although the data warehouses are logically federated, the implementation uses a single database system which presented many advantages like: cost reduction and improved data access to global users allowing consumers of the data to have a common data model for detailed analysis across different geographies and a flexible layer for local specific needs in the same place.Keywords: Data integration, data warehousing, federated architecture, online analytical processing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7107313 An In-Depth Analysis of Open Data Portals as an Emerging Public E-Service
Authors: Martin Lnenicka
Abstract:
Governments collect and produce large amounts of data. Increasingly, governments worldwide have started to implement open data initiatives and also launch open data portals to enable the release of these data in open and reusable formats. Therefore, a large number of open data repositories, catalogues and portals have been emerging in the world. The greater availability of interoperable and linkable open government data catalyzes secondary use of such data, so they can be used for building useful applications which leverage their value, allow insight, provide access to government services, and support transparency. The efficient development of successful open data portals makes it necessary to evaluate them systematic, in order to understand them better and assess the various types of value they generate, and identify the required improvements for increasing this value. Thus, the attention of this paper is directed particularly to the field of open data portals. The main aim of this paper is to compare the selected open data portals on the national level using content analysis and propose a new evaluation framework, which further improves the quality of these portals. It also establishes a set of considerations for involving businesses and citizens to create eservices and applications that leverage on the datasets available from these portals.
Keywords: Big data, content analysis, criteria comparison, data quality, open data, open data portals, public sector.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 30827312 Mobile Augmented Reality for Collaboration in Operation
Authors: Chong-Yang Qiao
Abstract:
Mobile augmented reality (MAR) tracking targets from the surroundings and aids operators for interactive data and procedures visualization, potential equipment and system understandably. Operators remotely communicate and coordinate with each other for the continuous tasks, information and data exchange between control room and work-site. In the routine work, distributed control system (DCS) monitoring and work-site manipulation require operators interact in real-time manners. The critical question is the improvement of user experience in cooperative works through applying Augmented Reality in the traditional industrial field. The purpose of this exploratory study is to find the cognitive model for the multiple task performance by MAR. In particular, the focus will be on the comparison between different tasks and environment factors which influence information processing. Three experiments use interface and interaction design, the content of start-up, maintenance and stop embedded in the mobile application. With the evaluation criteria of time demands and human errors, and analysis of the mental process and the behavior action during the multiple tasks, heuristic evaluation was used to find the operators performance with different situation factors, and record the information processing in recognition, interpretation, judgment and reasoning. The research will find the functional properties of MAR and constrain the development of the cognitive model. Conclusions can be drawn that suggest MAR is easy to use and useful for operators in the remote collaborative works.Keywords: Mobile augmented reality, remote collaboration, user experience, cognitive model.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13387311 Efficient Design Optimization of Multi-State Flow Network for Multiple Commodities
Authors: Yu-Cheng Chou, Po Ting Lin
Abstract:
The network of delivering commodities has been an important design problem in our daily lives and many transportation applications. The delivery performance is evaluated based on the system reliability of delivering commodities from a source node to a sink node in the network. The system reliability is thus maximized to find the optimal routing. However, the design problem is not simple because (1) each path segment has randomly distributed attributes; (2) there are multiple commodities that consume various path capacities; (3) the optimal routing must successfully complete the delivery process within the allowable time constraints. In this paper, we want to focus on the design optimization of the Multi-State Flow Network (MSFN) for multiple commodities. We propose an efficient approach to evaluate the system reliability in the MSFN with respect to randomly distributed path attributes and find the optimal routing subject to the allowable time constraints. The delivery rates, also known as delivery currents, of the path segments are evaluated and the minimal-current arcs are eliminated to reduce the complexity of the MSFN. Accordingly, the correct optimal routing is found and the worst-case reliability is evaluated. It has been shown that the reliability of the optimal routing is at least higher than worst-case measure. Two benchmark examples are utilized to demonstrate the proposed method. The comparisons between the original and the reduced networks show that the proposed method is very efficient.
Keywords: Multiple Commodities, Multi-State Flow Network (MSFN), Time Constraints, Worst-Case Reliability (WCR)
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14507310 Crowdfunding for Saudi Arabia Green Projects
Authors: Saleh Komies, Mona Alharbi, Razan Alhayyani, Mozah Almulhim, Roseanne Khawaja, Ahmed Alradhi
Abstract:
One of the proposed solutions that face some challenges is encouraging sustainable energy consumption across Saudi Arabia through crowdfunding platforms. To address these challenges, we need to determine the level of awareness of crowdfunding and green projects, as well as the preferences and willingness of Saudis to utilize crowdfunding as an alternative funding source for green projects in Saudi Arabia. In this study, we aim to determine the influence of environmental awareness and concern on the propensity to crowdfund green projects. The survey is being conducted as part of environmental initiatives to assess public perceptions and opinions on crowdfunding green projects in Saudi Arabia. A total of 450 responses to an online questionnaire distributed via convenience and snowball sampling were utilized for data analysis. The survey reveals that Saudis have a low understanding of crowdfunding concepts and a relatively high understanding of implementing green projects. The public is interested in crowdfunding green projects if there is a return on investment.
Keywords: Crowdfunding, green projects, renewable energy, Saudi Arabia, solar farms, wind resources.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2587309 File System-Based Data Protection Approach
Authors: Jaechun No
Abstract:
As data to be stored in storage subsystems tremendously increases, data protection techniques have become more important than ever, to provide data availability and reliability. In this paper, we present the file system-based data protection (WOWSnap) that has been implemented using WORM (Write-Once-Read-Many) scheme. In the WOWSnap, once WORM files have been created, only the privileged read requests to them are allowed to protect data against any intentional/accidental intrusions. Furthermore, all WORM files are related to their protection cycle that is a time period during which WORM files should securely be protected. Once their protection cycle is expired, the WORM files are automatically moved to the general-purpose data section without any user interference. This prevents the WORM data section from being consumed by unnecessary files. We evaluated the performance of WOWSnap on Linux cluster.Keywords: Data protection, Protection cycle, WORM
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16797308 A Survey of WhatsApp as a Tool for Instructor-Learner Dialogue, Learner-Content Dialogue, and Learner-Learner Dialogue
Authors: Ebrahim Panah, Muhammad Yasir Babar
Abstract:
Thanks to the development of online technology and social networks, people are able to communicate as well as learn. WhatsApp is a popular social network which is growingly gaining popularity. This app can be used for communication as well as education. It can be used for instructor-learner, learner-learner, and learner-content interactions; however, very little knowledge is available on these potentials of WhatsApp. The current study was undertaken to investigate university students’ perceptions of WhatsApp used as a tool for instructor-learner dialogue, learner-content dialogue, and learner-learner dialogue. The study adopted a survey approach and distributed the questionnaire developed by Google Forms to 54 (11 males and 43 females) university students. The obtained data were analyzed using SPSS version 20. The result of data analysis indicates that students have positive attitudes towards WhatsApp as a tool for Instructor-Learner Dialogue: it easy to reach the lecturer (4.07), the instructor gives me valuable feedback on my assignment (4.02), the instructor is supportive during course discussion and offers continuous support with the class (4.00). Learner-Content Dialogue: WhatsApp allows me to academically engage with lecturers anytime, anywhere (4.00), it helps to send graphics such as pictures or charts directly to the students (3.98), it also provides out of class, extra learning materials and homework (3.96), and Learner-Learner Dialogue: WhatsApp is a good tool for sharing knowledge with others (4.09), WhatsApp allows me to academically engage with peers anytime, anywhere (4.07), and we can interact with others through the use of group discussion (4.02). It was also found that there are significant positive correlations between students’ perceptions of Instructor-Learner Dialogue (ILD), Learner-Content Dialogue (LCD), Learner-Learner Dialogue (LLD) and WhatsApp Application in classroom. The findings of the study have implications for lectures, policy makers and curriculum developers.
Keywords: Instructor-learner dialogue, learners-contents dialogue, learner-learner dialogue, WhatsApp.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6837307 A Text Clustering System based on k-means Type Subspace Clustering and Ontology
Authors: Liping Jing, Michael K. Ng, Xinhua Yang, Joshua Zhexue Huang
Abstract:
This paper presents a text clustering system developed based on a k-means type subspace clustering algorithm to cluster large, high dimensional and sparse text data. In this algorithm, a new step is added in the k-means clustering process to automatically calculate the weights of keywords in each cluster so that the important words of a cluster can be identified by the weight values. For understanding and interpretation of clustering results, a few keywords that can best represent the semantic topic are extracted from each cluster. Two methods are used to extract the representative words. The candidate words are first selected according to their weights calculated by our new algorithm. Then, the candidates are fed to the WordNet to identify the set of noun words and consolidate the synonymy and hyponymy words. Experimental results have shown that the clustering algorithm is superior to the other subspace clustering algorithms, such as PROCLUS and HARP and kmeans type algorithm, e.g., Bisecting-KMeans. Furthermore, the word extraction method is effective in selection of the words to represent the topics of the clusters.
Keywords: Subspace Clustering, Text Mining, Feature Weighting, Cluster Interpretation, Ontology
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24627306 Comparative Analysis of Machine Learning Tools: A Review
Authors: S. Sarumathi, M. Vaishnavi, S. Geetha, P. Ranjetha
Abstract:
Machine learning is a new and exciting area of artificial intelligence nowadays. Machine learning is the most valuable, time, supervised, and cost-effective approach. It is not a narrow learning approach; it also includes a wide range of methods and techniques that can be applied to a wide range of complex realworld problems and time domains. Biological image classification, adaptive testing, computer vision, natural language processing, object detection, cancer detection, face recognition, handwriting recognition, speech recognition, and many other applications of machine learning are widely used in research, industry, and government. Every day, more data are generated, and conventional machine learning techniques are becoming obsolete as users move to distributed and real-time operations. By providing fundamental knowledge of machine learning tools and research opportunities in the field, the aim of this article is to serve as both a comprehensive overview and a guide. A diverse set of machine learning resources is demonstrated and contrasted with the key features in this survey.Keywords: Artificial intelligence, machine learning, deep learning, machine learning algorithms, machine learning tools.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18487305 Landscape Data Transformation: Categorical Descriptions to Numerical Descriptors
Authors: Dennis A. Apuan
Abstract:
Categorical data based on description of the agricultural landscape imposed some mathematical and analytical limitations. This problem however can be overcome by data transformation through coding scheme and the use of non-parametric multivariate approach. The present study describes data transformation from qualitative to numerical descriptors. In a collection of 103 random soil samples over a 60 hectare field, categorical data were obtained from the following variables: levels of nitrogen, phosphorus, potassium, pH, hue, chroma, value and data on topography, vegetation type, and the presence of rocks. Categorical data were coded, and Spearman-s rho correlation was then calculated using PAST software ver. 1.78 in which Principal Component Analysis was based. Results revealed successful data transformation, generating 1030 quantitative descriptors. Visualization based on the new set of descriptors showed clear differences among sites, and amount of variation was successfully measured. Possible applications of data transformation are discussed.Keywords: data transformation, numerical descriptors, principalcomponent analysis
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15057304 Disaster Preparedness for Academic Libraries in Malaysia: An Exploratory Study
Authors: Siti Juryiah Mohd Khalid, Norazlina Dol
Abstract:
Academic libraries in Malaysia are still not prepared for disaster even though several occasions have been reported. The study sets out to assess the current status of preparedness in disaster management among Malaysian academic libraries in the State of Selangor and the Federal Territory of Kuala Lumpur. To obtain a base level of knowledge on disaster preparedness of current practices, a questionnaire was distributed to chief librarians or their assignees in charge of disaster or emergency preparedness at 40 academic libraries and 34 responses were received. The study revolved around the current status of preparedness, on various issues including existence of disaster preparedness plan among academic libraries in Malaysia, disaster experiences by the academic libraries, funding, risk assessment activities and involvement of library staff in disaster management. Frequency and percentage tables were used in the analysis of the data collected. Some of the academic libraries under study have experienced one form of disaster or the other. Most of the academic libraries do not have a written disaster preparedness plan. The risk assessments and staff involvement in disaster preparedness by these libraries were generally adequate.Keywords: Academic libraries, disaster preparedness plan, disaster management, emergency plan.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 33397303 Performance Analysis of Search Medical Imaging Service on Cloud Storage Using Decision Trees
Authors: González A. Julio, Ramírez L. Leonardo, Puerta A. Gabriel
Abstract:
Telemedicine services use a large amount of data, most of which are diagnostic images in Digital Imaging and Communications in Medicine (DICOM) and Health Level Seven (HL7) formats. Metadata is generated from each related image to support their identification. This study presents the use of decision trees for the optimization of information search processes for diagnostic images, hosted on the cloud server. To analyze the performance in the server, the following quality of service (QoS) metrics are evaluated: delay, bandwidth, jitter, latency and throughput in five test scenarios for a total of 26 experiments during the loading and downloading of DICOM images, hosted by the telemedicine group server of the Universidad Militar Nueva Granada, Bogotá, Colombia. By applying decision trees as a data mining technique and comparing it with the sequential search, it was possible to evaluate the search times of diagnostic images in the server. The results show that by using the metadata in decision trees, the search times are substantially improved, the computational resources are optimized and the request management of the telemedicine image service is improved. Based on the experiments carried out, search efficiency increased by 45% in relation to the sequential search, given that, when downloading a diagnostic image, false positives are avoided in management and acquisition processes of said information. It is concluded that, for the diagnostic images services in telemedicine, the technique of decision trees guarantees the accessibility and robustness in the acquisition and manipulation of medical images, in improvement of the diagnoses and medical procedures in patients.
Keywords: Cloud storage, decision trees, diagnostic image, search, telemedicine.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9487302 Inferential Reasoning for Heterogeneous Multi-Agent Mission
Authors: Sagir M. Yusuf, Chris Baber
Abstract:
We describe issues bedeviling the coordination of heterogeneous (different sensors carrying agents) multi-agent missions such as belief conflict, situation reasoning, etc. We applied Bayesian and agents' presumptions inferential reasoning to solve the outlined issues with the heterogeneous multi-agent belief variation and situational-base reasoning. Bayesian Belief Network (BBN) was used in modeling the agents' belief conflict due to sensor variations. Simulation experiments were designed, and cases from agents’ missions were used in training the BBN using gradient descent and expectation-maximization algorithms. The output network is a well-trained BBN for making inferences for both agents and human experts. We claim that the Bayesian learning algorithm prediction capacity improves by the number of training data and argue that it enhances multi-agents robustness and solve agents’ sensor conflicts.Keywords: Distributed constraint optimization problem, multi-agent system, multi-robot coordination, autonomous system, swarm intelligence.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6417301 Computer Aided X-Ray Diffraction Intensity Analysis for Spinels: Hands-On Computing Experience
Authors: Ashish R. Tanna, Hiren H. Joshi
Abstract:
The mineral having chemical compositional formula MgAl2O4 is called “spinel". The ferrites crystallize in spinel structure are known as spinel-ferrites or ferro-spinels. The spinel structure has a fcc cage of oxygen ions and the metallic cations are distributed among tetrahedral (A) and octahedral (B) interstitial voids (sites). The X-ray diffraction (XRD) intensity of each Bragg plane is sensitive to the distribution of cations in the interstitial voids of the spinel lattice. This leads to the method of determination of distribution of cations in the spinel oxides through XRD intensity analysis. The computer program for XRD intensity analysis has been developed in C language and also tested for the real experimental situation by synthesizing the spinel ferrite materials Mg0.6Zn0.4AlxFe2- xO4 and characterized them by X-ray diffractometry. The compositions of Mg0.6Zn0.4AlxFe2-xO4(x = 0.0 to 0.6) ferrites have been prepared by ceramic method and powder X-ray diffraction patterns were recorded. Thus, the authenticity of the program is checked by comparing the theoretically calculated data using computer simulation with the experimental ones. Further, the deduced cation distributions were used to fit the magnetization data using Localized canting of spins approach to explain the “recovery" of collinear spin structure due to Al3+ - substitution in Mg-Zn ferrites which is the case if A-site magnetic dilution and non-collinear spin structure. Since the distribution of cations in the spinel ferrites plays a very important role with regard to their electrical and magnetic properties, it is essential to determine the cation distribution in spinel lattice.
Keywords: Spinel ferrites, Localized canting of spins, X-ray diffraction, Programming in Borland C.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 38077300 A Proposal of Community based Facility Management Performance (CbFM) in the Education System of Batubara District in Indonesia
Authors: Amilia Hasbullah, Wan Zahari Wan Yussof, Maziah Ismail
Abstract:
The primary education system in Indonesia involved the community recognized as the school committee, to take a part in the process of achieving the quality of education via the school facility performance, the low level of school committee involvement in the education system has become the issue in the development of education and reflected to the quality of education. This paper will discuss the conceptual framework and methodology for the performance of school committees within the management of school facilities in Batubara district of Indonesia. The concepts of Community based Facility Management (CbFM) and Logometrix are used as a basis to measure the school committee performance in order to address the needs of quality school management. The data will be taken from questionnaires distributed for those who work and use school facilities spread over seven sub district of Batubara, Indonesia. The result of this study is expected to provide a guide for evaluating the performance of existing school committee in improving the quality of education in Indonesia.
Keywords: community based facility management, School facility management, School committee performance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20957299 Architecture of Large-Scale Systems
Authors: Arne Koschel, Irina Astrova, Elena Deutschkämer, Jacob Ester, Johannes Feldmann
Abstract:
In this paper various techniques in relation to large-scale systems are presented. At first, explanation of large-scale systems and differences from traditional systems are given. Next, possible specifications and requirements on hardware and software are listed. Finally, examples of large-scale systems are presented.
Keywords: Distributed file systems, cashing, large scale systems, MapReduce algorithm, NoSQL databases.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 30577298 An Assessment of Technological Competencies on Professional Service Firms Business Performance
Authors: Sulaiman Ainin, Yusniza Kamar ulzaman, Abdul Ghani Farinda
Abstract:
This study was initiated with a three prong objective. One, to identify the relationship between Technological Competencies factors (Technical Capability, Firm Innovativeness and E-Business Practices and professional service firms- business performance. To investigate the predictors of professional service firms business performance and finally to evaluate the predictors of business performance according to the type of professional service firms, a survey questionnaire was deployed to collect empirical data. The questionnaire was distributed to the owners of the professional small medium size enterprises services in the Accounting, Legal, Engineering and Architecture sectors. Analysis showed that all three Technology Competency factors have moderate effect on business performance. In addition, the regression models indicate that technical capability is the most highly influential that could determine business performance, followed by e-business practices and firm innovativeness. Subsequently, the main predictor of business performance for all types of firms is Technical capability.Keywords: technology competency, technology capability, innovativeness, E-business practice
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16377297 Perceived Ease-of-Use and Intention to Use E-Government Services in Ghana: The Moderating Role of Perceived Usefulness
Authors: Isaac Kofi Mensah
Abstract:
Public sector organizations, ministries, departments and local government agencies are adopting e-government as a means to provide efficient and quality service delivery to citizens. The purpose of this research paper is to examine the extent to which perceived usefulness (PU) of e-government services moderates between perceived ease-of-use (PEOU) of e-government services and intention to use (IU) e-government services in Ghana. A structured research questionnaire instrument was developed and administered to 700 potential respondents in Ghana, of which 693 responded, representing 99% of the questionnaires distributed. The Technology Acceptance Model (TAM) was used as the theoretical framework for the study. The Statistical Package for Social Science (SPSS) was used to capture and analyze the data. The results indicate that even though predictors such as PU and PEOU are main determiners of citizens’ intention to adopt and use e-government services in Ghana, it failed to show that PEOU and IU e-government services in Ghana is significantly moderated by the PU of e-government services. The implication of this finding on theory and practice is further discussed.Keywords: E-government services, intention to use, moderating role, perceived ease-of-use, perceived usefulness, Ghana, technology acceptance model.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15337296 Barriers to the Use of Factoring Accounts Receivables: The Ghanaian Contractor’s Perception
Authors: E. Kissi, V. K. Acheamfour, J. J. Gyimah, T. Adjei-Kumi
Abstract:
Factoring accounts receivable is widely accepted as an alternative financing source and utilized in almost every industry that sells business-to-business or business-to-government. However, its patronage in the construction industry is very limited as some barriers hinder its application in the construction industry. This study aims at assessing the barriers to the use of factoring accounts receivables in the Ghanaian construction industry. The study adopted the sequential exploratory research method where structured and unstructured questionnaires were conveniently distributed to D1K1 and D2K2 construction firms in Ghana. Using the one-sample t-test and Kendall’s Coefficient of concordance data were analyzed. The most severe challenge concluded is the high cost of factoring patronage. Other critical challenges identified were low knowledge on factoring processes, inadequate access to information on factoring, and high risks involved in factoring. Hence, it is recommended that contractors should be made aware of the prospects of factoring of accounts receivables in the construction industry. This study serves as basis for further rigorous research into factoring of accounts receivables in the industry.
Keywords: Barriers, contractors, factoring accounts receivables, Ghanaian, perception.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5497295 Efficient Lossless Compression of Weather Radar Data
Authors: Wei-hua Ai, Wei Yan, Xiang Li
Abstract:
Data compression is used operationally to reduce bandwidth and storage requirements. An efficient method for achieving lossless weather radar data compression is presented. The characteristics of the data are taken into account and the optical linear prediction is used for the PPI images in the weather radar data in the proposed method. The next PPI image is identical to the current one and a dramatic reduction in source entropy is achieved by using the prediction algorithm. Some lossless compression methods are used to compress the predicted data. Experimental results show that for the weather radar data, the method proposed in this paper outperforms the other methods.
Keywords: Lossless compression, weather radar data, optical linear prediction, PPI image
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22577294 Artificial Intelligence Techniques Applications for Power Disturbances Classification
Authors: K.Manimala, Dr.K.Selvi, R.Ahila
Abstract:
Artificial Intelligence (AI) methods are increasingly being used for problem solving. This paper concerns using AI-type learning machines for power quality problem, which is a problem of general interest to power system to provide quality power to all appliances. Electrical power of good quality is essential for proper operation of electronic equipments such as computers and PLCs. Malfunction of such equipment may lead to loss of production or disruption of critical services resulting in huge financial and other losses. It is therefore necessary that critical loads be supplied with electricity of acceptable quality. Recognition of the presence of any disturbance and classifying any existing disturbance into a particular type is the first step in combating the problem. In this work two classes of AI methods for Power quality data mining are studied: Artificial Neural Networks (ANNs) and Support Vector Machines (SVMs). We show that SVMs are superior to ANNs in two critical respects: SVMs train and run an order of magnitude faster; and SVMs give higher classification accuracy.
Keywords: back propagation network, power quality, probabilistic neural network, radial basis function support vector machine
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15577293 Conceptualizing the Knowledge to Manage and Utilize Data Assets in the Context of Digitization: Case Studies of Multinational Industrial Enterprises
Authors: Martin Böhmer, Agatha Dabrowski, Boris Otto
Abstract:
The trend of digitization significantly changes the role of data for enterprises. Data turn from an enabler to an intangible organizational asset that requires management and qualifies as a tradeable good. The idea of a networked economy has gained momentum in the data domain as collaborative approaches for data management emerge. Traditional organizational knowledge consequently needs to be extended by comprehensive knowledge about data. The knowledge about data is vital for organizations to ensure that data quality requirements are met and data can be effectively utilized and sovereignly governed. As this specific knowledge has been paid little attention to so far by academics, the aim of the research presented in this paper is to conceptualize it by proposing a “data knowledge model”. Relevant model entities have been identified based on a design science research (DSR) approach that iteratively integrates insights of various industry case studies and literature research.
Keywords: Data management, digitization, Industry 4.0, knowledge engineering, metamodel.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14587292 A Methodology for Data Migration between Different Database Management Systems
Authors: Bogdan Walek, Cyril Klimes
Abstract:
In present days the area of data migration is very topical. Current tools for data migration in the area of relational database have several disadvantages that are presented in this paper. We propose a methodology for data migration of the database tables and their data between various types of relational database systems (RDBMS). The proposed methodology contains an expert system. The expert system contains a knowledge base that is composed of IFTHEN rules and based on the input data suggests appropriate data types of columns of database tables. The proposed tool, which contains an expert system, also includes the possibility of optimizing the data types in the target RDBMS database tables based on processed data of the source RDBMS database tables. The proposed expert system is shown on data migration of selected database of the source RDBMS to the target RDBMS.
Keywords: Expert system, fuzzy, data migration, database, relational database, data type, relational database management system.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 34927291 Opening up Government Datasets for Big Data Analysis to Support Policy Decisions
Authors: K. Hardy, A. Maurushat
Abstract:
Policy makers are increasingly looking to make evidence-based decisions. Evidence-based decisions have historically used rigorous methodologies of empirical studies by research institutes, as well as less reliable immediate survey/polls often with limited sample sizes. As we move into the era of Big Data analytics, policy makers are looking to different methodologies to deliver reliable empirics in real-time. The question is not why did these people do this for the last 10 years, but why are these people doing this now, and if the this is undesirable, and how can we have an impact to promote change immediately. Big data analytics rely heavily on government data that has been released in to the public domain. The open data movement promises greater productivity and more efficient delivery of services; however, Australian government agencies remain reluctant to release their data to the general public. This paper considers the barriers to releasing government data as open data, and how these barriers might be overcome.
Keywords: Big data, open data, productivity, transparency.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16367290 Forthcoming Big Data on Smart Buildings and Cities: An Experimental Study on Correlations among Urban Data
Authors: Yu-Mi Song, Sung-Ah Kim, Dongyoun Shin
Abstract:
Cities are complex systems of diverse and inter-tangled activities. These activities and their complex interrelationships create diverse urban phenomena. And such urban phenomena have considerable influences on the lives of citizens. This research aimed to develop a method to reveal the causes and effects among diverse urban elements in order to enable better understanding of urban activities and, therefrom, to make better urban planning strategies. Specifically, this study was conducted to solve a data-recommendation problem found on a Korean public data homepage. First, a correlation analysis was conducted to find the correlations among random urban data. Then, based on the results of that correlation analysis, the weighted data network of each urban data was provided to people. It is expected that the weights of urban data thereby obtained will provide us with insights into cities and show us how diverse urban activities influence each other and induce feedback.
Keywords: Big data, correlation analysis, data recommendation system, urban data network.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11057289 On the Combination of Patient-Generated Data with Data from a Secure Clinical Network Environment – A Practical Example
Authors: Jeroen S. de Bruin, Karin Schindler, Christian Schuh
Abstract:
With increasingly more mobile health applications appearing due to the popularity of smartphones, the possibility arises that these data can be used to improve the medical diagnostic process, as well as the overall quality of healthcare, while at the same time lowering costs. However, as of yet there have been no reports of a successful combination of patient-generated data from smartphones with data from clinical routine. In this paper we describe how these two types of data can be combined in a secure way without modification to hospital information systems, and how they can together be used in a medical expert system for automatic nutritional classification and triage.
Keywords: Data integration, disease-related malnutrition, expert systems, mobile health.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22007288 Precious and Rare Metals in Overburden Carbonaceous Rocks: Methods of Extraction
Authors: Tatyana Alexandrova, Alexandr Alexandrov, Nadezhda Nikolaeva
Abstract:
A problem of complex mineral resources development is urgent and priority, it is aimed at realization of the processes of their ecologically safe development, one of its components is revealing the influence of the forms of element compounds in raw materials and in the processing products. In view of depletion of the precious metal reserves at the traditional deposits in the XXI century the large-size open cast deposits, localized in black shale strata begin to play the leading role. Carbonaceous (black) shales carry a heightened metallogenic potential. Black shales with high content of carbon are widely distributed within the scope of Bureinsky massif. According to academician Hanchuk`s data black shales of Sutirskaya series contain generally PGEs native form. The presence of high absorptive towards carbonaceous matter gold and PGEs compounds in crude ore results in decrease of valuable components extraction because of their sorption into dissipated carbonaceous matter.Keywords: Сarbonaceous rocks, bitumens, precious metals, concentration, extraction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16477287 Full-genomic Network Inference for Non-model organisms: A Case Study for the Fungal Pathogen Candida albicans
Authors: Jörg Linde, Ekaterina Buyko, Robert Altwasser, Udo Hahn, Reinhard Guthke
Abstract:
Reverse engineering of full-genomic interaction networks based on compendia of expression data has been successfully applied for a number of model organisms. This study adapts these approaches for an important non-model organism: The major human fungal pathogen Candida albicans. During the infection process, the pathogen can adapt to a wide range of environmental niches and reversibly changes its growth form. Given the importance of these processes, it is important to know how they are regulated. This study presents a reverse engineering strategy able to infer fullgenomic interaction networks for C. albicans based on a linear regression, utilizing the sparseness criterion (LASSO). To overcome the limited amount of expression data and small number of known interactions, we utilize different prior-knowledge sources guiding the network inference to a knowledge driven solution. Since, no database of known interactions for C. albicans exists, we use a textmining system which utilizes full-text research papers to identify known regulatory interactions. By comparing with these known regulatory interactions, we find an optimal value for global modelling parameters weighting the influence of the sparseness criterion and the prior-knowledge. Furthermore, we show that soft integration of prior-knowledge additionally improves the performance. Finally, we compare the performance of our approach to state of the art network inference approaches.
Keywords: Pathogen, network inference, text-mining, Candida albicans, LASSO, mutual information, reverse engineering, linear regression, modelling.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16737286 Comparison of Imputation Techniques for Efficient Prediction of Software Fault Proneness in Classes
Authors: Geeta Sikka, Arvinder Kaur Takkar, Moin Uddin
Abstract:
Missing data is a persistent problem in almost all areas of empirical research. The missing data must be treated very carefully, as data plays a fundamental role in every analysis. Improper treatment can distort the analysis or generate biased results. In this paper, we compare and contrast various imputation techniques on missing data sets and make an empirical evaluation of these methods so as to construct quality software models. Our empirical study is based on NASA-s two public dataset. KC4 and KC1. The actual data sets of 125 cases and 2107 cases respectively, without any missing values were considered. The data set is used to create Missing at Random (MAR) data Listwise Deletion(LD), Mean Substitution(MS), Interpolation, Regression with an error term and Expectation-Maximization (EM) approaches were used to compare the effects of the various techniques.Keywords: Missing data, Imputation, Missing Data Techniques.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16677285 Cluster Analysis for the Statistical Modeling of Aesthetic Judgment Data Related to Comics Artists
Authors: George E. Tsekouras, Evi Sampanikou
Abstract:
We compare three categorical data clustering algorithms with respect to the problem of classifying cultural data related to the aesthetic judgment of comics artists. Such a classification is very important in Comics Art theory since the determination of any classes of similarities in such kind of data will provide to art-historians very fruitful information of Comics Art-s evolution. To establish this, we use a categorical data set and we study it by employing three categorical data clustering algorithms. The performances of these algorithms are compared each other, while interpretations of the clustering results are also given.Keywords: Aesthetic judgment, comics artists, cluster analysis, categorical data.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1634