Search results for: data scarcity
24849 Analysis and Rule Extraction of Coronary Artery Disease Data Using Data Mining
Authors: Rezaei Hachesu Peyman, Oliyaee Azadeh, Salahzadeh Zahra, Alizadeh Somayyeh, Safaei Naser
Abstract:
Coronary Artery Disease (CAD) is one major cause of disability in adults and one main cause of death in developed. In this study, data mining techniques including Decision Trees, Artificial neural networks (ANNs), and Support Vector Machine (SVM) analyze CAD data. Data of 4948 patients who had suffered from heart diseases were included in the analysis. CAD is the target variable, and 24 inputs or predictor variables are used for the classification. The performance of these techniques is compared in terms of sensitivity, specificity, and accuracy. The most significant factor influencing CAD is chest pain. Elderly males (age > 53) have a high probability to be diagnosed with CAD. SVM algorithm is the most useful way for evaluation and prediction of CAD patients as compared to non-CAD ones. Application of data mining techniques in analyzing coronary artery diseases is a good method for investigating the existing relationships between variables.Keywords: classification, coronary artery disease, data-mining, knowledge discovery, extract
Procedia PDF Downloads 65724848 Sensor Data Analysis for a Large Mining Major
Authors: Sudipto Shanker Dasgupta
Abstract:
One of the largest mining companies wanted to look at health analytics for their driverless trucks. These trucks were the key to their supply chain logistics. The automated trucks had multi-level sub-assemblies which would send out sensor information. The use case that was worked on was to capture the sensor signal from the truck subcomponents and analyze the health of the trucks from repair and replacement purview. Open source software was used to stream the data into a clustered Hadoop setup in Amazon Web Services cloud and Apache Spark SQL was used to analyze the data. All of this was achieved through a 10 node amazon 32 core, 64 GB RAM setup real-time analytics was achieved on ‘300 million records’. To check the scalability of the system, the cluster was increased to 100 node setup. This talk will highlight how Open Source software was used to achieve the above use case and the insights on the high data throughput on a cloud set up.Keywords: streaming analytics, data science, big data, Hadoop, high throughput, sensor data
Procedia PDF Downloads 40224847 Data-Centric Anomaly Detection with Diffusion Models
Authors: Sheldon Liu, Gordon Wang, Lei Liu, Xuefeng Liu
Abstract:
Anomaly detection, also referred to as one-class classification, plays a crucial role in identifying product images that deviate from the expected distribution. This study introduces Data-centric Anomaly Detection with Diffusion Models (DCADDM), presenting a systematic strategy for data collection and further diversifying the data with image generation via diffusion models. The algorithm addresses data collection challenges in real-world scenarios and points toward data augmentation with the integration of generative AI capabilities. The paper explores the generation of normal images using diffusion models. The experiments demonstrate that with 30% of the original normal image size, modeling in an unsupervised setting with state-of-the-art approaches can achieve equivalent performances. With the addition of generated images via diffusion models (10% equivalence of the original dataset size), the proposed algorithm achieves better or equivalent anomaly localization performance.Keywords: diffusion models, anomaly detection, data-centric, generative AI
Procedia PDF Downloads 8124846 Temporal Effects on Chemical Composition of Treated Wastewater and Borehole Water Used for Irrigation in Limpopo Province, South Africa
Authors: Pholosho M. Kgopa, Phatu W. Mashela, Alen Manyevere
Abstract:
Increasing incidents of drought spells in most Sub-Saharan Africa call for using alternative sources of water for irrigation in arid and semi-arid regions. A study was conducted to investigate chemical composition of borehole and treated wastewater from different sampling disposal sites at University of Limpopo Experimental Farm (ULEF). A 4 × 5 factorial experiment, with the borehole as a reference sampling site and three other sampling sites along the wastewater disposal system was conducted over five months. Water samples were collected at four sites namely, (a) exit from Pond 16 into the furrow, (b) entry into night-dam, (c) exit from night dam to irrigated fields and (d) exit from borehole to irrigated fields. Water samples were collected in the middle of each month, starting from July to November 2016. Samples were analysed for pH, EC, Ca, Mg, Na, K, Al, B, Zn, Cu, Cr, Pb, Cd and As. The site × time interactions were highly significant for Ca, Mg, Zn, Cu, Cr, Pb, Cd, and As variables, but not for Na and K. Sampling site was highly significant on all variables, with sampling period not significant for K and Na. Relative to water from the borehole, Na concentration in wastewater samples from the night-dam exit, night-dam entry and Pond16 exit were lower by 69, 34 and 55%, respectively. Relative to borehole water, Al was higher in wastewater sampling sites. In conclusion, both sampling site and period affected the chemical composition of treated wastewater.Keywords: irrigation water quality, spatial effects, temporal effects, water reuse, water scarcity
Procedia PDF Downloads 23824845 Regulation on the Protection of Personal Data Versus Quality Data Assurance in the Healthcare System Case Report
Authors: Elizabeta Krstić Vukelja
Abstract:
Digitization of personal data is a consequence of the development of information and communication technologies that create a new work environment with many advantages and challenges, but also potential threats to privacy and personal data protection. Regulation (EU) 2016/679 of the European Parliament and of the Council is becoming a law and obligation that should address the issues of personal data protection and information security. The existence of the Regulation leads to the conclusion that national legislation in the field of virtual environment, protection of the rights of EU citizens and processing of their personal data is insufficiently effective. In the health system, special emphasis is placed on the processing of special categories of personal data, such as health data. The healthcare industry is recognized as a particularly sensitive area in which a large amount of medical data is processed, the digitization of which enables quick access and quick identification of the health insured. The protection of the individual requires quality IT solutions that guarantee the technical protection of personal categories. However, the real problems are the technical and human nature and the spatial limitations of the application of the Regulation. Some conclusions will be drawn by analyzing the implementation of the basic principles of the Regulation on the example of the Croatian health care system and comparing it with similar activities in other EU member states.Keywords: regulation, healthcare system, personal dana protection, quality data assurance
Procedia PDF Downloads 3724844 Parallel Vector Processing Using Multi Level Orbital DATA
Authors: Nagi Mekhiel
Abstract:
Many applications use vector operations by applying single instruction to multiple data that map to different locations in conventional memory. Transferring data from memory is limited by access latency and bandwidth affecting the performance gain of vector processing. We present a memory system that makes all of its content available to processors in time so that processors need not to access the memory, we force each location to be available to all processors at a specific time. The data move in different orbits to become available to other processors in higher orbits at different time. We use this memory to apply parallel vector operations to data streams at first orbit level. Data processed in the first level move to upper orbit one data element at a time, allowing a processor in that orbit to apply another vector operation to deal with serial code limitations inherited in all parallel applications and interleaved it with lower level vector operations.Keywords: Memory Organization, Parallel Processors, Serial Code, Vector Processing
Procedia PDF Downloads 26824843 Reconstructability Analysis for Landslide Prediction
Authors: David Percy
Abstract:
Landslides are a geologic phenomenon that affects a large number of inhabited places and are constantly being monitored and studied for the prediction of future occurrences. Reconstructability analysis (RA) is a methodology for extracting informative models from large volumes of data that work exclusively with discrete data. While RA has been used in medical applications and social science extensively, we are introducing it to the spatial sciences through applications like landslide prediction. Since RA works exclusively with discrete data, such as soil classification or bedrock type, working with continuous data, such as porosity, requires that these data are binned for inclusion in the model. RA constructs models of the data which pick out the most informative elements, independent variables (IVs), from each layer that predict the dependent variable (DV), landslide occurrence. Each layer included in the model retains its classification data as a primary encoding of the data. Unlike other machine learning algorithms that force the data into one-hot encoding type of schemes, RA works directly with the data as it is encoded, with the exception of continuous data, which must be binned. The usual physical and derived layers are included in the model, and testing our results against other published methodologies, such as neural networks, yields accuracy that is similar but with the advantage of a completely transparent model. The results of an RA session with a data set are a report on every combination of variables and their probability of landslide events occurring. In this way, every combination of informative state combinations can be examined.Keywords: reconstructability analysis, machine learning, landslides, raster analysis
Procedia PDF Downloads 6424842 The Effect of Foreign Owned Firms and Licensed Manufacturing Agreements on Innovation: Case of Pharmaceutical Firms in Developing Countries
Authors: Ilham Benali, Nasser Hajji, Nawfal Acha
Abstract:
Given the fact that the pharmaceutical industry is a commonly studied sector in the context of innovation, the majority of innovation research is devoted to the developed markets known by high research and development (R&D) assets and intensive innovation. In contrast, in developing countries where R&D assets are very low, there is relatively little research to mention in the area of pharmaceutical sector innovation, characterized mainly by two principal elements which are the presence of foreign-owned firms and licensed manufacturing agreements between local firms and multinationals. With the scarcity of research in this field, this paper attempts to study the effect of these two elements on the firms’ innovation tendencies. Other traditional factors that influence innovation, which are the age and the size of the firm, the R&D activities and the market structure, revealed in the literature review, will be included in the study in order to try to make this work more exhaustive. The study starts by examining innovation tendency in pharmaceutical firms located in developing countries before analyzing the effect of foreign-owned firms and licensed manufacturing agreements between local firms and multinationals on technological, organizational and marketing innovation. Based on the related work and on the theoretical framework developed, there is a probability that foreign-owned firms and licensed manufacturing agreements between local firms and multinationals have a negative influence on technological innovation. The opposite effect is possible in the case of organizational and marketing innovation.Keywords: developing countries, foreign owned firms, innovation, licensed manufacturing agreements, pharmaceutical industry
Procedia PDF Downloads 16324841 A Preliminary Literature Review of Digital Transformation Case Studies
Authors: Vesna Bosilj Vukšić, Lucija Ivančić, Dalia Suša Vugec
Abstract:
While struggling to succeed in today’s complex market environment and provide better customer experience and services, enterprises encompass digital transformation as a means for reaching competitiveness and foster value creation. A digital transformation process consists of information technology implementation projects, as well as organizational factors such as top management support, digital transformation strategy, and organizational changes. However, to the best of our knowledge, there is little evidence about digital transformation endeavors in organizations and how they perceive it – is it only about digital technologies adoption or a true organizational shift is needed? In order to address this issue and as the first step in our research project, a literature review is conducted. The analysis included case study papers from Scopus and Web of Science databases. The following attributes are considered for classification and analysis of papers: time component; country of case origin; case industry and; digital transformation concept comprehension, i.e. focus. Research showed that organizations – public, as well as private ones, are aware of change necessity and employ digital transformation projects. Also, the changes concerning digital transformation affect both manufacturing and service-based industries. Furthermore, we discovered that organizations understand that besides technologies implementation, organizational changes must also be adopted. However, with only 29 relevant papers identified, research positioned digital transformation as an unexplored and emerging phenomenon in information systems research. The scarcity of evidence-based papers calls for further examination of this topic on cases from practice.Keywords: digital strategy, digital technologies, digital transformation, literature review
Procedia PDF Downloads 21724840 Data Analytics in Hospitality Industry
Authors: Tammy Wee, Detlev Remy, Arif Perdana
Abstract:
In the recent years, data analytics has become the buzzword in the hospitality industry. The hospitality industry is another example of a data-rich industry that has yet fully benefited from the insights of data analytics. Effective use of data analytics can change how hotels operate, market and position themselves competitively in the hospitality industry. However, at the moment, the data obtained by individual hotels remain under-utilized. This research is a preliminary research on data analytics in the hospitality industry, using an in-depth face-to-face interview on one hotel as a start to a multi-level research. The main case study of this research, hotel A, is a chain brand of international hotel that has been systematically gathering and collecting data on its own customer for the past five years. The data collection points begin from the moment a guest book a room until the guest leave the hotel premises, which includes room reservation, spa booking, and catering. Although hotel A has been gathering data intelligence on its customer for some time, they have yet utilized the data to its fullest potential, and they are aware of their limitation as well as the potential of data analytics. Currently, the utilization of data analytics in hotel A is limited in the area of customer service improvement, namely to enhance the personalization of service for each individual customer. Hotel A is able to utilize the data to improve and enhance their service which in turn, encourage repeated customers. According to hotel A, 50% of their guests returned to their hotel, and 70% extended nights because of the personalized service. Apart from using the data analytics for enhancing customer service, hotel A also uses the data in marketing. Hotel A uses the data analytics to predict or forecast the change in consumer behavior and demand, by tracking their guest’s booking preference, payment preference and demand shift between properties. However, hotel A admitted that the data they have been collecting was not fully utilized due to two challenges. The first challenge of using data analytics in hotel A is the data is not clean. At the moment, the data collection of one guest profile is meaningful only for one department in the hotel but meaningless for another department. Cleaning up the data and getting standards correctly for usage by different departments are some of the main concerns of hotel A. The second challenge of using data analytics in hotel A is the non-integral internal system. At the moment, the internal system used by hotel A do not integrate with each other well, limiting the ability to collect data systematically. Hotel A is considering another system to replace the current one for more comprehensive data collection. Hotel proprietors recognized the potential of data analytics as reported in this research, however, the current challenges of implementing a system to collect data come with a cost. This research has identified the current utilization of data analytics and the challenges faced when it comes to implementing data analytics.Keywords: data analytics, hospitality industry, customer relationship management, hotel marketing
Procedia PDF Downloads 17824839 Realization of a (GIS) for Drilling (DWS) through the Adrar Region
Authors: Djelloul Benatiallah, Ali Benatiallah, Abdelkader Harouz
Abstract:
Geographic Information Systems (GIS) include various methods and computer techniques to model, capture digitally, store, manage, view and analyze. Geographic information systems have the characteristic to appeal to many scientific and technical field, and many methods. In this article we will present a complete and operational geographic information system, following the theoretical principles of data management and adapting to spatial data, especially data concerning the monitoring of drinking water supply wells (DWS) Adrar region. The expected results of this system are firstly an offer consulting standard features, updating and editing beneficiaries and geographical data, on the other hand, provides specific functionality contractors entered data, calculations parameterized and statistics.Keywords: GIS, DWS, drilling, Adrar
Procedia PDF Downloads 30824838 Generic Data Warehousing for Consumer Electronics Retail Industry
Authors: S. Habte, K. Ouazzane, P. Patel, S. Patel
Abstract:
The dynamic and highly competitive nature of the consumer electronics retail industry means that businesses in this industry are experiencing different decision making challenges in relation to pricing, inventory control, consumer satisfaction and product offerings. To overcome the challenges facing retailers and create opportunities, we propose a generic data warehousing solution which can be applied to a wide range of consumer electronics retailers with a minimum configuration. The solution includes a dimensional data model, a template SQL script, a high level architectural descriptions, ETL tool developed using C#, a set of APIs, and data access tools. It has been successfully applied by ASK Outlets Ltd UK resulting in improved productivity and enhanced sales growth.Keywords: consumer electronics, data warehousing, dimensional data model, generic, retail industry
Procedia PDF Downloads 40924837 Sequential Data Assimilation with High-Frequency (HF) Radar Surface Current
Authors: Lei Ren, Michael Hartnett, Stephen Nash
Abstract:
The abundant measured surface current from HF radar system in coastal area is assimilated into model to improve the modeling forecasting ability. A simple sequential data assimilation scheme, Direct Insertion (DI), is applied to update model forecast states. The influence of Direct Insertion data assimilation over time is analyzed at one reference point. Vector maps of surface current from models are compared with HF radar measurements. Root-Mean-Squared-Error (RMSE) between modeling results and HF radar measurements is calculated during the last four days with no data assimilation.Keywords: data assimilation, CODAR, HF radar, surface current, direct insertion
Procedia PDF Downloads 57124836 Measured versus Default Interstate Traffic Data in New Mexico, USA
Authors: M. A. Hasan, M. R. Islam, R. A. Tarefder
Abstract:
This study investigates how the site specific traffic data differs from the Mechanistic Empirical Pavement Design Software default values. Two Weigh-in-Motion (WIM) stations were installed in Interstate-40 (I-40) and Interstate-25 (I-25) to developed site specific data. A computer program named WIM Data Analysis Software (WIMDAS) was developed using Microsoft C-Sharp (.Net) for quality checking and processing of raw WIM data. A complete year data from November 2013 to October 2014 was analyzed using the developed WIM Data Analysis Program. After that, the vehicle class distribution, directional distribution, lane distribution, monthly adjustment factor, hourly distribution, axle load spectra, average number of axle per vehicle, axle spacing, lateral wander distribution, and wheelbase distribution were calculated. Then a comparative study was done between measured data and AASHTOWare default values. It was found that the measured general traffic inputs for I-40 and I-25 significantly differ from the default values.Keywords: AASHTOWare, traffic, weigh-in-motion, axle load distribution
Procedia PDF Downloads 34024835 Effect of Thermal Energy on Inorganic Coagulation for the Treatment of Industrial Wastewater
Authors: Abhishek Singh, Rajlakshmi Barman, Tanmay Shah
Abstract:
Coagulation is considered to be one of the predominant water treatment processes which improve the cost effectiveness of wastewater. The sole purpose of this experiment on thermal coagulation is to increase the efficiency and the rate of reaction. The process uses renewable sources of energy which comprises of improved and minimized time method in order to eradicate the water scarcity of the regions which are on the brink of depletion. This paper includes the various effects of temperature on the standard coagulation treatment of wastewater and their effect on water quality. In addition, the coagulation is done with the mix of bottom/fly-ash that will act as an adsorbent and removes most of the minor and macro particles by means of adsorption which not only helps to reduce the environmental burden of fly ash but also enhance economic benefit. Also, the method of sand filtration is amalgamated in the process. The sand filter is an environmentally-friendly wastewater treatment method, which is relatively simple and inexpensive. The existing parameters were satisfied with the experimental results obtained in this study and were found satisfactory. The initial turbidity of the wastewater is 162 NTU. The initial temperature of the wastewater is 27 C. The temperature variation of the entire process is 50 C-80 C. The concentration of alum in wastewater is 60mg/L-320mg/L. The turbidity range is 8.31-28.1 NTU after treatment. pH variation is 7.73-8.29. The effective time taken is 10 minutes for thermal mixing and sedimentation. The results indicate that the presence of thermal energy affects the coagulation treatment process. The influence of thermal energy on turbidity is assessed along with renewable energy sources and increase of the rate of reaction of the treatment process.Keywords: adsorbent, sand filter, temperature, thermal coagulation
Procedia PDF Downloads 31924834 The Influence of English Immersion Program on Academic Performance: Case Study at a Sino-US Cooperative University in China
Authors: Leah Li Echiverri, Haoyu Shang, Yue Li
Abstract:
Wenzhou-Kean University (WKU) is a Sino-US Cooperative University in China. It practices the English Immersion Program (EIP), where all the courses are taught in English. Class discussions and presentations are pervasively interwoven in designing students’ learning experiences. This WKU model has brought positive influences on students and is in some way ahead of traditional college English majors. However, literature to support the perceptions on the positive outcomes of this teaching and learning model remain scarce. The distinctive profile of Chinese-ESL students in an English Medium of Instruction (EMI) environment contributes further to the scarcity of literature compared to existing studies conducted among ESL learners in Western educational settings. Hence, the study investigated the students’ perceptions towards the English Immersion Program and determine how it influences Chinese-ESL students’ academic performance (AP). This research can provide empirical data that would be helpful to educators, teaching practitioners, university administrators, and other researchers in making informed decisions when developing curricular reforms, instructional and pedagogical methods, and university-wide support programs using this educational model. The purpose of the study was to establish the relationship between the English Immersion Program and Academic Performance among Chinese-ESL students enrolled at WKU for the academic year 2020-2021. Course length, immersion location, course type, and instructional design were the constructs of the English immersion program. English language learning, learning efficiency, and class participation were used to measure academic performance. Descriptive-correlational design was used in this cross-sectional research project. A quantitative approach for data analysis was applied to determine the relationship between the English immersion program and Chinese-ESL students’ academic performance. The research was conducted at WKU; a Chinese-American jointly established higher educational institution located in Wenzhou, Zhejiang province. Convenience, random, and snowball sampling of 283 students, a response rate of 10.5%, were applied to represent the WKU student population. The questionnaire was posted through the survey website named Wenjuanxing and shared to QQ or WeChat. Cronbach’s alpha was used to test the reliability of the research instrument. Findings revealed that when professors integrate technology (PowerPoint, videos, and audios) in teaching, students pay more attention. This contributes to the acquisition of more professional knowledge in their major courses. As to course immersion, students perceive WKU as a good place to study, providing them a high degree of confidence to talk with their professors in English. This also contributes to their English fluency and better pronunciation in their communication. In the construct of designing instruction, the use of pictures, video clips, and professors’ non-verbal communication, and demonstration of concern for students encouraged students to be more active in-class participation. Findings on course length and academic performance indicated that students’ perception regarding taking courses during fall and spring terms can moderately contribute to their academic performance. In conclusion, the findings revealed a significantly strong positive relationship between course type, immersion location, instructional design, and academic performance.Keywords: class participation, English immersion program, English language learning, learning efficiency
Procedia PDF Downloads 17324833 Design of Knowledge Management System with Geographic Information System
Authors: Angga Hidayah Ramadhan, Luciana Andrawina, M. Azani Hasibuan
Abstract:
Data will be as a core of the decision if it has a good treatment or process, which is process that data into information, and information into knowledge to make a wisdom or decision. Today, many companies have not realize it include XYZ University Admission Directorate as executor of National Admission called Seleksi Masuk Bersama (SMB) that during the time, the workers only uses their feeling to make a decision. Whereas if it done, then that company can analyze the data to make a right decision to get a pin sales from student candidate or registrant that follow SMB as many as possible. Therefore, needs Knowledge Management System (KMS) with Geographic Information System (GIS) use 5C4C that can process that company data becomes more useful and can help make decisions. This information system can process data into information based on the pin sold data with 5C (Contextualized, Categorize, Calculation, Correction, Condensed) and convert information into knowledge with 4C (Comparing, Consequence, Connection, Conversation) that has been several steps until these data can be useful to make easier to take a decision or wisdom, resolve problems, communicate, and quicker to learn to the employees have not experience and also for ease of viewing/visualization based on spatial data that equipped with GIS functionality that can be used to indicate events in each province with indicator that facilitate in this system. The system also have a function to save the tacit on the system then to be proceed into explicit in expert system based on the problems that will be found from the consequences of information. With the system each team can make a decision with same ways, structured, and the important is based on the actual event/data.Keywords: 5C4C, data, information, knowledge
Procedia PDF Downloads 46024832 A Policy Strategy for Building Energy Data Management in India
Authors: Shravani Itkelwar, Deepak Tewari, Bhaskar Natarajan
Abstract:
The energy consumption data plays a vital role in energy efficiency policy design, implementation, and impact assessment. Any demand-side energy management intervention's success relies on the availability of accurate, comprehensive, granular, and up-to-date data on energy consumption. The Building sector, including residential and commercial, is one of the largest consumers of energy in India after the Industrial sector. With economic growth and increasing urbanization, the building sector is projected to grow at an unprecedented rate, resulting in a 5.6 times escalation in energy consumption till 2047 compared to 2017. Therefore, energy efficiency interventions will play a vital role in decoupling the floor area growth and associated energy demand, thereby increasing the need for robust data. In India, multiple institutions are involved in the collection and dissemination of data. This paper focuses on energy consumption data management in the building sector in India for both residential and commercial segments. It evaluates the robustness of data available through administrative and survey routes to estimate the key performance indicators and identify critical data gaps for making informed decisions. The paper explores several issues in the data, such as lack of comprehensiveness, non-availability of disaggregated data, the discrepancy in different data sources, inconsistent building categorization, and others. The identified data gaps are justified with appropriate examples. Moreover, the paper prioritizes required data in order of relevance to policymaking and groups it into "available," "easy to get," and "hard to get" categories. The paper concludes with recommendations to address the data gaps by leveraging digital initiatives, strengthening institutional capacity, institutionalizing exclusive building energy surveys, and standardization of building categorization, among others, to strengthen the management of building sector energy consumption data.Keywords: energy data, energy policy, energy efficiency, buildings
Procedia PDF Downloads 18424831 A Survey on Data-Centric and Data-Aware Techniques for Large Scale Infrastructures
Authors: Silvina Caíno-Lores, Jesús Carretero
Abstract:
Large scale computing infrastructures have been widely developed with the core objective of providing a suitable platform for high-performance and high-throughput computing. These systems are designed to support resource-intensive and complex applications, which can be found in many scientific and industrial areas. Currently, large scale data-intensive applications are hindered by the high latencies that result from the access to vastly distributed data. Recent works have suggested that improving data locality is key to move towards exascale infrastructures efficiently, as solutions to this problem aim to reduce the bandwidth consumed in data transfers, and the overheads that arise from them. There are several techniques that attempt to move computations closer to the data. In this survey we analyse the different mechanisms that have been proposed to provide data locality for large scale high-performance and high-throughput systems. This survey intends to assist scientific computing community in understanding the various technical aspects and strategies that have been reported in recent literature regarding data locality. As a result, we present an overview of locality-oriented techniques, which are grouped in four main categories: application development, task scheduling, in-memory computing and storage platforms. Finally, the authors include a discussion on future research lines and synergies among the former techniques.Keywords: data locality, data-centric computing, large scale infrastructures, cloud computing
Procedia PDF Downloads 25724830 Wind Speed Data Analysis in Colombia in 2013 and 2015
Authors: Harold P. Villota, Alejandro Osorio B.
Abstract:
The energy meteorology is an area for study energy complementarity and the use of renewable sources in interconnected systems. Due to diversify the energy matrix in Colombia with wind sources, is necessary to know the data bases about this one. However, the time series given by 260 automatic weather stations have empty, and no apply data, so the purpose is to fill the time series selecting two years to characterize, impute and use like base to complete the data between 2005 and 2020.Keywords: complementarity, wind speed, renewable, colombia, characteri, characterization, imputation
Procedia PDF Downloads 16224829 Industrial Process Mining Based on Data Pattern Modeling and Nonlinear Analysis
Authors: Hyun-Woo Cho
Abstract:
Unexpected events may occur with serious impacts on industrial process. This work utilizes a data representation technique to model and to analyze process data pattern for the purpose of diagnosis. In this work, the use of triangular representation of process data is evaluated using simulation process. Furthermore, the effect of using different pre-treatment techniques based on such as linear or nonlinear reduced spaces was compared. This work extracted the fault pattern in the reduced space, not in the original data space. The results have shown that the non-linear technique based diagnosis method produced more reliable results and outperforms linear method.Keywords: process monitoring, data analysis, pattern modeling, fault, nonlinear techniques
Procedia PDF Downloads 38624828 Recommender System Based on Mining Graph Databases for Data-Intensive Applications
Authors: Mostafa Gamal, Hoda K. Mohamed, Islam El-Maddah, Ali Hamdi
Abstract:
In recent years, many digital documents on the web have been created due to the rapid growth of ’social applications’ communities or ’Data-intensive applications’. The evolution of online-based multimedia data poses new challenges in storing and querying large amounts of data for online recommender systems. Graph data models have been shown to be more efficient than relational data models for processing complex data. This paper will explain the key differences between graph and relational databases, their strengths and weaknesses, and why using graph databases is the best technology for building a realtime recommendation system. Also, The paper will discuss several similarity metrics algorithms that can be used to compute a similarity score of pairs of nodes based on their neighbourhoods or their properties. Finally, the paper will discover how NLP strategies offer the premise to improve the accuracy and coverage of realtime recommendations by extracting the information from the stored unstructured knowledge, which makes up the bulk of the world’s data to enrich the graph database with this information. As the size and number of data items are increasing rapidly, the proposed system should meet current and future needs.Keywords: graph databases, NLP, recommendation systems, similarity metrics
Procedia PDF Downloads 10324827 Digital Revolution a Veritable Infrastructure for Technological Development
Authors: Osakwe Jude Odiakaosa
Abstract:
Today’s digital society is characterized by e-education or e-learning, e-commerce, and so on. All these have been propelled by digital revolution. Digital technology such as computer technology, Global Positioning System (GPS) and Geographic Information System (GIS) has been having a tremendous impact on the field of technology. This development has positively affected the scope, methods, speed of data acquisition, data management and the rate of delivery of the results (map and other map products) of data processing. This paper tries to address the impact of revolution brought by digital technology.Keywords: digital revolution, internet, technology, data management
Procedia PDF Downloads 44724826 Status of Production, Distribution and Determinants of Biomass Briquette Acceptability in Kampala, Uganda
Authors: David B. Kisakye, Paul Mugabi
Abstract:
Biomass briquettes have been identified as a plausible and close alternative to commonly used energy fuels such as charcoal and firewood, whose prices are escalating due to the dwindling natural resource base. However, briquettes do not seem to be as popular as would be expected. This study assessed the production, distribution, and acceptability of the briquettes in the Kampala district. A total of 60 respondents, 50 of whom were briquette users and 10 briquette producers, were sampled from five divisions of Kampala district to evaluate consumer acceptability, preference for briquette type and shape. Households and institutions were identified to be the major consumers of briquettes, while community-based organizations were the major distributors of briquettes. The Chi-square test of independence showed a significant association between briquette acceptability and briquette attributes of substitutability and low cost (p < 0,05). The Kruskal Wallis test showed that low-income class people preferred non-carbonized briquettes. Gender, marital status, and income level also cause variation in preference for spherical, stick, and honeycomb briquettes (p < 0,05). The major challenges faced by briquette users in Kampala were; production of a lot of ash, frequent crushing, and limited access to briquettes. The producers of briquettes were mainly challenged by regular machine breakdown, raw material scarcity, and poor carbonizing units. It was concluded that briquettes have a market and are generally accepted in Kampala. However, user preferences need to be taken into account by briquette produces, suitable cookstoves should be availed to users, and there is a need for standards to ensure the quality of briquettes.Keywords: consumer acceptability, biomass residues, briquettes, briquette producers, distribution, fuel, marketability, wood fuel
Procedia PDF Downloads 14224825 BigCrypt: A Probable Approach of Big Data Encryption to Protect Personal and Business Privacy
Authors: Abdullah Al Mamun, Talal Alkharobi
Abstract:
As data size is growing up, people are became more familiar to store big amount of secret information into cloud storage. Companies are always required to need transfer massive business files from one end to another. We are going to lose privacy if we transmit it as it is and continuing same scenario repeatedly without securing the communication mechanism means proper encryption. Although asymmetric key encryption solves the main problem of symmetric key encryption but it can only encrypt limited size of data which is inapplicable for large data encryption. In this paper we propose a probable approach of pretty good privacy for encrypt big data using both symmetric and asymmetric keys. Our goal is to achieve encrypt huge collection information and transmit it through a secure communication channel for committing the business and personal privacy. To justify our method an experimental dataset from three different platform is provided. We would like to show that our approach is working for massive size of various data efficiently and reliably.Keywords: big data, cloud computing, cryptography, hadoop, public key
Procedia PDF Downloads 32024824 Implementation of Big Data Concepts Led by the Business Pressures
Authors: Snezana Savoska, Blagoj Ristevski, Violeta Manevska, Zlatko Savoski, Ilija Jolevski
Abstract:
Big data is widely accepted by the pharmaceutical companies as a result of business demands create through legal pressure. Pharmaceutical companies have many legal demands as well as standards’ demands and have to adapt their procedures to the legislation. To manage with these demands, they have to standardize the usage of the current information technology and use the latest software tools. This paper highlights some important aspects of experience with big data projects implementation in a pharmaceutical Macedonian company. These projects made improvements of their business processes by the help of new software tools selected to comply with legal and business demands. They use IT as a strategic tool to obtain competitive advantage on the market and to reengineer the processes towards new Internet economy and quality demands. The company is required to manage vast amounts of structured as well as unstructured data. For these reasons, they implement projects for emerging and appropriate software tools which have to deal with big data concepts accepted in the company.Keywords: big data, unstructured data, SAP ERP, documentum
Procedia PDF Downloads 26924823 Saving Energy at a Wastewater Treatment Plant through Electrical and Production Data Analysis
Authors: Adriano Araujo Carvalho, Arturo Alatrista Corrales
Abstract:
This paper intends to show how electrical energy consumption and production data analysis were used to find opportunities to save energy at Taboada wastewater treatment plant in Callao, Peru. In order to access the data, it was used independent data networks for both electrical and process instruments, which were taken to analyze under an ISO 50001 energy audit, which considered, thus, Energy Performance Indexes for each process and a step-by-step guide presented in this text. Due to the use of aforementioned methodology and data mining techniques applied on information gathered through electronic multimeters (conveniently placed on substation switchboards connected to a cloud network), it was possible to identify thoroughly the performance of each process and thus, evidence saving opportunities which were previously hidden before. The data analysis brought both costs and energy reduction, allowing the plant to save significant resources and to be certified under ISO 50001.Keywords: energy and production data analysis, energy management, ISO 50001, wastewater treatment plant energy analysis
Procedia PDF Downloads 19224822 Life Cycle Assessment-Based Environmental Assessment of the Production and Maintenance of Wooden Windows
Authors: Pamela Del Rosario, Elisabetta Palumbo, Marzia Traverso
Abstract:
The building sector plays an important role in addressing pressing environmental issues such as climate change and resource scarcity. The energy performance of buildings is considerably affected by the external envelope. In fact, a considerable proportion of the building energy demand is due to energy losses through the windows. Nevertheless, according to literature, to pay attention only to the contribution of windows to the building energy performance, i.e., their influence on energy use during building operation, could result in a partial evaluation. Hence, it is important to consider not only the building energy performance but also the environmental performance of windows, and this not only during the operational stage but along its complete life cycle. Life Cycle Assessment (LCA) according to ISO 14040:2006 and ISO 14044:2006+A1:2018 is one of the most adopted and robust methods to evaluate the environmental performance of products throughout their complete life cycle. This life-cycle based approach avoids the shift of environmental impacts of a life cycle stage to another, allowing to allocate them to the stage in which they originated and to adopt measures that optimize the environmental performance of the product. Moreover, the LCA method is widely implemented in the construction sector to assess whole buildings as well as construction products and materials. LCA is regulated by the European Standards EN 15978:2011, at the building level, and EN 15804:2012+A2:2019, at the level of construction products and materials. In this work, the environmental performance of wooden windows was assessed by implementing the LCA method and adopting primary data. More specifically, the emphasis is given to embedded and operational impacts. Furthermore, correlations are made between these environmental impacts and aspects such as type of wood and window transmittance. In the particular case of the operational impacts, special attention is set on the definition of suitable maintenance scenarios that consider the potential climate influence on the environmental impacts. For this purpose, a literature review was conducted, and expert consultation was carried out. The study underlined the variability of the embedded environmental impacts of wooden windows by considering different wood types and transmittance values. The results also highlighted the need to define appropriate maintenance scenarios for precise assessment results. It was found that both the service life and the window maintenance requirements in terms of treatment and its frequency are highly dependent not only on the wood type and its treatment during the manufacturing process but also on the weather conditions of the place where the window is installed. In particular, it became evident that maintenance-related environmental impacts were the highest for climate regions with the lowest temperatures and the greatest amount of precipitation.Keywords: embedded impacts, environmental performance, life cycle assessment, LCA, maintenance stage, operational impacts, wooden windows
Procedia PDF Downloads 23124821 Data Clustering in Wireless Sensor Network Implemented on Self-Organization Feature Map (SOFM) Neural Network
Authors: Krishan Kumar, Mohit Mittal, Pramod Kumar
Abstract:
Wireless sensor network is one of the most promising communication networks for monitoring remote environmental areas. In this network, all the sensor nodes are communicated with each other via radio signals. The sensor nodes have capability of sensing, data storage and processing. The sensor nodes collect the information through neighboring nodes to particular node. The data collection and processing is done by data aggregation techniques. For the data aggregation in sensor network, clustering technique is implemented in the sensor network by implementing self-organizing feature map (SOFM) neural network. Some of the sensor nodes are selected as cluster head nodes. The information aggregated to cluster head nodes from non-cluster head nodes and then this information is transferred to base station (or sink nodes). The aim of this paper is to manage the huge amount of data with the help of SOM neural network. Clustered data is selected to transfer to base station instead of whole information aggregated at cluster head nodes. This reduces the battery consumption over the huge data management. The network lifetime is enhanced at a greater extent.Keywords: artificial neural network, data clustering, self organization feature map, wireless sensor network
Procedia PDF Downloads 51624820 Review and Comparison of Associative Classification Data Mining Approaches
Authors: Suzan Wedyan
Abstract:
Data mining is one of the main phases in the Knowledge Discovery Database (KDD) which is responsible of finding hidden and useful knowledge from databases. There are many different tasks for data mining including regression, pattern recognition, clustering, classification, and association rule. In recent years a promising data mining approach called associative classification (AC) has been proposed, AC integrates classification and association rule discovery to build classification models (classifiers). This paper surveys and critically compares several AC algorithms with reference of the different procedures are used in each algorithm, such as rule learning, rule sorting, rule pruning, classifier building, and class allocation for test cases.Keywords: associative classification, classification, data mining, learning, rule ranking, rule pruning, prediction
Procedia PDF Downloads 534