Search results for: design exploration and data mining
34611 Predicting Medical Check-Up Patient Re-Coming Using Sequential Pattern Mining and Association Rules
Authors: Rizka Aisha Rahmi Hariadi, Chao Ou-Yang, Han-Cheng Wang, Rajesri Govindaraju
Abstract:
As the increasing of medical check-up popularity, there are a huge number of medical check-up data stored in database and have not been useful. These data actually can be very useful for future strategic planning if we mine it correctly. In other side, a lot of patients come with unpredictable coming and also limited available facilities make medical check-up service offered by hospital not maximal. To solve that problem, this study used those medical check-up data to predict patient re-coming. Sequential pattern mining (SPM) and association rules method were chosen because these methods are suitable for predicting patient re-coming using sequential data. First, based on patient personal information the data was grouped into … groups then discriminant analysis was done to check significant of the grouping. Second, for each group some frequent patterns were generated using SPM method. Third, based on frequent patterns of each group, pairs of variable can be extracted using association rules to get general pattern of re-coming patient. Last, discussion and conclusion was done to give some implications of the results.Keywords: patient re-coming, medical check-up, health examination, data mining, sequential pattern mining, association rules, discriminant analysis
Procedia PDF Downloads 64034610 Advanced Data Visualization Techniques for Effective Decision-making in Oil and Gas Exploration and Production
Authors: Deepak Singh, Rail Kuliev
Abstract:
This research article explores the significance of advanced data visualization techniques in enhancing decision-making processes within the oil and gas exploration and production domain. With the oil and gas industry facing numerous challenges, effective interpretation and analysis of vast and diverse datasets are crucial for optimizing exploration strategies, production operations, and risk assessment. The article highlights the importance of data visualization in managing big data, aiding the decision-making process, and facilitating communication with stakeholders. Various advanced data visualization techniques, including 3D visualization, augmented reality (AR), virtual reality (VR), interactive dashboards, and geospatial visualization, are discussed in detail, showcasing their applications and benefits in the oil and gas sector. The article presents case studies demonstrating the successful use of these techniques in optimizing well placement, real-time operations monitoring, and virtual reality training. Additionally, the article addresses the challenges of data integration and scalability, emphasizing the need for future developments in AI-driven visualization. In conclusion, this research emphasizes the immense potential of advanced data visualization in revolutionizing decision-making processes, fostering data-driven strategies, and promoting sustainable growth and improved operational efficiency within the oil and gas exploration and production industry.Keywords: augmented reality (AR), virtual reality (VR), interactive dashboards, real-time operations monitoring
Procedia PDF Downloads 8634609 The Impact of Gold Mining on Disability: Experiences from the Obuasi Municipal Area
Authors: Mavis Yaa Konadu Agyemang
Abstract:
Despite provisions to uphold and safeguard the rights of persons with disability in Ghana, there is evidence that they still encounter several challenges which limit their full and effective involvement in mainstream society, including the gold mining sector. The study sought to explore how persons with physical disability (PWPDs) experience gold mining in the Obuasi Municipal Area. A qualitative research design was used to discover and understand the experiences of PWPDs regarding mining. The purposive sampling technique was used to select five key informants for the study with the age range of (24-52 years) while snowball sampling aided the selection of 16 persons with various forms of physical disability with the age range of (24-60 years). In-depth interviews were used to gather data. The interviews lasted from forty-five minutes to an hour. In relation to the setting, the interviews of thirteen (13) of the participants with disability were done in their houses, two (2) were done on the phone, and one (1) was done in the office. Whereas the interviews of the five (5) key informants were all done in their offices. Data were analyzed using Creswell’s (2009) concept of thematic analysis. The findings suggest that even though land degradation affected everyone in the area, persons with mobility and visual impairment experienced many difficulties trekking the undulating land for long distances in search of arable land. Also, although mining activities are mostly labour-intensive, PWPDs were not employed even in areas where they could work. Further, the cost of items, in general, was high, affecting PWPDs more due to their economic immobility and paying for other sources of water due to land degradation and water pollution. The study also discovered that the peculiar conditions of PWPDs were not factored into compensation payments, and neither were females with physical disability engaged in compensation negotiations. Also, although some of the infrastructure provided by the gold mining companies in the area was physically accessible to some extent, it was not accessible in terms of information delivery. There is a need to educate the public on the effects of mining on PWPDs, their needs as well as disability issues in general. The Minerals and Mining Act (703) should be amended to include provisions that would consider the peculiar needs of PWPDs in compensation payment.Keywords: mining, resettlement, compensation, environmental, social, disability
Procedia PDF Downloads 5534608 Modelling of Powered Roof Supports Work
Authors: Marcin Michalak
Abstract:
Due to the increasing efforts on saving our natural environment a change in the structure of energy resources can be observed - an increasing fraction of a renewable energy sources. In many countries traditional underground coal mining loses its significance but there are still countries, like Poland or Germany, in which the coal based technologies have the greatest fraction in a total energy production. This necessitates to make an effort to limit the costs and negative effects of underground coal mining. The longwall complex is as essential part of the underground coal mining. The safety and the effectiveness of the work is strongly dependent of the diagnostic state of powered roof supports. The building of a useful and reliable diagnostic system requires a lot of data. As the acquisition of a data of any possible operating conditions it is important to have a possibility to generate a demanded artificial working characteristics. In this paper a new approach of modelling a leg pressure in the single unit of powered roof support. The model is a result of the analysis of a typical working cycles.Keywords: machine modelling, underground mining, coal mining, structure
Procedia PDF Downloads 36834607 Neural Networks Models for Measuring Hotel Users Satisfaction
Authors: Asma Ameur, Dhafer Malouche
Abstract:
Nowadays, user comments on the Internet have an important impact on hotel bookings. This confirms that the e-reputation issue can influence the likelihood of customer loyalty to a hotel. In this way, e-reputation has become a real differentiator between hotels. For this reason, we have a unique opportunity in the opinion mining field to analyze the comments. In fact, this field provides the possibility of extracting information related to the polarity of user reviews. This sentimental study (Opinion Mining) represents a new line of research for analyzing the unstructured textual data. Knowing the score of e-reputation helps the hotelier to better manage his marketing strategy. The score we then obtain is translated into the image of hotels to differentiate between them. Therefore, this present research highlights the importance of hotel satisfaction ‘scoring. To calculate the satisfaction score, the sentimental analysis can be manipulated by several techniques of machine learning. In fact, this study treats the extracted textual data by using the Artificial Neural Networks Approach (ANNs). In this context, we adopt the aforementioned technique to extract information from the comments available in the ‘Trip Advisor’ website. This actual paper details the description and the modeling of the ANNs approach for the scoring of online hotel reviews. In summary, the validation of this used method provides a significant model for hotel sentiment analysis. So, it provides the possibility to determine precisely the polarity of the hotel users reviews. The empirical results show that the ANNs are an accurate approach for sentiment analysis. The obtained results show also that this proposed approach serves to the dimensionality reduction for textual data’ clustering. Thus, this study provides researchers with a useful exploration of this technique. Finally, we outline guidelines for future research in the hotel e-reputation field as comparing the ANNs with other technique.Keywords: clustering, consumer behavior, data mining, e-reputation, machine learning, neural network, online hotel ‘reviews, opinion mining, scoring
Procedia PDF Downloads 13634606 Distributed Perceptually Important Point Identification for Time Series Data Mining
Authors: Tak-Chung Fu, Ying-Kit Hung, Fu-Lai Chung
Abstract:
In the field of time series data mining, the concept of the Perceptually Important Point (PIP) identification process is first introduced in 2001. This process originally works for financial time series pattern matching and it is then found suitable for time series dimensionality reduction and representation. Its strength is on preserving the overall shape of the time series by identifying the salient points in it. With the rise of Big Data, time series data contributes a major proportion, especially on the data which generates by sensors in the Internet of Things (IoT) environment. According to the nature of PIP identification and the successful cases, it is worth to further explore the opportunity to apply PIP in time series ‘Big Data’. However, the performance of PIP identification is always considered as the limitation when dealing with ‘Big’ time series data. In this paper, two distributed versions of PIP identification based on the Specialized Binary (SB) Tree are proposed. The proposed approaches solve the bottleneck when running the PIP identification process in a standalone computer. Improvement in term of speed is obtained by the distributed versions.Keywords: distributed computing, performance analysis, Perceptually Important Point identification, time series data mining
Procedia PDF Downloads 43334605 Incorporation of Safety into Design by Safety Cube
Authors: Mohammad Rajabalinejad
Abstract:
Safety is often seen as a requirement or a performance indicator through the design process, and this does not always result in optimally safe products or systems. This paper suggests integrating the best safety practices with the design process to enrich the exploration experience for designers and add extra values for customers. For this purpose, the commonly practiced safety standards and design methods have been reviewed and their common blocks have been merged forming Safety Cube. Safety Cube combines common blocks for design, hazard identification, risk assessment and risk reduction through an integral approach. An example application presents the use of Safety Cube for design of machinery.Keywords: safety, safety cube, product, system, machinery, design
Procedia PDF Downloads 24634604 A Theoretical Model for Pattern Extraction in Large Datasets
Authors: Muhammad Usman
Abstract:
Pattern extraction has been done in past to extract hidden and interesting patterns from large datasets. Recently, advancements are being made in these techniques by providing the ability of multi-level mining, effective dimension reduction, advanced evaluation and visualization support. This paper focuses on reviewing the current techniques in literature on the basis of these parameters. Literature review suggests that most of the techniques which provide multi-level mining and dimension reduction, do not handle mixed-type data during the process. Patterns are not extracted using advanced algorithms for large datasets. Moreover, the evaluation of patterns is not done using advanced measures which are suited for high-dimensional data. Techniques which provide visualization support are unable to handle a large number of rules in a small space. We present a theoretical model to handle these issues. The implementation of the model is beyond the scope of this paper.Keywords: association rule mining, data mining, data warehouses, visualization of association rules
Procedia PDF Downloads 22334603 “Octopub”: Geographical Sentiment Analysis Using Named Entity Recognition from Social Networks for Geo-Targeted Billboard Advertising
Authors: Oussama Hafferssas, Hiba Benyahia, Amina Madani, Nassima Zeriri
Abstract:
Although data nowadays has multiple forms; from text to images, and from audio to videos, yet text is still the most used one at a public level. At an academical and research level, and unlike other forms, text can be considered as the easiest form to process. Therefore, a brunch of Data Mining researches has been always under its shadow, called "Text Mining". Its concept is just like data mining’s, finding valuable patterns in data, from large collections and tremendous volumes of data, in this case: Text. Named entity recognition (NER) is one of Text Mining’s disciplines, it aims to extract and classify references such as proper names, locations, expressions of time and dates, organizations and more in a given text. Our approach "Octopub" does not aim to find new ways to improve named entity recognition process, rather than that it’s about finding a new, and yet smart way, to use NER in a way that we can extract sentiments of millions of people using Social Networks as a limitless information source, and Marketing for product promotion as the main domain of application.Keywords: textmining, named entity recognition(NER), sentiment analysis, social media networks (SN, SMN), business intelligence(BI), marketing
Procedia PDF Downloads 58934602 Emotion Mining and Attribute Selection for Actionable Recommendations to Improve Customer Satisfaction
Authors: Jaishree Ranganathan, Poonam Rajurkar, Angelina A. Tzacheva, Zbigniew W. Ras
Abstract:
In today’s world, business often depends on the customer feedback and reviews. Sentiment analysis helps identify and extract information about the sentiment or emotion of the of the topic or document. Attribute selection is a challenging problem, especially with large datasets in actionable pattern mining algorithms. Action Rule Mining is one of the methods to discover actionable patterns from data. Action Rules are rules that help describe specific actions to be made in the form of conditions that help achieve the desired outcome. The rules help to change from any undesirable or negative state to a more desirable or positive state. In this paper, we present a Lexicon based weighted scheme approach to identify emotions from customer feedback data in the area of manufacturing business. Also, we use Rough sets and explore the attribute selection method for large scale datasets. Then we apply Actionable pattern mining to extract possible emotion change recommendations. This kind of recommendations help business analyst to improve their customer service which leads to customer satisfaction and increase sales revenue.Keywords: actionable pattern discovery, attribute selection, business data, data mining, emotion
Procedia PDF Downloads 19934601 Knowledge-Driven Decision Support System Based on Knowledge Warehouse and Data Mining by Improving Apriori Algorithm with Fuzzy Logic
Authors: Pejman Hosseinioun, Hasan Shakeri, Ghasem Ghorbanirostam
Abstract:
In recent years, we have seen an increasing importance of research and study on knowledge source, decision support systems, data mining and procedure of knowledge discovery in data bases and it is considered that each of these aspects affects the others. In this article, we have merged information source and knowledge source to suggest a knowledge based system within limits of management based on storing and restoring of knowledge to manage information and improve decision making and resources. In this article, we have used method of data mining and Apriori algorithm in procedure of knowledge discovery one of the problems of Apriori algorithm is that, a user should specify the minimum threshold for supporting the regularity. Imagine that a user wants to apply Apriori algorithm for a database with millions of transactions. Definitely, the user does not have necessary knowledge of all existing transactions in that database, and therefore cannot specify a suitable threshold. Our purpose in this article is to improve Apriori algorithm. To achieve our goal, we tried using fuzzy logic to put data in different clusters before applying the Apriori algorithm for existing data in the database and we also try to suggest the most suitable threshold to the user automatically.Keywords: decision support system, data mining, knowledge discovery, data discovery, fuzzy logic
Procedia PDF Downloads 33534600 Aqua Logo Design 2013 Decomposition and Meanings
Authors: Peni Rizki
Abstract:
This article presents decomposition on Aqua logo design 2013 as well as exploration on the meanings denoting marketing resolution. In the analysis, it is described decomposition details on Aqua logo design 2013, a semiotics implementation on marketing enterprise. 2013’s design is different in parts from its first establishment in 1973. Upon that, design elements such as pictures and colors are examined in semiotic theories of sign utilized as directives to the meaning constructed. Each part of the design is analyzed based on its significations that generate denotation and connotation as well as myth. At the end will be concluded the converses of Aqua logo design 2013 in reflection to its initiated marketing creativity; what pictures and colors do in it.Keywords: design, aqua, semiotics, signification
Procedia PDF Downloads 37734599 Personalize E-Learning System Based on Clustering and Sequence Pattern Mining Approach
Authors: H. S. Saini, K. Vijayalakshmi, Rishi Sayal
Abstract:
Network-based education has been growing rapidly in size and quality. Knowledge clustering becomes more important in personalized information retrieval for web-learning. A personalized-Learning service after the learners’ knowledge has been classified with clustering. Through automatic analysis of learners’ behaviors, their partition with similar data level and interests may be discovered so as to produce learners with contents that best match educational needs for collaborative learning. We present a specific mining tool and a recommender engine that we have integrated in the online learning in order to help the teacher to carry out the whole e-learning process. We propose to use sequential pattern mining algorithms to discover the most used path by the students and from this information can recommend links to the new students automatically meanwhile they browse in the course. We have Developed a specific author tool in order to help the teacher to apply all the data mining process. We tend to report on many experiments with real knowledge so as to indicate the quality of using both clustering and sequential pattern mining algorithms together for discovering personalized e-learning systems.Keywords: e-learning, cluster, personalization, sequence, pattern
Procedia PDF Downloads 42834598 Sequential Pattern Mining from Data of Medical Record with Sequential Pattern Discovery Using Equivalent Classes (SPADE) Algorithm (A Case Study : Bolo Primary Health Care, Bima)
Authors: Rezky Rifaini, Raden Bagus Fajriya Hakim
Abstract:
This research was conducted at the Bolo primary health Care in Bima Regency. The purpose of the research is to find out the association pattern that is formed of medical record database from Bolo Primary health care’s patient. The data used is secondary data from medical records database PHC. Sequential pattern mining technique is the method that used to analysis. Transaction data generated from Patient_ID, Check_Date and diagnosis. Sequential Pattern Discovery Algorithms Using Equivalent Classes (SPADE) is one of the algorithm in sequential pattern mining, this algorithm find frequent sequences of data transaction, using vertical database and sequence join process. Results of the SPADE algorithm is frequent sequences that then used to form a rule. It technique is used to find the association pattern between items combination. Based on association rules sequential analysis with SPADE algorithm for minimum support 0,03 and minimum confidence 0,75 is gotten 3 association sequential pattern based on the sequence of patient_ID, check_Date and diagnosis data in the Bolo PHC.Keywords: diagnosis, primary health care, medical record, data mining, sequential pattern mining, SPADE algorithm
Procedia PDF Downloads 40134597 Comparative Analysis of the Computer Methods' Usage for Calculation of Hydrocarbon Reserves in the Baltic Sea
Authors: Pavel Shcherban, Vlad Golovanov
Abstract:
Nowadays, the depletion of hydrocarbon deposits on the land of the Kaliningrad region leads to active geological exploration and development of oil and natural gas reserves in the southeastern part of the Baltic Sea. LLC 'Lukoil-Kaliningradmorneft' implements a comprehensive program for the development of the region's shelf in 2014-2023. Due to heterogeneity of reservoir rocks in various open fields, as well as with ambiguous conclusions on the contours of deposits, additional geological prospecting and refinement of the recoverable oil reserves are carried out. The key element is use of an effective technique of computer stock modeling at the first stage of processing of the received data. The following step uses information for the cluster analysis, which makes it possible to optimize the field development approaches. The article analyzes the effectiveness of various methods for reserves' calculation and computer modelling methods of the offshore hydrocarbon fields. Cluster analysis allows to measure influence of the obtained data on the development of a technical and economic model for mining deposits. The relationship between the accuracy of the calculation of recoverable reserves and the need of modernization of existing mining infrastructure, as well as the optimization of the scheme of opening and development of oil deposits, is observed.Keywords: cluster analysis, computer modelling of deposits, correction of the feasibility study, offshore hydrocarbon fields
Procedia PDF Downloads 16634596 Configuration Design and Optimization of the Movable Leg-Foot Lunar Soft-Landing Device
Authors: Shan Jia, Jinbao Chen, Jinhua Zhou, Jiacheng Qian
Abstract:
Lunar exploration is a necessary foundation for deep-space exploration. For the functional limitations of the fixed landers which are widely used currently and are to expand the detection range by the use of wheeled rovers with unavoidable path-repeatability, a movable lunar soft-landing device based on cantilever type buffer mechanism and leg-foot type walking mechanism is presented. Firstly, a 20 DoFs quadruped configuration based on pushrod is proposed. The configuration is of the bionic characteristics such as hip, knee and ankle joints, and can make the kinematics of the whole mechanism unchanged before and after buffering. Secondly, the multi-function main/auxiliary buffers based on crumple-energy absorption and screw-nut mechanism, as well as the telescopic device which could be used to protect the plantar force sensors during the buffer process are designed. Finally, the kinematic model of the whole mechanism is established, and the configuration optimization of the whole mechanism is completed based on the performance requirements of slope adaptation and obstacle crossing. This research can provide a technical solution integrating soft-landing, large-scale inspection and material-transfer for future lunar exploration and even mars exploration, and can also serve as the technical basis for developing the reusable landers.Keywords: configuration design, lunar soft-landing device, movable, optimization
Procedia PDF Downloads 15834595 Secure Multiparty Computations for Privacy Preserving Classifiers
Authors: M. Sumana, K. S. Hareesha
Abstract:
Secure computations are essential while performing privacy preserving data mining. Distributed privacy preserving data mining involve two to more sites that cannot pool in their data to a third party due to the violation of law regarding the individual. Hence in order to model the private data without compromising privacy and information loss, secure multiparty computations are used. Secure computations of product, mean, variance, dot product, sigmoid function using the additive and multiplicative homomorphic property is discussed. The computations are performed on vertically partitioned data with a single site holding the class value.Keywords: homomorphic property, secure product, secure mean and variance, secure dot product, vertically partitioned data
Procedia PDF Downloads 41234594 Establishment of a Test Bed for Integrated Map of Underground Space and Verification of GPR Exploration Equipment
Authors: Jisong Ryu, Woosik Lee, Yonggu Jang
Abstract:
The paper discusses the process of establishing a reliable test bed for verifying the usability of Ground Penetrating Radar (GPR) exploration equipment based on an integrated underground spatial map in Korea. The aim of this study is to construct a test bed consisting of metal and non-metal pipelines to verify the performance of GPR equipment and improve the accuracy of the underground spatial integrated map. The study involved the design and construction of a test bed for metal and non-metal pipe detecting tests. The test bed was built in the SOC Demonstration Research Center (Yeoncheon) of the Korea Institute of Civil Engineering and Building Technology, burying metal and non-metal pipelines up to a depth of 5m. The test bed was designed in both vehicle-type and cart-type GPR-mounted equipment. The study collected data through the construction of the test bed and conducting metal and non-metal pipe detecting tests. The study analyzed the reliability of GPR detecting results by comparing them with the basic drawings, such as the underground space integrated map. The study contributes to the improvement of GPR equipment performance evaluation and the accuracy of the underground spatial integrated map, which is essential for urban planning and construction. The study addressed the question of how to verify the usability of GPR exploration equipment based on an integrated underground spatial map and improve its performance. The study found that the test bed is reliable for verifying the performance of GPR exploration equipment and accurately detecting metal and non-metal pipelines using an integrated underground spatial map. The study concludes that the establishment of a test bed for verifying the usability of GPR exploration equipment based on an integrated underground spatial map is essential. The proposed Korean-style test bed can be used for the evaluation of GPR equipment performance and support the construction of a national non-metal pipeline exploration equipment performance evaluation center in Korea.Keywords: Korea-style GPR testbed, GPR, metal pipe detecting, non-metal pipe detecting
Procedia PDF Downloads 10034593 Exploration of Two Selected Sculptural Forms in the Department of Fine and Applied Arts, Federal Capital Territory College of Education Zuba-Abuja, Nigeria as Motifs for Wax Print Pattern and Design
Authors: Adeoti Adebowale, Abduljaleel, Ejiogu Fidelis Onyekwo
Abstract:
Form and image development are fundamental to creative expression in visual arts. The form is an element that distinguishes the difference between two-dimension and three-dimension among the branches of visual arts. Particularly, the sculpture is a three-dimensional form, while the textile design is a two-dimensional form of its visual appearance. The visual expression of each of them is embedded in the creative practice of the artist, which is easily understood and interpreted by the viewer. In this research, an attempt is made to explore and analyse sculptural forms adopted as a motif for wax print in textile design, aiming at breeding yet another pattern and motif suitable for various design uses. For instance, the dynamics of sculptural form adaptation into other areas of creativity, such as architecture, pictorial arts and pottery, as well as automobile bodies, is a discernible image everywhere. The research is studio exploratory, while a camera and descriptive analysis were used to process the data. Two sculptural forms were adopted from the Department of Fine and Applied Arts, Federal Capital Territory College of Education Zuba-Abuja, in this study due to the uniqueness of their technique of execution. The findings resulted in ten (10) paper designs showing the dexterity of studio practice in the development of design for various fashion and textile uses. However, the paper concludes that sculptural form is a source of inspiration for generating design concepts for a textile designer.Keywords: exploration, design, motifs, sculptural forms, wax print
Procedia PDF Downloads 7034592 Performance Evaluation of Production Schedules Based on Process Mining
Authors: Kwan Hee Han
Abstract:
External environment of enterprise is rapidly changing majorly by global competition, cost reduction pressures, and new technology. In these situations, production scheduling function plays a critical role to meet customer requirements and to attain the goal of operational efficiency. It deals with short-term decision making in the production process of the whole supply chain. The major task of production scheduling is to seek a balance between customer orders and limited resources. In manufacturing companies, this task is so difficult because it should efficiently utilize resource capacity under the careful consideration of many interacting constraints. At present, many computerized software solutions have been utilized in many enterprises to generate a realistic production schedule to overcome the complexity of schedule generation. However, most production scheduling systems do not provide sufficient information about the validity of the generated schedule except limited statistics. Process mining only recently emerged as a sub-discipline of both data mining and business process management. Process mining techniques enable the useful analysis of a wide variety of processes such as process discovery, conformance checking, and bottleneck analysis. In this study, the performance of generated production schedule is evaluated by mining event log data of production scheduling software system by using the process mining techniques since every software system generates event logs for the further use such as security investigation, auditing and error bugging. An application of process mining approach is proposed for the validation of the goodness of production schedule generated by scheduling software systems in this study. By using process mining techniques, major evaluation criteria such as utilization of workstation, existence of bottleneck workstations, critical process route patterns, and work load balance of each machine over time are measured, and finally, the goodness of production schedule is evaluated. By using the proposed process mining approach for evaluating the performance of generated production schedule, the quality of production schedule of manufacturing enterprises can be improved.Keywords: data mining, event log, process mining, production scheduling
Procedia PDF Downloads 27934591 Decision Support System in Air Pollution Using Data Mining
Authors: E. Fathallahi Aghdam, V. Hosseini
Abstract:
Environmental pollution is not limited to a specific region or country; that is why sustainable development, as a necessary process for improvement, pays attention to issues such as destruction of natural resources, degradation of biological system, global pollution, and climate change in the world, especially in the developing countries. According to the World Health Organization, as a developing city, Tehran (capital of Iran) is one of the most polluted cities in the world in terms of air pollution. In this study, three pollutants including particulate matter less than 10 microns, nitrogen oxides, and sulfur dioxide were evaluated in Tehran using data mining techniques and through Crisp approach. The data from 21 air pollution measuring stations in different areas of Tehran were collected from 1999 to 2013. Commercial softwares Clementine was selected for this study. Tehran was divided into distinct clusters in terms of the mentioned pollutants using the software. As a data mining technique, clustering is usually used as a prologue for other analyses, therefore, the similarity of clusters was evaluated in this study through analyzing local conditions, traffic behavior, and industrial activities. In fact, the results of this research can support decision-making system, help managers improve the performance and decision making, and assist in urban studies.Keywords: data mining, clustering, air pollution, crisp approach
Procedia PDF Downloads 42734590 Enhance the Power of Sentiment Analysis
Authors: Yu Zhang, Pedro Desouza
Abstract:
Since big data has become substantially more accessible and manageable due to the development of powerful tools for dealing with unstructured data, people are eager to mine information from social media resources that could not be handled in the past. Sentiment analysis, as a novel branch of text mining, has in the last decade become increasingly important in marketing analysis, customer risk prediction and other fields. Scientists and researchers have undertaken significant work in creating and improving their sentiment models. In this paper, we present a concept of selecting appropriate classifiers based on the features and qualities of data sources by comparing the performances of five classifiers with three popular social media data sources: Twitter, Amazon Customer Reviews, and Movie Reviews. We introduced a couple of innovative models that outperform traditional sentiment classifiers for these data sources, and provide insights on how to further improve the predictive power of sentiment analysis. The modelling and testing work was done in R and Greenplum in-database analytic tools.Keywords: sentiment analysis, social media, Twitter, Amazon, data mining, machine learning, text mining
Procedia PDF Downloads 35334589 A Brief Exploration on the Green Urban Design for Carbon Neutrality
Authors: Gaoyuan Wang, Tian Chen
Abstract:
China’s emission peak and carbon neutrality strategies lead to the transformation of development patterns and call for new green urban design thinking. This paper begins by revealing the evolution of green urban design thinking during the periods of carbon enlightenment, carbon dependency, and carbon decoupling from the perspective of the energy transition. Combined with the current energy situation, national strengths, and technological trends, the emergence of green urban design towards carbon neutrality becomes inevitable. Based on the preliminary analysis of its connotation, the characteristics of the new type of green urban design are generalized as low-carbon orientation, carbon-related objects, carbon-reduction means, and carbon-control patterns. Its theory is briefly clarified in terms of the human-earth synergism, quality-energy interconnection, and form-flow interpromotion. Then, its mechanism is analyzed combined with the core tasks of carbon neutrality, and the scope of design issues is defined, including carbon flow mapping, carbon source regulation, carbon sink construction, and carbon emission management. Finally, a multi-scale spatial response system is proposed across the region, city, cluster, and neighborhood level. The discussion aims to provide support for the innovation of green urban design theories and methods in the context of peak neutrality.Keywords: carbon neutrality, green urban design, energy transition, theoretical exploration
Procedia PDF Downloads 17534588 Optimizing Communications Overhead in Heterogeneous Distributed Data Streams
Authors: Rashi Bhalla, Russel Pears, M. Asif Naeem
Abstract:
In this 'Information Explosion Era' analyzing data 'a critical commodity' and mining knowledge from vertically distributed data stream incurs huge communication cost. However, an effort to decrease the communication in the distributed environment has an adverse influence on the classification accuracy; therefore, a research challenge lies in maintaining a balance between transmission cost and accuracy. This paper proposes a method based on Bayesian inference to reduce the communication volume in a heterogeneous distributed environment while retaining prediction accuracy. Our experimental evaluation reveals that a significant reduction in communication can be achieved across a diverse range of dataset types.Keywords: big data, bayesian inference, distributed data stream mining, heterogeneous-distributed data
Procedia PDF Downloads 16134587 Critical Review of Web Content Mining Extraction Mechanisms
Authors: Rabia Bashir, Sajjad Akbar
Abstract:
There is an inevitable demand of web mining due to rapid increase of huge information on the Internet, but the striking variety of web structures has made required content retrieval a difficult task. To counter this issue, Web Content Mining (WCM) emerges as a potential candidate which extracts and integrates suitable resources of data to users. In past few years, research has been done on several extraction techniques for WCM i.e. agent-based, template-based, assumption-based, statistic-based, wrapper-based and machine learning. However, it is still unclear that either these approaches are efficiently tackling the significant challenges of WCM or not. To answer this question, this paper identifies these challenges such as language independency, structure flexibility, performance, automation, dynamicity, redundancy handling, intelligence, relevant content retrieval, and privacy. Further, mapping of these challenges is done with existing extraction mechanisms which helps to adopt the most suitable WCM approach, given some conditions and characteristics at hand.Keywords: content mining challenges, web content mining, web content extraction approaches, web information retrieval
Procedia PDF Downloads 54834586 Evaluation of Classification Algorithms for Diagnosis of Asthma in Iranian Patients
Authors: Taha SamadSoltani, Peyman Rezaei Hachesu, Marjan GhaziSaeedi, Maryam Zolnoori
Abstract:
Introduction: Data mining defined as a process to find patterns and relationships along data in the database to build predictive models. Application of data mining extended in vast sectors such as the healthcare services. Medical data mining aims to solve real-world problems in the diagnosis and treatment of diseases. This method applies various techniques and algorithms which have different accuracy and precision. The purpose of this study was to apply knowledge discovery and data mining techniques for the diagnosis of asthma based on patient symptoms and history. Method: Data mining includes several steps and decisions should be made by the user which starts by creation of an understanding of the scope and application of previous knowledge in this area and identifying KD process from the point of view of the stakeholders and finished by acting on discovered knowledge using knowledge conducting, integrating knowledge with other systems and knowledge documenting and reporting.in this study a stepwise methodology followed to achieve a logical outcome. Results: Sensitivity, Specifity and Accuracy of KNN, SVM, Naïve bayes, NN, Classification tree and CN2 algorithms and related similar studies was evaluated and ROC curves were plotted to show the performance of the system. Conclusion: The results show that we can accurately diagnose asthma, approximately ninety percent, based on the demographical and clinical data. The study also showed that the methods based on pattern discovery and data mining have a higher sensitivity compared to expert and knowledge-based systems. On the other hand, medical guidelines and evidence-based medicine should be base of diagnostics methods, therefore recommended to machine learning algorithms used in combination with knowledge-based algorithms.Keywords: asthma, datamining, classification, machine learning
Procedia PDF Downloads 44734585 Exploring Legal Liabilities of Mining Companies for Human Rights Abuses: Case Study of Mongolian Mine
Authors: Azzaya Enkhjargal
Abstract:
Context: The mining industry has a long history of human rights abuses, including forced labor, environmental pollution, and displacement of communities. In recent years, there has been growing international pressure to hold mining companies accountable for these abuses. Research Aim: This study explores the legal liabilities of mining companies for human rights abuses. The study specifically examines the case of Erdenet Mining Corporation (EMC), a large mining company in Mongolia that has been accused of human rights abuses. Methodology: The study used a mixed-methods approach, which included a review of legal literature, interviews with community members and NGOs, and a case study of EMC. Findings: The study found that mining companies can be held liable for human rights abuses under a variety of regulatory frameworks, including soft law and self-regulatory instruments in the mining industry, international law, national law, and corporate law. The study also found that there are a number of challenges to holding mining companies accountable for human rights abuses, including the lack of effective enforcement mechanisms and the difficulty of proving causation. Theoretical Importance: The study contributes to the growing body of literature on the legal liabilities of mining companies for human rights abuses. The study also provides insights into the challenges of holding mining companies accountable for human rights abuses. Data Collection: The data for the study was collected through a variety of methods, including a review of legal literature, interviews with community members and NGOs, and a case study of EMC. Analysis Procedures: The data was analyzed using a variety of methods, including content analysis, thematic analysis, and case study analysis. Conclusion: The study concludes that mining companies can be held liable for human rights abuses under a variety of legal and regulatory frameworks. There are positive developments in ensuring greater accountability and protection of affected communities and the environment in countries with a strong economy. Regrettably, access to avenues of redress is reasonably low in less developed countries, where the governments have not implemented a robust mechanism to enforce liability requirements in the mining industry. The study recommends that governments and mining companies take more ambitious steps to enhance corporate accountability.Keywords: human rights, human rights abuses, ESG, litigation, Erdenet Mining Corporation, corporate social responsibility, soft law, self-regulation, mining industry, parent company liability, sustainability, environment, UN
Procedia PDF Downloads 8034584 Failure Statistics Analysis of China’s Spacecraft in Full-Life
Authors: Xin-Yan Ji
Abstract:
The historical failures data of the spacecraft is very useful to improve the spacecraft design and the test philosophies and reduce the spacecraft flight risk. A study of spacecraft failures data was performed, which is the most comprehensive statistics of spacecrafts in China. 2593 on-orbit failures data and 1298 ground data that occurred on 150 spacecraft launched from 2000 to 2016 were identified and collected, which covered the navigation satellites, communication satellites, remote sensing deep space exploration manned spaceflight platforms. In this paper, the failures were analyzed to compare different spacecraft subsystem and estimate their impact on the mission, then the development of spacecraft in China was evaluated from design, software, workmanship, management, parts, and materials. Finally, the lessons learned from the past years show that electrical and mechanical failures are responsible for the largest parts, and the key solution to reduce in-orbit failures is improving design technology, enough redundancy, adequate space environment protection measures, and adequate ground testing.Keywords: spacecraft anomalies, anomalies mechanism, failure cause, spacecraft testing
Procedia PDF Downloads 11734583 A Concept of Data Mining with XML Document
Authors: Akshay Agrawal, Anand K. Srivastava
Abstract:
The increasing amount of XML datasets available to casual users increases the necessity of investigating techniques to extract knowledge from these data. Data mining is widely applied in the database research area in order to extract frequent correlations of values from both structured and semi-structured datasets. The increasing availability of heterogeneous XML sources has raised a number of issues concerning how to represent and manage these semi structured data. In recent years due to the importance of managing these resources and extracting knowledge from them, lots of methods have been proposed in order to represent and cluster them in different ways.Keywords: XML, similarity measure, clustering, cluster quality, semantic clustering
Procedia PDF Downloads 37934582 Data Mining Approach: Classification Model Evaluation
Authors: Lubabatu Sada Sodangi
Abstract:
The rapid growth in exchange and accessibility of information via the internet makes many organisations acquire data on their own operation. The aim of data mining is to analyse the different behaviour of a dataset using observation. Although, the subset of the dataset being analysed may not display all the behaviours and relationships of the entire data and, therefore, may not represent other parts that exist in the dataset. There is a range of techniques used in data mining to determine the hidden or unknown information in datasets. In this paper, the performance of two algorithms Chi-Square Automatic Interaction Detection (CHAID) and multilayer perceptron (MLP) would be matched using an Adult dataset to find out the percentage of an/the adults that earn > 50k and those that earn <= 50k per year. The two algorithms were studied and compared using IBM SPSS statistics software. The result for CHAID shows that the most important predictors are relationship and education. The algorithm shows that those are married (husband) and have qualification: Bachelor, Masters, Doctorate or Prof-school whose their age is > 41<57 earn > 50k. Also, multilayer perceptron displays marital status and capital gain as the most important predictors of the income. It also shows that individuals that their capital gain is less than 6,849 and are single, separated or widow, earn <= 50K, whereas individuals with their capital gain is > 6,849, work > 35 hrs/wk, and > 27yrs their income will be > 50k. By comparing the two algorithms, it is observed that both algorithms are reliable but there is strong reliability in CHAID which clearly shows that relation and education contribute to the prediction as displayed in the data visualisation.Keywords: data mining, CHAID, multi-layer perceptron, SPSS, Adult dataset
Procedia PDF Downloads 378