Search results for: Content mining
2030 Granularity Analysis for Spatio-Temporal Web Sensors
Authors: Shun Hattori
Abstract:
In recent years, many researches to mine the exploding Web world, especially User Generated Content (UGC) such as weblogs, for knowledge about various phenomena and events in the physical world have been done actively, and also Web services with the Web-mined knowledge have begun to be developed for the public. However, there are few detailed investigations on how accurately Web-mined data reflect physical-world data. It must be problematic to idolatrously utilize the Web-mined data in public Web services without ensuring their accuracy sufficiently. Therefore, this paper introduces the simplest Web Sensor and spatiotemporallynormalized Web Sensor to extract spatiotemporal data about a target phenomenon from weblogs searched by keyword(s) representing the target phenomenon, and tries to validate the potential and reliability of the Web-sensed spatiotemporal data by four kinds of granularity analyses of coefficient correlation with temperature, rainfall, snowfall, and earthquake statistics per day by region of Japan Meteorological Agency as physical-world data: spatial granularity (region-s population density), temporal granularity (time period, e.g., per day vs. per week), representation granularity (e.g., “rain" vs. “heavy rain"), and media granularity (weblogs vs. microblogs such as Tweets).Keywords: Granularity analysis, knowledge extraction, spatiotemporal data mining, Web credibility, Web mining, Web sensor.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18822029 Improved FP-growth Algorithm with Multiple Minimum Supports Using Maximum Constraints
Authors: Elsayeda M. Elgaml, Dina M. Ibrahim, Elsayed A. Sallam
Abstract:
Association rule mining is one of the most important fields of data mining and knowledge discovery. In this paper, we propose an efficient multiple support frequent pattern growth algorithm which we called “MSFP-growth” that enhancing the FPgrowth algorithm by making infrequent child node pruning step with multiple minimum support using maximum constrains. The algorithm is implemented, and it is compared with other common algorithms: Apriori-multiple minimum supports using maximum constraints and FP-growth. The experimental results show that the rule mining from the proposed algorithm are interesting and our algorithm achieved better performance than other algorithms without scarifying the accuracy.
Keywords: Association Rules, FP-growth, Multiple minimum supports, Weka Tool
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 33182028 Soil Moisture Content in Hill-Filed Side Slope
Authors: A. Aboufayed
Abstract:
The soil moisture content is an important property of the soil. The results of mean weekly gravimetric soil moisture content, measured for the three soil layers within the A horizon, showed that it was higher for the top 5 cm over the whole period of monitoring (15/7/2004 up to 10/11/05) with the variation becoming greater during winter time. This reflects the pattern of rainfall in Ireland which is spread over the whole year and shows that light rainfall events during summer time were compensated by loss through evapotranspiration, but only in the top 5 cm of soil. This layer had the highest porosity and highest moisture holding capacity due to the high content of organic matter. The gravimetric soil moisture contents of the top 5 cm and the underlying 5-15 and 15-25 cm layers show that bottom site of the Hill Field had higher soil moisture content than the middle and top sites during the whole period of monitoring.Keywords: Soil, Soil moisture, Gravimetric soil moisture content.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23722027 Teaching Science Content Area Literacy to 21st Century Learners
Authors: Melissa C. LaDuke
Abstract:
The use of new literacies within science classrooms needs to be balanced by teachers to both teach different forms of communication while assessing content area proficiency. Using new literacies such as Twitter and Facebook needs to be incorporated into science content area literacy studies in addition to continuing to use generally-accepted forms of scientific content area presentation which include scientific papers and textbooks. The research question this literature review seeks to answer is “What are some ways in which new forms of literacy are better suited to teach scientific content area literacy to 21st century learners?” The research question is addressed through a literature review that highlights methods currently being used to educate the next wave of learners in the world of science content area literacy. Both temporal discourse analysis (TDA) and critical discourse analysis (CDA) were used to determine the need to use new literacies to teach science content area literacy. Increased use of digital technologies and a change in science content area pedagogy were explored.
Keywords: Science content area literacy, new literacies, critical discourse analysis, temporal discourse analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4652026 Analysis of Users’ Behavior on Book Loan Log Based On Association Rule Mining
Authors: Kanyarat Bussaban, Kunyanuth Kularbphettong
Abstract:
This research aims to create a model for analysis of student behavior using Library resources based on data mining technique in case of Suan Sunandha Rajabhat University. The model was created under association rules, Apriori algorithm. The results were found 14 rules and the rules were tested with testing data set and it showed that the ability of classify data was 79.24percent and the MSE was 22.91. The results showed that the user’s behavior model by using association rule technique can use to manage the library resources.
Keywords: Behavior, data mining technique, Apriori algorithm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23062025 Applying Sequential Pattern Mining to Generate Block for Scheduling Problems
Authors: Meng-Hui Chen, Chen-Yu Kao, Chia-Yu Hsu, Pei-Chann Chang
Abstract:
The main idea in this paper is using sequential pattern mining to find the information which is helpful for finding high performance solutions. By combining this information, it is defined as blocks. Using the blocks to generate artificial chromosomes (ACs) could improve the structure of solutions. Estimation of Distribution Algorithms (EDAs) is adapted to solve the combinatorial problems. Nevertheless many of these approaches are advantageous for this application, but only some of them are used to enhance the efficiency of application. Generating ACs uses patterns and EDAs could increase the diversity. According to the experimental result, the algorithm which we proposed has a better performance to solve the permutation flow-shop problems.
Keywords: Combinatorial problems, Sequential Pattern Mining, Estimation of Distribution Algorithms, Artificial Chromosomes.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17182024 On Pattern-Based Programming towards the Discovery of Frequent Patterns
Authors: Kittisak Kerdprasop, Nittaya Kerdprasop
Abstract:
The problem of frequent pattern discovery is defined as the process of searching for patterns such as sets of features or items that appear in data frequently. Finding such frequent patterns has become an important data mining task because it reveals associations, correlations, and many other interesting relationships hidden in a database. Most of the proposed frequent pattern mining algorithms have been implemented with imperative programming languages. Such paradigm is inefficient when set of patterns is large and the frequent pattern is long. We suggest a high-level declarative style of programming apply to the problem of frequent pattern discovery. We consider two languages: Haskell and Prolog. Our intuitive idea is that the problem of finding frequent patterns should be efficiently and concisely implemented via a declarative paradigm since pattern matching is a fundamental feature supported by most functional languages and Prolog. Our frequent pattern mining implementation using the Haskell and Prolog languages confirms our hypothesis about conciseness of the program. The comparative performance studies on line-of-code, speed and memory usage of declarative versus imperative programming have been reported in the paper.Keywords: Frequent pattern mining, functional programming, pattern matching, logic programming.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13432023 Effects of Heavy Pumping and Artificial Groundwater Recharge Pond on the Aquifer System of Langat Basin, Malaysia
Authors: R. May, K. Jinno, I. Yusoff
Abstract:
The paper aims at evaluating the effects of heavy groundwater withdrawal and artificial groundwater recharge of an ex-mining pond to the aquifer system of the Langat Basin through the three-dimensional (3D) numerical modeling. Many mining sites have been left behind from the massive mining exploitations in Malaysia during the England colonization era and from the last few decades. These sites are able to accommodate more than a million cubic meters of water from precipitation, runoff, groundwater, and river. Most of the time, the mining sites are turned into ponds for recreational activities. In the current study, an artificial groundwater recharge from an ex-mining pond in the Langat Basin was proposed due to its capacity to store >50 million m3 of water. The location of the pond is near the Langat River and opposite a steel company where >4 million gallons of groundwater is withdrawn on a daily basis. The 3D numerical simulation was developed using the Groundwater Modeling System (GMS). The calibrated model (error about 0.7 m) was utilized to simulate two scenarios (1) Case 1: artificial recharge pond with no pumping and (2) Case 2: artificial pond with pumping. The results showed that in Case 1, the pond played a very important role in supplying additional water to the aquifer and river. About 90,916 m3/d of water from the pond, 1,173 m3/d from the Langat River, and 67,424 m3/d from the direct recharge of precipitation infiltrated into the aquifer system. In Case 2, due to the abstraction of groundwater from a company, it caused a steep depression around the wells, river, and pond. The result of the water budget showed an increase rate of inflow in the pond and river with 92,493m3/d and 3,881m3/d respectively. The outcome of the current study provides useful information of the aquifer behavior of the Langat Basin.
Keywords: Groundwater and surface water interaction, groundwater modeling, GMS, artificial recharge pond, ex-mining site.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 26552022 Review and Comparison of Associative Classification Data Mining Approaches
Authors: Suzan Wedyan
Abstract:
Associative classification (AC) is a data mining approach that combines association rule and classification to build classification models (classifiers). AC has attracted a significant attention from several researchers mainly because it derives accurate classifiers that contain simple yet effective rules. In the last decade, a number of associative classification algorithms have been proposed such as Classification based Association (CBA), Classification based on Multiple Association Rules (CMAR), Class based Associative Classification (CACA), and Classification based on Predicted Association Rule (CPAR). This paper surveys major AC algorithms and compares the steps and methods performed in each algorithm including: rule learning, rule sorting, rule pruning, classifier building, and class prediction.
Keywords: Associative Classification, Classification, Data Mining, Learning, Rule Ranking, Rule Pruning, Prediction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 66332021 Questions Categorization in E-Learning Environment Using Data Mining Technique
Authors: Vilas P. Mahatme, K. K. Bhoyar
Abstract:
Nowadays, education cannot be imagined without digital technologies. It broadens the horizons of teaching learning processes. Several universities are offering online courses. For evaluation purpose, e-examination systems are being widely adopted in academic environments. Multiple-choice tests are extremely popular. Moving away from traditional examinations to e-examination, Moodle as Learning Management Systems (LMS) is being used. Moodle logs every click that students make for attempting and navigational purposes in e-examination. Data mining has been applied in various domains including retail sales, bioinformatics. In recent years, there has been increasing interest in the use of data mining in e-learning environment. It has been applied to discover, extract, and evaluate parameters related to student’s learning performance. The combination of data mining and e-learning is still in its babyhood. Log data generated by the students during online examination can be used to discover knowledge with the help of data mining techniques. In web based applications, number of right and wrong answers of the test result is not sufficient to assess and evaluate the student’s performance. So, assessment techniques must be intelligent enough. If student cannot answer the question asked by the instructor then some easier question can be asked. Otherwise, more difficult question can be post on similar topic. To do so, it is necessary to identify difficulty level of the questions. Proposed work concentrate on the same issue. Data mining techniques in specific clustering is used in this work. This method decide difficulty levels of the question and categories them as tough, easy or moderate and later this will be served to the desire students based on their performance. Proposed experiment categories the question set and also group the students based on their performance in examination. This will help the instructor to guide the students more specifically. In short mined knowledge helps to support, guide, facilitate and enhance learning as a whole.Keywords: Data mining, e-examination, e-learning, moodle.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20752020 The Research of Fuzzy Classification Rules Applied to CRM
Authors: Chien-Hua Wang, Meng-Ying Chou, Chin-Tzong Pang
Abstract:
In the era of great competition, understanding and satisfying customers- requirements are the critical tasks for a company to make a profits. Customer relationship management (CRM) thus becomes an important business issue at present. With the help of the data mining techniques, the manager can explore and analyze from a large quantity of data to discover meaningful patterns and rules. Among all methods, well-known association rule is most commonly seen. This paper is based on Apriori algorithm and uses genetic algorithms combining a data mining method to discover fuzzy classification rules. The mined results can be applied in CRM to help decision marker make correct business decisions for marketing strategies.Keywords: Customer relationship management (CRM), Data mining, Apriori algorithm, Genetic algorithm, Fuzzy classification rules.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16612019 Predicting Oil Content of Fresh Palm Fruit Using Transmission-Mode Ultrasonic Technique
Authors: Sutthawee Suwannarat, Thanate Khaorapapong, Mitchai Chongcheawchamnan
Abstract:
In this paper, an ultrasonic technique is proposed to predict oil content in a fresh palm fruit. This is accomplished by measuring the attenuation based on ultrasonic transmission mode. Several palm fruit samples with known oil content by Soxhlet extraction (ISO9001:2008) were tested with our ultrasonic measurement. Amplitude attenuation data results for all palm samples were collected. The Feedforward Neural Networks (FNNs) are applied to predict the oil content for the samples. The Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) of the FNN model for predicting oil content percentage are 7.6186 and 5.2287 with the correlation coefficient (R) of 0.9193.Keywords: Non-destructive, ultrasonic testing, oil content, fresh palm fruit, neural network.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18072018 Heavy Metal Pollution of the Soils around the Mining Area near Shamlugh Town (Armenia) and Related Risks to the Environment
Authors: G. A. Gevorgyan, K. A. Ghazaryan, T. H. Derdzyan
Abstract:
The heavy metal pollution of the soils around the mining area near Shamlugh town and related risks to human health were assessed. The investigations showed that the soils were polluted with heavy metals that can be ranked by anthropogenic pollution degree as follows: Cu>Pb>As>Co>Ni>Zn. The main sources of the anthropogenic metal pollution of the soils were the copper mining area near Shamlugh town, the Chochkan tailings storage facility and the trucks transferring ore from the mining area. Copper pollution degree in some observation sites was unallowable for agricultural production. The total non-carcinogenic chronic hazard index (THI) values in some places, including observation sites in Shamlugh town, were above the safe level (THI<1) for children living in this territory. Although the highest heavy metal enrichment degree in the soils was registered in case of copper, however, the highest health risks to humans especially children were posed by cobalt which is explained by the fact that heavy metals have different toxicity levels and penetration characteristics.
Keywords: Armenia, copper mine, heavy metal pollution of soil, health risks.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23792017 A Location Routing Model for the Logistic System in the Mining Collection Centers of the Northern Region of Boyacá-Colombia
Authors: Erika Ruíz, Luis Amaya, Diego Carreño
Abstract:
The main objective of this study is to design a mathematical model for the logistics of mining collection centers in the northern region of the department of Boyacá (Colombia), determining the structure that facilitates the flow of products along the supply chain. In order to achieve this, it is necessary to define a suitable design of the distribution network, taking into account the products, customer’s characteristics and the availability of information. Likewise, some other aspects must be defined, such as number and capacity of collection centers to establish, routes that must be taken to deliver products to the customers, among others. This research will use one of the operation research problems, which is used in the design of distribution networks known as Location Routing Problem (LRP).
Keywords: Location routing problem, logistic, mining collection, model.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7892016 Application of Data Mining Tools to Predicate Completion Time of a Project
Authors: Seyed Hossein Iranmanesh, Zahra Mokhtari
Abstract:
Estimation time and cost of work completion in a project and follow up them during execution are contributors to success or fail of a project, and is very important for project management team. Delivering on time and within budgeted cost needs to well managing and controlling the projects. To dealing with complex task of controlling and modifying the baseline project schedule during execution, earned value management systems have been set up and widely used to measure and communicate the real physical progress of a project. But it often fails to predict the total duration of the project. In this paper data mining techniques is used predicting the total project duration in term of Time Estimate At Completion-EAC (t). For this purpose, we have used a project with 90 activities, it has updated day by day. Then, it is used regular indexes in literature and applied Earned Duration Method to calculate time estimate at completion and set these as input data for prediction and specifying the major parameters among them using Clem software. By using data mining, the effective parameters on EAC and the relationship between them could be extracted and it is very useful to manage a project with minimum delay risks. As we state, this could be a simple, safe and applicable method in prediction the completion time of a project during execution.Keywords: Data Mining Techniques, Earned Duration Method, Earned Value, Estimate At Completion.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18032015 Impact of Foliar Application of Zinc on Micro and Macro Elements Distribution in Phyllanthus amarus
Authors: Nguyen Cao Nguyen, Krasimir I. Ivanov, Penka S. Zapryanova
Abstract:
The present study was carried out to investigate the interaction of foliar applied zinc with other elements in Phyllanthus amarus plants. The plant samples for our experiment were collected from Lam Dong province, Vietnam. Seven suspension solutions of nanosized zinc hydroxide nitrate (Zn5(OH)8(NO3)2·2H2O) with different Zn concentration were used. Fertilization and irrigation were the same for all variants. The Zn content and the content of selected micro (Cu, Fe, Mn) and macro (Ca, Mg, P and K) nutrients in plant roots, and stems and leaves were determined. It was concluded that the zinc content of plant roots varies narrowly, with no significant impact of ZnHN fertilization. The same trend can be seen in the content of Cu, Mn, and macronutrients. The zinc content of plant stems and leaves varies within wide limits, with the significant impact of ZnHN fertilization. The trends in the content of Cu, Mn, and macronutrients are kept the same as in the root, whereas the iron trends to increase its content at increasing the zinc content.
Keywords: Zinc fertilizers, micro and macro elements, Phyllanthus amarus.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5972014 Analysis of Medical Data using Data Mining and Formal Concept Analysis
Authors: Anamika Gupta, Naveen Kumar, Vasudha Bhatnagar
Abstract:
This paper focuses on analyzing medical diagnostic data using classification rules in data mining and context reduction in formal concept analysis. It helps in finding redundancies among the various medical examination tests used in diagnosis of a disease. Classification rules have been derived from positive and negative association rules using the Concept lattice structure of the Formal Concept Analysis. Context reduction technique given in Formal Concept Analysis along with classification rules has been used to find redundancies among the various medical examination tests. Also it finds out whether expensive medical tests can be replaced by some cheaper tests.
Keywords: Data Mining, Formal Concept Analysis, Medical Data, Negative Classification Rules.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17382013 Searching for Similar Informational Articles in the Internet Channel
Authors: Sung Ho Ha, Seong Hyeon Joo, Hyun U. Pae
Abstract:
In terms of total online audience, newspapers are the most successful form of online content to date. The online audience for newspapers continues to demand higher-quality services, including personalized news services. News providers should be able to offer suitable users appropriate content. In this paper, a news article recommender system is suggested based on a user-s preference when he or she visits an Internet news site and reads the published articles. This system helps raise the user-s satisfaction, increase customer loyalty toward the content provider.
Keywords: Content classification, content recommendation, customer profiling, documents clustering.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16072012 Network Mobility Support in Content-Centric Internet
Authors: Zhiwei Yan, Jong-Hyouk Lee, Yong-Jin Park, Xiaodong Lee
Abstract:
In this paper, we analyze NEtwork MObility (NEMO) supporting problems in Content-Centric Networking (CCN), and propose the CCN-NEMO which can well support the deployment of the content-centric paradigm in large-scale mobile Internet. The CCN-NEMO extends the signaling message of the basic CCN protocol, to support the mobility discovery and fast trigger of Interest re-issuing during the network mobility. Besides, the Mobile Router (MR) is extended to optimize the content searching and relaying in the local subnet. These features can be employed by the nested NEMO to maximize the advantages of content retrieving with CCN. Based on the analysis, we compare the performance on handover latency between the basic CCN and our proposed CCN-NEMO. The results show that our scheme can facilitate the content-retrieving in the NEMO scenario with improved performance.
Keywords: CCN, handover, NEMO, mobility management.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15352011 Using Data Clustering in Oral Medicine
Authors: Fahad Shahbaz Khan, Rao Muhammad Anwer, Olof Torgersson
Abstract:
The vast amount of information hidden in huge databases has created tremendous interests in the field of data mining. This paper examines the possibility of using data clustering techniques in oral medicine to identify functional relationships between different attributes and classification of similar patient examinations. Commonly used data clustering algorithms have been reviewed and as a result several interesting results have been gathered.Keywords: Oral Medicine, Cluto, Data Clustering, Data Mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19772010 Optimization of Air Pollution Control Model for Mining
Authors: Zunaira Asif, Zhi Chen
Abstract:
The sustainable measures on air quality management are recognized as one of the most serious environmental concerns in the mining region. The mining operations emit various types of pollutants which have significant impacts on the environment. This study presents a stochastic control strategy by developing the air pollution control model to achieve a cost-effective solution. The optimization method is formulated to predict the cost of treatment using linear programming with an objective function and multi-constraints. The constraints mainly focus on two factors which are: production of metal should not exceed the available resources, and air quality should meet the standard criteria of the pollutant. The applicability of this model is explored through a case study of an open pit metal mine, Utah, USA. This method simultaneously uses meteorological data as a dispersion transfer function to support the practical local conditions. The probabilistic analysis and the uncertainties in the meteorological conditions are accomplished by Monte Carlo simulation. Reasonable results have been obtained to select the optimized treatment technology for PM2.5, PM10, NOx, and SO2. Additional comparison analysis shows that baghouse is the least cost option as compared to electrostatic precipitator and wet scrubbers for particulate matter, whereas non-selective catalytical reduction and dry-flue gas desulfurization are suitable for NOx and SO2 reduction respectively. Thus, this model can aid planners to reduce these pollutants at a marginal cost by suggesting control pollution devices, while accounting for dynamic meteorological conditions and mining activities.
Keywords: Air pollution, linear programming, mining, optimization, treatment technologies.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16062009 Data Mining for Cancer Management in Egypt Case Study: Childhood Acute Lymphoblastic Leukemia
Authors: Nevine M. Labib, Michael N. Malek
Abstract:
Data Mining aims at discovering knowledge out of data and presenting it in a form that is easily comprehensible to humans. One of the useful applications in Egypt is the Cancer management, especially the management of Acute Lymphoblastic Leukemia or ALL, which is the most common type of cancer in children. This paper discusses the process of designing a prototype that can help in the management of childhood ALL, which has a great significance in the health care field. Besides, it has a social impact on decreasing the rate of infection in children in Egypt. It also provides valubale information about the distribution and segmentation of ALL in Egypt, which may be linked to the possible risk factors. Undirected Knowledge Discovery is used since, in the case of this research project, there is no target field as the data provided is mainly subjective. This is done in order to quantify the subjective variables. Therefore, the computer will be asked to identify significant patterns in the provided medical data about ALL. This may be achieved through collecting the data necessary for the system, determimng the data mining technique to be used for the system, and choosing the most suitable implementation tool for the domain. The research makes use of a data mining tool, Clementine, so as to apply Decision Trees technique. We feed it with data extracted from real-life cases taken from specialized Cancer Institutes. Relevant medical cases details such as patient medical history and diagnosis are analyzed, classified, and clustered in order to improve the disease management.Keywords: Data Mining, Decision Trees, Knowledge Discovery, Leukemia.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22152008 Data Mining Approach for Commercial Data Classification and Migration in Hybrid Storage Systems
Authors: Mais Haj Qasem, Maen M. Al Assaf, Ali Rodan
Abstract:
Parallel hybrid storage systems consist of a hierarchy of different storage devices that vary in terms of data reading speed performance. As we ascend in the hierarchy, data reading speed becomes faster. Thus, migrating the application’ important data that will be accessed in the near future to the uppermost level will reduce the application I/O waiting time; hence, reducing its execution elapsed time. In this research, we implement trace-driven two-levels parallel hybrid storage system prototype that consists of HDDs and SSDs. The prototype uses data mining techniques to classify application’ data in order to determine its near future data accesses in parallel with the its on-demand request. The important data (i.e. the data that the application will access in the near future) are continuously migrated to the uppermost level of the hierarchy. Our simulation results show that our data migration approach integrated with data mining techniques reduces the application execution elapsed time when using variety of traces in at least to 22%.Keywords: Data mining, hybrid storage system, recurrent neural network, support vector machine.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17362007 Energy Requirement for Cutting Corn Stalks (Single Cross 704 Var.)
Authors: M. Azadbakht, A. Rezaei Asl, K. Tamaskani Zahedi
Abstract:
Corn is cultivated in most countries because of high consumption, quality, and food value. This study evaluated needed energy for cutting corn stems in different levels of cutting height and moisture content. For this reason, test device was fabricated and then calibrated. The device works on the principle of conservation of energy. The results were analyzed using split plot design and SAS software. The results showed that effect of height and moisture content and their interaction effect on cutting energy are significant (P<1%). The maximum cutting energy was 3.22 kJ in 63 (w.b.%) moisture content and the minimum cutting energy was 1.63 kJ in 83.25 (w.b.%) moisture content.
Keywords: Cutting energy, Corn stalk, Cutting height, Moisture content, Impact cutting.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 31432006 Estimation Model of Dry Docking Duration Using Data Mining
Authors: Isti Surjandari, Riara Novita
Abstract:
Maintenance is one of the most important activities in the shipyard industry. However, sometimes it is not supported by adequate services from the shipyard, where inaccuracy in estimating the duration of the ship maintenance is still common. This makes estimation of ship maintenance duration is crucial. This study uses Data Mining approach, i.e., CART (Classification and Regression Tree) to estimate the duration of ship maintenance that is limited to dock works or which is known as dry docking. By using the volume of dock works as an input to estimate the maintenance duration, 4 classes of dry docking duration were obtained with different linear model and job criteria for each class. These linear models can then be used to estimate the duration of dry docking based on job criteria.
Keywords: Classification and regression tree (CART), data mining, dry docking, maintenance duration.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24332005 A New Algorithm for Cluster Initialization
Authors: Moth'd Belal. Al-Daoud
Abstract:
Clustering is a very well known technique in data mining. One of the most widely used clustering techniques is the k-means algorithm. Solutions obtained from this technique are dependent on the initialization of cluster centers. In this article we propose a new algorithm to initialize the clusters. The proposed algorithm is based on finding a set of medians extracted from a dimension with maximum variance. The algorithm has been applied to different data sets and good results are obtained.
Keywords: clustering, k-means, data mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21032004 Arsenic Mobility from Mining Tailings of Monte San Nicolas to Presa de Mata in Guanajuato, Mexico
Authors: I. Cano-Aguilera, B. E. Rubio-Campos, G. De la Rosa, A. F. Aguilera-Alvarado
Abstract:
Mining tailings represent a generating source of rich heavy metal material with a potential danger the public health and the environment, since these metals, under certain conditions, can leach and contaminate aqueous systems that serve like supplying potable water sources. The strategy for this work is based on the observation, experimentation and the simulation that can be obtained by binding real answers of the hydrodynamic behavior of metals leached from mining tailings, and the applied mathematics that provides the logical structure to decipher the individual effects of the general physicochemical phenomenon. The case of study presented herein focuses on mining tailings deposits located in Monte San Nicolas, Guanajuato, Mexico, an abandoned mine. This was considered the contamination source that under certain physicochemical conditions can favor the metal leaching, and its transport towards aqueous systems. In addition, the cartography, meteorology, geology and the hydrodynamics and hydrological characteristics of the place, will be helpful in determining the way and the time in which these systems can interact. Preliminary results demonstrated that arsenic presents a great mobility, since this one was identified in several superficial aqueous systems of the micro watershed, as well as in sediments in concentrations that exceed the established maximum limits in the official norms. Also variations in pH and potential oxide-reduction were registered, conditions that favor the presence of different species from this element its solubility and therefore its mobility.
Keywords: Arsenic, mining tailings, transport.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16892003 Spreading Japan's National Image through China during the Era of Mass Tourism: The Japan National Tourism Organization’s Use of Sina Weibo
Authors: Abigail Qian Zhou
Abstract:
Since China has entered an era of mass tourism, there has been a fundamental change in the way Chinese people approach and perceive the image of other countries. With the advent of the new media era, social networking sites such as Sina Weibo have become a tool for many foreign governmental organizations to spread and promote their national image. Among them, the Japan National Tourism Organization (JNTO) was one of the first foreign official tourism agencies to register with Sina Weibo and actively implement communication activities. Due to historical and political reasons, cognition of Japan's national image by the Chinese has always been complicated and contradictory. However, since 2015, China has become the largest source of tourists visiting Japan. This clearly indicates that the broadening of Japan's national image in China has been effective and has value worthy of reference in promoting a positive Chinese perception of Japan and encouraging Japanese tourism. Within this context and using the method of content analysis in media studies through content mining software, this study analyzed how JNTO’s Sina Weibo accounts have constructed and spread Japan's national image. This study also summarized the characteristics of its content and form, and finally revealed the strategy of JNTO in building its international image. The findings of this study not only add a tourism-based perspective to traditional national image communications research, but also provide some reference for the effective international dissemination of national image in the future.Keywords: National image, tourism, international communication, Japan, China.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9942002 Association of Smoking with Chest Radiographic and Lung Function Findings in Retired Bauxite Mining Workers
Authors: L. R. Ferreira, R. C. G. Bianchi, L. C.R. Ferreira, C. M. Galhardi, E. P. Baciuk, L. H. Oliveira
Abstract:
Inhalation hazards are associated with potentially injurious exposure and increased risk for lung diseases, within the bauxite mining industry, especially for the smelter workers. Smoking is related to decreased lung function and leads to chronic lung diseases. This study had the objective to evaluate whether smoking is related to functional and radiographic respiratory changes in retired bauxite mining workers. Methods: This was a retrospective and cross-sectional study involving the analysis of database information of 140 retired bauxite mining workers from Poços de Caldas-MG evaluated at Worker’s Health Reference Center and at the Social Security Brazilian National Institute, from July 1st, 2015 until June 30th, 2016. The workers were divided into three groups: non-smokers (n = 47), ex-smokers (n = 46), and smokers (n = 47). The data included: age, gender, spirometry results, and the presence or not of pulmonary pleural and/or parenchymal changes in chest radiographs. Chi-Squared test was used (p < 0,05). Results: In the smokers’ group, 83% of spirometry tests and 64% of chest x-rays were altered. In the non-smokers’ group, 19% of spirometry tests and 13% of chest x-rays were altered. In the ex-smokers’ group, 35% of spirometry tests and 30% of chest x-rays were altered. Most of the results were statistically significant. Results demonstrated a significant difference between smokers’ and non-smokers’ groups in regard to spirometric and radiographic pulmonary alterations. Ex-smokers’ and non-smokers’ group demonstrated better results when compared to the smokers’ group in relation to altered spirometry and radiograph findings. These data may contribute to planning strategies to enhance smoking cessation programs within the bauxite mining industry.
Keywords: Bauxite mining, spirometry, chest radiography, smoking.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7012001 Dry Matter, Moisture, Ash and Crude Fibre Content in Distinct Segments of ‘Durian Kampung’ Husk
Authors: Norhanim Nordin, Rosnah Shamsudin, Azrina Azlan, Mohammad Effendy Ya’acob
Abstract:
An environmental friendly approach for disposal of voluminous durian husk waste could be implemented by substituting them into various valuable commodities, such as healthcare and biofuel products. Thus, the study of composition value in each segment of durian husk was very crucial to determine the suitable proportions of nutrients that need to be added and mixed in the product. A total of 12 ‘Durian Kampung’ fruits from Sg Ruan, Pahang were selected and each fruit husk was divided into four segments and labelled as P-L (thin neck area of white inner husk), P-B (thick bottom area of white inner husk), H (green and thorny outer husk) and W (whole combination of P-B and H). Four experiments have been carried out to determine the dry matter, moisture, ash and crude fibre content. The results show that the H segment has the highest dry matter content (30.47%), while the P-B segment has the highest percentage in moisture (81.83%) and ash (6.95%) content. It was calculated that the ash content of the P-B segment has a higher rate of moisture level which causes the ash content to increase about 2.89% from the P-L segment. These data have proven that each segment of durian husk has a significant difference in terms of composition value, which might be useful information to fully utilize every part of the durian husk in the future.
Keywords: Durian husk, crude fibre content, dry matter content, moisture content.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2197