Search results for: time series data mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 38952

Search results for: time series data mining

38772 Determination of Surface Deformations with Global Navigation Satellite System Time Series

Authors: Ibrahim Tiryakioglu, Mehmet Ali Ugur, Caglar Ozkaymak

Abstract:

The development of GNSS technology has led to increasingly widespread and successful applications of GNSS surveys for monitoring crustal movements. However, multi-period GPS survey solutions have not been applied in monitoring vertical surface deformation. This study uses long-term GNSS time series that are required to determine vertical deformations. In recent years, the surface deformations that are parallel and semi-parallel to Bolvadin fault have occurred in Western Anatolia. These surface deformations have continued to occur in Bolvadin settlement area that is located mostly on alluvium ground. Due to these surface deformations, a number of cracks in the buildings located in the residential areas and breaks in underground water and sewage systems have been observed. In order to determine the amount of vertical surface deformations, two continuous GNSS stations have been established in the region. The stations have been operating since 2015 and 2017, respectively. In this study, GNSS observations from the mentioned two GNSS stations were processed with GAMIT/GLOBK (GNSS Analysis Massachusetts Institute of Technology/GLOBal Kalman) program package to create a coordinate time series. With the time series analyses, the GNSS stations’ behavior models (linear, periodical, etc.), the causes of these behaviors, and mathematical models were determined. The study results from the time series analysis of these two 2 GNSS stations shows approximately 50-80 mm/yr vertical movement.

Keywords: Bolvadin fault, GAMIT, GNSS time series, surface deformations

Procedia PDF Downloads 164
38771 New Machine Learning Optimization Approach Based on Input Variables Disposition Applied for Time Series Prediction

Authors: Hervice Roméo Fogno Fotsoa, Germaine Djuidje Kenmoe, Claude Vidal Aloyem Kazé

Abstract:

One of the main applications of machine learning is the prediction of time series. But a more accurate prediction requires a more optimal model of machine learning. Several optimization techniques have been developed, but without considering the input variables disposition of the system. Thus, this work aims to present a new machine learning architecture optimization technique based on their optimal input variables disposition. The validations are done on the prediction of wind time series, using data collected in Cameroon. The number of possible dispositions with four input variables is determined, i.e., twenty-four. Each of the dispositions is used to perform the prediction, with the main criteria being the training and prediction performances. The results obtained from a static architecture and a dynamic architecture of neural networks have shown that these performances are a function of the input variable's disposition, and this is in a different way from the architectures. This analysis revealed that it is necessary to take into account the input variable's disposition for the development of a more optimal neural network model. Thus, a new neural network training algorithm is proposed by introducing the search for the optimal input variables disposition in the traditional back-propagation algorithm. The results of the application of this new optimization approach on the two single neural network architectures are compared with the previously obtained results step by step. Moreover, this proposed approach is validated in a collaborative optimization method with a single objective optimization technique, i.e., genetic algorithm back-propagation neural networks. From these comparisons, it is concluded that each proposed model outperforms its traditional model in terms of training and prediction performance of time series. Thus the proposed optimization approach can be useful in improving the accuracy of time series forecasts. This proves that the proposed optimization approach can be useful in improving the accuracy of time series prediction based on machine learning.

Keywords: input variable disposition, machine learning, optimization, performance, time series prediction

Procedia PDF Downloads 109
38770 Application of Stochastic Models to Annual Extreme Streamflow Data

Authors: Karim Hamidi Machekposhti, Hossein Sedghi

Abstract:

This study was designed to find the best stochastic model (using of time series analysis) for annual extreme streamflow (peak and maximum streamflow) of Karkheh River at Iran. The Auto-regressive Integrated Moving Average (ARIMA) model used to simulate these series and forecast those in future. For the analysis, annual extreme streamflow data of Jelogir Majin station (above of Karkheh dam reservoir) for the years 1958–2005 were used. A visual inspection of the time plot gives a little increasing trend; therefore, series is not stationary. The stationarity observed in Auto-Correlation Function (ACF) and Partial Auto-Correlation Function (PACF) plots of annual extreme streamflow was removed using first order differencing (d=1) in order to the development of the ARIMA model. Interestingly, the ARIMA(4,1,1) model developed was found to be most suitable for simulating annual extreme streamflow for Karkheh River. The model was found to be appropriate to forecast ten years of annual extreme streamflow and assist decision makers to establish priorities for water demand. The Statistical Analysis System (SAS) and Statistical Package for the Social Sciences (SPSS) codes were used to determinate of the best model for this series.

Keywords: stochastic models, ARIMA, extreme streamflow, Karkheh river

Procedia PDF Downloads 146
38769 Application of Seasonal Autoregressive Integrated Moving Average Model for Forecasting Monthly Flows in Waterval River, South Africa

Authors: Kassahun Birhanu Tadesse, Megersa Olumana Dinka

Abstract:

Reliable future river flow information is basic for planning and management of any river systems. For data scarce river system having only a river flow records like the Waterval River, a univariate time series models are appropriate for river flow forecasting. In this study, a univariate Seasonal Autoregressive Integrated Moving Average (SARIMA) model was applied for forecasting Waterval River flow using GRETL statistical software. Mean monthly river flows from 1960 to 2016 were used for modeling. Different unit root tests and Mann-Kendall trend analysis were performed to test the stationarity of the observed flow time series. The time series was differenced to remove the seasonality. Using the correlogram of seasonally differenced time series, different SARIMA models were identified, their parameters were estimated, and diagnostic check-up of model forecasts was performed using white noise and heteroscedasticity tests. Finally, based on minimum Akaike Information (AIc) and Hannan-Quinn (HQc) criteria, SARIMA (3, 0, 2) x (3, 1, 3)12 was selected as the best model for Waterval River flow forecasting. Therefore, this model can be used to generate future river information for water resources development and management in Waterval River system. SARIMA model can also be used for forecasting other similar univariate time series with seasonality characteristics.

Keywords: heteroscedasticity, stationarity test, trend analysis, validation, white noise

Procedia PDF Downloads 203
38768 An Optimized Association Rule Mining Algorithm

Authors: Archana Singh, Jyoti Agarwal, Ajay Rana

Abstract:

Data Mining is an efficient technology to discover patterns in large databases. Association Rule Mining techniques are used to find the correlation between the various item sets in a database, and this co-relation between various item sets are used in decision making and pattern analysis. In recent years, the problem of finding association rules from large datasets has been proposed by many researchers. Various research papers on association rule mining (ARM) are studied and analyzed first to understand the existing algorithms. Apriori algorithm is the basic ARM algorithm, but it requires so many database scans. In DIC algorithm, less amount of database scan is needed but complex data structure lattice is used. The main focus of this paper is to propose a new optimized algorithm (Friendly Algorithm) and compare its performance with the existing algorithms A data set is used to find out frequent itemsets and association rules with the help of existing and proposed (Friendly Algorithm) and it has been observed that the proposed algorithm also finds all the frequent itemsets and essential association rules from databases as compared to existing algorithms in less amount of database scan. In the proposed algorithm, an optimized data structure is used i.e. Graph and Adjacency Matrix.

Keywords: association rules, data mining, dynamic item set counting, FP-growth, friendly algorithm, graph

Procedia PDF Downloads 419
38767 Data Mining Model for Predicting the Status of HIV Patients during Drug Regimen Change

Authors: Ermias A. Tegegn, Million Meshesha

Abstract:

Human Immunodeficiency Virus and Acquired Immunodeficiency Syndrome (HIV/AIDS) is a major cause of death for most African countries. Ethiopia is one of the seriously affected countries in sub Saharan Africa. Previously in Ethiopia, having HIV/AIDS was almost equivalent to a death sentence. With the introduction of Antiretroviral Therapy (ART), HIV/AIDS has become chronic, but manageable disease. The study focused on a data mining technique to predict future living status of HIV/AIDS patients at the time of drug regimen change when the patients become toxic to the currently taking ART drug combination. The data is taken from University of Gondar Hospital ART program database. Hybrid methodology is followed to explore the application of data mining on ART program dataset. Data cleaning, handling missing values and data transformation were used for preprocessing the data. WEKA 3.7.9 data mining tools, classification algorithms, and expertise are utilized as means to address the research problem. By using four different classification algorithms, (i.e., J48 Classifier, PART rule induction, Naïve Bayes and Neural network) and by adjusting their parameters thirty-two models were built on the pre-processed University of Gondar ART program dataset. The performances of the models were evaluated using the standard metrics of accuracy, precision, recall, and F-measure. The most effective model to predict the status of HIV patients with drug regimen substitution is pruned J48 decision tree with a classification accuracy of 98.01%. This study extracts interesting attributes such as Ever taking Cotrim, Ever taking TbRx, CD4 count, Age, Weight, and Gender so as to predict the status of drug regimen substitution. The outcome of this study can be used as an assistant tool for the clinician to help them make more appropriate drug regimen substitution. Future research directions are forwarded to come up with an applicable system in the area of the study.

Keywords: HIV drug regimen, data mining, hybrid methodology, predictive model

Procedia PDF Downloads 141
38766 Mining Educational Data to Support Students’ Major Selection

Authors: Kunyanuth Kularbphettong, Cholticha Tongsiri

Abstract:

This paper aims to create the model for student in choosing an emphasized track of student majoring in computer science at Suan Sunandha Rajabhat University. The objective of this research is to develop the suggested system using data mining technique to analyze knowledge and conduct decision rules. Such relationships can be used to demonstrate the reasonableness of student choosing a track as well as to support his/her decision and the system is verified by experts in the field. The sampling is from student of computer science based on the system and the questionnaire to see the satisfaction. The system result is found to be satisfactory by both experts and student as well.

Keywords: data mining technique, the decision support system, knowledge and decision rules, education

Procedia PDF Downloads 421
38765 An IM-COH Algorithm Neural Network Optimization with Cuckoo Search Algorithm for Time Series Samples

Authors: Wullapa Wongsinlatam

Abstract:

Back propagation algorithm (BP) is a widely used technique in artificial neural network and has been used as a tool for solving the time series problems, such as decreasing training time, maximizing the ability to fall into local minima, and optimizing sensitivity of the initial weights and bias. This paper proposes an improvement of a BP technique which is called IM-COH algorithm (IM-COH). By combining IM-COH algorithm with cuckoo search algorithm (CS), the result is cuckoo search improved control output hidden layer algorithm (CS-IM-COH). This new algorithm has a better ability in optimizing sensitivity of the initial weights and bias than the original BP algorithm. In this research, the algorithm of CS-IM-COH is compared with the original BP, the IM-COH, and the original BP with CS (CS-BP). Furthermore, the selected benchmarks, four time series samples, are shown in this research for illustration. The research shows that the CS-IM-COH algorithm give the best forecasting results compared with the selected samples.

Keywords: artificial neural networks, back propagation algorithm, time series, local minima problem, metaheuristic optimization

Procedia PDF Downloads 151
38764 Predicting Medical Check-Up Patient Re-Coming Using Sequential Pattern Mining and Association Rules

Authors: Rizka Aisha Rahmi Hariadi, Chao Ou-Yang, Han-Cheng Wang, Rajesri Govindaraju

Abstract:

As the increasing of medical check-up popularity, there are a huge number of medical check-up data stored in database and have not been useful. These data actually can be very useful for future strategic planning if we mine it correctly. In other side, a lot of patients come with unpredictable coming and also limited available facilities make medical check-up service offered by hospital not maximal. To solve that problem, this study used those medical check-up data to predict patient re-coming. Sequential pattern mining (SPM) and association rules method were chosen because these methods are suitable for predicting patient re-coming using sequential data. First, based on patient personal information the data was grouped into … groups then discriminant analysis was done to check significant of the grouping. Second, for each group some frequent patterns were generated using SPM method. Third, based on frequent patterns of each group, pairs of variable can be extracted using association rules to get general pattern of re-coming patient. Last, discussion and conclusion was done to give some implications of the results.

Keywords: patient re-coming, medical check-up, health examination, data mining, sequential pattern mining, association rules, discriminant analysis

Procedia PDF Downloads 640
38763 Generating Real-Time Visual Summaries from Located Sensor-Based Data with Chorems

Authors: Z. Bouattou, R. Laurini, H. Belbachir

Abstract:

This paper describes a new approach for the automatic generation of the visual summaries dealing with cartographic visualization methods and sensors real time data modeling. Hence, the concept of chorems seems an interesting candidate to visualize real time geographic database summaries. Chorems have been defined by Roger Brunet (1980) as schematized visual representations of territories. However, the time information is not yet handled in existing chorematic map approaches, issue has been discussed in this paper. Our approach is based on spatial analysis by interpolating the values recorded at the same time, by sensors available, so we have a number of distributed observations on study areas and used spatial interpolation methods to find the concentration fields, from these fields and by using some spatial data mining procedures on the fly, it is possible to extract important patterns as geographic rules. Then, those patterns are visualized as chorems.

Keywords: geovisualization, spatial analytics, real-time, geographic data streams, sensors, chorems

Procedia PDF Downloads 400
38762 Modelling of Powered Roof Supports Work

Authors: Marcin Michalak

Abstract:

Due to the increasing efforts on saving our natural environment a change in the structure of energy resources can be observed - an increasing fraction of a renewable energy sources. In many countries traditional underground coal mining loses its significance but there are still countries, like Poland or Germany, in which the coal based technologies have the greatest fraction in a total energy production. This necessitates to make an effort to limit the costs and negative effects of underground coal mining. The longwall complex is as essential part of the underground coal mining. The safety and the effectiveness of the work is strongly dependent of the diagnostic state of powered roof supports. The building of a useful and reliable diagnostic system requires a lot of data. As the acquisition of a data of any possible operating conditions it is important to have a possibility to generate a demanded artificial working characteristics. In this paper a new approach of modelling a leg pressure in the single unit of powered roof support. The model is a result of the analysis of a typical working cycles.

Keywords: machine modelling, underground mining, coal mining, structure

Procedia PDF Downloads 366
38761 Income-Consumption Relationships in Pakistan (1980-2011): A Cointegration Approach

Authors: Himayatullah Khan, Alena Fedorova

Abstract:

The present paper analyses the income-consumption relationships in Pakistan using annual time series data from 1980-81 to 2010-1. The paper uses the Augmented Dickey-Fuller test to check the unit root and stationarity in these two time series. The paper finds that the two time series are nonstationary but stationary at their first difference levels. The Augmented Engle-Granger test and the Cointegrating Regression Durbin-Watson test imply that the two time series of consumption and income are cointegrated and that long-run marginal propensity to consume is 0.88 which is given by the estimated (static) equilibrium relation. The paper also used the error correction mechanism to find out to model dynamic relationship. The purpose of the ECM is to indicate the speed of adjustment from the short-run equilibrium to the long-run equilibrium state. The results show that MPC is equal to 0.93 and is highly significant. The coefficient of Engle-Granger residuals is negative but insignificant. Statistically, the equilibrium error term is zero, which suggests that consumption adjusts to changes in GDP in the same period. The short-run changes in GDP have a positive impact on short-run changes in consumption. The paper concludes that we may interpret 0.93 as the short-run MPC. The pair-wise Granger Causality test shows that both GDP and consumption Granger cause each other.

Keywords: cointegrating regression, Augmented Dickey Fuller test, Augmented Engle-Granger test, Granger causality, error correction mechanism

Procedia PDF Downloads 414
38760 Discovering User Behaviour Patterns from Web Log Analysis to Enhance the Accessibility and Usability of Website

Authors: Harpreet Singh

Abstract:

Finding relevant information on the World Wide Web is becoming highly challenging day by day. Web usage mining is used for the extraction of relevant and useful knowledge, such as user behaviour patterns, from web access log records. Web access log records all the requests for individual files that the users have requested from the website. Web usage mining is important for Customer Relationship Management (CRM), as it can ensure customer satisfaction as far as the interaction between the customer and the organization is concerned. Web usage mining is helpful in improving website structure or design as per the user’s requirement by analyzing the access log file of a website through a log analyzer tool. The focus of this paper is to enhance the accessibility and usability of a guitar selling web site by analyzing their access log through Deep Log Analyzer tool. The results show that the maximum number of users is from the United States and that they use Opera 9.8 web browser and the Windows XP operating system.

Keywords: web usage mining, web mining, log file, data mining, deep log analyzer

Procedia PDF Downloads 248
38759 A Theoretical Model for Pattern Extraction in Large Datasets

Authors: Muhammad Usman

Abstract:

Pattern extraction has been done in past to extract hidden and interesting patterns from large datasets. Recently, advancements are being made in these techniques by providing the ability of multi-level mining, effective dimension reduction, advanced evaluation and visualization support. This paper focuses on reviewing the current techniques in literature on the basis of these parameters. Literature review suggests that most of the techniques which provide multi-level mining and dimension reduction, do not handle mixed-type data during the process. Patterns are not extracted using advanced algorithms for large datasets. Moreover, the evaluation of patterns is not done using advanced measures which are suited for high-dimensional data. Techniques which provide visualization support are unable to handle a large number of rules in a small space. We present a theoretical model to handle these issues. The implementation of the model is beyond the scope of this paper.

Keywords: association rule mining, data mining, data warehouses, visualization of association rules

Procedia PDF Downloads 222
38758 Emotion Mining and Attribute Selection for Actionable Recommendations to Improve Customer Satisfaction

Authors: Jaishree Ranganathan, Poonam Rajurkar, Angelina A. Tzacheva, Zbigniew W. Ras

Abstract:

In today’s world, business often depends on the customer feedback and reviews. Sentiment analysis helps identify and extract information about the sentiment or emotion of the of the topic or document. Attribute selection is a challenging problem, especially with large datasets in actionable pattern mining algorithms. Action Rule Mining is one of the methods to discover actionable patterns from data. Action Rules are rules that help describe specific actions to be made in the form of conditions that help achieve the desired outcome. The rules help to change from any undesirable or negative state to a more desirable or positive state. In this paper, we present a Lexicon based weighted scheme approach to identify emotions from customer feedback data in the area of manufacturing business. Also, we use Rough sets and explore the attribute selection method for large scale datasets. Then we apply Actionable pattern mining to extract possible emotion change recommendations. This kind of recommendations help business analyst to improve their customer service which leads to customer satisfaction and increase sales revenue.

Keywords: actionable pattern discovery, attribute selection, business data, data mining, emotion

Procedia PDF Downloads 199
38757 Knowledge-Driven Decision Support System Based on Knowledge Warehouse and Data Mining by Improving Apriori Algorithm with Fuzzy Logic

Authors: Pejman Hosseinioun, Hasan Shakeri, Ghasem Ghorbanirostam

Abstract:

In recent years, we have seen an increasing importance of research and study on knowledge source, decision support systems, data mining and procedure of knowledge discovery in data bases and it is considered that each of these aspects affects the others. In this article, we have merged information source and knowledge source to suggest a knowledge based system within limits of management based on storing and restoring of knowledge to manage information and improve decision making and resources. In this article, we have used method of data mining and Apriori algorithm in procedure of knowledge discovery one of the problems of Apriori algorithm is that, a user should specify the minimum threshold for supporting the regularity. Imagine that a user wants to apply Apriori algorithm for a database with millions of transactions. Definitely, the user does not have necessary knowledge of all existing transactions in that database, and therefore cannot specify a suitable threshold. Our purpose in this article is to improve Apriori algorithm. To achieve our goal, we tried using fuzzy logic to put data in different clusters before applying the Apriori algorithm for existing data in the database and we also try to suggest the most suitable threshold to the user automatically.

Keywords: decision support system, data mining, knowledge discovery, data discovery, fuzzy logic

Procedia PDF Downloads 334
38756 Satellite Data to Understand Changes in Carbon Dioxide for Surface Mining and Green Zone

Authors: Carla Palencia-Aguilar

Abstract:

In order to attain the 2050’s zero emissions goal, it is necessary to know the carbon dioxide changes over time either from pollution to attenuations in the mining industry versus at green zones to establish real goals and redirect efforts to reduce greenhouse effects. Two methods were used to compute the amount of CO2 tons in specific mining zones in Colombia. The former by means of NPP with MODIS MOD17A3HGF from years 2000 to 2021. The latter by using MODIS MYD021KM bands 33 to 36 with maximum values of 644 data points distributed in 7 sites corresponding to surface mineral mining of: coal, nickel, iron and limestone. The green zones selected were located at the proximities of the studied sites, but further than 1 km to avoid information overlapping. Year 2012 was selected for method 2 to compare the results with data provided by the Colombian government to determine range of values. Some data was compared with 2022 MODIS energy values and converted to kton of CO2 by using the Greenhouse Gas Equivalencies Calculator by EPA. The results showed that Nickel mining was the least pollutant with 81 kton of CO2 e.q on average and maximum of 102 kton of CO2 e.q. per year, with green zones attenuating carbon dioxide in 103 kton of CO2 on average and 125 kton maximum per year in the last 22 years. Following Nickel, there was Coal with average kton of CO2 per year of 152 and maximum of 188, values very similar to the subjacent green zones with average and maximum kton of CO2 of 157 and 190 respectively. Iron had similar results with respect to 3 Limestone sites with average values of 287 kton of CO2 for mining and 310 kton for green zones, and maximum values of 310 kton for iron mining and 356 kton for green zones. One of the limestone sites exceeded the other sites with an average value of 441 kton per year and maximum of 490 kton per year, eventhough it had higher attenuation by green zones than a close Limestore site (3.5 Km apart): 371 kton versus 281 kton on average and maximum 416 kton versus 323 kton, such vegetation contribution is not enough, meaning that manufacturing process should be improved for the most pollutant site. By comparing bands 33 to 36 for years 2012 and 2022 from January to August, it can be seen that on average the kton of CO2 were similar for mining sites and green zones; showing an average yearly balance of carbon dioxide emissions and attenuation. However, efforts on improving manufacturing process are needed to overcome the carbon dioxide effects specially during emissions’ peaks because surrounding vegetation cannot fully attenuate it.

Keywords: carbon dioxide, MODIS, surface mining, vegetation

Procedia PDF Downloads 99
38755 Medical Knowledge Management since the Integration of Heterogeneous Data until the Knowledge Exploitation in a Decision-Making System

Authors: Nadjat Zerf Boudjettou, Fahima Nader, Rachid Chalal

Abstract:

Knowledge management is to acquire and represent knowledge relevant to a domain, a task or a specific organization in order to facilitate access, reuse and evolution. This usually means building, maintaining and evolving an explicit representation of knowledge. The next step is to provide access to that knowledge, that is to say, the spread in order to enable effective use. Knowledge management in the medical field aims to improve the performance of the medical organization by allowing individuals in the care facility (doctors, nurses, paramedics, etc.) to capture, share and apply collective knowledge in order to make optimal decisions in real time. In this paper, we propose a knowledge management approach based on integration technique of heterogeneous data in the medical field by creating a data warehouse, a technique of extracting knowledge from medical data by choosing a technique of data mining, and finally an exploitation technique of that knowledge in a case-based reasoning system.

Keywords: data warehouse, data mining, knowledge discovery in database, KDD, medical knowledge management, Bayesian networks

Procedia PDF Downloads 394
38754 Small Target Recognition Based on Trajectory Information

Authors: Saad Alkentar, Abdulkareem Assalem

Abstract:

Recognizing small targets has always posed a significant challenge in image analysis. Over long distances, the image signal-to-noise ratio tends to be low, limiting the amount of useful information available to detection systems. Consequently, visual target recognition becomes an intricate task to tackle. In this study, we introduce a Track Before Detect (TBD) approach that leverages target trajectory information (coordinates) to effectively distinguish between noise and potential targets. By reframing the problem as a multivariate time series classification, we have achieved remarkable results. Specifically, our TBD method achieves an impressive 97% accuracy in separating target signals from noise within a mere half-second time span (consisting of 10 data points). Furthermore, when classifying the identified targets into our predefined categories—airplane, drone, and bird—we achieve an outstanding classification accuracy of 96% over a more extended period of 1.5 seconds (comprising 30 data points).

Keywords: small targets, drones, trajectory information, TBD, multivariate time series

Procedia PDF Downloads 44
38753 Oil Demand Forecasting in China: A Structural Time Series Analysis

Authors: Tehreem Fatima, Enjun Xia

Abstract:

The research investigates the relationship between total oil consumption and transport oil consumption, GDP, oil price, and oil reserve in order to forecast future oil demand in China. Annual time series data is used over the period of 1980 to 2015, and for this purpose, an oil demand function is estimated by applying structural time series model (STSM). The technique also uncovers the Underline energy demand trend (UEDT) for China oil demand and GDP, oil reserve, oil price and UEDT are considering important drivers of China oil demand. The long-run elasticity of total oil consumption with respect to GDP and price are (0.5, -0.04) respectively while GDP, oil reserve, and price remain (0.17; 0.23; -0.05) respectively. Moreover, the Estimated results of long-run elasticity of transport oil consumption with respect to GDP and price are (0.5, -0.00) respectively long-run estimates remain (0.28; 37.76;-37.8) for GDP, oil reserve, and price respectively. For both model estimated underline energy demand trend (UEDT) remains nonlinear and stochastic and with an increasing trend of (UEDT) and based on estimated equations, it is predicted that China total oil demand somewhere will be 9.9 thousand barrel per day by 2025 as compare to 9.4 thousand barrel per day in 2015, while transport oil demand predicting value is 9.0 thousand barrel per day by 2020 as compare to 8.8 thousand barrel per day in 2015.

Keywords: china, forecasting, oil, structural time series model (STSM), underline energy demand trend (UEDT)

Procedia PDF Downloads 283
38752 Personalize E-Learning System Based on Clustering and Sequence Pattern Mining Approach

Authors: H. S. Saini, K. Vijayalakshmi, Rishi Sayal

Abstract:

Network-based education has been growing rapidly in size and quality. Knowledge clustering becomes more important in personalized information retrieval for web-learning. A personalized-Learning service after the learners’ knowledge has been classified with clustering. Through automatic analysis of learners’ behaviors, their partition with similar data level and interests may be discovered so as to produce learners with contents that best match educational needs for collaborative learning. We present a specific mining tool and a recommender engine that we have integrated in the online learning in order to help the teacher to carry out the whole e-learning process. We propose to use sequential pattern mining algorithms to discover the most used path by the students and from this information can recommend links to the new students automatically meanwhile they browse in the course. We have Developed a specific author tool in order to help the teacher to apply all the data mining process. We tend to report on many experiments with real knowledge so as to indicate the quality of using both clustering and sequential pattern mining algorithms together for discovering personalized e-learning systems.

Keywords: e-learning, cluster, personalization, sequence, pattern

Procedia PDF Downloads 426
38751 The Women-In-Mining Discourse: A Study Combining Corpus Linguistics and Discourse Analysis

Authors: Ylva Fältholm, Cathrine Norberg

Abstract:

One of the major threats identified to successful future mining is that women do not find the industry attractive. Many attempts have been made, for example in Sweden and Australia, to create organizational structures and mining communities attractive to both genders. Despite such initiatives, many mining areas are developing into gender-segregated fly-in/fly out communities dominated by men with both social and economic consequences. One of the challenges facing many mining companies is thus to break traditional gender patterns and structures. To do this increased knowledge about gender in the context of mining is needed. Since language both constitutes and reproduces knowledge, increased knowledge can be gained through an exploration and description of the mining discourse from a gender perspective. The aim of this study is to explore what conceptual ideas are activated in connection to the physical/geographical mining area and to work within the mining industry. We use a combination of critical discourse analysis implying close reading of selected texts, such as policy documents, interview materials, applications and research and innovation agendas, and analyses of linguistic patterns found in large language corpora covering millions of words of contemporary language production. The quantitative corpus data serves as a point of departure for the qualitative analysis of the texts, that is, suggests what patterns to explore further. The study shows that despite technological and organizational development, one of the most persistent discourses about mining is the conception of dangerous and unfriendly areas infused with traditional notions of masculinity ideals and manual hard work. Although some of the texts analyzed highlight gender issues, and describe gender-equalizing initiatives, such as wage-mapping systems, female networks and recruitment efforts for women executives, and thereby render the discourse less straightforward, it is shown that these texts are not unambiguous examples of a counter-discourse. They rather illustrate that discourses are not stable but include opposing discourses, in dialogue with each other. For example, many texts highlight why and how women are important to mining, at the same time as they suggest that gender and diversity are all about women: why mining is a problem for them, how they should be, and what they should do to fit in. Drawing on a constitutive view of discourse, knowledge about such conflicting perceptions of women is a prerequisite for succeeding in attracting women to the mining industry and thereby contributing to the development of future mining.

Keywords: discourse, corpus linguistics, gender, mining

Procedia PDF Downloads 263
38750 Sequential Pattern Mining from Data of Medical Record with Sequential Pattern Discovery Using Equivalent Classes (SPADE) Algorithm (A Case Study : Bolo Primary Health Care, Bima)

Authors: Rezky Rifaini, Raden Bagus Fajriya Hakim

Abstract:

This research was conducted at the Bolo primary health Care in Bima Regency. The purpose of the research is to find out the association pattern that is formed of medical record database from Bolo Primary health care’s patient. The data used is secondary data from medical records database PHC. Sequential pattern mining technique is the method that used to analysis. Transaction data generated from Patient_ID, Check_Date and diagnosis. Sequential Pattern Discovery Algorithms Using Equivalent Classes (SPADE) is one of the algorithm in sequential pattern mining, this algorithm find frequent sequences of data transaction, using vertical database and sequence join process. Results of the SPADE algorithm is frequent sequences that then used to form a rule. It technique is used to find the association pattern between items combination. Based on association rules sequential analysis with SPADE algorithm for minimum support 0,03 and minimum confidence 0,75 is gotten 3 association sequential pattern based on the sequence of patient_ID, check_Date and diagnosis data in the Bolo PHC.

Keywords: diagnosis, primary health care, medical record, data mining, sequential pattern mining, SPADE algorithm

Procedia PDF Downloads 401
38749 Assessment of Indigenous People Living Condition in Coal Mining Region: An Evidence from Dhanbad, India

Authors: Arun Kumar Yadav

Abstract:

Coal contributes a significant role in India’s developmental mission. But, ironically, on the other side it causes large scale population displacement and significant changes in indigenous people’s livelihood mechanism. Dhanbad which is regarded as one of the oldest and large mining area, as well as a “Coal Capital of India”. Here, mining exploration work started nearly a century ago. But with the passage of time, mining brings a lot of changes in the life of local people. In this context, study tries to do comparative situational analysis of the changes in the living condition of dwellers living in mines affected and non-mines affected villages based on livelihood approach. Since, this place has long history of mining so it is very difficult to conduct before and after comparison between mines and non-mines affected areas. Consequently, the present study is based on relative comparison approach to elucidate the actual scenario. By using primary survey data which was collected by the author during the month of September 2014 to March 2015 at Dhanbad, Jharkhand. The data were collected from eight villages, these were categorised broadly into mines and non-mines affected villages. Further at micro level, mines affected villages has been categorised into open cast and underground mines. This categorization will help us to capture the deeper understanding about the issues of mine affected villages group. Total of 400 household were surveyed. Result depicts that in every sphere mining affected villages are more vulnerable. Regarding financial capital, although mine affected villages are engaged in mining work and get higher mean income. But in contrast, non-mine affected villages are more occupationally diversified. They have an opportunity to earn money from diversified extents like agricultural land, working in mining area, selling coal informally as well as receiving remittances. Non-mines affected villages are in better physical capital which comprises of basic infrastructure to support livelihood. They have an access to secured shelter, adequate water supply & sanitation, and affordable information and transport. Mining affected villages are more prone to health risks. Regarding social capital, it shows that in comparison to last five years, law and order has been improved in mine affected villages.

Keywords: displacement, indigenous, livelihood, mining

Procedia PDF Downloads 311
38748 Change Point Detection Using Random Matrix Theory with Application to Frailty in Elderly Individuals

Authors: Malika Kharouf, Aly Chkeir, Khac Tuan Huynh

Abstract:

Detecting change points in time series data is a challenging problem, especially in scenarios where there is limited prior knowledge regarding the data’s distribution and the nature of the transitions. We present a method designed for detecting changes in the covariance structure of high-dimensional time series data, where the number of variables closely matches the data length. Our objective is to achieve unbiased test statistic estimation under the null hypothesis. We delve into the utilization of Random Matrix Theory to analyze the behavior of our test statistic within a high-dimensional context. Specifically, we illustrate that our test statistic converges pointwise to a normal distribution under the null hypothesis. To assess the effectiveness of our proposed approach, we conduct evaluations on a simulated dataset. Furthermore, we employ our method to examine changes aimed at detecting frailty in the elderly.

Keywords: change point detection, hypothesis tests, random matrix theory, frailty in elderly

Procedia PDF Downloads 51
38747 Secure Multiparty Computations for Privacy Preserving Classifiers

Authors: M. Sumana, K. S. Hareesha

Abstract:

Secure computations are essential while performing privacy preserving data mining. Distributed privacy preserving data mining involve two to more sites that cannot pool in their data to a third party due to the violation of law regarding the individual. Hence in order to model the private data without compromising privacy and information loss, secure multiparty computations are used. Secure computations of product, mean, variance, dot product, sigmoid function using the additive and multiplicative homomorphic property is discussed. The computations are performed on vertically partitioned data with a single site holding the class value.

Keywords: homomorphic property, secure product, secure mean and variance, secure dot product, vertically partitioned data

Procedia PDF Downloads 410
38746 A Stepwise Approach to Automate the Search for Optimal Parameters in Seasonal ARIMA Models

Authors: Manisha Mukherjee, Diptarka Saha

Abstract:

Reliable forecasts of univariate time series data are often necessary for several contexts. ARIMA models are quite popular among practitioners in this regard. Hence, choosing correct parameter values for ARIMA is a challenging yet imperative task. Thus, a stepwise algorithm is introduced to provide automatic and robust estimates for parameters (p; d; q)(P; D; Q) used in seasonal ARIMA models. This process is focused on improvising the overall quality of the estimates, and it alleviates the problems induced due to the unidimensional nature of the methods that are currently used such as auto.arima. The fast and automated search of parameter space also ensures reliable estimates of the parameters that possess several desirable qualities, consequently, resulting in higher test accuracy especially in the cases of noisy data. After vigorous testing on real as well as simulated data, the algorithm doesn’t only perform better than current state-of-the-art methods, it also completely obviates the need for human intervention due to its automated nature.

Keywords: time series, ARIMA, auto.arima, ARIMA parameters, forecast, R function

Procedia PDF Downloads 161
38745 Expanding Trading Strategies By Studying Sentiment Correlation With Data Mining Techniques

Authors: Ved Kulkarni, Karthik Kini

Abstract:

This experiment aims to understand how the media affects the power markets in the mainland United States and study the duration of reaction time between news updates and actual price movements. it have taken into account electric utility companies trading in the NYSE and excluded companies that are more politically involved and move with higher sensitivity to Politics. The scrapper checks for any news related to keywords, which are predefined and stored for each specific company. Based on this, the classifier will allocate the effect into five categories: positive, negative, highly optimistic, highly negative, or neutral. The effect on the respective price movement will be studied to understand the response time. Based on the response time observed, neural networks would be trained to understand and react to changing market conditions, achieving the best strategy in every market. The stock trader would be day trading in the first phase and making option strategy predictions based on the black holes model. The expected result is to create an AI-based system that adjusts trading strategies within the market response time to each price movement.

Keywords: data mining, language processing, artificial neural networks, sentiment analysis

Procedia PDF Downloads 16
38744 Design of Electromagnetic Field of PMSG for VTOL Series-Hybrid UAV

Authors: Sooyoung Cho, In-Gun Kim, Hyun-Seok Hong, Dong-Woo Kang, Ju Lee

Abstract:

Series hybrid UAV(Unmanned aerial vehicle) that is proposed in this paper performs VTOL(Vertical take-off and landing) using the battery and generator, and it applies the series hybrid system with combination of the small engine and generator when cruising flight. This system can be described as the next-generation system that can dramatically increase the UAV flight times. Also, UAV systems require a large energy at the time of VTOL to be conducted for a short time. Therefore, this paper designs PMSG(Permanent Magnet Synchronous Generator) having a high specific power considering VTOL through the FEA.

Keywords: PMSG, VTOL, UAV, high specific power density

Procedia PDF Downloads 516
38743 Identification of Classes of Bilinear Time Series Models

Authors: Anthony Usoro

Abstract:

In this paper, two classes of bilinear time series model are obtained under certain conditions from the general bilinear autoregressive moving average model. Bilinear Autoregressive (BAR) and Bilinear Moving Average (BMA) Models have been identified. From the general bilinear model, BAR and BMA models have been proved to exist for q = Q = 0, => j = 0, and p = P = 0, => i = 0 respectively. These models are found useful in modelling most of the economic and financial data.

Keywords: autoregressive model, bilinear autoregressive model, bilinear moving average model, moving average model

Procedia PDF Downloads 405