Search results for: time series data mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 37636

Search results for: time series data mining

37336 Directional Dust Deposition Measurements: The Influence of Seasonal Changes and the Meteorological Conditions Influencing in Witbank Area and Carletonville Area

Authors: Maphuti Georgina Kwata

Abstract:

Coal mining in Mpumalanga Province is known of contributing to the atmospheric pollution from various activities. Gold mining in North-West Province is known of also contributing to the atmospheric pollution especially with the production of radon gas. In this research directional dust deposition gauge was used to measure source of direction and meteorological data was used to determine the wind rose blowing and the influence of the seasonal changes. Fourteen months of dust collection was undertaken in Witbank Area and Carletonville Area. The results shows that the sources of direction for Ericson Dam its East in February 2010 and Tip Area shows that the source of direction its West in October 2010. In the East direction there were mining operations, power stations which contributed to the East to be the sources of direction. In the West direction there were smelters, power stations and agricultural activities which contributed for the source of direction to be the West direction for Driefontein Mine: East Recreational Village Club. The East of Leslie Williams hospital is the source of direction which also indicated that there dust generating activities such as mining operation, agricultural activities. The meteorological results for Emalahleni Area in summer and winter the wind rose blow with wind speed of 5-10 ms-1 from the East sector. Annual average for the wind rose blow its East South eastern sector with 20 ms-1 and day time the wind rose from northwestern sector with excess of 20 ms-1. The night time wind direction East-eastern direction with a maximum wind speed of 20 ms-1. The meteorogical results for Driefontein Mine show that North-western sector and north-eastern sector wind rose is blowing with 5-10 ms-1 win speed. Day time wind blows from the West sector and night time wind blows from the north sector. In summer the wind blows North-east sector with 5-10 ms-1 and winter wind blows from North-west and it’s also predominant. In spring wind blows from north-east. The conclusion is that not only mining operation where the directional dust deposit gauge were installed contributed to the source of direction also the power stations, smelters, and other activities nearby the mining operation contributed. The recommendations are the dust suppressant for unpaved roads should be used on a regular basis and there should be monitoring of the weather conditions (the wind speed and direction prior to blasting to ensure minimal emissions).

Keywords: directional dust deposition gauge, BS part 5 1747 dust deposit gauge, wind rose, wind blowing

Procedia PDF Downloads 464
37335 L1-Convergence of Modified Trigonometric Sums

Authors: Sandeep Kaur Chouhan, Jatinderdeep Kaur, S. S. Bhatia

Abstract:

The existence of sine and cosine series as a Fourier series, their L1-convergence seems to be one of the difficult question in theory of convergence of trigonometric series in L1-metric norm. In the literature so far available, various authors have studied the L1-convergence of cosine and sine trigonometric series with special coefficients. In this paper, we present a modified cosine and sine sums and criterion for L1-convergence of these modified sums is obtained. Also, a necessary and sufficient condition for the L1-convergence of the cosine and sine series is deduced as corollaries.

Keywords: conjugate Dirichlet kernel, Dirichlet kernel, L1-convergence, modified sums

Procedia PDF Downloads 322
37334 Design of a Small and Medium Enterprise Growth Prediction Model Based on Web Mining

Authors: Yiea Funk Te, Daniel Mueller, Irena Pletikosa Cvijikj

Abstract:

Small and medium enterprises (SMEs) play an important role in the economy of many countries. When the overall world economy is considered, SMEs represent 95% of all businesses in the world, accounting for 66% of the total employment. Existing studies show that the current business environment is characterized as highly turbulent and strongly influenced by modern information and communication technologies, thus forcing SMEs to experience more severe challenges in maintaining their existence and expanding their business. To support SMEs at improving their competitiveness, researchers recently turned their focus on applying data mining techniques to build risk and growth prediction models. However, data used to assess risk and growth indicators is primarily obtained via questionnaires, which is very laborious and time-consuming, or is provided by financial institutes, thus highly sensitive to privacy issues. Recently, web mining (WM) has emerged as a new approach towards obtaining valuable insights in the business world. WM enables automatic and large scale collection and analysis of potentially valuable data from various online platforms, including companies’ websites. While WM methods have been frequently studied to anticipate growth of sales volume for e-commerce platforms, their application for assessment of SME risk and growth indicators is still scarce. Considering that a vast proportion of SMEs own a website, WM bears a great potential in revealing valuable information hidden in SME websites, which can further be used to understand SME risk and growth indicators, as well as to enhance current SME risk and growth prediction models. This study aims at developing an automated system to collect business-relevant data from the Web and predict future growth trends of SMEs by means of WM and data mining techniques. The envisioned system should serve as an 'early recognition system' for future growth opportunities. In an initial step, we examine how structured and semi-structured Web data in governmental or SME websites can be used to explain the success of SMEs. WM methods are applied to extract Web data in a form of additional input features for the growth prediction model. The data on SMEs provided by a large Swiss insurance company is used as ground truth data (i.e. growth-labeled data) to train the growth prediction model. Different machine learning classification algorithms such as the Support Vector Machine, Random Forest and Artificial Neural Network are applied and compared, with the goal to optimize the prediction performance. The results are compared to those from previous studies, in order to assess the contribution of growth indicators retrieved from the Web for increasing the predictive power of the model.

Keywords: data mining, SME growth, success factors, web mining

Procedia PDF Downloads 236
37333 Identifying the Factors affecting on the Success of Energy Usage Saving in Municipality of Tehran

Authors: Rojin Bana Derakhshan, Abbas Toloie

Abstract:

For the purpose of optimizing and developing energy efficiency in building, it is required to recognize key elements of success in optimization of energy consumption before performing any actions. Surveying Principal Components is one of the most valuable result of Linear Algebra because the simple and non-parametric methods are become confusing. So that energy management system implemented according to energy management system international standard ISO50001:2011 and all energy parameters in building to be measured through performing energy auditing. In this essay by simulating used of data mining, the key impressive elements on energy saving in buildings to be determined. This approach is based on data mining statistical techniques using feature selection method and fuzzy logic and convert data from massive to compressed type and used to increase the selected feature. On the other side, influence portion and amount of each energy consumption elements in energy dissipation in percent are recognized as separated norm while using obtained results from energy auditing and after measurement of all energy consuming parameters and identified variables. Accordingly, energy saving solution divided into 3 categories, low, medium and high expense solutions.

Keywords: energy saving, key elements of success, optimization of energy consumption, data mining

Procedia PDF Downloads 434
37332 The Usage of Bridge Estimator for Hegy Seasonal Unit Root Tests

Authors: Huseyin Guler, Cigdem Kosar

Abstract:

The aim of this study is to propose Bridge estimator for seasonal unit root tests. Seasonality is an important factor for many economic time series. Some variables may contain seasonal patterns and forecasts that ignore important seasonal patterns have a high variance. Therefore, it is very important to eliminate seasonality for seasonal macroeconomic data. There are some methods to eliminate the impacts of seasonality in time series. One of them is filtering the data. However, this method leads to undesired consequences in unit root tests, especially if the data is generated by a stochastic seasonal process. Another method to eliminate seasonality is using seasonal dummy variables. Some seasonal patterns may result from stationary seasonal processes, which are modelled using seasonal dummies but if there is a varying and changing seasonal pattern over time, so the seasonal process is non-stationary, deterministic seasonal dummies are inadequate to capture the seasonal process. It is not suitable to use seasonal dummies for modeling such seasonally nonstationary series. Instead of that, it is necessary to take seasonal difference if there are seasonal unit roots in the series. Different alternative methods are proposed in the literature to test seasonal unit roots, such as Dickey, Hazsa, Fuller (DHF) and Hylleberg, Engle, Granger, Yoo (HEGY) tests. HEGY test can be also used to test the seasonal unit root in different frequencies (monthly, quarterly, and semiannual). Another issue in unit root tests is the lag selection. Lagged dependent variables are added to the model in seasonal unit root tests as in the unit root tests to overcome the autocorrelation problem. In this case, it is necessary to choose the lag length and determine any deterministic components (i.e., a constant and trend) first, and then use the proper model to test for seasonal unit roots. However, this two-step procedure might lead size distortions and lack of power in seasonal unit root tests. Recent studies show that Bridge estimators are good in selecting optimal lag length while differentiating nonstationary versus stationary models for nonseasonal data. The advantage of this estimator is the elimination of the two-step nature of conventional unit root tests and this leads a gain in size and power. In this paper, the Bridge estimator is proposed to test seasonal unit roots in a HEGY model. A Monte-Carlo experiment is done to determine the efficiency of this approach and compare the size and power of this method with HEGY test. Since Bridge estimator performs well in model selection, our approach may lead to some gain in terms of size and power over HEGY test.

Keywords: bridge estimators, HEGY test, model selection, seasonal unit root

Procedia PDF Downloads 299
37331 Emotion Classification Using Recurrent Neural Network and Scalable Pattern Mining

Authors: Jaishree Ranganathan, MuthuPriya Shanmugakani Velsamy, Shamika Kulkarni, Angelina Tzacheva

Abstract:

Emotions play an important role in everyday life. An-alyzing these emotions or feelings from social media platforms like Twitter, Facebook, blogs, and forums based on user comments and reviews plays an important role in various factors. Some of them include brand monitoring, marketing strategies, reputation, and competitor analysis. The opinions or sentiments mined from such data helps understand the current state of the user. It does not directly provide intuitive insights on what actions to be taken to benefit the end user or business. Actionable Pattern Mining method provides suggestions or actionable recommendations on what changes or actions need to be taken in order to benefit the end user. In this paper, we propose automatic classification of emotions in Twitter data using Recurrent Neural Network - Gated Recurrent Unit. We achieve training accuracy of 87.58% and validation accuracy of 86.16%. Also, we extract action rules with respect to the user emotion that helps to provide actionable suggestion.

Keywords: emotion mining, twitter, recurrent neural network, gated recurrent unit, actionable pattern mining

Procedia PDF Downloads 133
37330 Analyzing Medical Workflows Using Market Basket Analysis

Authors: Mohit Kumar, Mayur Betharia

Abstract:

Healthcare domain, with the emergence of Electronic Medical Record (EMR), collects a lot of data which have been attracting Data Mining expert’s interest. In the past, doctors have relied on their intuition while making critical clinical decisions. This paper presents the means to analyze the Medical workflows to get business insights out of huge dumped medical databases. Market Basket Analysis (MBA) which is a special data mining technique, has been widely used in marketing and e-commerce field to discover the association between products bought together by customers. It helps businesses in increasing their sales by analyzing the purchasing behavior of customers and pitching the right customer with the right product. This paper is an attempt to demonstrate Market Basket Analysis applications in healthcare. In particular, it discusses the Market Basket Analysis Algorithm ‘Apriori’ applications within healthcare in major areas such as analyzing the workflow of diagnostic procedures, Up-selling and Cross-selling of Healthcare Systems, designing healthcare systems more user-friendly. In the paper, we have demonstrated the MBA applications using Angiography Systems, but can be extrapolated to other modalities as well.

Keywords: data mining, market basket analysis, healthcare applications, knowledge discovery in healthcare databases, customer relationship management, healthcare systems

Procedia PDF Downloads 138
37329 A Suggested Study Plan for Mining Engineering Program in Northern Border University (NBU) to Match the Requirements of the Local Mining Industry

Authors: Mohammad Aljuhani, Yasamina Aljuhani

Abstract:

The Mining Engineering Department at College of Engineering in NBU is under establishment. It is essential to establish such department in NBU. This is because, it is the only university in the region. Moreover, the mining industry is very active in the northern borders region. However, there is no mining engineering department in KSA except one in King Abdulziz University, which is 1400 km from the mining industry in the northern borders. As a result, department graduates from KAU find difficulties to get suitable jobs in their specialization in spite of their few numbers graduated per year and the presence of many jobs vacancies at the local mining sector. Therefore, the objectives of this research are to identify, measure and analyze the above mentioned problem from educational point of view. One more objective is to add a contribution towards solving such vital, society affecting problem. For achieving the first task of the research, that is problem size identification and analyses, a questionnaire was designed. The questionnaire was directed towards experienced engineers, in the mining and related industries, including the ministry of petroleum and minerals, Saudi Geological Survey, and Ma’aden Company as being prospective employers for the mining sector. The questionnaire target was to evaluate the Saudi mining engineers from an industrial point of view and to detect the main reasons behind their failure to find jobs. In addition, the study focuses in the demand of mining engineers in the northern borders region. Moreover, the study plan of the suggested department is designed based on the requirements of the mining industry. The feedback received from the industry reflected major educational shortcomings. In order to overcome the revealed defects, the second objective of the research was achieved where a suggested study plan “curriculum” has been prepared to take into consideration all the points of weakness so as to improve the graduates’ quality to fit the local mining work market.

Keywords: mining engineering, labor market, qualifications, curriculum, mining industry, mining engineers

Procedia PDF Downloads 241
37328 The Impact of Gold Mining on Disability: Experiences from the Obuasi Municipal Area

Authors: Mavis Yaa Konadu Agyemang

Abstract:

Despite provisions to uphold and safeguard the rights of persons with disability in Ghana, there is evidence that they still encounter several challenges which limit their full and effective involvement in mainstream society, including the gold mining sector. The study sought to explore how persons with physical disability (PWPDs) experience gold mining in the Obuasi Municipal Area. A qualitative research design was used to discover and understand the experiences of PWPDs regarding mining. The purposive sampling technique was used to select five key informants for the study with the age range of (24-52 years) while snowball sampling aided the selection of 16 persons with various forms of physical disability with the age range of (24-60 years). In-depth interviews were used to gather data. The interviews lasted from forty-five minutes to an hour. In relation to the setting, the interviews of thirteen (13) of the participants with disability were done in their houses, two (2) were done on the phone, and one (1) was done in the office. Whereas the interviews of the five (5) key informants were all done in their offices. Data were analyzed using Creswell’s (2009) concept of thematic analysis. The findings suggest that even though land degradation affected everyone in the area, persons with mobility and visual impairment experienced many difficulties trekking the undulating land for long distances in search of arable land. Also, although mining activities are mostly labour-intensive, PWPDs were not employed even in areas where they could work. Further, the cost of items, in general, was high, affecting PWPDs more due to their economic immobility and paying for other sources of water due to land degradation and water pollution. The study also discovered that the peculiar conditions of PWPDs were not factored into compensation payments, and neither were females with physical disability engaged in compensation negotiations. Also, although some of the infrastructure provided by the gold mining companies in the area was physically accessible to some extent, it was not accessible in terms of information delivery. There is a need to educate the public on the effects of mining on PWPDs, their needs as well as disability issues in general. The Minerals and Mining Act (703) should be amended to include provisions that would consider the peculiar needs of PWPDs in compensation payment.

Keywords: mining, resettlement, compensation, environmental, social, disability

Procedia PDF Downloads 24
37327 Data Recording for Remote Monitoring of Autonomous Vehicles

Authors: Rong-Terng Juang

Abstract:

Autonomous vehicles offer the possibility of significant benefits to social welfare. However, fully automated cars might not be going to happen in the near further. To speed the adoption of the self-driving technologies, many governments worldwide are passing laws requiring data recorders for the testing of autonomous vehicles. Currently, the self-driving vehicle, (e.g., shuttle bus) has to be monitored from a remote control center. When an autonomous vehicle encounters an unexpected driving environment, such as road construction or an obstruction, it should request assistance from a remote operator. Nevertheless, large amounts of data, including images, radar and lidar data, etc., have to be transmitted from the vehicle to the remote center. Therefore, this paper proposes a data compression method of in-vehicle networks for remote monitoring of autonomous vehicles. Firstly, the time-series data are rearranged into a multi-dimensional signal space. Upon the arrival, for controller area networks (CAN), the new data are mapped onto a time-data two-dimensional space associated with the specific CAN identity. Secondly, the data are sampled based on differential sampling. Finally, the whole set of data are encoded using existing algorithms such as Huffman, arithmetic and codebook encoding methods. To evaluate system performance, the proposed method was deployed on an in-house built autonomous vehicle. The testing results show that the amount of data can be reduced as much as 1/7 compared to the raw data.

Keywords: autonomous vehicle, data compression, remote monitoring, controller area networks (CAN), Lidar

Procedia PDF Downloads 131
37326 Development and Management of Integrated Mineral Resource Policy for Environmental Sustainability: The Mindanao Experience, the Philippines

Authors: Davidson E. Egirani, Nanfe R. Poyi, Napoleon Wessey

Abstract:

This paper would report the environmental challenges faced by stakeholders in the development and management of mineral resources in Mindanao mining region of the Philippines. The paper would proffer solutions via the development and management of integrated mineral resource framework. This is by interfacing the views of government, operating mining companies and the mining host communities. The project methods involved the desktop review of existing local, regional, national environmental and mining legislation. This was followed up with visits to mining sites and discussions were held with stakeholders in the mineral sector. The findings from a 2-year investigation would reveal lack of information, education, and communication campaign by stakeholders on environmental, health, political, and social issues in the mining industry. Small-scale miners lack the professional muscles for a balance shift of emphasis to sustainable and responsible mining to avoid environmental degradation and human health effect. Therefore, there is a need to balance ecological requirements, sustainability of the environment and development of mineral resources. This paper would provide an environmentally friendly mineral resource development framework.

Keywords: ecological requirements, environmental degradation, human health, mining legislation, responsible mining

Procedia PDF Downloads 100
37325 The Predictive Value of Serum Bilirubin in the Post-Transplant De Novo Malignancy: A Data Mining Approach

Authors: Nasim Nosoudi, Amir Zadeh, Hunter White, Joshua Conrad, Joon W. Shim

Abstract:

De novo Malignancy has become one of the major causes of death after transplantation, so early cancer diagnosis and detection can drastically improve survival rates post-transplantation. Most previous work focuses on using artificial intelligence (AI) to predict transplant success or failure outcomes. In this work, we focused on predicting de novo malignancy after liver transplantation using AI. We chose the patients that had malignancy after liver transplantation with no history of malignancy pre-transplant. Their donors were cancer-free as well. We analyzed 254,200 patient profiles with post-transplant malignancy from the US Organ Procurement and Transplantation Network (OPTN). Several popular data mining methods were applied to the resultant dataset to build predictive models to characterize de novo malignancy after liver transplantation. Recipient's bilirubin, creatinine, weight, gender, number of days recipient was on the transplant waiting list, Epstein Barr Virus (EBV), International normalized ratio (INR), and ascites are among the most important factors affecting de novo malignancy after liver transplantation

Keywords: De novo malignancy, bilirubin, data mining, transplantation

Procedia PDF Downloads 74
37324 Localization of Geospatial Events and Hoax Prediction in the UFO Database

Authors: Harish Krishnamurthy, Anna Lafontant, Ren Yi

Abstract:

Unidentified Flying Objects (UFOs) have been an interesting topic for most enthusiasts and hence people all over the United States report such findings online at the National UFO Report Center (NUFORC). Some of these reports are a hoax and among those that seem legitimate, our task is not to establish that these events confirm that they indeed are events related to flying objects from aliens in outer space. Rather, we intend to identify if the report was a hoax as was identified by the UFO database team with their existing curation criterion. However, the database provides a wealth of information that can be exploited to provide various analyses and insights such as social reporting, identifying real-time spatial events and much more. We perform analysis to localize these time-series geospatial events and correlate with known real-time events. This paper does not confirm any legitimacy of alien activity, but rather attempts to gather information from likely legitimate reports of UFOs by studying the online reports. These events happen in geospatial clusters and also are time-based. We look at cluster density and data visualization to search the space of various cluster realizations to decide best probable clusters that provide us information about the proximity of such activity. A random forest classifier is also presented that is used to identify true events and hoax events, using the best possible features available such as region, week, time-period and duration. Lastly, we show the performance of the scheme on various days and correlate with real-time events where one of the UFO reports strongly correlates to a missile test conducted in the United States.

Keywords: time-series clustering, feature extraction, hoax prediction, geospatial events

Procedia PDF Downloads 345
37323 Investigating Dynamic Transition Process of Issues Using Unstructured Text Analysis

Authors: Myungsu Lim, William Xiu Shun Wong, Yoonjin Hyun, Chen Liu, Seongi Choi, Dasom Kim, Namgyu Kim

Abstract:

The amount of real-time data generated through various mass media has been increasing rapidly. In this study, we had performed topic analysis by using the unstructured text data that is distributed through news article. As one of the most prevalent applications of topic analysis, the issue tracking technique investigates the changes of the social issues that identified through topic analysis. Currently, traditional issue tracking is conducted by identifying the main topics of documents that cover an entire period at the same time and analyzing the occurrence of each topic by the period of occurrence. However, this traditional issue tracking approach has limitation that it cannot discover dynamic mutation process of complex social issues. The purpose of this study is to overcome the limitations of the existing issue tracking method. We first derived core issues of each period, and then discover the dynamic mutation process of various issues. In this study, we further analyze the mutation process from the perspective of the issues categories, in order to figure out the pattern of issue flow, including the frequency and reliability of the pattern. In other words, this study allows us to understand the components of the complex issues by tracking the dynamic history of issues. This methodology can facilitate a clearer understanding of complex social phenomena by providing mutation history and related category information of the phenomena.

Keywords: Data Mining, Issue Tracking, Text Mining, topic Analysis, topic Detection, Trend Detection

Procedia PDF Downloads 377
37322 Phillips Curve Estimation in an Emerging Economy: Evidence from Sub-National Data of Indonesia

Authors: Harry Aginta

Abstract:

Using Phillips curve framework, this paper seeks for new empirical evidence on the relationship between inflation and output in a major emerging economy. By exploiting sub-national data, the contribution of this paper is threefold. First, it resolves the issue of using on-target national inflation rates that potentially causes weakening inflation-output nexus. This is very relevant for Indonesia as its central bank has been adopting inflation targeting framework based on national consumer price index (CPI) inflation. Second, the study tests the relevance of mining sector in output gap estimation. The test for mining sector is important to control for the effects of mining regulation and nominal effects of coal prices on real economic activities. Third, the paper applies panel econometric method by incorporating regional variation that help to improve model estimation. The results from this paper confirm the strong presence of Phillips curve in Indonesia. Positive output gap that reflects excess demand condition gives rise to the inflation rates. In addition, the elasticity of output gap is higher if the mining sector is excluded from output gap estimation. In addition to inflation adaptation, the dynamics of exchange rate and international commodity price are also found to affect inflation significantly. The results are robust to the alternative measurement of output gap

Keywords: Phillips curve, inflation, Indonesia, panel data

Procedia PDF Downloads 95
37321 Modeling Activity Pattern Using XGBoost for Mining Smart Card Data

Authors: Eui-Jin Kim, Hasik Lee, Su-Jin Park, Dong-Kyu Kim

Abstract:

Smart-card data are expected to provide information on activity pattern as an alternative to conventional person trip surveys. The focus of this study is to propose a method for training the person trip surveys to supplement the smart-card data that does not contain the purpose of each trip. We selected only available features from smart card data such as spatiotemporal information on the trip and geographic information system (GIS) data near the stations to train the survey data. XGboost, which is state-of-the-art tree-based ensemble classifier, was used to train data from multiple sources. This classifier uses a more regularized model formalization to control the over-fitting and show very fast execution time with well-performance. The validation results showed that proposed method efficiently estimated the trip purpose. GIS data of station and duration of stay at the destination were significant features in modeling trip purpose.

Keywords: activity pattern, data fusion, smart-card, XGboost

Procedia PDF Downloads 214
37320 A General Framework for Knowledge Discovery from Echocardiographic and Natural Images

Authors: S. Nandagopalan, N. Pradeep

Abstract:

The aim of this paper is to propose a general framework for storing, analyzing, and extracting knowledge from two-dimensional echocardiographic images, color Doppler images, non-medical images, and general data sets. A number of high performance data mining algorithms have been used to carry out this task. Our framework encompasses four layers namely physical storage, object identification, knowledge discovery, user level. Techniques such as active contour model to identify the cardiac chambers, pixel classification to segment the color Doppler echo image, universal model for image retrieval, Bayesian method for classification, parallel algorithms for image segmentation, etc., were employed. Using the feature vector database that have been efficiently constructed, one can perform various data mining tasks like clustering, classification, etc. with efficient algorithms along with image mining given a query image. All these facilities are included in the framework that is supported by state-of-the-art user interface (UI). The algorithms were tested with actual patient data and Coral image database and the results show that their performance is better than the results reported already.

Keywords: active contour, Bayesian, echocardiographic image, feature vector

Procedia PDF Downloads 416
37319 Reduction of Plants Biodiversity in Hyrcanian Forest by Coal Mining Activities

Authors: Mahsa Tavakoli, Seyed Mohammad Hojjati, Yahya Kooch

Abstract:

Considering that coal mining is one of the important industrial activities, it may cause damages to environment. According to the author’s best knowledge, the effect of traditional coal mining activities on plant biodiversity has not been investigated in the Hyrcanian forests. Therefore, in this study, the effect of coal mining activities on vegetation and tree diversity was investigated in Hyrcanian forest, North Iran. After filed visiting and determining the mine, 16 plots (20×20 m2) were established by systematic-randomly (60×60 m2) in an area of 4 ha (200×200 m2-mine entrance placed at center). An area adjacent to the mine was not affected by the mining activity, and it is considered as the control area. In each plot, the data about trees such as number and type of species were recorded. The biodiversity of vegetation cover was considered 5 square sub-plots (1 m2) in each plot. PAST software and Ecological Methodology were used to calculate Biodiversity indices. The value of Shannon Wiener and Simpson diversity indices for tree cover in control area (1.04±0.34 and 0.62±0.20) was significantly higher than mining area (0.78±0.27 and 0.45±0.14). The value of evenness indices for tree cover in the mining area was significantly lower than that of the control area. The value of Shannon Wiener and Simpson diversity indices for vegetation cover in the control area (1.37±0.06 and 0.69±0.02) was significantly higher than the mining area (1.02±0.13 and 0.50±0.07). The value of evenness index in the control area was significantly higher than the mining area. Plant communities are a good indicator of the changes in the site. Study about changes in vegetation biodiversity and plant dynamics in the degraded land can provide necessary information for forest management and reforestation of these areas.

Keywords: vegetation biodiversity, species composition, traditional coal mining, Caspian forest

Procedia PDF Downloads 149
37318 Review of Friction Stir Welding of Dissimilar 5000 and 6000 Series Aluminum Alloy Plates

Authors: K. Subbaiah

Abstract:

Friction stir welding is a solid state welding process. Friction stir welding process eliminates the defects found in fusion welding processes. It is environmentally friend process. 5000 and 6000 series aluminum alloys are widely used in the transportation industries. The Al-Mg-Mn (5000) and Al-Mg-Si (6000) alloys are preferably offer best combination of use in Marine construction. The medium strength and high corrosion resistant 5000 series alloys are the aluminum alloys, which are found maximum utility in the world. In this review, the tool pin profile, process parameters such as hardness, yield strength and tensile strength, and microstructural evolution of friction stir welding of Al-Mg alloys 5000 Series and 6000 series have been discussed.

Keywords: 5000 series and 6000 series Al alloys, friction stir welding, tool pin profile, microstructure and properties

Procedia PDF Downloads 429
37317 Arabic Light Stemmer for Better Search Accuracy

Authors: Sahar Khedr, Dina Sayed, Ayman Hanafy

Abstract:

Arabic is one of the most ancient and critical languages in the world. It has over than 250 million Arabic native speakers and more than twenty countries having Arabic as one of its official languages. In the past decade, we have witnessed a rapid evolution in smart devices, social network and technology sector which led to the need to provide tools and libraries that properly tackle the Arabic language in different domains. Stemming is one of the most crucial linguistic fundamentals. It is used in many applications especially in information extraction and text mining fields. The motivation behind this work is to enhance the Arabic light stemmer to serve the data mining industry and leverage it in an open source community. The presented implementation works on enhancing the Arabic light stemmer by utilizing and enhancing an algorithm that provides an extension for a new set of rules and patterns accompanied by adjusted procedure. This study has proven a significant enhancement for better search accuracy with an average 10% improvement in comparison with previous works.

Keywords: Arabic data mining, Arabic Information extraction, Arabic Light stemmer, Arabic stemmer

Procedia PDF Downloads 280
37316 Forecasting Model to Predict Dengue Incidence in Malaysia

Authors: W. H. Wan Zakiyatussariroh, A. A. Nasuhar, W. Y. Wan Fairos, Z. A. Nazatul Shahreen

Abstract:

Forecasting dengue incidence in a population can provide useful information to facilitate the planning of the public health intervention. Many studies on dengue cases in Malaysia were conducted but are limited in modeling the outbreak and forecasting incidence. This article attempts to propose the most appropriate time series model to explain the behavior of dengue incidence in Malaysia for the purpose of forecasting future dengue outbreaks. Several seasonal auto-regressive integrated moving average (SARIMA) models were developed to model Malaysia’s number of dengue incidence on weekly data collected from January 2001 to December 2011. SARIMA (2,1,1)(1,1,1)52 model was found to be the most suitable model for Malaysia’s dengue incidence with the least value of Akaike information criteria (AIC) and Bayesian information criteria (BIC) for in-sample fitting. The models further evaluate out-sample forecast accuracy using four different accuracy measures. The results indicate that SARIMA (2,1,1)(1,1,1)52 performed well for both in-sample fitting and out-sample evaluation.

Keywords: time series modeling, Box-Jenkins, SARIMA, forecasting

Procedia PDF Downloads 447
37315 Opinion Mining and Sentiment Analysis on DEFT

Authors: Najiba Ouled Omar, Azza Harbaoui, Henda Ben Ghezala

Abstract:

Current research practices sentiment analysis with a focus on social networks, DEfi Fouille de Texte (DEFT) (Text Mining Challenge) evaluation campaign focuses on opinion mining and sentiment analysis on social networks, especially social network Twitter. It aims to confront the systems produced by several teams from public and private research laboratories. DEFT offers participants the opportunity to work on regularly renewed themes and proposes to work on opinion mining in several editions. The purpose of this article is to scrutinize and analyze the works relating to opinions mining and sentiment analysis in the Twitter social network realized by DEFT. It examines the tasks proposed by the organizers of the challenge and the methods used by the participants.

Keywords: opinion mining, sentiment analysis, emotion, polarity, annotation, OSEE, figurative language, DEFT, Twitter, Tweet

Procedia PDF Downloads 110
37314 Real-Time Visualization Using GPU-Accelerated Filtering of LiDAR Data

Authors: Sašo Pečnik, Borut Žalik

Abstract:

This paper presents a real-time visualization technique and filtering of classified LiDAR point clouds. The visualization is capable of displaying filtered information organized in layers by the classification attribute saved within LiDAR data sets. We explain the used data structure and data management, which enables real-time presentation of layered LiDAR data. Real-time visualization is achieved with LOD optimization based on the distance from the observer without loss of quality. The filtering process is done in two steps and is entirely executed on the GPU and implemented using programmable shaders.

Keywords: filtering, graphics, level-of-details, LiDAR, real-time visualization

Procedia PDF Downloads 272
37313 Grid and Market Integration of Large Scale Wind Farms using Advanced Predictive Data Mining Techniques

Authors: Umit Cali

Abstract:

The integration of intermittent energy sources like wind farms into the electricity grid has become an important challenge for the utilization and control of electric power systems, because of the fluctuating behaviour of wind power generation. Wind power predictions improve the economic and technical integration of large amounts of wind energy into the existing electricity grid. Trading, balancing, grid operation, controllability and safety issues increase the importance of predicting power output from wind power operators. Therefore, wind power forecasting systems have to be integrated into the monitoring and control systems of the transmission system operator (TSO) and wind farm operators/traders. The wind forecasts are relatively precise for the time period of only a few hours, and, therefore, relevant with regard to Spot and Intraday markets. In this work predictive data mining techniques are applied to identify a statistical and neural network model or set of models that can be used to predict wind power output of large onshore and offshore wind farms. These advanced data analytic methods helps us to amalgamate the information in very large meteorological, oceanographic and SCADA data sets into useful information and manageable systems. Accurate wind power forecasts are beneficial for wind plant operators, utility operators, and utility customers. An accurate forecast allows grid operators to schedule economically efficient generation to meet the demand of electrical customers. This study is also dedicated to an in-depth consideration of issues such as the comparison of day ahead and the short-term wind power forecasting results, determination of the accuracy of the wind power prediction and the evaluation of the energy economic and technical benefits of wind power forecasting.

Keywords: renewable energy sources, wind power, forecasting, data mining, big data, artificial intelligence, energy economics, power trading, power grids

Procedia PDF Downloads 482
37312 A Long Short-Term Memory Based Deep Learning Model for Corporate Bond Price Predictions

Authors: Vikrant Gupta, Amrit Goswami

Abstract:

The fixed income market forms the basis of the modern financial market. All other assets in financial markets derive their value from the bond market. Owing to its over-the-counter nature, corporate bonds have relatively less data publicly available and thus is researched upon far less compared to Equities. Bond price prediction is a complex financial time series forecasting problem and is considered very crucial in the domain of finance. The bond prices are highly volatile and full of noise which makes it very difficult for traditional statistical time-series models to capture the complexity in series patterns which leads to inefficient forecasts. To overcome the inefficiencies of statistical models, various machine learning techniques were initially used in the literature for more accurate forecasting of time-series. However, simple machine learning methods such as linear regression, support vectors, random forests fail to provide efficient results when tested on highly complex sequences such as stock prices and bond prices. hence to capture these intricate sequence patterns, various deep learning-based methodologies have been discussed in the literature. In this study, a recurrent neural network-based deep learning model using long short term networks for prediction of corporate bond prices has been discussed. Long Short Term networks (LSTM) have been widely used in the literature for various sequence learning tasks in various domains such as machine translation, speech recognition, etc. In recent years, various studies have discussed the effectiveness of LSTMs in forecasting complex time-series sequences and have shown promising results when compared to other methodologies. LSTMs are a special kind of recurrent neural networks which are capable of learning long term dependencies due to its memory function which traditional neural networks fail to capture. In this study, a simple LSTM, Stacked LSTM and a Masked LSTM based model has been discussed with respect to varying input sequences (three days, seven days and 14 days). In order to facilitate faster learning and to gradually decompose the complexity of bond price sequence, an Empirical Mode Decomposition (EMD) has been used, which has resulted in accuracy improvement of the standalone LSTM model. With a variety of Technical Indicators and EMD decomposed time series, Masked LSTM outperformed the other two counterparts in terms of prediction accuracy. To benchmark the proposed model, the results have been compared with traditional time series models (ARIMA), shallow neural networks and above discussed three different LSTM models. In summary, our results show that the use of LSTM models provide more accurate results and should be explored more within the asset management industry.

Keywords: bond prices, long short-term memory, time series forecasting, empirical mode decomposition

Procedia PDF Downloads 105
37311 A General Framework for Knowledge Discovery Using High Performance Machine Learning Algorithms

Authors: S. Nandagopalan, N. Pradeep

Abstract:

The aim of this paper is to propose a general framework for storing, analyzing, and extracting knowledge from two-dimensional echocardiographic images, color Doppler images, non-medical images, and general data sets. A number of high performance data mining algorithms have been used to carry out this task. Our framework encompasses four layers namely physical storage, object identification, knowledge discovery, user level. Techniques such as active contour model to identify the cardiac chambers, pixel classification to segment the color Doppler echo image, universal model for image retrieval, Bayesian method for classification, parallel algorithms for image segmentation, etc., were employed. Using the feature vector database that have been efficiently constructed, one can perform various data mining tasks like clustering, classification, etc. with efficient algorithms along with image mining given a query image. All these facilities are included in the framework that is supported by state-of-the-art user interface (UI). The algorithms were tested with actual patient data and Coral image database and the results show that their performance is better than the results reported already.

Keywords: active contour, bayesian, echocardiographic image, feature vector

Procedia PDF Downloads 389
37310 High Performance Computing and Big Data Analytics

Authors: Branci Sarra, Branci Saadia

Abstract:

Because of the multiplied data growth, many computer science tools have been developed to process and analyze these Big Data. High-performance computing architectures have been designed to meet the treatment needs of Big Data (view transaction processing standpoint, strategic, and tactical analytics). The purpose of this article is to provide a historical and global perspective on the recent trend of high-performance computing architectures especially what has a relation with Analytics and Data Mining.

Keywords: high performance computing, HPC, big data, data analysis

Procedia PDF Downloads 484
37309 Determining Abnomal Behaviors in UAV Robots for Trajectory Control in Teleoperation

Authors: Kiwon Yeom

Abstract:

Change points are abrupt variations in a data sequence. Detection of change points is useful in modeling, analyzing, and predicting time series in application areas such as robotics and teleoperation. In this paper, a change point is defined to be a discontinuity in one of its derivatives. This paper presents a reliable method for detecting discontinuities within a three-dimensional trajectory data. The problem of determining one or more discontinuities is considered in regular and irregular trajectory data from teleoperation. We examine the geometric detection algorithm and illustrate the use of the method on real data examples.

Keywords: change point, discontinuity, teleoperation, abrupt variation

Procedia PDF Downloads 137
37308 Comparing Performance of Neural Network and Decision Tree in Prediction of Myocardial Infarction

Authors: Reza Safdari, Goli Arji, Robab Abdolkhani Maryam zahmatkeshan

Abstract:

Background and purpose: Cardiovascular diseases are among the most common diseases in all societies. The most important step in minimizing myocardial infarction and its complications is to minimize its risk factors. The amount of medical data is increasingly growing. Medical data mining has a great potential for transforming these data into information. Using data mining techniques to generate predictive models for identifying those at risk for reducing the effects of the disease is very helpful. The present study aimed to collect data related to risk factors of heart infarction from patients’ medical record and developed predicting models using data mining algorithm. Methods: The present work was an analytical study conducted on a database containing 350 records. Data were related to patients admitted to Shahid Rajaei specialized cardiovascular hospital, Iran, in 2011. Data were collected using a four-sectioned data collection form. Data analysis was performed using SPSS and Clementine version 12. Seven predictive algorithms and one algorithm-based model for predicting association rules were applied to the data. Accuracy, precision, sensitivity, specificity, as well as positive and negative predictive values were determined and the final model was obtained. Results: five parameters, including hypertension, DLP, tobacco smoking, diabetes, and A+ blood group, were the most critical risk factors of myocardial infarction. Among the models, the neural network model was found to have the highest sensitivity, indicating its ability to successfully diagnose the disease. Conclusion: Risk prediction models have great potentials in facilitating the management of a patient with a specific disease. Therefore, health interventions or change in their life style can be conducted based on these models for improving the health conditions of the individuals at risk.

Keywords: decision trees, neural network, myocardial infarction, Data Mining

Procedia PDF Downloads 404
37307 Missing Link Data Estimation with Recurrent Neural Network: An Application Using Speed Data of Daegu Metropolitan Area

Authors: JaeHwan Yang, Da-Woon Jeong, Seung-Young Kho, Dong-Kyu Kim

Abstract:

In terms of ITS, information on link characteristic is an essential factor for plan or operation. But in practical cases, not every link has installed sensors on it. The link that does not have data on it is called “Missing Link”. The purpose of this study is to impute data of these missing links. To get these data, this study applies the machine learning method. With the machine learning process, especially for the deep learning process, missing link data can be estimated from present link data. For deep learning process, this study uses “Recurrent Neural Network” to take time-series data of road. As input data, Dedicated Short-range Communications (DSRC) data of Dalgubul-daero of Daegu Metropolitan Area had been fed into the learning process. Neural Network structure has 17 links with present data as input, 2 hidden layers, for 1 missing link data. As a result, forecasted data of target link show about 94% of accuracy compared with actual data.

Keywords: data estimation, link data, machine learning, road network

Procedia PDF Downloads 484