Search results for: frequent item sets mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3698

Search results for: frequent item sets mining

3398 Location-Domination on Join of Two Graphs and Their Complements

Authors: Analen Malnegro, Gina Malacas

Abstract:

Dominating sets and related topics have been studied extensively in the past few decades. A dominating set of a graph G is a subset D of V such that every vertex not in D is adjacent to at least one member of D. The domination number γ(G) is the number of vertices in a smallest dominating set for G. Some problems involving detection devices can be modeled with graphs. Finding the minimum number of devices needed according to the type of devices and the necessity of locating the object gives rise to locating-dominating sets. A subset S of vertices of a graph G is called locating-dominating set, LD-set for short, if it is a dominating set and if every vertex v not in S is uniquely determined by the set of neighbors of v belonging to S. The location-domination number λ(G) is the minimum cardinality of an LD-set for G. The complement of a graph G is a graph Ḡ on same vertices such that two distinct vertices of Ḡ are adjacent if and only if they are not adjacent in G. An LD-set of a graph G is global if it is an LD-set of both G and its complement Ḡ. The global location-domination number λg(G) is defined as the minimum cardinality of a global LD-set of G. In this paper, global LD-sets on the join of two graphs are characterized. Global location-domination numbers of these graphs are also determined.

Keywords: dominating set, global locating-dominating set, global location-domination number, locating-dominating set, location-domination number

Procedia PDF Downloads 184
3397 A New Learning Automata-Based Algorithm to the Priority-Based Target Coverage Problem in Directional Sensor Networks

Authors: Shaharuddin Salleh, Sara Marouf, Hosein Mohammadi

Abstract:

Directional sensor networks (DSNs) have recently attracted a great deal of attention due to their extensive applications in a wide range of situations. One of the most important problems associated with DSNs is covering a set of targets in a given area and, at the same time, maximizing the network lifetime. This is due to limitation in sensing angle and battery power of the directional sensors. This problem gets more complicated by the possibility that targets may have different coverage requirements. In the present study, this problem is referred to as priority-based target coverage (PTC). As sensors are often densely deployed, organizing the sensors into several cover sets and then activating these cover sets successively is a promising solution to this problem. In this paper, we propose a learning automata-based algorithm to organize the directional sensors into several cover sets in such a way that each cover set could satisfy coverage requirements of all the targets. Several experiments are conducted to evaluate the performance of the proposed algorithm. The results demonstrated that the algorithms were able to contribute to solving the problem.

Keywords: directional sensor networks, target coverage problem, cover set formation, learning automata

Procedia PDF Downloads 415
3396 Mining in Peru and Local Governance: Assessing the Contribution of CRS Projects

Authors: Sandra Carrillo Hoyos

Abstract:

Mining activities in South America have significantly grown during the last decades, given the abundance of natural resources, the implemented governmental policies to incentivize foreign investment as well as the boom in international prices for metals and oil between 2002 and 2008. While this context allowed the region to occupy a leading position between the top producers of minerals around the world, it has also meant an increase in socio-environmental conflicts which have generated costs and negative impacts not only for the companies but especially for the governments and local communities.During the latest decade, the mining sector in Peru has faced with the social resistance of a large number of communities, which began organizing actions against the implementation of high investing projects. The dissatisfaction has derived in the prevalence of socio-environmental conflicts associated with mining activities, some of them never solved into an agreement. In order to prevent those socio-environmental conflicts and obtain the social license from local communities, most of the mining companies have developed diverse initiatives within the framework of policies and practices of corporate social responsibility (CSR). This paper has assessed the mining sector’s contribution toward the local development management along the last decade, as part of CSR strategies as well as the policies promoted by the Peruvian State. This assessment found that, in the beginning, these initiatives have been based on a philanthropic approach and were reacting to pressures from local stakeholders to maintain the consent to operate from the surrounding communities as well as to create, as a result, a harmonious atmosphere for operations. Due to the weak State presence, such practices have increased the expectations of communities related to the participation of mining companies in solving structural development problems, especially those related to primary needs, infrastructure, education, health, among others. In other words, this paper was focused on analyze in what extent these initiatives have promoted local empowerment for development planning and integrated management of natural resources from a territorial approach. From this perspective, the analysis demonstrates that, while the design and planning of social investment initiatives have improved due to the sector´s sustainability approach, many companies have developed actions beyond their competence during this process. In some cases, the referenced actions have generated dependency with communities, even though this relationship has not exempted the companies of conflict situations with unfortunate consequences. Furthermore, the social programs developed have not necessarily generated a significant impact in improving the quality of life of affected populations. In fact, it is possible to identify that those regions with high mining resources and investment are facing with a situation of poverty and high dependency on mining production. In spite of the revenues derived from mining industry, local governments have not been able to translate the royalties into sustainable development opportunities. For this reason, the proposed paper suggests some challenges for the mining sector contribution to local development based on the best practices and lessons learnt from a benchmarking for the leading mining companies.

Keywords: corporate social responsibility, local development, mining, socio-environmental conflict

Procedia PDF Downloads 408
3395 A Study on the Different Components of a Typical Back-Scattered Chipless RFID Tag Reflection

Authors: Fatemeh Babaeian, Nemai Chandra Karmakar

Abstract:

Chipless RFID system is a wireless system for tracking and identification which use passive tags for encoding data. The advantage of using chipless RFID tag is having a planar tag which is printable on different low-cost materials like paper and plastic. The printed tag can be attached to different items in the labelling level. Since the price of chipless RFID tag can be as low as a fraction of a cent, this technology has the potential to compete with the conventional optical barcode labels. However, due to the passive structure of the tag, data processing of the reflection signal is a crucial challenge. The captured reflected signal from a tag attached to an item consists of different components which are the reflection from the reader antenna, the reflection from the item, the tag structural mode RCS component and the antenna mode RCS of the tag. All these components are summed up in both time and frequency domains. The effect of reflection from the item and the structural mode RCS component can distort/saturate the frequency domain signal and cause difficulties in extracting the desired component which is the antenna mode RCS. Therefore, it is required to study the reflection of the tag in both time and frequency domains to have a better understanding of the nature of the captured chipless RFID signal. The other benefits of this study can be to find an optimised encoding technique in tag design level and to find the best processing algorithm the chipless RFID signal in decoding level. In this paper, the reflection from a typical backscattered chipless RFID tag with six resonances is analysed, and different components of the signal are separated in both time and frequency domains. Moreover, the time domain signal corresponding to each resonator of the tag is studied. The data for this processing was captured from simulation in CST Microwave Studio 2017. The outcome of this study is understanding different components of a measured signal in a chipless RFID system and a discovering a research gap which is a need to find an optimum detection algorithm for tag ID extraction.

Keywords: antenna mode RCS, chipless RFID tag, resonance, structural mode RCS

Procedia PDF Downloads 200
3394 Factor Structure of the University of California, Los Angeles (UCLA) Loneliness Scale: Gender, Age, and Marital Status Differences

Authors: Hamzeh Dodeen

Abstract:

This study aims at examining the effects of item wording effects on the factor structure of the University of California, Los Angeles (UCLA) Loneliness Scale: gender, age, and marital status differences. A total of 2374 persons from the UAE participated, representing six different populations (teenagers/elderly, males/females, and married/unmarried). The results of the exploratory factor analysis using principal axis factoring with (oblique) rotation revealed that two factors were extracted from the 20 items of the scale. The nine positively worded items were highly loaded on the first factor, while 10 out of the 11 negatively worded items were highly loaded on the second factor. The two-factor solution was confirmed on the six different populations based on age, gender, and marital status. It has been concluded that the rating of the UCLA scale is affected by a response style related to the item wording.

Keywords: UCLA Loneliness Scale, loneliness, positively worded items, factor structure, negatively worded items

Procedia PDF Downloads 354
3393 Lead and Cadmium Spatial Pattern and Risk Assessment around Coal Mine in Hyrcanian Forest, North Iran

Authors: Mahsa Tavakoli, Seyed Mohammad Hojjati, Yahya Kooch

Abstract:

In this study, the effect of coal mining activities on lead and cadmium concentrations and distribution in soil was investigated in Hyrcanian forest, North Iran. 16 plots (20×20 m2) were established by systematic-randomly (60×60 m2) in an area of 4 ha (200×200 m2-mine entrance placed at center). An area adjacent to the mine was not affected by the mining activity; considered as the controlled area. In order to investigate soil lead and cadmium concentration, one sample was taken from the 0-10 cm in each plot. To study the spatial pattern of soil properties and lead and cadmium concentrations in the mining area, an area of 80×80m2 (the mine as the center) was considered and 80 soil samples were systematic-randomly taken (10 m intervals). Geostatistical analysis was performed via Kriging method and GS+ software (version 5.1). In order to estimate the impact of coal mining activities on soil quality, pollution index was measured. Lead and cadmium concentrations were significantly higher in mine area (Pb: 10.97±0.30, Cd: 184.47±6.26 mg.kg-1) in comparison to control area (Pb: 9.42±0.17, Cd: 131.71±15.77 mg.kg-1). The mean values of the PI index indicate that Pb (1.16) and Cd (1.77) presented slightly polluted. Results of the NIPI index showed that Pb (1.44) and Cd (2.52) presented slight pollution and moderate pollution respectively. Results of variography and kriging method showed that it is possible to prepare interpolation maps of lead and cadmium around the mining areas in Hyrcanian forest. According to results of pollution and risk assessments, forest soil was contaminated by heavy metals (lead and cadmium); therefore, using reclamation and remediation techniques in these areas is necessary.

Keywords: traditional coal mining, heavy metals, pollution indicators, geostatistics, Caspian forest

Procedia PDF Downloads 180
3392 Building Scalable and Accurate Hybrid Kernel Mapping Recommender

Authors: Hina Iqbal, Mustansar Ali Ghazanfar, Sandor Szedmak

Abstract:

Recommender systems uses artificial intelligence practices for filtering obscure information and can predict if a user likes a specified item. Kernel mapping Recommender systems have been proposed which are accurate and state-of-the-art algorithms and resolve recommender system’s design objectives such as; long tail, cold-start, and sparsity. The aim of research is to propose hybrid framework that can efficiently integrate different versions— namely item-based and user-based KMR— of KMR algorithm. We have proposed various heuristic algorithms that integrate different versions of KMR (into a unified framework) resulting in improved accuracy and elimination of problems associated with conventional recommender system. We have tested our system on publically available movies dataset and benchmark with KMR. The results (in terms of accuracy, precision, recall, F1 measure and ROC metrics) reveal that the proposed algorithm is quite accurate especially under cold-start and sparse scenarios.

Keywords: Kernel Mapping Recommender Systems, hybrid recommender systems, cold start, sparsity, long tail

Procedia PDF Downloads 341
3391 Analysis of Production Forecasting in Unconventional Gas Resources Development Using Machine Learning and Data-Driven Approach

Authors: Dongkwon Han, Sangho Kim, Sunil Kwon

Abstract:

Unconventional gas resources have dramatically changed the future energy landscape. Unlike conventional gas resources, the key challenges in unconventional gas have been the requirement that applies to advanced approaches for production forecasting due to uncertainty and complexity of fluid flow. In this study, artificial neural network (ANN) model which integrates machine learning and data-driven approach was developed to predict productivity in shale gas. The database of 129 wells of Eagle Ford shale basin used for testing and training of the ANN model. The Input data related to hydraulic fracturing, well completion and productivity of shale gas were selected and the output data is a cumulative production. The performance of the ANN using all data sets, clustering and variables importance (VI) models were compared in the mean absolute percentage error (MAPE). ANN model using all data sets, clustering, and VI were obtained as 44.22%, 10.08% (cluster 1), 5.26% (cluster 2), 6.35%(cluster 3), and 32.23% (ANN VI), 23.19% (SVM VI), respectively. The results showed that the pre-trained ANN model provides more accurate results than the ANN model using all data sets.

Keywords: unconventional gas, artificial neural network, machine learning, clustering, variables importance

Procedia PDF Downloads 196
3390 Quantile Coherence Analysis: Application to Precipitation Data

Authors: Yaeji Lim, Hee-Seok Oh

Abstract:

The coherence analysis measures the linear time-invariant relationship between two data sets and has been studied various fields such as signal processing, engineering, and medical science. However classical coherence analysis tends to be sensitive to outliers and focuses only on mean relationship. In this paper, we generalized cross periodogram to quantile cross periodogram and provide richer inter-relationship between two data sets. This is a general version of Laplace cross periodogram. We prove its asymptotic distribution under the long range process and compare them with ordinary coherence through numerical examples. We also present real data example to confirm the usefulness of quantile coherence analysis.

Keywords: coherence, cross periodogram, spectrum, quantile

Procedia PDF Downloads 392
3389 Study and Analysis of the Factors Affecting Road Safety Using Decision Tree Algorithms

Authors: Naina Mahajan, Bikram Pal Kaur

Abstract:

The purpose of traffic accident analysis is to find the possible causes of an accident. Road accidents cannot be totally prevented but by suitable traffic engineering and management the accident rate can be reduced to a certain extent. This paper discusses the classification techniques C4.5 and ID3 using the WEKA Data mining tool. These techniques use on the NH (National highway) dataset. With the C4.5 and ID3 technique it gives best results and high accuracy with less computation time and error rate.

Keywords: C4.5, ID3, NH(National highway), WEKA data mining tool

Procedia PDF Downloads 339
3388 Phillips Curve Estimation in an Emerging Economy: Evidence from Sub-National Data of Indonesia

Authors: Harry Aginta

Abstract:

Using Phillips curve framework, this paper seeks for new empirical evidence on the relationship between inflation and output in a major emerging economy. By exploiting sub-national data, the contribution of this paper is threefold. First, it resolves the issue of using on-target national inflation rates that potentially causes weakening inflation-output nexus. This is very relevant for Indonesia as its central bank has been adopting inflation targeting framework based on national consumer price index (CPI) inflation. Second, the study tests the relevance of mining sector in output gap estimation. The test for mining sector is important to control for the effects of mining regulation and nominal effects of coal prices on real economic activities. Third, the paper applies panel econometric method by incorporating regional variation that help to improve model estimation. The results from this paper confirm the strong presence of Phillips curve in Indonesia. Positive output gap that reflects excess demand condition gives rise to the inflation rates. In addition, the elasticity of output gap is higher if the mining sector is excluded from output gap estimation. In addition to inflation adaptation, the dynamics of exchange rate and international commodity price are also found to affect inflation significantly. The results are robust to the alternative measurement of output gap

Keywords: Phillips curve, inflation, Indonesia, panel data

Procedia PDF Downloads 125
3387 Research of the Three-Dimensional Visualization Geological Modeling of Mine Based on Surpac

Authors: Honggang Qu, Yong Xu, Rongmei Liu, Zhenji Gao, Bin Wang

Abstract:

Today's mining industry is advancing gradually toward digital and visual direction. The three-dimensional visualization geological modeling of mine is the digital characterization of mineral deposits and is one of the key technology of digital mining. Three-dimensional geological modeling is a technology that combines geological spatial information management, geological interpretation, geological spatial analysis and prediction, geostatistical analysis, entity content analysis and graphic visualization in a three-dimensional environment with computer technology and is used in geological analysis. In this paper, the three-dimensional geological modeling of an iron mine through the use of Surpac is constructed, and the weight difference of the estimation methods between the distance power inverse ratio method and ordinary kriging is studied, and the ore body volume and reserves are simulated and calculated by using these two methods. Compared with the actual mine reserves, its result is relatively accurate, so it provides scientific bases for mine resource assessment, reserve calculation, mining design and so on.

Keywords: three-dimensional geological modeling, geological database, geostatistics, block model

Procedia PDF Downloads 80
3386 Using Data Mining Technique for Scholarship Disbursement

Authors: J. K. Alhassan, S. A. Lawal

Abstract:

This work is on decision tree-based classification for the disbursement of scholarship. Tree-based data mining classification technique is used in other to determine the generic rule to be used to disburse the scholarship. The system based on the defined rules from the tree is able to determine the class (status) to which an applicant shall belong whether Granted or Not Granted. The applicants that fall to the class of granted denote a successful acquirement of scholarship while those in not granted class are unsuccessful in the scheme. An algorithm that can be used to classify the applicants based on the rules from tree-based classification was also developed. The tree-based classification is adopted because of its efficiency, effectiveness, and easy to comprehend features. The system was tested with the data of National Information Technology Development Agency (NITDA) Abuja, a Parastatal of Federal Ministry of Communication Technology that is mandated to develop and regulate information technology in Nigeria. The system was found working according to the specification. It is therefore recommended for all scholarship disbursement organizations.

Keywords: classification, data mining, decision tree, scholarship

Procedia PDF Downloads 378
3385 Assessing Carbon Stock and Sequestration of Reforestation Species on Old Mining Sites in Morocco Using the DNDC Model

Authors: Nabil Elkhatri, Mohamed Louay Metougui, Ngonidzashe Chirinda

Abstract:

Mining activities have left a legacy of degraded landscapes, prompting urgent efforts for ecological restoration. Reforestation holds promise as a potent tool to rehabilitate these old mining sites, with the potential to sequester carbon and contribute to climate change mitigation. This study focuses on evaluating the carbon stock and sequestration potential of reforestation species in the context of Morocco's mining areas, employing the DeNitrification-DeComposition (DNDC) model. The research is grounded in recognizing the need to connect theoretical models with practical implementation, ensuring that reforestation efforts are informed by accurate and context-specific data. Field data collection encompasses growth patterns, biomass accumulation, and carbon sequestration rates, establishing an empirical foundation for the study's analyses. By integrating the collected data with the DNDC model, the study aims to provide a comprehensive understanding of carbon dynamics within reforested ecosystems on old mining sites. The major findings reveal varying sequestration rates among different reforestation species, indicating the potential for species-specific optimization of reforestation strategies to enhance carbon capture. This research's significance lies in its potential to contribute to sustainable land management practices and climate change mitigation strategies. By quantifying the carbon stock and sequestration potential of reforestation species, the study serves as a valuable resource for policymakers, land managers, and practitioners involved in ecological restoration and carbon management. Ultimately, the study aligns with global objectives to rejuvenate degraded landscapes while addressing pressing climate challenges.

Keywords: carbon stock, carbon sequestration, DNDC model, ecological restoration, mining sites, Morocco, reforestation, sustainable land management.

Procedia PDF Downloads 77
3384 Using Textual Pre-Processing and Text Mining to Create Semantic Links

Authors: Ricardo Avila, Gabriel Lopes, Vania Vidal, Jose Macedo

Abstract:

This article offers a approach to the automatic discovery of semantic concepts and links in the domain of Oil Exploration and Production (E&P). Machine learning methods combined with textual pre-processing techniques were used to detect local patterns in texts and, thus, generate new concepts and new semantic links. Even using more specific vocabularies within the oil domain, our approach has achieved satisfactory results, suggesting that the proposal can be applied in other domains and languages, requiring only minor adjustments.

Keywords: semantic links, data mining, linked data, SKOS

Procedia PDF Downloads 181
3383 Text Mining of Twitter Data Using a Latent Dirichlet Allocation Topic Model and Sentiment Analysis

Authors: Sidi Yang, Haiyi Zhang

Abstract:

Twitter is a microblogging platform, where millions of users daily share their attitudes, views, and opinions. Using a probabilistic Latent Dirichlet Allocation (LDA) topic model to discern the most popular topics in the Twitter data is an effective way to analyze a large set of tweets to find a set of topics in a computationally efficient manner. Sentiment analysis provides an effective method to show the emotions and sentiments found in each tweet and an efficient way to summarize the results in a manner that is clearly understood. The primary goal of this paper is to explore text mining, extract and analyze useful information from unstructured text using two approaches: LDA topic modelling and sentiment analysis by examining Twitter plain text data in English. These two methods allow people to dig data more effectively and efficiently. LDA topic model and sentiment analysis can also be applied to provide insight views in business and scientific fields.

Keywords: text mining, Twitter, topic model, sentiment analysis

Procedia PDF Downloads 179
3382 Application of Advanced Remote Sensing Data in Mineral Exploration in the Vicinity of Heavy Dense Forest Cover Area of Jharkhand and Odisha State Mining Area

Authors: Hemant Kumar, R. N. K. Sharma, A. P. Krishna

Abstract:

The study has been carried out on the Saranda in Jharkhand and a part of Odisha state. Geospatial data of Hyperion, a remote sensing satellite, have been used. This study has used a wide variety of patterns related to image processing to enhance and extract the mining class of Fe and Mn ores.Landsat-8, OLI sensor data have also been used to correctly explore related minerals. In this way, various processes have been applied to increase the mineralogy class and comparative evaluation with related frequency done. The Hyperion dataset for hyperspectral remote sensing has been specifically verified as an effective tool for mineral or rock information extraction within the band range of shortwave infrared used. The abundant spatial and spectral information contained in hyperspectral images enables the differentiation of different objects of any object into targeted applications for exploration such as exploration detection, mining.

Keywords: Hyperion, hyperspectral, sensor, Landsat-8

Procedia PDF Downloads 125
3381 Heritage Value and Industrial Tourism Potential of the Urals, Russia

Authors: Anatoly V. Stepanov, Maria Y. Ilyushkina, Alexander S. Burnasov

Abstract:

Expansion of tourism, especially after WWII, has led to significant improvements in the regional infrastructure. The present study has revealed a lot of progress in the advancement of industrial heritage narrative in the Central Urals. The evidence comes from the general public’s increased fascination with some of Europe’s oldest mining and industrial sites, and the agreement of many stakeholders that the Urals industrial heritage should be preserved. The development of tourist sites in Nizhny Tagil and Nevyansk, gold-digging in Beryosovsky, gemstone search in Murzinka, and the progress with the Urals Gemstone Ring project are the examples showing the immense opportunities of industrial heritage tourism development in the region that are still to be realized. Regardless of the economic future of the Central Urals, whether it will remain an industrial region or experience a deeper deindustrialization, the sprouts of the industrial heritage tourism should be advanced and amplified for the benefit of local communities and the tourist community at large as it is hard to imagine a more suitable site for the discovery of industrial and mining heritage than the Central Urals Region of Russia.

Keywords: industrial heritage, mining heritage, Central Urals, Russia

Procedia PDF Downloads 138
3380 Using Data Mining Techniques to Evaluate the Different Factors Affecting the Academic Performance of Students at the Faculty of Information Technology in Hashemite University in Jordan

Authors: Feras Hanandeh, Majdi Shannag

Abstract:

This research studies the different factors that could affect the Faculty of Information Technology in Hashemite University students’ accumulative average. The research paper verifies the student information, background, their academic records, and how this information will affect the student to get high grades. The student information used in the study is extracted from the student’s academic records. The data mining tools and techniques are used to decide which attribute(s) will affect the student’s accumulative average. The results show that the most important factor which affects the students’ accumulative average is the student Acceptance Type. And we built a decision tree model and rules to determine how the student can get high grades in their courses. The overall accuracy of the model is 44% which is accepted rate.

Keywords: data mining, classification, extracting rules, decision tree

Procedia PDF Downloads 417
3379 Statistical Models and Time Series Forecasting on Crime Data in Nepal

Authors: Dila Ram Bhandari

Abstract:

Throughout the 20th century, new governments were created where identities such as ethnic, religious, linguistic, caste, communal, tribal, and others played a part in the development of constitutions and the legal system of victim and criminal justice. Acute issues with extremism, poverty, environmental degradation, cybercrimes, human rights violations, crime against, and victimization of both individuals and groups have recently plagued South Asian nations. Everyday massive number of crimes are steadfast, these frequent crimes have made the lives of common citizens restless. Crimes are one of the major threats to society and also for civilization. Crime is a bone of contention that can create a societal disturbance. The old-style crime solving practices are unable to live up to the requirement of existing crime situations. Crime analysis is one of the most important activities of the majority of intelligent and law enforcement organizations all over the world. The South Asia region lacks such a regional coordination mechanism, unlike central Asia of Asia Pacific regions, to facilitate criminal intelligence sharing and operational coordination related to organized crime, including illicit drug trafficking and money laundering. There have been numerous conversations in recent years about using data mining technology to combat crime and terrorism. The Data Detective program from Sentient as a software company, uses data mining techniques to support the police (Sentient, 2017). The goals of this internship are to test out several predictive model solutions and choose the most effective and promising one. First, extensive literature reviews on data mining, crime analysis, and crime data mining were conducted. Sentient offered a 7-year archive of crime statistics that were daily aggregated to produce a univariate dataset. Moreover, a daily incidence type aggregation was performed to produce a multivariate dataset. Each solution's forecast period lasted seven days. Statistical models and neural network models were the two main groups into which the experiments were split. For the crime data, neural networks fared better than statistical models. This study gives a general review of the applied statistics and neural network models. A detailed image of each model's performance on the available data and generalizability is provided by a comparative analysis of all the models on a comparable dataset. Obviously, the studies demonstrated that, in comparison to other models, Gated Recurrent Units (GRU) produced greater prediction. The crime records of 2005-2019 which was collected from Nepal Police headquarter and analysed by R programming. In conclusion, gated recurrent unit implementation could give benefit to police in predicting crime. Hence, time series analysis using GRU could be a prospective additional feature in Data Detective.

Keywords: time series analysis, forecasting, ARIMA, machine learning

Procedia PDF Downloads 166
3378 Fuzzy Logic Classification Approach for Exponential Data Set in Health Care System for Predication of Future Data

Authors: Manish Pandey, Gurinderjit Kaur, Meenu Talwar, Sachin Chauhan, Jagbir Gill

Abstract:

Health-care management systems are a unit of nice connection as a result of the supply a straightforward and fast management of all aspects relating to a patient, not essentially medical. What is more, there are unit additional and additional cases of pathologies during which diagnosing and treatment may be solely allotted by victimization medical imaging techniques. With associate ever-increasing prevalence, medical pictures area unit directly acquired in or regenerate into digital type, for his or her storage additionally as sequent retrieval and process. Data Mining is the process of extracting information from large data sets through using algorithms and Techniques drawn from the field of Statistics, Machine Learning and Data Base Management Systems. Forecasting may be a prediction of what's going to occur within the future, associated it's an unsure method. Owing to the uncertainty, the accuracy of a forecast is as vital because the outcome foretold by foretelling the freelance variables. A forecast management should be wont to establish if the accuracy of the forecast is within satisfactory limits. Fuzzy regression strategies have normally been wont to develop shopper preferences models that correlate the engineering characteristics with shopper preferences relating to a replacement product; the patron preference models offer a platform, wherever by product developers will decide the engineering characteristics so as to satisfy shopper preferences before developing the merchandise. Recent analysis shows that these fuzzy regression strategies area units normally will not to model client preferences. We tend to propose a Testing the strength of Exponential Regression Model over regression toward the mean Model.

Keywords: health-care management systems, fuzzy regression, data mining, forecasting, fuzzy membership function

Procedia PDF Downloads 280
3377 Relay Mining: Verifiable Multi-Tenant Distributed Rate Limiting

Authors: Daniel Olshansky, Ramiro Rodrıguez Colmeiro

Abstract:

Relay Mining presents a scalable solution employing probabilistic mechanisms and crypto-economic incentives to estimate RPC volume usage, facilitating decentralized multitenant rate limiting. Network traffic from individual applications can be concurrently serviced by multiple RPC service providers, with costs, rewards, and rate limiting governed by a native cryptocurrency on a distributed ledger. Building upon established research in token bucket algorithms and distributed rate-limiting penalty models, our approach harnesses a feedback loop control mechanism to adjust the difficulty of mining relay rewards, dynamically scaling with network usage growth. By leveraging crypto-economic incentives, we reduce coordination overhead costs and introduce a mechanism for providing RPC services that are both geopolitically and geographically distributed.

Keywords: remote procedure call, crypto-economic, commit-reveal, decentralization, scalability, blockchain, rate limiting, token bucket

Procedia PDF Downloads 55
3376 Exploring Counting Methods for the Vertices of Certain Polyhedra with Uncertainties

Authors: Sammani Danwawu Abdullahi

Abstract:

Vertex Enumeration Algorithms explore the methods and procedures of generating the vertices of general polyhedra formed by system of equations or inequalities. These problems of enumerating the extreme points (vertices) of general polyhedra are shown to be NP-Hard. This lead to exploring how to count the vertices of general polyhedra without listing them. This is also shown to be #P-Complete. Some fully polynomial randomized approximation schemes (fpras) of counting the vertices of some special classes of polyhedra associated with Down-Sets, Independent Sets, 2-Knapsack problems and 2 x n transportation problems are presented together with some discovered open problems.

Keywords: counting with uncertainties, mathematical programming, optimization, vertex enumeration

Procedia PDF Downloads 359
3375 Data Mining Approach: Classification Model Evaluation

Authors: Lubabatu Sada Sodangi

Abstract:

The rapid growth in exchange and accessibility of information via the internet makes many organisations acquire data on their own operation. The aim of data mining is to analyse the different behaviour of a dataset using observation. Although, the subset of the dataset being analysed may not display all the behaviours and relationships of the entire data and, therefore, may not represent other parts that exist in the dataset. There is a range of techniques used in data mining to determine the hidden or unknown information in datasets. In this paper, the performance of two algorithms Chi-Square Automatic Interaction Detection (CHAID) and multilayer perceptron (MLP) would be matched using an Adult dataset to find out the percentage of an/the adults that earn > 50k and those that earn <= 50k per year. The two algorithms were studied and compared using IBM SPSS statistics software. The result for CHAID shows that the most important predictors are relationship and education. The algorithm shows that those are married (husband) and have qualification: Bachelor, Masters, Doctorate or Prof-school whose their age is > 41<57 earn > 50k. Also, multilayer perceptron displays marital status and capital gain as the most important predictors of the income. It also shows that individuals that their capital gain is less than 6,849 and are single, separated or widow, earn <= 50K, whereas individuals with their capital gain is > 6,849, work > 35 hrs/wk, and > 27yrs their income will be > 50k. By comparing the two algorithms, it is observed that both algorithms are reliable but there is strong reliability in CHAID which clearly shows that relation and education contribute to the prediction as displayed in the data visualisation.

Keywords: data mining, CHAID, multi-layer perceptron, SPSS, Adult dataset

Procedia PDF Downloads 378
3374 On Exploring Search Heuristics for improving the efficiency in Web Information Extraction

Authors: Patricia Jiménez, Rafael Corchuelo

Abstract:

Nowadays the World Wide Web is the most popular source of information that relies on billions of on-line documents. Web mining is used to crawl through these documents, collect the information of interest and process it by applying data mining tools in order to use the gathered information in the best interest of a business, what enables companies to promote theirs. Unfortunately, it is not easy to extract the information a web site provides automatically when it lacks an API that allows to transform the user-friendly data provided in web documents into a structured format that is machine-readable. Rule-based information extractors are the tools intended to extract the information of interest automatically and offer it in a structured format that allow mining tools to process it. However, the performance of an information extractor strongly depends on the search heuristic employed since bad choices regarding how to learn a rule may easily result in loss of effectiveness and/or efficiency. Improving search heuristics regarding efficiency is of uttermost importance in the field of Web Information Extraction since typical datasets are very large. In this paper, we employ an information extractor based on a classical top-down algorithm that uses the so-called Information Gain heuristic introduced by Quinlan and Cameron-Jones. Unfortunately, the Information Gain relies on some well-known problems so we analyse an intuitive alternative, Termini, that is clearly more efficient; we also analyse other proposals in the literature and conclude that none of them outperforms the previous alternative.

Keywords: information extraction, search heuristics, semi-structured documents, web mining.

Procedia PDF Downloads 338
3373 Bioengineering of a Plant System to Sustainably Remove Heavy Metals and to Harvest Rare Earth Elements (REEs) from Industrial Wastes

Authors: Edmaritz Hernandez-Pagan, Kanjana Laosuntisuk, Alex Harris, Allison Haynes, David Buitrago, Michael Kudenov, Colleen Doherty

Abstract:

Rare Earth Elements (REEs) are critical metals for modern electronics, green technologies, and defense systems. However, due to their dispersed nature in the Earth’s crust, frequent co-occurrence with radioactive materials, and similar chemical properties, acquiring and purifying REEs is costly and environmentally damaging, restricting access to these metals. Plants could serve as resources for bioengineering REE mining systems. Although there is limited information on how REEs affect plants at a cellular and molecular level, plants with high REE tolerance and hyperaccumulation have been identified. This dissertation aims to develop a plant-based system for harvesting REEs from industrial waste material with a focus on Acid Mine Drainage (AMD), a toxic coal mining product. The objectives are 1) to develop a non-destructive, in vivo detection method for REE detection in Phytolacca plants (REE hyperaccumulator) plants utilizing fluorescence spectroscopy and with a primary focus on dysprosium, 2) to characterize the uptake of REE and Heavy Metals in Phytolacca americana and Phytolacca acinosa (REE hyperaccumulator) in AMD for potential implementation in the plant-based system, 3) to implement the REE detection method to identify REE-binding proteins and peptides for potential enhancement of uptake and selectivity for targeted REEs in the plants implemented in the plant-based system. The candidates are known REE-binding peptides or proteins, orthologs of known metal-binding proteins from REE hyperaccumulator plants, and novel proteins and peptides identified by comparative plant transcriptomics. Lanmodulin, a high-affinity REE-binding protein from methylotrophic bacteria, is used as a benchmark for the REE-protein binding fluorescence assays and expression in A. thaliana to test for changes in REE plant tolerance and uptake.

Keywords: phytomining, agromining, rare earth elements, pokeweed, phytolacca

Procedia PDF Downloads 18
3372 Automatic Lead Qualification with Opinion Mining in Customer Relationship Management Projects

Authors: Victor Radich, Tania Basso, Regina Moraes

Abstract:

Lead qualification is one of the main procedures in Customer Relationship Management (CRM) projects. Its main goal is to identify potential consumers who have the ideal characteristics to establish a profitable and long-term relationship with a certain organization. Social networks can be an important source of data for identifying and qualifying leads since interest in specific products or services can be identified from the users’ expressed feelings of (dis)satisfaction. In this context, this work proposes the use of machine learning techniques and sentiment analysis as an extra step in the lead qualification process in order to improve it. In addition to machine learning models, sentiment analysis or opinion mining can be used to understand the evaluation that the user makes of a particular service, product, or brand. The results obtained so far have shown that it is possible to extract data from social networks and combine the techniques for a more complete classification.

Keywords: lead qualification, sentiment analysis, opinion mining, machine learning, CRM, lead scoring

Procedia PDF Downloads 89
3371 Application of a New Efficient Normal Parameter Reduction Algorithm of Soft Sets in Online Shopping

Authors: Xiuqin Ma, Hongwu Qin

Abstract:

A new efficient normal parameter reduction algorithm of soft set in decision making was proposed. However, up to the present, few documents have focused on real-life applications of this algorithm. Accordingly, we apply a New Efficient Normal Parameter Reduction algorithm into real-life datasets of online shopping, such as Blackberry Mobile Phone Dataset. Experimental results show that this algorithm is not only suitable but feasible for dealing with the online shopping.

Keywords: soft sets, parameter reduction, normal parameter reduction, online shopping

Procedia PDF Downloads 511
3370 What the Future Holds for Social Media Data Analysis

Authors: P. Wlodarczak, J. Soar, M. Ally

Abstract:

The dramatic rise in the use of Social Media (SM) platforms such as Facebook and Twitter provide access to an unprecedented amount of user data. Users may post reviews on products and services they bought, write about their interests, share ideas or give their opinions and views on political issues. There is a growing interest in the analysis of SM data from organisations for detecting new trends, obtaining user opinions on their products and services or finding out about their online reputations. A recent research trend in SM analysis is making predictions based on sentiment analysis of SM. Often indicators of historic SM data are represented as time series and correlated with a variety of real world phenomena like the outcome of elections, the development of financial indicators, box office revenue and disease outbreaks. This paper examines the current state of research in the area of SM mining and predictive analysis and gives an overview of the analysis methods using opinion mining and machine learning techniques.

Keywords: social media, text mining, knowledge discovery, predictive analysis, machine learning

Procedia PDF Downloads 425
3369 Phonological Variation in the Speech of Grade 1 Teachers in Select Public Elementary Schools in the Philippines

Authors: M. Leonora D. Guerrero

Abstract:

The study attempted to uncover the most and least frequent phonological variation evident in the speech patterns of grade 1 teachers in select public elementary schools in the Philippines. It also determined the lectal description of the participants based on Tayao’s consonant charts for American and Philippine English. Descriptive method was utilized. A total of 24 grade 1 teachers participated in the study. The instrument used was word list. Each column in the word list is represented by words with the target consonant phonemes: labiodental fricatives f/ and /v/ and lingua-alveolar fricative /z/. These phonemes were in the initial, medial, and final positions, respectively. Findings of the study revealed that the most frequent variation happened when the participants read words with /z/ in the final position while the least frequent variation happened when the participants read words with /z/ in the initial position. The study likewise proved that the grade 1 teachers exhibited the segmental features of both the mesolect and basilect. Based on these results, it is suggested that teachers of English in the Philippines must aspire to manifest the features of the mesolect, if not, the acrolect since it is expected of the academicians not to be displaying the phonological features of the acrolects since this variety is only used by the 'uneducated.' This is especially so with grade 1 teachers who are often mimicked by their students who classify their speech as the 'standard.'

Keywords: consonant phonemes, lectal description, Philippine English, phonological variation

Procedia PDF Downloads 215