Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 1080

Search results for: deep-sea mining tailings

360 Cross-Knowledge Graph Relation Completion for Non-Isomorphic Cross-Lingual Entity Alignment

Authors: Yuhong Zhang, Dan Lu, Chenyang Bu, Peipei Li, Kui Yu, Xindong Wu

Abstract:

The Cross-Lingual Entity Alignment (CLEA) task aims to find the aligned entities that refer to the same identity from two knowledge graphs (KGs) in different languages. It is an effective way to enhance the performance of data mining for KGs with scarce resources. In real-world applications, the neighborhood structures of the same entities in different KGs tend to be non-isomorphic, which makes the representation of entities contain diverse semantic information and then poses a great challenge for CLEA. In this paper, we try to address this challenge from two perspectives. On the one hand, the cross-KG relation completion rules are designed with the alignment constraint of entities and relations to improve the topology isomorphism of two KGs. On the other hand, a representation method combining isomorphic weights is designed to include more isomorphic semantics for counterpart entities, which will benefit the CLEA. Experiments show that our model can improve the isomorphism of two KGs and the alignment performance, especially for two non-isomorphic KGs.

Keywords: knowledge graphs, cross-lingual entity alignment, non-isomorphic, relation completion

Procedia PDF Downloads 100

359 FCNN-MR: A Parallel Instance Selection Method Based on Fast Condensed Nearest Neighbor Rule

Authors: Lu Si, Jie Yu, Shasha Li, Jun Ma, Lei Luo, Qingbo Wu, Yongqi Ma, Zhengji Liu

Abstract:

Instance selection (IS) technique is used to reduce the data size to improve the performance of data mining methods. Recently, to process very large data set, several proposed methods divide the training set into some disjoint subsets and apply IS algorithms independently to each subset. In this paper, we analyze the limitation of these methods and give our viewpoint about how to divide and conquer in IS procedure. Then, based on fast condensed nearest neighbor (FCNN) rule, we propose a large data sets instance selection method with MapReduce framework. Besides ensuring the prediction accuracy and reduction rate, it has two desirable properties: First, it reduces the work load in the aggregation node; Second and most important, it produces the same result with the sequential version, which other parallel methods cannot achieve. We evaluate the performance of FCNN-MR on one small data set and two large data sets. The experimental results show that it is effective and practical.

Keywords: instance selection, data reduction, MapReduce, kNN

Procedia PDF Downloads 234

358 Development of Innovative Islamic Web Applications

Authors: Farrukh Shahzad

Abstract:

The rich Islamic resources related to religious text, Islamic sciences, and history are widely available in print and in electronic format online. However, most of these works are only available in Arabic language. In this research, an attempt is made to utilize these resources to create interactive web applications in Arabic, English and other languages. The system utilizes the Pattern Recognition, Knowledge Management, Data Mining, Information Retrieval and Management, Indexing, storage and data-analysis techniques to parse, store, convert and manage the information from authentic Arabic resources. These interactive web Apps provide smart multi-lingual search, tree based search, on-demand information matching and linking. In this paper, we provide details of application architecture, design, implementation and technologies employed. We also presented the summary of web applications already developed. We have also included some screen shots from the corresponding web sites. These web applications provide an Innovative On-line Learning Systems (eLearning and computer based education).

Keywords: Islamic resources, Muslim scholars, hadith, narrators, history, fiqh

Procedia PDF Downloads 261

357 Semi-Automatic Method to Assist Expert for Association Rules Validation

Authors: Amdouni Hamida, Gammoudi Mohamed Mohsen

Abstract:

In order to help the expert to validate association rules extracted from data, some quality measures are proposed in the literature. We distinguish two categories: objective and subjective measures. The first one depends on a fixed threshold and on data quality from which the rules are extracted. The second one consists on providing to the expert some tools in the objective to explore and visualize rules during the evaluation step. However, the number of extracted rules to validate remains high. Thus, the manually mining rules task is very hard. To solve this problem, we propose, in this paper, a semi-automatic method to assist the expert during the association rule's validation. Our method uses rule-based classification as follow: (i) We transform association rules into classification rules (classifiers), (ii) We use the generated classifiers for data classification. (iii) We visualize association rules with their quality classification to give an idea to the expert and to assist him during validation process.

Keywords: association rules, rule-based classification, classification quality, validation

Procedia PDF Downloads 413

356 Development of Terrorist Threat Prediction Model in Indonesia by Using Bayesian Network

Authors: Hilya Mudrika Arini, Nur Aini Masruroh, Budi Hartono

Abstract:

There are more than 20 terrorist threats from 2002 to 2012 in Indonesia. Despite of this fact, preventive solution through studies in the field of national security in Indonesia has not been conducted comprehensively. This study aims to provide a preventive solution by developing prediction model of the terrorist threat in Indonesia by using Bayesian network. There are eight stages to build the model, started from literature review, build and verify Bayesian belief network to what-if scenario. In order to build the model, four experts from different perspectives are utilized. This study finds several significant findings. First, news and the readiness of terrorist group are the most influent factor. Second, according to several scenarios of the news portion, it can be concluded that the higher positive news proportion, the higher probability of terrorist threat will occur. Therefore, the preventive solution to reduce the terrorist threat in Indonesia based on the model is by keeping the positive news portion to a maximum of 38%.

Keywords: Bayesian network, decision analysis, national security system, text mining

Procedia PDF Downloads 368

355 Frequent-Pattern Tree Algorithm Application to S&P and Equity Indexes

Authors: E. Younsi, H. Andriamboavonjy, A. David, S. Dokou, B. Lemrabet

Abstract:

Software and time optimization are very important factors in financial markets, which are competitive fields, and emergence of new computer tools further stresses the challenge. In this context, any improvement of technical indicators which generate a buy or sell signal is a major issue. Thus, many tools have been created to make them more effective. This worry about efficiency has been leading in present paper to seek best (and most innovative) way giving largest improvement in these indicators. The approach consists in attaching a signature to frequent market configurations by application of frequent patterns extraction method which is here most appropriate to optimize investment strategies. The goal of proposed trading algorithm is to find most accurate signatures using back testing procedure applied to technical indicators for improving their performance. The problem is then to determine the signatures which, combined with an indicator, outperform this indicator alone. To do this, the FP-Tree algorithm has been preferred, as it appears to be the most efficient algorithm to perform this task.

Keywords: quantitative analysis, back-testing, computational models, apriori algorithm, pattern recognition, data mining, FP-tree

Procedia PDF Downloads 339

354 Modeling the Present Economic and Social Alienation of Working Class in South Africa in the Musical Production ‘from Marikana to Mahagonny’ at Durban University of Technology (DUT)

Authors: Pamela Tancsik

Abstract:

The stage production in 2018, titled ‘From‘Marikana to Mahagonny’, began with a prologue in the form of the award-winning documentary ‘Miners Shot Down' by Rehad Desai, followed by Brecht/Weill’s song play or scenic cantata ‘Mahagonny’, premièred in Baden-Baden 1927. The central directorial concept of the DUT musical production ‘From Marikana to Mahagonny’ was to show a connection between the socio-political alienation of mineworkers in present-day South Africa and Brecht’s alienation effect in his scenic cantata ‘Mahagonny’. Marikana is a mining town about 50 km west of South Africa’s capital Pretoria. Mahagonny is a fantasy name for a utopian mining town in the United States. The characters, setting, and lyrics refer to America with of songs like ‘Benares’ and ‘Moon of Alabama’ and the use of typical American inventions such as dollars, saloons, and the telephone. The six singing characters in ‘Mahagonny’ all have typical American names: Charlie, Billy, Bobby, Jimmy, and the two girls they meet later are called Jessie and Bessie. The four men set off to seek Mahagonny. For them, it is the ultimate dream destination promising the fulfilment of all their desires, such as girls, alcohol, and dollars – in short, materialistic goals. Instead of finding a paradise, they experience how money and the practice of exploitive capitalism, and the lack of any moral and humanity is destroying their lives. In the end, Mahagonny gets demolished by a hurricane, an event which happened in 1926 in the United States. ‘God’ in person arrives disillusioned and bitter, complaining about violent and immoral mankind. In the end, he sends them all to hell. Charlie, Billy, Bobby, and Jimmy reply that this punishment does not mean anything to them because they have already been in hell for a long time – hell on earth is a reality, so the threat of hell after life is meaningless. Human life was also taken during the stand-off between striking mineworkers and the South African police on 16 August 2012. Miners from the Lonmin Platinum Mine went on an illegal strike, equipped with bush knives and spears. They were striking because their living conditions had never improved; they still lived in muddy shacks with no running water and electricity. Wages were as low as R4,000 (South African Rands), equivalent to just over 200 Euro per month. By August 2012, the negotiations between Lonmin management and the mineworkers’ unions, asking for a minimum wage of R12,500 per month, had failed. Police were sent in by the Government, and when the miners did not withdraw, the police shot at them. 34 were killed, some by bullets in their backs while running away and trying to hide behind rocks. In the musical play ‘From Marikana to Mahagonny’ audiences in South Africa are confronted with a documentary about Marikana, followed by Brecht/Weill’s scenic cantata, highlighting the tragic parallels between the Mahagonny story and characters from 1927 America and the Lonmin workers today in South Africa, showing that in 95 years, capitalism has not changed.

Keywords: alienation, brecht/Weill, mahagonny, marikana/South Africa, musical theatre

Procedia PDF Downloads 69

353 Semantic Textual Similarity on Contracts: Exploring Multiple Negative Ranking Losses for Sentence Transformers

Authors: Yogendra Sisodia

Abstract:

Researchers are becoming more interested in extracting useful information from legal documents thanks to the development of large-scale language models in natural language processing (NLP), and deep learning has accelerated the creation of powerful text mining models. Legal fields like contracts benefit greatly from semantic text search since it makes it quick and easy to find related clauses. After collecting sentence embeddings, it is relatively simple to locate sentences with a comparable meaning throughout the entire legal corpus. The author of this research investigated two pre-trained language models for this task: MiniLM and Roberta, and further fine-tuned them on Legal Contracts. The author used Multiple Negative Ranking Loss for the creation of sentence transformers. The fine-tuned language models and sentence transformers showed promising results.

Keywords: legal contracts, multiple negative ranking loss, natural language inference, sentence transformers, semantic textual similarity

Procedia PDF Downloads 75

352 The Economic Geology of Ijero Ekiti, South Western Nigeria: A Need for Sustainable Mining for a Responsible Socio-Economic Growth and Development

Authors: Olagunju John Olusesan-Remi

Abstract:

The study area Ijero-Ekiti falls within the Ilesha-Ekiti Schist belt, originating from the long year of the Pan-Africa orogenic events and various cataclysmic tectonic activities in history. Ijero-Ekiti is situated within latitude 7 degree 45N and 7 Degree 55N. Ijero Ekiti is bordered between the Dahomean Basin and the southern Bida/Benue basin on the Geological map of Nigeria. This research work centers on majorly on investigating the chemical composition and as well as the mineralogical distribution of the various mineral-bearing rocks that composed the study area. This work is essentially carried out with a view to assessing and at the same time ascertaining the economic potentials and or the industrial significance of the area to Ekiti-south western region and the Nigeria nation as a whole. The mineralogical distribution pattern is of particular interest to us in this study. In this regard essential focus is put on the mostly the economic gemstones distributions within the various mineral bearing rocks in the zone, some of which includes the tourmaline formation, cassiterite deposit, tin-ore, tantalum columbite, smoky quartz, amethyst, polychrome and emerald variety beryl among others as they occurred within the older granite of the Precambrian rocks. To this end, samples of the major rock types were taken from various locations within the study area for detail scientific analysis as follows: The Igemo pegmatite of Ijero west, the epidiorite of Idaho, the biotitic hornblende gneiss of Ikoro-Ijero north and the beryl crystalline rock types to mention a few. The slides of the each rock from the aforementioned zones were later prepared and viewed under a cross Nichol petro graphic microscope with a particular focus on the light reflection ability of the constituent minerals in each rock samples. The results from the physical analysis viewed from the colour had it that the pegmatite samples ranges from pure milky white to fairly pinkish coloration. Other physical properties investigated include the streak, luster, form, specific gravity, cleavage/fracture pattern etc. The optical examination carried out centers on the refractive indices and pleochroism of the minerals present while the chemical analysis reveals from the tourmaline samples a differing correlation coefficient of the various oxides in each samples collected through which the mineral presence was established. In conclusion, it was inferred that the various minerals outlined above were in reasonable quantity within the Ijero area. With the above discoveries, therefore, we strongly recommend a detailed scientific investigation to be carried out such that will lead to a comprehensive mining of the area. Above all, it is our conclusion that a comprehensive mineralogical exploitation of this area will not only boost the socio-economic potential of the area but at the same time will go a long way contributing immensely to the socio-economic growth and development of the Nation-Nigeria at large.

Keywords: Ijero Ekiti, Southwestern Nigeria, economic minerals, pegmatite of the pan African origin, cataclastic tectonic activities, Ilesha Schistbelt, precambrian formations

Procedia PDF Downloads 233

351 Modeling Food Popularity Dependencies Using Social Media Data

Authors: DEVASHISH KHULBE, MANU PATHAK

Abstract:

The rise in popularity of major social media platforms have enabled people to share photos and textual information about their daily life. One of the popular topics about which information is shared is food. Since a lot of media about food are attributed to particular locations and restaurants, information like spatio-temporal popularity of various cuisines can be analyzed. Tracking the popularity of food types and retail locations across space and time can also be useful for business owners and restaurant investors. In this work, we present an approach using off-the shelf machine learning techniques to identify trends and popularity of cuisine types in an area using geo-tagged data from social media, Google images and Yelp. After adjusting for time, we use the Kernel Density Estimation to get hot spots across the location and model the dependencies among food cuisines popularity using Bayesian Networks. We consider the Manhattan borough of New York City as the location for our analyses but the approach can be used for any area with social media data and information about retail businesses.

Keywords: Web Mining, Geographic Information Systems, Business popularity, Spatial Data Analyses

Procedia PDF Downloads 87

350 A Deep Learning Based Approach for Dynamically Selecting Pre-processing Technique for Images

Authors: Revoti Prasad Bora, Nikita Katyal, Saurabh Yadav

Abstract:

Pre-processing plays an important role in various image processing applications. Most of the time due to the similar nature of images, a particular pre-processing or a set of pre-processing steps are sufficient to produce the desired results. However, in the education domain, there is a wide variety of images in various aspects like images with line-based diagrams, chemical formulas, mathematical equations, etc. Hence a single pre-processing or a set of pre-processing steps may not yield good results. Therefore, a Deep Learning based approach for dynamically selecting a relevant pre-processing technique for each image is proposed. The proposed method works as a classifier to detect hidden patterns in the images and predicts the relevant pre-processing technique needed for the image. This approach experimented for an image similarity matching problem but it can be adapted to other use cases too. Experimental results showed significant improvement in average similarity ranking with the proposed method as opposed to static pre-processing techniques.

Keywords: deep-learning, classification, pre-processing, computer vision, image processing, educational data mining

Procedia PDF Downloads 124

349 Topic Modelling Using Latent Dirichlet Allocation and Latent Semantic Indexing on SA Telco Twitter Data

Authors: Phumelele Kubheka, Pius Owolawi, Gbolahan Aiyetoro

Abstract:

Twitter is one of the most popular social media platforms where users can share their opinions on different subjects. As of 2010, The Twitter platform generates more than 12 Terabytes of data daily, ~ 4.3 petabytes in a single year. For this reason, Twitter is a great source for big mining data. Many industries such as Telecommunication companies can leverage the availability of Twitter data to better understand their markets and make an appropriate business decision. This study performs topic modeling on Twitter data using Latent Dirichlet Allocation (LDA). The obtained results are benchmarked with another topic modeling technique, Latent Semantic Indexing (LSI). The study aims to retrieve topics on a Twitter dataset containing user tweets on South African Telcos. Results from this study show that LSI is much faster than LDA. However, LDA yields better results with higher topic coherence by 8% for the best-performing model represented in Table 1. A higher topic coherence score indicates better performance of the model.

Keywords: big data, latent Dirichlet allocation, latent semantic indexing, telco, topic modeling, twitter

Procedia PDF Downloads 128

348 Analyzing Semantic Feature Using Multiple Information Sources for Reviews Summarization

Authors: Yu Hung Chiang, Hei Chia Wang

Abstract:

Nowadays, tourism has become a part of life. Before reserving hotels, customers need some information, which the most important source is online reviews, about hotels to help them make decisions. Due to the dramatic growing of online reviews, it is impossible for tourists to read all reviews manually. Therefore, designing an automatic review analysis system, which summarizes reviews, is necessary for them. The main purpose of the system is to understand the opinion of reviews, which may be positive or negative. In other words, the system would analyze whether the customers who visited the hotel like it or not. Using sentiment analysis methods will help the system achieve the purpose. In sentiment analysis methods, the targets of opinion (here they are called the feature) should be recognized to clarify the polarity of the opinion because polarity of the opinion may be ambiguous. Hence, the study proposes an unsupervised method using Part-Of-Speech pattern and multi-lexicons sentiment analysis to summarize all reviews. We expect this method can help customers search what they want information as well as make decisions efficiently.

Keywords: text mining, sentiment analysis, product feature extraction, multi-lexicons

Procedia PDF Downloads 307

347 Automated Process Quality Monitoring and Diagnostics for Large-Scale Measurement Data

Authors: Hyun-Woo Cho

Abstract:

Continuous monitoring of industrial plants is one of necessary tasks when it comes to ensuring high-quality final products. In terms of monitoring and diagnosis, it is quite critical and important to detect some incipient abnormal events of manufacturing processes in order to improve safety and reliability of operations involved and to reduce related losses. In this work a new multivariate statistical online diagnostic method is presented using a case study. For building some reference models an empirical discriminant model is constructed based on various past operation runs. When a fault is detected on-line, an on-line diagnostic module is initiated. Finally, the status of the current operating conditions is compared with the reference model to make a diagnostic decision. The performance of the presented framework is evaluated using a dataset from complex industrial processes. It has been shown that the proposed diagnostic method outperforms other techniques especially in terms of incipient detection of any faults occurred.

Keywords: data mining, empirical model, on-line diagnostics, process fault, process monitoring

Procedia PDF Downloads 379

346 The Impact of Interrelationship between Business Intelligence and Knowledge Management on Decision Making Process: An Empirical Investigation of Banking Sector in Jordan

Authors: Issa M. Shehabat, Huda F. Y. Nimri

Abstract:

This paper aims to study the relationship between knowledge management in its processes, including knowledge creation, knowledge sharing, knowledge organization, and knowledge application, and business intelligence tools, including OLAP, data mining, and data warehouse, and their impact on the decision-making process in the banking sector in Jordan. A total of 200 questionnaires were distributed to the sample of the study. The study hypotheses were tested using the statistical package SPSS. Study findings suggest that decision-making processes were positively related to knowledge management processes. Additionally, the components of business intelligence had a positive impact on decision-making. The study recommended conducting studies similar to this study in other sectors such as the industrial, telecommunications, and service sectors to contribute to enhancing understanding of the role of the knowledge management processes and business intelligence tools.

Keywords: business intelligence, knowledge management, decision making, Jordan, banking sector

Procedia PDF Downloads 114

345 Developing Serious Games to Improve Learning Experience of Programming: A Case Study

Authors: Shan Jiang, Xinyu Tang

Abstract:

Game-based learning is an emerging pedagogy to make the learning experience more effective, enjoyable, and fun. However, most games used in classroom settings have been overly simplistic. This paper presents a case study on a Python-based online game designed to improve the effectiveness in both teaching and research in higher education. The proposed game system not only creates a fun and enjoyable experience for students to learn various topics in programming but also improves the effectiveness of teaching in several aspects, including material presentation, helping students to recognize the importance of the subjects, and linking theoretical concepts to practice. The proposed game system also serves as an information cyber-infrastructure that automatically collects and stores data from players. The data could be useful in research areas including human-computer interaction, decision making, opinion mining, and artificial intelligence. They further provide other possibilities beyond these areas due to the customizable nature of the game.

Keywords: game-based learning, programming, research-teaching integration, Hearthstone

Procedia PDF Downloads 143

344 Development of Knowledge Discovery Based Interactive Decision Support System on Web Platform for Maternal and Child Health System Strengthening

Authors: Partha Saha, Uttam Kumar Banerjee

Abstract:

Maternal and Child Healthcare (MCH) has always been regarded as one of the important issues globally. Reduction of maternal and child mortality rates and increase of healthcare service coverage were declared as one of the targets in Millennium Development Goals till 2015 and thereafter as an important component of the Sustainable Development Goals. Over the last decade, worldwide MCH indicators have improved but could not match the expected levels. Progress of both maternal and child mortality rates have been monitored by several researchers. Each of the studies has stated that only less than 26% of low-income and middle income countries (LMICs) were on track to achieve targets as prescribed by MDG4. Average worldwide annual rate of reduction of under-five mortality rate and maternal mortality rate were 2.2% and 1.9% as on 2011 respectively whereas rates should be minimum 4.4% and 5.5% annually to achieve targets. In spite of having proven healthcare interventions for both mothers and children, those could not be scaled up to the required volume due to fragmented health systems, especially in the developing and under-developed countries. In this research, a knowledge discovery based interactive Decision Support System (DSS) has been developed on web platform which would assist healthcare policy makers to develop evidence-based policies. To achieve desirable results in MCH, efficient resource planning is very much required. In maximum LMICs, resources are big constraint. Knowledge, generated through this system, would help healthcare managers to develop strategic resource planning for combatting with issues like huge inequity and less coverage in MCH. This system would help healthcare managers to accomplish following four tasks. Those are a) comprehending region wise conditions of variables related with MCH, b) identifying relationships within variables, c) segmenting regions based on variables status, and d) finding out segment wise key influential variables which have major impact on healthcare indicators. Whole system development process has been divided into three phases. Those were i) identifying contemporary issues related with MCH services and policy making; ii) development of the system; and iii) verification and validation of the system. More than 90 variables under three categories, such as a) educational, social, and economic parameters; b) MCH interventions; and c) health system building blocks have been included into this web-based DSS and five separate modules have been developed under the system. First module has been designed for analysing current healthcare scenario. Second module would help healthcare managers to understand correlations among variables. Third module would reveal frequently-occurring incidents along with different MCH interventions. Fourth module would segment regions based on previously mentioned three categories and in fifth module, segment-wise key influential interventions will be identified. India has been considered as case study area in this research. Data of 601 districts of India has been used for inspecting effectiveness of those developed modules. This system has been developed by importing different statistical and data mining techniques on Web platform. Policy makers would be able to generate different scenarios from the system before drawing any inference, aided by its interactive capability.

Keywords: maternal and child heathcare, decision support systems, data mining techniques, low and middle income countries

Procedia PDF Downloads 231

343 Classification of Political Affiliations by Reduced Number of Features

Authors: Vesile Evrim, Aliyu Awwal

Abstract:

By the evolvement in technology, the way of expressing opinions switched the direction to the digital world. The domain of politics as one of the hottest topics of opinion mining research merged together with the behavior analysis for affiliation determination in text which constitutes the subject of this paper. This study aims to classify the text in news/blogs either as Republican or Democrat with the minimum number of features. As an initial set, 68 features which 64 are constituted by Linguistic Inquiry and Word Count (LIWC) features are tested against 14 benchmark classification algorithms. In the later experiments, the dimensions of the feature vector reduced based on the 7 feature selection algorithms. The results show that Decision Tree, Rule Induction and M5 Rule classifiers when used with SVM and IGR feature selection algorithms performed the best up to 82.5% accuracy on a given dataset. Further tests on a single feature and the linguistic based feature sets showed the similar results. The feature “function” as an aggregate feature of the linguistic category, is obtained as the most differentiating feature among the 68 features with 81% accuracy by itself in classifying articles either as Republican or Democrat.

Keywords: feature selection, LIWC, machine learning, politics

Procedia PDF Downloads 361

342 Modelling Fluoride Pollution of Groundwater Using Artificial Neural Network in the Western Parts of Jharkhand

Authors: Neeta Kumari, Gopal Pathak

Abstract:

Artificial neural network has been proved to be an efficient tool for non-parametric modeling of data in various applications where output is non-linearly associated with input. It is a preferred tool for many predictive data mining applications because of its power , flexibility, and ease of use. A standard feed forward networks (FFN) is used to predict the groundwater fluoride content. The ANN model is trained using back propagated algorithm, Tansig and Logsig activation function having varying number of neurons. The models are evaluated on the basis of statistical performance criteria like Root Mean Squarred Error (RMSE) and Regression coefficient (R2), bias (mean error), Coefficient of variation (CV), Nash-Sutcliffe efficiency (NSE), and the index of agreement (IOA). The results of the study indicate that Artificial neural network (ANN) can be used for groundwater fluoride prediction in the limited data situation in the hard rock region like western parts of Jharkhand with sufficiently good accuracy.

Keywords: Artificial neural network (ANN), FFN (Feed-forward network), backpropagation algorithm, Levenberg-Marquardt algorithm, groundwater fluoride contamination

Procedia PDF Downloads 516

341 User Modeling from the Perspective of Improvement in Search Results: A Survey of the State of the Art

Authors: Samira Karimi-Mansoub, Rahem Abri

Abstract:

Currently, users expect high quality and personalized information from search results. To satisfy user’s needs, personalized approaches to web search have been proposed. These approaches can provide the most appropriate answer for user’s needs by using user context and incorporating information about query provided by combining search technologies. To carry out personalized web search, there is a need to make different techniques on whole of user search process. There are the number of possible deployment of personalized approaches such as personalized web search, personalized recommendation, personalized summarization and filtering systems and etc. but the common feature of all approaches in various domains is that user modeling is utilized to provide personalized information from the Web. So the most important work in personalized approaches is user model mining. User modeling applications and technologies can be used in various domains depending on how the user collected information may be extracted. In addition to, the used techniques to create user model is also different in each of these applications. Since in the previous studies, there was not a complete survey in this field, our purpose is to present a survey on applications and techniques of user modeling from the viewpoint of improvement in search results by considering the existing literature and researches.

Keywords: filtering systems, personalized web search, user modeling, user search behavior

Procedia PDF Downloads 257

340 Benthic Foraminiferal Responses to Coastal Pollution for Some Selected Sites along Red Sea, Egypt

Authors: Ramadan M. El-Kahawy, M. A. El-Shafeiy, Mohamed Abd El-Wahab, S. A. Helal, Nabil Aboul-Ela

Abstract:

Due to the economic importance of Safaga Bay, Quseir harbor and Ras Gharib harbor , a multidisciplinary approach was adopted to invistigate 27 surfecial sediment samples from the three sites and 9 samples for each in order to use the benthic foraminifera as bio-indicators for characterization of the environmental variations. Grain size analyses indicate that the bottom facies in the inner part of quseir is muddy while the inner part of Ras Gharib and Safaga is silty sand and those close to the entrance of Safaga bay and Ras Gharib is sandy facies while quseir still also muddy facies. geochemical data show high concentration of heavy-metals mainly in Ras Gharib due to oil leakage from the hydrocarbon oil field and Safaga bay due to the phosphate mining while quseir is medium concentration due to anthropocentric effect.micropaelontological analyses indicate the boundaries of the highest concentration of heavy metals and those of low concentration as well.the dominant benthic foraminifera in these three sites are Ammonia beccarii, Amphistigina and sorites. the study highlights the worsening of environmental conditions and also show that the areas in need of a priority recovery.

Keywords: benthic foraminifera, Ras Gharib, Safaga, Quseir, Red Sea, Egypt

Procedia PDF Downloads 324

339 Sentiment Analysis of Ensemble-Based Classifiers for E-Mail Data

Authors: Muthukumarasamy Govindarajan

Abstract:

Detection of unwanted, unsolicited mails called spam from email is an interesting area of research. It is necessary to evaluate the performance of any new spam classifier using standard data sets. Recently, ensemble-based classifiers have gained popularity in this domain. In this research work, an efficient email filtering approach based on ensemble methods is addressed for developing an accurate and sensitive spam classifier. The proposed approach employs Naive Bayes (NB), Support Vector Machine (SVM) and Genetic Algorithm (GA) as base classifiers along with different ensemble methods. The experimental results show that the ensemble classifier was performing with accuracy greater than individual classifiers, and also hybrid model results are found to be better than the combined models for the e-mail dataset. The proposed ensemble-based classifiers turn out to be good in terms of classification accuracy, which is considered to be an important criterion for building a robust spam classifier.

Keywords: accuracy, arcing, bagging, genetic algorithm, Naive Bayes, sentiment mining, support vector machine

Procedia PDF Downloads 115

338 Product Features Extraction from Opinions According to Time

Authors: Kamal Amarouche, Houda Benbrahim, Ismail Kassou

Abstract:

Nowadays, e-commerce shopping websites have experienced noticeable growth. These websites have gained consumers’ trust. After purchasing a product, many consumers share comments where opinions are usually embedded about the given product. Research on the automatic management of opinions that gives suggestions to potential consumers and portrays an image of the product to manufactures has been growing recently. After launching the product in the market, the reviews generated around it do not usually contain helpful information or generic opinions about this product (e.g. telephone: great phone...); in the sense that the product is still in the launching phase in the market. Within time, the product becomes old. Therefore, consumers perceive the advantages/ disadvantages about each specific product feature. Therefore, they will generate comments that contain their sentiments about these features. In this paper, we present an unsupervised method to extract different product features hidden in the opinions which influence its purchase, and that combines Time Weighting (TW) which depends on the time opinions were expressed with Term Frequency-Inverse Document Frequency (TF-IDF). We conduct several experiments using two different datasets about cell phones and hotels. The results show the effectiveness of our automatic feature extraction, as well as its domain independent characteristic.

Keywords: opinion mining, product feature extraction, sentiment analysis, SentiWordNet

Procedia PDF Downloads 371

337 Artificial Reproduction System and Imbalanced Dataset: A Mendelian Classification

Authors: Anita Kushwaha

Abstract:

We propose a new evolutionary computational model called Artificial Reproduction System which is based on the complex process of meiotic reproduction occurring between male and female cells of the living organisms. Artificial Reproduction System is an attempt towards a new computational intelligence approach inspired by the theoretical reproduction mechanism, observed reproduction functions, principles and mechanisms. A reproductive organism is programmed by genes and can be viewed as an automaton, mapping and reducing so as to create copies of those genes in its off springs. In Artificial Reproduction System, the binding mechanism between male and female cells is studied, parameters are chosen and a network is constructed also a feedback system for self regularization is established. The model then applies Mendel’s law of inheritance, allele-allele associations and can be used to perform data analysis of imbalanced data, multivariate, multiclass and big data. In the experimental study Artificial Reproduction System is compared with other state of the art classifiers like SVM, Radial Basis Function, neural networks, K-Nearest Neighbor for some benchmark datasets and comparison results indicates a good performance.

Keywords: bio-inspired computation, nature- inspired computation, natural computing, data mining

Procedia PDF Downloads 243

336 Feature Weighting Comparison Based on Clustering Centers in the Detection of Diabetic Retinopathy

Authors: Kemal Polat

Abstract:

In this paper, three feature weighting methods have been used to improve the classification performance of diabetic retinopathy (DR). To classify the diabetic retinopathy, features extracted from the output of several retinal image processing algorithms, such as image-level, lesion-specific and anatomical components, have been used and fed them into the classifier algorithms. The dataset used in this study has been taken from University of California, Irvine (UCI) machine learning repository. Feature weighting methods including the fuzzy c-means clustering based feature weighting, subtractive clustering based feature weighting, and Gaussian mixture clustering based feature weighting, have been used and compered with each other in the classification of DR. After feature weighting, five different classifier algorithms comprising multi-layer perceptron (MLP), k- nearest neighbor (k-NN), decision tree, support vector machine (SVM), and Naïve Bayes have been used. The hybrid method based on combination of subtractive clustering based feature weighting and decision tree classifier has been obtained the classification accuracy of 100% in the screening of DR. These results have demonstrated that the proposed hybrid scheme is very promising in the medical data set classification.

Keywords: machine learning, data weighting, classification, data mining

Procedia PDF Downloads 307

335 An Evaluation of Edible Plants for Remediation of Contaminated Soil- Can Edible Plants Be Used to Remove Heavy Metals on Soil?

Authors: Celia Marilia Martins, Sonia I. V. Guilundo, Iris M. Victorino, Antonio O. Quilambo

Abstract:

In Mozambique rapid industrialization (mining, aluminium and cement activities) and urbanization processes has led to the incorporation of heavy metals on soil, thus degrading not only the quality of the environment, but also affecting plants, animals and human healthy. Several methods have been used to remediate contaminated soils, but most of them are costly and diﬃcult to get optimum results. Currently, phytoremediation is an eﬀective and aﬀordable technological solution used to extract or remove inactive metals from contaminated soil. Phytoremediation is the use of plants to clean up a contamination from soils, sediments, and water. This technology is environmental friendly and potentially cost eﬀective. The present investigation summarised the potential of edible vegetable to grow under the high level of heavy metals such as lead and zinc. The plants used in these studies include Tomatoes, lettuce and Soya beans. The studies have shown that edible plants can be grown under the high level of heavy metals on the soil. Further investigations are identifying mechanisms used by plants to ensure a safe and sustainable use for remediation of contaminated soils by heavy metals.

Keywords: contaminated soil, edible plants, heavy metals, phytoremediation

Procedia PDF Downloads 345

334 Domain Adaptive Dense Retrieval with Query Generation

Authors: Rui Yin, Haojie Wang, Xun Li

Abstract:

Recently, mainstream dense retrieval methods have obtained state-of-the-art results on some datasets and tasks. However, they require large amounts of training data, which is not available in most domains. The severe performance degradation of dense retrievers on new data domains has limited the use of dense retrieval methods to only a few domains with large training datasets. In this paper, we propose an unsupervised domain-adaptive approach based on query generation. First, a generative model is used to generate relevant queries for each passage in the target corpus, and then, the generated queries are used for mining negative passages. Finally, the query-passage pairs are labeled with a cross-encoder and used to train a domain-adapted dense retriever. We also explore contrastive learning as a method for training domain-adapted dense retrievers and show that it leads to strong performance in various retrieval settings. Experiments show that our approach is more robust than previous methods in target domains that require less unlabeled data.

Keywords: dense retrieval, query generation, contrastive learning, unsupervised training

Procedia PDF Downloads 68

333 Aquatic and Marshy Flora from Fresh Water Wetlands on Quartz Sands in Pinar Del Río, Cuba

Authors: Vidal Pérez Hernández, Enrique González Pendás

Abstract:

The most of the aquatic and marshy flora in Cuba, is located on quartzitic sands ecosystems and they are represented by a wide variety of freshwater wetlands, which are spread in the whole south and south-western plain of Pinar del Río. The survey carried out in these ecosystems offers an updated inventory of these species, showing up their biological type, habit, distribution, and the threat grade to which are subjected, taking into account categories granted by UICN. A remarkable decrease is evidenced, in the total of these species respect to this area; due to deposit processes and deforestation, which are taken place by the human activity and the climatic change. It is linked to others threats like, limitless use of their water reserves for irrigating groves, the cattle raising and intensive fishing. Added to it, its sand with 99% pure crystal quartz, are used for the mining. The combination of all factors has a negative influence on a flora that stores more than 250 species, most of them herbaceous and hydrophytes. In these particular ecosystems were found a 40% endemism from total flora, and more than 80%, are evaluated inside the most sensitive threat categories, and already some of them have been declared as extinct.

Keywords: aquatic flora, marshy flora, quartzitic sands, wetlands

Procedia PDF Downloads 200

332 Quantifying User-Related, System-Related, and Context-Related Patterns of Smartphone Use

Authors: Andrew T. Hendrickson, Liven De Marez, Marijn Martens, Gytha Muller, Tudor Paisa, Koen Ponnet, Catherine Schweizer, Megan Van Meer, Mariek Vanden Abeele

Abstract:

Quantifying and understanding the myriad ways people use their phones and how that impacts their relationships, cognitive abilities, mental health, and well-being is increasingly important in our phone-centric society. However, most studies on the patterns of phone use have focused on theory-driven tests of specific usage hypotheses using self-report questionnaires or analyses of smaller datasets. In this work we present a series of analyses from a large corpus of over 3000 users that combine data-driven and theory-driven analyses to identify reliable smartphone usage patterns and clusters of similar users. Furthermore, we compare the stability of user clusters across user- and system-initiated sessions, as well as during the hypothesized ritualized behavior times directly before and after sleeping. Our results indicate support for some hypothesized usage patterns but present a more complete and nuanced view of how people use smartphones.

Keywords: data mining, experience sampling, smartphone usage, health and well being

Procedia PDF Downloads 140

331 Internet of Things, Edge and Cloud Computing in Rock Mechanical Investigation for Underground Surveys

Authors: Esmael Makarian, Ayub Elyasi, Fatemeh Saberi, Olusegun Stanley Tomomewo

Abstract:

Rock mechanical investigation is one of the most crucial activities in underground operations, especially in surveys related to hydrocarbon exploration and production, geothermal reservoirs, energy storage, mining, and geotechnics. There is a wide range of traditional methods for driving, collecting, and analyzing rock mechanics data. However, these approaches may not be suitable or work perfectly in some situations, such as fractured zones. Cutting-edge technologies have been provided to solve and optimize the mentioned issues. Internet of Things (IoT), Edge, and Cloud Computing technologies (ECt & CCt, respectively) are among the most widely used and new artificial intelligence methods employed for geomechanical studies. IoT devices act as sensors and cameras for real-time monitoring and mechanical-geological data collection of rocks, such as temperature, movement, pressure, or stress levels. Structural integrity, especially for cap rocks within hydrocarbon systems, and rock mass behavior assessment, to further activities such as enhanced oil recovery (EOR) and underground gas storage (UGS), or to improve safety risk management (SRM) and potential hazards identification (P.H.I), are other benefits from IoT technologies. EC techniques can process, aggregate, and analyze data immediately collected by IoT on a real-time scale, providing detailed insights into the behavior of rocks in various situations (e.g., stress, temperature, and pressure), establishing patterns quickly, and detecting trends. Therefore, this state-of-the-art and useful technology can adopt autonomous systems in rock mechanical surveys, such as drilling and production (in hydrocarbon wells) or excavation (in mining and geotechnics industries). Besides, ECt allows all rock-related operations to be controlled remotely and enables operators to apply changes or make adjustments. It must be mentioned that this feature is very important in environmental goals. More often than not, rock mechanical studies consist of different data, such as laboratory tests, field operations, and indirect information like seismic or well-logging data. CCt provides a useful platform for storing and managing a great deal of volume and different information, which can be very useful in fractured zones. Additionally, CCt supplies powerful tools for predicting, modeling, and simulating rock mechanical information, especially in fractured zones within vast areas. Also, it is a suitable source for sharing extensive information on rock mechanics, such as the direction and size of fractures in a large oil field or mine. The comprehensive review findings demonstrate that digital transformation through integrated IoT, Edge, and Cloud solutions is revolutionizing traditional rock mechanical investigation. These advanced technologies have empowered real-time monitoring, predictive analysis, and data-driven decision-making, culminating in noteworthy enhancements in safety, efficiency, and sustainability. Therefore, by employing IoT, CCt, and ECt, underground operations have experienced a significant boost, allowing for timely and informed actions using real-time data insights. The successful implementation of IoT, CCt, and ECt has led to optimized and safer operations, optimized processes, and environmentally conscious approaches in underground geological endeavors.

Keywords: rock mechanical studies, internet of things, edge computing, cloud computing, underground surveys, geological operations

Procedia PDF Downloads 35