Search results for: underground coal mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1665

Search results for: underground coal mining

435 Topic Modelling Using Latent Dirichlet Allocation and Latent Semantic Indexing on SA Telco Twitter Data

Authors: Phumelele Kubheka, Pius Owolawi, Gbolahan Aiyetoro

Abstract:

Twitter is one of the most popular social media platforms where users can share their opinions on different subjects. As of 2010, The Twitter platform generates more than 12 Terabytes of data daily, ~ 4.3 petabytes in a single year. For this reason, Twitter is a great source for big mining data. Many industries such as Telecommunication companies can leverage the availability of Twitter data to better understand their markets and make an appropriate business decision. This study performs topic modeling on Twitter data using Latent Dirichlet Allocation (LDA). The obtained results are benchmarked with another topic modeling technique, Latent Semantic Indexing (LSI). The study aims to retrieve topics on a Twitter dataset containing user tweets on South African Telcos. Results from this study show that LSI is much faster than LDA. However, LDA yields better results with higher topic coherence by 8% for the best-performing model represented in Table 1. A higher topic coherence score indicates better performance of the model.

Keywords: big data, latent Dirichlet allocation, latent semantic indexing, telco, topic modeling, twitter

Procedia PDF Downloads 152
434 Analyzing Semantic Feature Using Multiple Information Sources for Reviews Summarization

Authors: Yu Hung Chiang, Hei Chia Wang

Abstract:

Nowadays, tourism has become a part of life. Before reserving hotels, customers need some information, which the most important source is online reviews, about hotels to help them make decisions. Due to the dramatic growing of online reviews, it is impossible for tourists to read all reviews manually. Therefore, designing an automatic review analysis system, which summarizes reviews, is necessary for them. The main purpose of the system is to understand the opinion of reviews, which may be positive or negative. In other words, the system would analyze whether the customers who visited the hotel like it or not. Using sentiment analysis methods will help the system achieve the purpose. In sentiment analysis methods, the targets of opinion (here they are called the feature) should be recognized to clarify the polarity of the opinion because polarity of the opinion may be ambiguous. Hence, the study proposes an unsupervised method using Part-Of-Speech pattern and multi-lexicons sentiment analysis to summarize all reviews. We expect this method can help customers search what they want information as well as make decisions efficiently.

Keywords: text mining, sentiment analysis, product feature extraction, multi-lexicons

Procedia PDF Downloads 331
433 Automated Process Quality Monitoring and Diagnostics for Large-Scale Measurement Data

Authors: Hyun-Woo Cho

Abstract:

Continuous monitoring of industrial plants is one of necessary tasks when it comes to ensuring high-quality final products. In terms of monitoring and diagnosis, it is quite critical and important to detect some incipient abnormal events of manufacturing processes in order to improve safety and reliability of operations involved and to reduce related losses. In this work a new multivariate statistical online diagnostic method is presented using a case study. For building some reference models an empirical discriminant model is constructed based on various past operation runs. When a fault is detected on-line, an on-line diagnostic module is initiated. Finally, the status of the current operating conditions is compared with the reference model to make a diagnostic decision. The performance of the presented framework is evaluated using a dataset from complex industrial processes. It has been shown that the proposed diagnostic method outperforms other techniques especially in terms of incipient detection of any faults occurred.

Keywords: data mining, empirical model, on-line diagnostics, process fault, process monitoring

Procedia PDF Downloads 401
432 The Impact of Interrelationship between Business Intelligence and Knowledge Management on Decision Making Process: An Empirical Investigation of Banking Sector in Jordan

Authors: Issa M. Shehabat, Huda F. Y. Nimri

Abstract:

This paper aims to study the relationship between knowledge management in its processes, including knowledge creation, knowledge sharing, knowledge organization, and knowledge application, and business intelligence tools, including OLAP, data mining, and data warehouse, and their impact on the decision-making process in the banking sector in Jordan. A total of 200 questionnaires were distributed to the sample of the study. The study hypotheses were tested using the statistical package SPSS. Study findings suggest that decision-making processes were positively related to knowledge management processes. Additionally, the components of business intelligence had a positive impact on decision-making. The study recommended conducting studies similar to this study in other sectors such as the industrial, telecommunications, and service sectors to contribute to enhancing understanding of the role of the knowledge management processes and business intelligence tools.

Keywords: business intelligence, knowledge management, decision making, Jordan, banking sector

Procedia PDF Downloads 145
431 Krill-Herd Step-Up Approach Based Energy Efficiency Enhancement Opportunities in the Offshore Mixed Refrigerant Natural Gas Liquefaction Process

Authors: Kinza Qadeer, Muhammad Abdul Qyyum, Moonyong Lee

Abstract:

Natural gas has become an attractive energy source in comparison with other fossil fuels because of its lower CO₂ and other air pollutant emissions. Therefore, compared to the demand for coal and oil, that for natural gas is increasing rapidly world-wide. The transportation of natural gas over long distances as a liquid (LNG) preferable for several reasons, including economic, technical, political, and safety factors. However, LNG production is an energy-intensive process due to the tremendous amount of power requirements for compression of refrigerants, which provide sufficient cold energy to liquefy natural gas. Therefore, one of the major issues in the LNG industry is to improve the energy efficiency of existing LNG processes through a cost-effective approach that is 'optimization'. In this context, a bio-inspired Krill-herd (KH) step-up approach was examined to enhance the energy efficiency of a single mixed refrigerant (SMR) natural gas liquefaction (LNG) process, which is considered as a most promising candidate for offshore LNG production (FPSO). The optimal design of a natural gas liquefaction processes involves multivariable non-linear thermodynamic interactions, which lead to exergy destruction and contribute to process irreversibility. As key decision variables, the optimal values of mixed refrigerant flow rates and process operating pressures were determined based on the herding behavior of krill individuals corresponding to the minimum energy consumption for LNG production. To perform the rigorous process analysis, the SMR process was simulated in Aspen Hysys® software and the resulting model was connected with the Krill-herd approach coded in MATLAB. The optimal operating conditions found by the proposed approach significantly reduced the overall energy consumption of the SMR process by ≤ 22.5% and also improved the coefficient of performance in comparison with the base case. The proposed approach was also compared with other well-proven optimization algorithms, such as genetic and particle swarm optimization algorithms, and was found to exhibit a superior performance over these existing approaches.

Keywords: energy efficiency, Krill-herd, LNG, optimization, single mixed refrigerant

Procedia PDF Downloads 155
430 Application of Thermal Dimensioning Tools to Consider Different Strategies for the Disposal of High-Heat-Generating Waste

Authors: David Holton, Michelle Dickinson, Giovanni Carta

Abstract:

The principle of geological disposal is to isolate higher-activity radioactive wastes deep inside a suitable rock formation to ensure that no harmful quantities of radioactivity reach the surface environment. To achieve this, wastes will be placed in an engineered underground containment facility – the geological disposal facility (GDF) – which will be designed so that natural and man-made barriers work together to minimise the escape of radioactivity. Internationally, various multi-barrier concepts have been developed for the disposal of higher-activity radioactive wastes. High-heat-generating wastes (HLW, spent fuel and Pu) provide a number of different technical challenges to those associated with the disposal of low-heat-generating waste. Thermal management of the disposal system must be taken into consideration in GDF design; temperature constraints might apply to the wasteform, container, buffer and host rock. Of these, the temperature limit placed on the buffer component of the engineered barrier system (EBS) can be the most constraining factor. The heat must therefore be managed such that the properties of the buffer are not compromised to the extent that it cannot deliver the required level of safety. The maximum temperature of a buffer surrounding a container at the centre of a fixed array of heat-generating sources, arises due to heat diffusing from neighbouring heat-generating wastes, incrementally contributing to the temperature of the EBS. A range of strategies can be employed for managing heat in a GDF, including the spatial arrangements or patterns of those containers; different geometrical configurations can influence the overall thermal density in a disposal facility (or area within a facility) and therefore the maximum buffer temperature. A semi-analytical thermal dimensioning tool and methodology have been applied at a generic stage to explore a range of strategies to manage the disposal of high-heat-generating waste. A number of examples, including different geometrical layouts and chequer-boarding, have been illustrated to demonstrate how these tools can be used to consider safety margins and inform strategic disposal options when faced with uncertainty, at a generic stage of the development of a GDF.

Keywords: buffer, geological disposal facility, high-heat-generating waste, spent fuel

Procedia PDF Downloads 286
429 Balancing Electricity Demand and Supply to Protect a Company from Load Shedding: A Review

Authors: G. W. Greubel, A. Kalam

Abstract:

This paper provides a review of the technical problems facing the South African electricity system and discusses a hypothetical ‘virtual grid’ concept that may assist in solving the problems. The proposed solution has potential application across emerging markets with constrained power infrastructure or for companies who wish to be entirely powered by renewable energy. South Africa finds itself at a confluence of forces where the national electricity supply system is constrained with under-supply primarily from old and failing coal-fired power stations and congested and inadequate transmission and distribution systems. Simultaneously, the country attempts to meet carbon reduction targets driven by both an alignment with international goals and a consumer-driven requirement. The constrained electricity system is an aspect of an economy characterized by very low economic growth, high unemployment, and frequent and significant load shedding. The fiscus does not have the funding to build new generation capacity or strengthen the grid. The under-supply is increasingly alleviated by the penetration of wind and solar generation capacity and embedded roof-top solar. However, this increased penetration results in less inertia, less synchronous generation, and less capability for fast frequency response, with resultant instability. The renewable energy facilities assist in solving the under-supply issues but merely ‘kick the can down the road’ by not contributing to grid stability or by substituting the lost inertia, thus creating an expanding issue for the grid to manage. By technically balancing its electricity demand and supply a company with facilities located across the country can be protected from the effects of load shedding, and thus ensure financial and production performance, protect jobs, and contribute meaningfully to the economy. By treating the company’s load (across the country) and its various distributed generation facilities as a ‘virtual grid’, which by design will provide ancillary services to the grid one is able to create a win-win situation for both the company and the grid.

Keywords: load shedding, renewable energy integration, smart grid, virtual grid, virtual power plant

Procedia PDF Downloads 60
428 Developing Serious Games to Improve Learning Experience of Programming: A Case Study

Authors: Shan Jiang, Xinyu Tang

Abstract:

Game-based learning is an emerging pedagogy to make the learning experience more effective, enjoyable, and fun. However, most games used in classroom settings have been overly simplistic. This paper presents a case study on a Python-based online game designed to improve the effectiveness in both teaching and research in higher education. The proposed game system not only creates a fun and enjoyable experience for students to learn various topics in programming but also improves the effectiveness of teaching in several aspects, including material presentation, helping students to recognize the importance of the subjects, and linking theoretical concepts to practice. The proposed game system also serves as an information cyber-infrastructure that automatically collects and stores data from players. The data could be useful in research areas including human-computer interaction, decision making, opinion mining, and artificial intelligence. They further provide other possibilities beyond these areas due to the customizable nature of the game.

Keywords: game-based learning, programming, research-teaching integration, Hearthstone

Procedia PDF Downloads 166
427 3D Modeling for Frequency and Time-Domain Airborne EM Systems with Topography

Authors: C. Yin, B. Zhang, Y. Liu, J. Cai

Abstract:

Airborne EM (AEM) is an effective geophysical exploration tool, especially suitable for ridged mountain areas. In these areas, topography will have serious effects on AEM system responses. However, until now little study has been reported on topographic effect on airborne EM systems. In this paper, an edge-based unstructured finite-element (FE) method is developed for 3D topographic modeling for both frequency and time-domain airborne EM systems. Starting from the frequency-domain Maxwell equations, a vector Helmholtz equation is derived to obtain a stable and accurate solution. Considering that the AEM transmitter and receiver are both located in the air, the scattered field method is used in our modeling. The Galerkin method is applied to discretize the Helmholtz equation for the final FE equations. Solving the FE equations, the frequency-domain AEM responses are obtained. To accelerate the calculation speed, the response of source in free-space is used as the primary field and the PARDISO direct solver is used to deal with the problem with multiple transmitting sources. After calculating the frequency-domain AEM responses, a Hankel’s transform is applied to obtain the time-domain AEM responses. To check the accuracy of present algorithm and to analyze the characteristic of topographic effect on airborne EM systems, both the frequency- and time-domain AEM responses for 3 model groups are simulated: 1) a flat half-space model that has a semi-analytical solution of EM response; 2) a valley or hill earth model; 3) a valley or hill earth with an abnormal body embedded. Numerical experiments show that close to the node points of the topography, AEM responses demonstrate sharp changes. Special attentions need to be paid to the topographic effects when interpreting AEM survey data over rugged topographic areas. Besides, the profile of the AEM responses presents a mirror relation with the topographic earth surface. In comparison to the topographic effect that mainly occurs at the high-frequency end and early time channels, the EM responses of underground conductors mainly occur at low frequencies and later time channels. For the signal of the same time channel, the dB/dt field reflects the change of conductivity better than the B-field. The research of this paper will serve airborne EM in the identification and correction of the topographic effects.

Keywords: 3D, Airborne EM, forward modeling, topographic effect

Procedia PDF Downloads 318
426 Development of Knowledge Discovery Based Interactive Decision Support System on Web Platform for Maternal and Child Health System Strengthening

Authors: Partha Saha, Uttam Kumar Banerjee

Abstract:

Maternal and Child Healthcare (MCH) has always been regarded as one of the important issues globally. Reduction of maternal and child mortality rates and increase of healthcare service coverage were declared as one of the targets in Millennium Development Goals till 2015 and thereafter as an important component of the Sustainable Development Goals. Over the last decade, worldwide MCH indicators have improved but could not match the expected levels. Progress of both maternal and child mortality rates have been monitored by several researchers. Each of the studies has stated that only less than 26% of low-income and middle income countries (LMICs) were on track to achieve targets as prescribed by MDG4. Average worldwide annual rate of reduction of under-five mortality rate and maternal mortality rate were 2.2% and 1.9% as on 2011 respectively whereas rates should be minimum 4.4% and 5.5% annually to achieve targets. In spite of having proven healthcare interventions for both mothers and children, those could not be scaled up to the required volume due to fragmented health systems, especially in the developing and under-developed countries. In this research, a knowledge discovery based interactive Decision Support System (DSS) has been developed on web platform which would assist healthcare policy makers to develop evidence-based policies. To achieve desirable results in MCH, efficient resource planning is very much required. In maximum LMICs, resources are big constraint. Knowledge, generated through this system, would help healthcare managers to develop strategic resource planning for combatting with issues like huge inequity and less coverage in MCH. This system would help healthcare managers to accomplish following four tasks. Those are a) comprehending region wise conditions of variables related with MCH, b) identifying relationships within variables, c) segmenting regions based on variables status, and d) finding out segment wise key influential variables which have major impact on healthcare indicators. Whole system development process has been divided into three phases. Those were i) identifying contemporary issues related with MCH services and policy making; ii) development of the system; and iii) verification and validation of the system. More than 90 variables under three categories, such as a) educational, social, and economic parameters; b) MCH interventions; and c) health system building blocks have been included into this web-based DSS and five separate modules have been developed under the system. First module has been designed for analysing current healthcare scenario. Second module would help healthcare managers to understand correlations among variables. Third module would reveal frequently-occurring incidents along with different MCH interventions. Fourth module would segment regions based on previously mentioned three categories and in fifth module, segment-wise key influential interventions will be identified. India has been considered as case study area in this research. Data of 601 districts of India has been used for inspecting effectiveness of those developed modules. This system has been developed by importing different statistical and data mining techniques on Web platform. Policy makers would be able to generate different scenarios from the system before drawing any inference, aided by its interactive capability.

Keywords: maternal and child heathcare, decision support systems, data mining techniques, low and middle income countries

Procedia PDF Downloads 259
425 Classification of Political Affiliations by Reduced Number of Features

Authors: Vesile Evrim, Aliyu Awwal

Abstract:

By the evolvement in technology, the way of expressing opinions switched the direction to the digital world. The domain of politics as one of the hottest topics of opinion mining research merged together with the behavior analysis for affiliation determination in text which constitutes the subject of this paper. This study aims to classify the text in news/blogs either as Republican or Democrat with the minimum number of features. As an initial set, 68 features which 64 are constituted by Linguistic Inquiry and Word Count (LIWC) features are tested against 14 benchmark classification algorithms. In the later experiments, the dimensions of the feature vector reduced based on the 7 feature selection algorithms. The results show that Decision Tree, Rule Induction and M5 Rule classifiers when used with SVM and IGR feature selection algorithms performed the best up to 82.5% accuracy on a given dataset. Further tests on a single feature and the linguistic based feature sets showed the similar results. The feature “function” as an aggregate feature of the linguistic category, is obtained as the most differentiating feature among the 68 features with 81% accuracy by itself in classifying articles either as Republican or Democrat.

Keywords: feature selection, LIWC, machine learning, politics

Procedia PDF Downloads 383
424 Modelling Fluoride Pollution of Groundwater Using Artificial Neural Network in the Western Parts of Jharkhand

Authors: Neeta Kumari, Gopal Pathak

Abstract:

Artificial neural network has been proved to be an efficient tool for non-parametric modeling of data in various applications where output is non-linearly associated with input. It is a preferred tool for many predictive data mining applications because of its power , flexibility, and ease of use. A standard feed forward networks (FFN) is used to predict the groundwater fluoride content. The ANN model is trained using back propagated algorithm, Tansig and Logsig activation function having varying number of neurons. The models are evaluated on the basis of statistical performance criteria like Root Mean Squarred Error (RMSE) and Regression coefficient (R2), bias (mean error), Coefficient of variation (CV), Nash-Sutcliffe efficiency (NSE), and the index of agreement (IOA). The results of the study indicate that Artificial neural network (ANN) can be used for groundwater fluoride prediction in the limited data situation in the hard rock region like western parts of Jharkhand with sufficiently good accuracy.

Keywords: Artificial neural network (ANN), FFN (Feed-forward network), backpropagation algorithm, Levenberg-Marquardt algorithm, groundwater fluoride contamination

Procedia PDF Downloads 551
423 User Modeling from the Perspective of Improvement in Search Results: A Survey of the State of the Art

Authors: Samira Karimi-Mansoub, Rahem Abri

Abstract:

Currently, users expect high quality and personalized information from search results. To satisfy user’s needs, personalized approaches to web search have been proposed. These approaches can provide the most appropriate answer for user’s needs by using user context and incorporating information about query provided by combining search technologies. To carry out personalized web search, there is a need to make different techniques on whole of user search process. There are the number of possible deployment of personalized approaches such as personalized web search, personalized recommendation, personalized summarization and filtering systems and etc. but the common feature of all approaches in various domains is that user modeling is utilized to provide personalized information from the Web. So the most important work in personalized approaches is user model mining. User modeling applications and technologies can be used in various domains depending on how the user collected information may be extracted. In addition to, the used techniques to create user model is also different in each of these applications. Since in the previous studies, there was not a complete survey in this field, our purpose is to present a survey on applications and techniques of user modeling from the viewpoint of improvement in search results by considering the existing literature and researches.

Keywords: filtering systems, personalized web search, user modeling, user search behavior

Procedia PDF Downloads 280
422 Study on Pd Catalyst Supported on Carbon Materials for C₂ Hydrogenation

Authors: Huanru Wang, Jianzhun Jiang

Abstract:

At present, the preparation of the catalyst by carbon carrier is one of the improvement directions of the C₂ pre-hydrogenation catalyst. Carbon materials can be prepared from coal direct liquefaction residues, coconut shells, biomass, etc., and the pore structure of carbon carrier materials can be adjusted through the preparation process; at high temperatures, the carbon carrier itself also shows certain catalytic activity. Therefore, this paper mainly selected typical activated carbon and coconut shell carbon as carbon carrier materials, studied their microstructure and surface properties, prepared a series of carbon-based catalysts loaded with Pd, and investigated the effects of the content of promoter Ag and the concentration of reductant on the structure and performance of the catalyst and its catalytic performance for the pre hydrogenation of C₂. In this paper, the carbon supports from two sources and the catalysts prepared by them were characterized in detail. The results showed that the morphology and structure of different supports and the performance of the catalysts prepared were also obviously different. The catalyst supported on coconut shell carbon has a small specific surface area and large pore diameter. The catalyst supported on activated carbon has a large specific surface area and rich pore structure. The active carbon support is mainly a mixture of amorphous graphite and microcrystalline graphite. For the catalyst prepared with coconut shell carbon as the carrier, the sample is very uneven, and its specific surface area and pore volume are irregular. Compared with coconut shell carbon, activated carbon is more suitable as the carrier of the C₂ hydrogenation catalyst. The conversion of acetylene, methyl acetylene, and butadiene decreased, and the ethylene selectivity increased after Ag was added to the supported Pd catalyst. When the amount of promoter Ag is 0.01-0.015%, the catalyst has relatively good catalytic performance. Ag and Pd form an alloying effect, thus reducing the effective demand for Ag. The Pd Ag ratio is the key factor affecting the catalytic performance. When the addition amount of Ag is 0.01-0.015%, the dispersion of Pd on the carbon support surface can be significantly improved, and the size of active particles can be reduced. The Pd Ag ratio is the main factor in improving the selectivity of the catalyst. When the additional amount of sodium formate is 1%, the catalyst prepared has both high acetylene conversion and high ethylene selectivity.

Keywords: C₂ hydrogenation, activated carbon, Ag promoter, Pd catalysts

Procedia PDF Downloads 121
421 Benthic Foraminiferal Responses to Coastal Pollution for Some Selected Sites along Red Sea, Egypt

Authors: Ramadan M. El-Kahawy, M. A. El-Shafeiy, Mohamed Abd El-Wahab, S. A. Helal, Nabil Aboul-Ela

Abstract:

Due to the economic importance of Safaga Bay, Quseir harbor and Ras Gharib harbor , a multidisciplinary approach was adopted to invistigate 27 surfecial sediment samples from the three sites and 9 samples for each in order to use the benthic foraminifera as bio-indicators for characterization of the environmental variations. Grain size analyses indicate that the bottom facies in the inner part of quseir is muddy while the inner part of Ras Gharib and Safaga is silty sand and those close to the entrance of Safaga bay and Ras Gharib is sandy facies while quseir still also muddy facies. geochemical data show high concentration of heavy-metals mainly in Ras Gharib due to oil leakage from the hydrocarbon oil field and Safaga bay due to the phosphate mining while quseir is medium concentration due to anthropocentric effect.micropaelontological analyses indicate the boundaries of the highest concentration of heavy metals and those of low concentration as well.the dominant benthic foraminifera in these three sites are Ammonia beccarii, Amphistigina and sorites. the study highlights the worsening of environmental conditions and also show that the areas in need of a priority recovery.

Keywords: benthic foraminifera, Ras Gharib, Safaga, Quseir, Red Sea, Egypt

Procedia PDF Downloads 351
420 Investigation of Turbulent Flow in a Bubble Column Photobioreactor and Consequent Effects on Microalgae Cultivation Using Computational Fluid Dynamic Simulation

Authors: Geetanjali Yadav, Arpit Mishra, Parthsarathi Ghosh, Ramkrishna Sen

Abstract:

The world is facing problems of increasing global CO2 emissions, climate change and fuel crisis. Therefore, several renewable and sustainable energy alternatives should be investigated to replace non-renewable fuels in future. Algae presents itself a versatile feedstock for the production of variety of fuels (biodiesel, bioethanol, bio-hydrogen etc.) and high value compounds for food, fodder, cosmetics and pharmaceuticals. Microalgae are simple microorganisms that require water, light, CO2 and nutrients for growth by the process of photosynthesis and can grow in extreme environments, utilize waste gas (flue gas) and waste waters. Mixing, however, is a crucial parameter within the culture system for the uniform distribution of light, nutrients and gaseous exchange in addition to preventing settling/sedimentation, creation of dark zones etc. The overarching goal of the present study is to improve photobioreactor (PBR) design for enhancing dissolution of CO2 from ambient air (0.039%, v/v), pure CO2 and coal-fired flue gas (10 ± 2%) into microalgal PBRs. Computational fluid dynamics (CFD), a state-of-the-art technique has been used to solve partial differential equations with turbulence closure which represents the dynamics of fluid in a photobioreactor. In this paper, the hydrodynamic performance of the PBR has been characterized and compared with that of the conventional bubble column PBR using CFD. Parameters such as flow rate (Q), mean velocity (u), mean turbulent kinetic energy (TKE) were characterized for each experiment that was tested across different aeration schemes. The results showed that the modified PBR design had superior liquid circulation properties and gas-liquid transfer that resulted in creation of uniform environment inside PBR as compared to conventional bubble column PBR. The CFD technique has shown to be promising to successfully design and paves path for a future research in order to develop PBRs which can be commercially available for scale-up microalgal production.

Keywords: computational fluid dynamics, microalgae, bubble column photbioreactor, flue gas, simulation

Procedia PDF Downloads 232
419 Sentiment Analysis of Ensemble-Based Classifiers for E-Mail Data

Authors: Muthukumarasamy Govindarajan

Abstract:

Detection of unwanted, unsolicited mails called spam from email is an interesting area of research. It is necessary to evaluate the performance of any new spam classifier using standard data sets. Recently, ensemble-based classifiers have gained popularity in this domain. In this research work, an efficient email filtering approach based on ensemble methods is addressed for developing an accurate and sensitive spam classifier. The proposed approach employs Naive Bayes (NB), Support Vector Machine (SVM) and Genetic Algorithm (GA) as base classifiers along with different ensemble methods. The experimental results show that the ensemble classifier was performing with accuracy greater than individual classifiers, and also hybrid model results are found to be better than the combined models for the e-mail dataset. The proposed ensemble-based classifiers turn out to be good in terms of classification accuracy, which is considered to be an important criterion for building a robust spam classifier.

Keywords: accuracy, arcing, bagging, genetic algorithm, Naive Bayes, sentiment mining, support vector machine

Procedia PDF Downloads 143
418 The KAPSARC Energy Policy Database: Introducing a Quantified Library of China's Energy Policies

Authors: Philipp Galkin

Abstract:

Government policy is a critical factor in the understanding of energy markets. Regardless, it is rarely approached systematically from a research perspective. Gaining a precise understanding of what policies exist, their intended outcomes, geographical extent, duration, evolution, etc. would enable the research community to answer a variety of questions that, for now, are either oversimplified or ignored. Policy, on its surface, also seems a rather unstructured and qualitative undertaking. There may be quantitative components, but incorporating the concept of policy analysis into quantitative analysis remains a challenge. The KAPSARC Energy Policy Database (KEPD) is intended to address these two energy policy research limitations. Our approach is to represent policies within a quantitative library of the specific policy measures contained within a set of legal documents. Each of these measures is recorded into the database as a single entry characterized by a set of qualitative and quantitative attributes. Initially, we have focused on the major laws at the national level that regulate coal in China. However, KAPSARC is engaged in various efforts to apply this methodology to other energy policy domains. To ensure scalability and sustainability of our project, we are exploring semantic processing using automated computer algorithms. Automated coding can provide a more convenient input data for human coders and serve as a quality control option. Our initial findings suggest that the methodology utilized in KEPD could be applied to any set of energy policies. It also provides a convenient tool to facilitate understanding in the energy policy realm enabling the researcher to quickly identify, summarize, and digest policy documents and specific policy measures. The KEPD captures a wide range of information about each individual policy contained within a single policy document. This enables a variety of analyses, such as structural comparison of policy documents, tracing policy evolution, stakeholder analysis, and exploring interdependencies of policies and their attributes with exogenous datasets using statistical tools. The usability and broad range of research implications suggest a need for the continued expansion of the KEPD to encompass a larger scope of policy documents across geographies and energy sectors.

Keywords: China, energy policy, policy analysis, policy database

Procedia PDF Downloads 323
417 Product Features Extraction from Opinions According to Time

Authors: Kamal Amarouche, Houda Benbrahim, Ismail Kassou

Abstract:

Nowadays, e-commerce shopping websites have experienced noticeable growth. These websites have gained consumers’ trust. After purchasing a product, many consumers share comments where opinions are usually embedded about the given product. Research on the automatic management of opinions that gives suggestions to potential consumers and portrays an image of the product to manufactures has been growing recently. After launching the product in the market, the reviews generated around it do not usually contain helpful information or generic opinions about this product (e.g. telephone: great phone...); in the sense that the product is still in the launching phase in the market. Within time, the product becomes old. Therefore, consumers perceive the advantages/ disadvantages about each specific product feature. Therefore, they will generate comments that contain their sentiments about these features. In this paper, we present an unsupervised method to extract different product features hidden in the opinions which influence its purchase, and that combines Time Weighting (TW) which depends on the time opinions were expressed with Term Frequency-Inverse Document Frequency (TF-IDF). We conduct several experiments using two different datasets about cell phones and hotels. The results show the effectiveness of our automatic feature extraction, as well as its domain independent characteristic.

Keywords: opinion mining, product feature extraction, sentiment analysis, SentiWordNet

Procedia PDF Downloads 415
416 Artificial Reproduction System and Imbalanced Dataset: A Mendelian Classification

Authors: Anita Kushwaha

Abstract:

We propose a new evolutionary computational model called Artificial Reproduction System which is based on the complex process of meiotic reproduction occurring between male and female cells of the living organisms. Artificial Reproduction System is an attempt towards a new computational intelligence approach inspired by the theoretical reproduction mechanism, observed reproduction functions, principles and mechanisms. A reproductive organism is programmed by genes and can be viewed as an automaton, mapping and reducing so as to create copies of those genes in its off springs. In Artificial Reproduction System, the binding mechanism between male and female cells is studied, parameters are chosen and a network is constructed also a feedback system for self regularization is established. The model then applies Mendel’s law of inheritance, allele-allele associations and can be used to perform data analysis of imbalanced data, multivariate, multiclass and big data. In the experimental study Artificial Reproduction System is compared with other state of the art classifiers like SVM, Radial Basis Function, neural networks, K-Nearest Neighbor for some benchmark datasets and comparison results indicates a good performance.

Keywords: bio-inspired computation, nature- inspired computation, natural computing, data mining

Procedia PDF Downloads 274
415 Feature Weighting Comparison Based on Clustering Centers in the Detection of Diabetic Retinopathy

Authors: Kemal Polat

Abstract:

In this paper, three feature weighting methods have been used to improve the classification performance of diabetic retinopathy (DR). To classify the diabetic retinopathy, features extracted from the output of several retinal image processing algorithms, such as image-level, lesion-specific and anatomical components, have been used and fed them into the classifier algorithms. The dataset used in this study has been taken from University of California, Irvine (UCI) machine learning repository. Feature weighting methods including the fuzzy c-means clustering based feature weighting, subtractive clustering based feature weighting, and Gaussian mixture clustering based feature weighting, have been used and compered with each other in the classification of DR. After feature weighting, five different classifier algorithms comprising multi-layer perceptron (MLP), k- nearest neighbor (k-NN), decision tree, support vector machine (SVM), and Naïve Bayes have been used. The hybrid method based on combination of subtractive clustering based feature weighting and decision tree classifier has been obtained the classification accuracy of 100% in the screening of DR. These results have demonstrated that the proposed hybrid scheme is very promising in the medical data set classification.

Keywords: machine learning, data weighting, classification, data mining

Procedia PDF Downloads 327
414 Effects of Audiovisual Contextualization of L2 Idioms on Enhancing Students’ Comprehension and Retention

Authors: Monica Karlsson

Abstract:

The positive effect of a supportive written context on comprehension and retention when faced with a previously unknown idiomatic expression is today an indisputable fact, especially if relevant clues are given in close proximity of the item in question. Also, giving learners a chance of visualizing the meaning of an idiom by offering them its source domain and/or by elaborating etymologically, i.e. providing a mental picture in addition to the spoken/written form (referred to as dual coding), seems to enhance comprehension and retention even further, especially if the idiom is of a more transparent kind. For example, by explaining that walk the plank has a maritime origin and a canary in a coal mine comes from the time when canaries were kept in cages to warn miners if gas was leaking out at which point the canaries succumbed immediately, learners’ comprehension and retention have been shown to increase. The present study aims to investigate whether contextualization of an audiovisual kind could help increase comprehension and retention of L2 idioms. 40 Swedish first-term university students studying English as part of their education to become middle-school teachers participated in the investigation, which tested 24 idioms, all of which were ascertained to be previously unknown to the informants. While half of the learners were subjected to a test in which they were asked to watch scenes from various TV programmes, each scene including one idiomatic expression in a supportive context, the remaining 20 students, as a point of reference, were only offered written contexts, though equally supportive. Immediately after these sessions, both groups were given the same idioms in a decontextualized form and asked to give their meaning. After five weeks, finally, the students were subjected to yet another decontextualized comprehension test. Furthermore, since mastery of idioms in one’s L1 appears to correlate to a great extent with a person’s ability to comprehend idioms in an L2, all the informants were also asked to take a test focusing on idioms in their L1. The result on this test is thus seen to indicate each student’s potential for understanding and memorizing various idiomatic expressions from a more general perspective. Preliminary results clearly show that audiovisual contextualization indeed has a positive effect on learners’ retention. In addition, preliminary results also show that those learners’ who were able to recall most meanings were those who had a propensity for idiom comprehension in their L1.

Keywords: English, L2, idioms, audiovisual context

Procedia PDF Downloads 347
413 Analysis of the Introduction of Carsharing in the Context of Developing Countries: A Case Study Based on On-Board Carsharing Survey in Kabul, Afghanistan

Authors: Mustafa Rezazada, Takuya Maruyama

Abstract:

Cars have a strong integration with the human being since its introduction, and this interaction is more evident in the urban context. Therefore, shifting city residents from driving private vehicles to public transits has been a big challenge. Accordingly, carsharing as an innovative, environmentally friendly transport alternative had a significant contribution to this transition so far. It helped to reduce the numbers of household car ownership, declining demand for on-street parking, dropping the numbers of kilometers traveled by car, and affects the future of mobility by decreasing the Green House Gases (GHS) emissions’ and the numbers of new cars to be purchased otherwise. However, majorities of carsharing researches were conducted in highly developed cities, and less attention has been paid to the cities of developing countries. This study is conducted in the Capital of Afghanistan, Kabul to investigate the current transport pattern, user behavior, and to examine the possibility of introducing the carsharing system. This study established a new survey method called Onboard Carsharing Survey OCS. In this survey, the carpooling passengers aboard are interviewed following the Onboard Transit Survey OTS guideline with a few refinements. The survey focuses on respondents’ daily travel behavior and hypothetical stated choice of carsharing opportunities. Moreover, it followed by an aggregate analysis at the end. The survey results indicate the following: two-thirds of the respondents 62% have been carpooling every day since 5 years or more, more than half of the respondents are not satisfied with current modes, besides other attributes the Traffic Congestion, Environment and Insufficient Public Transport were ranked the most critical in daily transportation by survey participants. Moreover, 68.24% of the respondent chose Carsharing over carpooling under different choice game scenarios. Overall, the findings in this research show that Kabul City is a potential underground for the introduction of Carsharing in the future. Taken together, insufficient public transit, dissatisfaction with current modes, and their stated interest will affect the future of carsharing positively in Kabul City. The modal choice in this study is limited to carpooling and carsharing; more choice sets, including bus, cycling, and walking, will have to be added to evaluate further.

Keywords: carsharing, developing countries, Kabul Afghanistan, onboard carsharing survey, transportation, urban planning

Procedia PDF Downloads 138
412 An Evaluation of Edible Plants for Remediation of Contaminated Soil- Can Edible Plants Be Used to Remove Heavy Metals on Soil?

Authors: Celia Marilia Martins, Sonia I. V. Guilundo, Iris M. Victorino, Antonio O. Quilambo

Abstract:

In Mozambique rapid industrialization (mining, aluminium and cement activities) and urbanization processes has led to the incorporation of heavy metals on soil, thus degrading not only the quality of the environment, but also affecting plants, animals and human healthy. Several methods have been used to remediate contaminated soils, but most of them are costly and difficult to get optimum results. Currently, phytoremediation is an effective and affordable technological solution used to extract or remove inactive metals from contaminated soil. Phytoremediation is the use of plants to clean up a contamination from soils, sediments, and water. This technology is environmental friendly and potentially cost effective. The present investigation summarised the potential of edible vegetable to grow under the high level of heavy metals such as lead and zinc. The plants used in these studies include Tomatoes, lettuce and Soya beans. The studies have shown that edible plants can be grown under the high level of heavy metals on the soil. Further investigations are identifying mechanisms used by plants to ensure a safe and sustainable use for remediation of contaminated soils by heavy metals.

Keywords: contaminated soil, edible plants, heavy metals, phytoremediation

Procedia PDF Downloads 377
411 Selection and Identification of Some Spontaneous Plant Species Having the Ability to Grow Naturally on Crude Oil Contaminated Soil for a Possible Approach to Decontaminate and Rehabilitate an Industrial Area

Authors: Salima Agoun-Bahar, Ouzna Abrous-Belbachir, Souad Amelal

Abstract:

Industrial areas generally contain heavy metals; thus, negative consequences can appear in the medium and long term on the fauna and flora, but also on the food chain, which man constitutes the final link. The SONATRACH Company has become aware of the importance of environmental protection by setting up a rehabilitation program for polluted sites in order to avoid major ecological disasters and find both curative and preventive solutions. The aim of this work consists to study industrial pollution located around a crude oil storage tank in the Algiers refinery of Sidi R'cine and to select the plants which accumulate the most heavy metals for possible use in phytotechnology. Sampling of whole plants with their soil clod was realized around the pollution source at a depth of twenty centimeters, then transported to the laboratory to identify them. The quantification of heavy metals, lead, zinc, copper, and nickel was carried out by atomic absorption spectrophotometry with flame in the soil and at the level of the aerial and underground parts of the plants. Ten plant species were recorded in the polluted site, three of them belonging to the grass family with a dominance percentage higher than 50%, followed by three other species belonging to the Composite family represented by 12% and one species for each of the families Linaceae, Plantaginaceae, Papilionaceae, and Boraginaceae. Koeleria phleoïdes L. and Avena sterilis L. of the grass family seem to be the dominant plants, although they are quite far from the pollution source. Lead pollution of soils is the most pronounced for all stations, with values varying from 237.5 to 2682.5 µg.g⁻¹. Other peaks are observed for zinc (1177 µg.g⁻¹) and copper (635 µg.g⁻¹) at station 8 and nickel (1800 µg.g⁻¹) at station 10. Among the inventoried plants, some species accumulate a significant amount of metals: Trifolium sp and K.phleoides for lead and zinc, P.lanceolata and G.tomentosa for nickel, and A.clavatus for zinc. K.phloides is a very interesting species because it accumulates an important quantity of heavy metals, especially in its aerial part. This can be explained by its use of the phytoextraction technique, which will facilitate the recovery of the pollutants by the simple removal of shoots.

Keywords: heavy metals, industrial pollution, phytotechnology, rehabilitation

Procedia PDF Downloads 66
410 Domain Adaptive Dense Retrieval with Query Generation

Authors: Rui Yin, Haojie Wang, Xun Li

Abstract:

Recently, mainstream dense retrieval methods have obtained state-of-the-art results on some datasets and tasks. However, they require large amounts of training data, which is not available in most domains. The severe performance degradation of dense retrievers on new data domains has limited the use of dense retrieval methods to only a few domains with large training datasets. In this paper, we propose an unsupervised domain-adaptive approach based on query generation. First, a generative model is used to generate relevant queries for each passage in the target corpus, and then, the generated queries are used for mining negative passages. Finally, the query-passage pairs are labeled with a cross-encoder and used to train a domain-adapted dense retriever. We also explore contrastive learning as a method for training domain-adapted dense retrievers and show that it leads to strong performance in various retrieval settings. Experiments show that our approach is more robust than previous methods in target domains that require less unlabeled data.

Keywords: dense retrieval, query generation, contrastive learning, unsupervised training

Procedia PDF Downloads 105
409 Aquatic and Marshy Flora from Fresh Water Wetlands on Quartz Sands in Pinar Del Río, Cuba

Authors: Vidal Pérez Hernández, Enrique González Pendás

Abstract:

The most of the aquatic and marshy flora in Cuba, is located on quartzitic sands ecosystems and they are represented by a wide variety of freshwater wetlands, which are spread in the whole south and south-western plain of Pinar del Río. The survey carried out in these ecosystems offers an updated inventory of these species, showing up their biological type, habit, distribution, and the threat grade to which are subjected, taking into account categories granted by UICN. A remarkable decrease is evidenced, in the total of these species respect to this area; due to deposit processes and deforestation, which are taken place by the human activity and the climatic change. It is linked to others threats like, limitless use of their water reserves for irrigating groves, the cattle raising and intensive fishing. Added to it, its sand with 99% pure crystal quartz, are used for the mining. The combination of all factors has a negative influence on a flora that stores more than 250 species, most of them herbaceous and hydrophytes. In these particular ecosystems were found a 40% endemism from total flora, and more than 80%, are evaluated inside the most sensitive threat categories, and already some of them have been declared as extinct.

Keywords: aquatic flora, marshy flora, quartzitic sands, wetlands

Procedia PDF Downloads 229
408 Economic Evaluation of an Advanced Bioethanol Manufacturing Technology Using Maize as a Feedstock in South Africa

Authors: Ayanda Ndokwana, Stanley Fore

Abstract:

Industrial prosperity and rapid expansion of human population in South Africa over the past two decades, have increased the use of conventional fossil fuels such as crude oil, coal and natural gas to meet the country’s energy demands. However, the inevitable depletion of fossil fuel reserves, global volatile oil price and large carbon footprint are some of the crucial reasons the South African Government needs to make a considerable investment in the development of the biofuel industry. In South Africa, this industry is still at the introductory stage with no large scale manufacturing plant that has been commissioned yet. Bioethanol is a potential replacement of gasoline which is a fossil fuel that is used in motor vehicles. Using bioethanol for the transport sector as a source of fuel will help Government to save heavy foreign exchange incurred during importation of oil and create many job opportunities in rural farming. In 2007, the South African Government developed the National Biofuels Industrial Strategy in an effort to make provision for support and attract investment in bioethanol production. However, capital investment in the production of bioethanol on a large scale, depends on the sound economic assessment of the available manufacturing technologies. The aim of this study is to evaluate the profitability of an advanced bioethanol manufacturing technology which uses maize as a feedstock in South Africa. The impact of fiber or bran fractionation in this technology causes it to possess a number of merits such as energy efficiency, low capital expenditure, and profitability compared to a conventional dry-mill bioethanol technology. Quantitative techniques will be used to collect and analyze numerical data from suitable organisations in South Africa. The dependence of three profitability indicators such as the Discounted Payback Period (DPP), Net Present Value (NPV) and Return On Investment (ROI) on plant capacity will be evaluated. Profitability analysis will be done on the following plant capacities: 100 000 ton/year, 150 000 ton/year and 200 000 ton/year. The plant capacity with the shortest Discounted Payback Period, positive Net Present Value and highest Return On Investment implies that a further consideration in terms of capital investment is warranted.

Keywords: bioethanol, economic evaluation, maize, profitability indicators

Procedia PDF Downloads 233
407 Taxonomy of Araceous Plants on Limestone Mountains in Lop Buri and Saraburi Provinces, Thailand

Authors: Duangchai Sookchaloem, Sutida Maneeanakekul

Abstract:

Araceous plant or Araceae is a monocotyledon family having numerous potential useful plants. Two hundred and ten species of Araceae were reported in Thailand, of which 43 species were reported as threatened plants. Fifty percent of endemic status and rare status plants were recorded in limestone areas. Currently, these areas are seriously threatened by land-use changes. The study on taxonomy of Araceous plants was carried out in Lop Buri and Saraburi limestone mountains from February 2011 to May 2015. The purposes of this study were to study species diversity, taxonomic character and ecological habitat. 55 specimens collected from various limestone areas including Pra Phut Tabat National forest (Pra Phut Tabat Mountain, Khao Pra Phut Tabat Noi Mountains, Wat Thum Krabog Mountain), Tab Khwang and Muak Lek Natinal forest (Pha Lad mountain, and Muak Lek waterfall) in Saraburi province ,and Wang Plaeng Ta Muang and Lumnarai National forest (Wat Thum chang phuk mountain), Panead National forest (Wat Khao Samo Khon Mountain), Lan Ta Ridge National forest (Khao Wong Prachan mountain, Wat Pa Chumchon) in Lop Buri province. Twenty species of Araceous plants were identified using characteristics of underground stem, phyllotaxis and leaf blade, spathe and spadix. Species list are Aglaonema cochinchinense, A. simplex, Alocasia acuminata, Amorphophallus paeoniifolius, A. albispathus, A. saraburiensis, A. pseudoharmandii, Pycnospatha arietina, Hapaline kerri, Lasia spinosa, Pothos scandens, Typhonium laoticum, T. orbifolium, T. saraburiense, T. trilobatum, T. sp.1, T. sp. 2, Cryptocoryne crispatula var. balansae, Scindapsus sp., and Rhaphidophora peepla. Five species are new locality records. One species (Typhonium sp.1) is considered as a new species. Seven species were reported as threatened plants in Thailand Red Data Book. Taxonomic features were used for key to species constructions. Araceous specimens were found in mixed deciduous forests, dry evergreen forests with 50-470 m. elevation. New ecological habitat of Typhonium laoticum, T. orbifolium, and T. saraburiense were reported in this study.

Keywords: ecology, limestone mountains, Lopburi and Saraburi provinces, species diversity, taxonomic character

Procedia PDF Downloads 241
406 Desulphurization of Waste Tire Pyrolytic Oil (TPO) Using Photodegradation and Adsorption Techniques

Authors: Moshe Mello, Hilary Rutto, Tumisang Seodigeng

Abstract:

The nature of tires makes them extremely challenging to recycle due to the available chemically cross-linked polymer and, therefore, they are neither fusible nor soluble and, consequently, cannot be remolded into other shapes without serious degradation. Open dumping of tires pollutes the soil, contaminates underground water and provides ideal breeding grounds for disease carrying vermins. The thermal decomposition of tires by pyrolysis produce char, gases and oil. The composition of oils derived from waste tires has common properties to commercial diesel fuel. The problem associated with the light oil derived from pyrolysis of waste tires is that it has a high sulfur content (> 1.0 wt.%) and therefore emits harmful sulfur oxide (SOx) gases to the atmosphere when combusted in diesel engines. Desulphurization of TPO is necessary due to the increasing stringent environmental regulations worldwide. Hydrodesulphurization (HDS) is the commonly practiced technique for the removal of sulfur species in liquid hydrocarbons. However, the HDS technique fails in the presence of complex sulfur species such as Dibenzothiopene (DBT) present in TPO. This study aims to investigate the viability of photodegradation (Photocatalytic oxidative desulphurization) and adsorptive desulphurization technologies for efficient removal of complex and non-complex sulfur species in TPO. This study focuses on optimizing the cleaning (removal of impurities and asphaltenes) process by varying process parameters; temperature, stirring speed, acid/oil ratio and time. The treated TPO will then be sent for vacuum distillation to attain the desired diesel like fuel. The effect of temperature, pressure and time will be determined for vacuum distillation of both raw TPO and the acid treated oil for comparison purposes. Polycyclic sulfides present in the distilled (diesel like) light oil will be oxidized dominantly to the corresponding sulfoxides and sulfone via a photo-catalyzed system using TiO2 as a catalyst and hydrogen peroxide as an oxidizing agent and finally acetonitrile will be used as an extraction solvent. Adsorptive desulphurization will be used to adsorb traces of sulfurous compounds which remained during photocatalytic desulphurization step. This desulphurization convoy is expected to give high desulphurization efficiency with reasonable oil recovery.

Keywords: adsorption, asphaltenes, photocatalytic oxidation, pyrolysis

Procedia PDF Downloads 273