Search results for: incremental mining
408 Topic Modelling Using Latent Dirichlet Allocation and Latent Semantic Indexing on SA Telco Twitter Data
Authors: Phumelele Kubheka, Pius Owolawi, Gbolahan Aiyetoro
Abstract:
Twitter is one of the most popular social media platforms where users can share their opinions on different subjects. As of 2010, The Twitter platform generates more than 12 Terabytes of data daily, ~ 4.3 petabytes in a single year. For this reason, Twitter is a great source for big mining data. Many industries such as Telecommunication companies can leverage the availability of Twitter data to better understand their markets and make an appropriate business decision. This study performs topic modeling on Twitter data using Latent Dirichlet Allocation (LDA). The obtained results are benchmarked with another topic modeling technique, Latent Semantic Indexing (LSI). The study aims to retrieve topics on a Twitter dataset containing user tweets on South African Telcos. Results from this study show that LSI is much faster than LDA. However, LDA yields better results with higher topic coherence by 8% for the best-performing model represented in Table 1. A higher topic coherence score indicates better performance of the model.Keywords: big data, latent Dirichlet allocation, latent semantic indexing, telco, topic modeling, twitter
Procedia PDF Downloads 150407 Analyzing Semantic Feature Using Multiple Information Sources for Reviews Summarization
Authors: Yu Hung Chiang, Hei Chia Wang
Abstract:
Nowadays, tourism has become a part of life. Before reserving hotels, customers need some information, which the most important source is online reviews, about hotels to help them make decisions. Due to the dramatic growing of online reviews, it is impossible for tourists to read all reviews manually. Therefore, designing an automatic review analysis system, which summarizes reviews, is necessary for them. The main purpose of the system is to understand the opinion of reviews, which may be positive or negative. In other words, the system would analyze whether the customers who visited the hotel like it or not. Using sentiment analysis methods will help the system achieve the purpose. In sentiment analysis methods, the targets of opinion (here they are called the feature) should be recognized to clarify the polarity of the opinion because polarity of the opinion may be ambiguous. Hence, the study proposes an unsupervised method using Part-Of-Speech pattern and multi-lexicons sentiment analysis to summarize all reviews. We expect this method can help customers search what they want information as well as make decisions efficiently.Keywords: text mining, sentiment analysis, product feature extraction, multi-lexicons
Procedia PDF Downloads 331406 Automated Process Quality Monitoring and Diagnostics for Large-Scale Measurement Data
Authors: Hyun-Woo Cho
Abstract:
Continuous monitoring of industrial plants is one of necessary tasks when it comes to ensuring high-quality final products. In terms of monitoring and diagnosis, it is quite critical and important to detect some incipient abnormal events of manufacturing processes in order to improve safety and reliability of operations involved and to reduce related losses. In this work a new multivariate statistical online diagnostic method is presented using a case study. For building some reference models an empirical discriminant model is constructed based on various past operation runs. When a fault is detected on-line, an on-line diagnostic module is initiated. Finally, the status of the current operating conditions is compared with the reference model to make a diagnostic decision. The performance of the presented framework is evaluated using a dataset from complex industrial processes. It has been shown that the proposed diagnostic method outperforms other techniques especially in terms of incipient detection of any faults occurred.Keywords: data mining, empirical model, on-line diagnostics, process fault, process monitoring
Procedia PDF Downloads 401405 The Impact of Interrelationship between Business Intelligence and Knowledge Management on Decision Making Process: An Empirical Investigation of Banking Sector in Jordan
Authors: Issa M. Shehabat, Huda F. Y. Nimri
Abstract:
This paper aims to study the relationship between knowledge management in its processes, including knowledge creation, knowledge sharing, knowledge organization, and knowledge application, and business intelligence tools, including OLAP, data mining, and data warehouse, and their impact on the decision-making process in the banking sector in Jordan. A total of 200 questionnaires were distributed to the sample of the study. The study hypotheses were tested using the statistical package SPSS. Study findings suggest that decision-making processes were positively related to knowledge management processes. Additionally, the components of business intelligence had a positive impact on decision-making. The study recommended conducting studies similar to this study in other sectors such as the industrial, telecommunications, and service sectors to contribute to enhancing understanding of the role of the knowledge management processes and business intelligence tools.Keywords: business intelligence, knowledge management, decision making, Jordan, banking sector
Procedia PDF Downloads 144404 Engineering Topology of Ecological Model for Orientation Impact of Sustainability Urban Environments: The Spatial-Economic Modeling
Authors: Moustafa Osman Mohammed
Abstract:
The modeling of a spatial-economic database is crucial in recitation economic network structure to social development. Sustainability within the spatial-economic model gives attention to green businesses to comply with Earth’s Systems. The natural exchange patterns of ecosystems have consistent and periodic cycles to preserve energy and materials flow in systems ecology. When network topology influences formal and informal communication to function in systems ecology, ecosystems are postulated to valence the basic level of spatial sustainable outcome (i.e., project compatibility success). These referred instrumentalities impact various aspects of the second level of spatial sustainable outcomes (i.e., participant social security satisfaction). The sustainability outcomes are modeling composite structure based on a network analysis model to calculate the prosperity of panel databases for efficiency value, from 2005 to 2025. The database is modeling spatial structure to represent state-of-the-art value-orientation impact and corresponding complexity of sustainability issues (e.g., build a consistent database necessary to approach spatial structure; construct the spatial-economic-ecological model; develop a set of sustainability indicators associated with the model; allow quantification of social, economic and environmental impact; use the value-orientation as a set of important sustainability policy measures), and demonstrate spatial structure reliability. The structure of spatial-ecological model is established for management schemes from the perspective pollutants of multiple sources through the input–output criteria. These criteria evaluate the spillover effect to conduct Monte Carlo simulations and sensitivity analysis in a unique spatial structure. The balance within “equilibrium patterns,” such as collective biosphere features, has a composite index of many distributed feedback flows. The following have a dynamic structure related to physical and chemical properties for gradual prolong to incremental patterns. While these spatial structures argue from ecological modeling of resource savings, static loads are not decisive from an artistic/architectural perspective. The model attempts to unify analytic and analogical spatial structure for the development of urban environments in a relational database setting, using optimization software to integrate spatial structure where the process is based on the engineering topology of systems ecology.Keywords: ecological modeling, spatial structure, orientation impact, composite index, industrial ecology
Procedia PDF Downloads 68403 Developing Serious Games to Improve Learning Experience of Programming: A Case Study
Authors: Shan Jiang, Xinyu Tang
Abstract:
Game-based learning is an emerging pedagogy to make the learning experience more effective, enjoyable, and fun. However, most games used in classroom settings have been overly simplistic. This paper presents a case study on a Python-based online game designed to improve the effectiveness in both teaching and research in higher education. The proposed game system not only creates a fun and enjoyable experience for students to learn various topics in programming but also improves the effectiveness of teaching in several aspects, including material presentation, helping students to recognize the importance of the subjects, and linking theoretical concepts to practice. The proposed game system also serves as an information cyber-infrastructure that automatically collects and stores data from players. The data could be useful in research areas including human-computer interaction, decision making, opinion mining, and artificial intelligence. They further provide other possibilities beyond these areas due to the customizable nature of the game.Keywords: game-based learning, programming, research-teaching integration, Hearthstone
Procedia PDF Downloads 165402 Development of Knowledge Discovery Based Interactive Decision Support System on Web Platform for Maternal and Child Health System Strengthening
Authors: Partha Saha, Uttam Kumar Banerjee
Abstract:
Maternal and Child Healthcare (MCH) has always been regarded as one of the important issues globally. Reduction of maternal and child mortality rates and increase of healthcare service coverage were declared as one of the targets in Millennium Development Goals till 2015 and thereafter as an important component of the Sustainable Development Goals. Over the last decade, worldwide MCH indicators have improved but could not match the expected levels. Progress of both maternal and child mortality rates have been monitored by several researchers. Each of the studies has stated that only less than 26% of low-income and middle income countries (LMICs) were on track to achieve targets as prescribed by MDG4. Average worldwide annual rate of reduction of under-five mortality rate and maternal mortality rate were 2.2% and 1.9% as on 2011 respectively whereas rates should be minimum 4.4% and 5.5% annually to achieve targets. In spite of having proven healthcare interventions for both mothers and children, those could not be scaled up to the required volume due to fragmented health systems, especially in the developing and under-developed countries. In this research, a knowledge discovery based interactive Decision Support System (DSS) has been developed on web platform which would assist healthcare policy makers to develop evidence-based policies. To achieve desirable results in MCH, efficient resource planning is very much required. In maximum LMICs, resources are big constraint. Knowledge, generated through this system, would help healthcare managers to develop strategic resource planning for combatting with issues like huge inequity and less coverage in MCH. This system would help healthcare managers to accomplish following four tasks. Those are a) comprehending region wise conditions of variables related with MCH, b) identifying relationships within variables, c) segmenting regions based on variables status, and d) finding out segment wise key influential variables which have major impact on healthcare indicators. Whole system development process has been divided into three phases. Those were i) identifying contemporary issues related with MCH services and policy making; ii) development of the system; and iii) verification and validation of the system. More than 90 variables under three categories, such as a) educational, social, and economic parameters; b) MCH interventions; and c) health system building blocks have been included into this web-based DSS and five separate modules have been developed under the system. First module has been designed for analysing current healthcare scenario. Second module would help healthcare managers to understand correlations among variables. Third module would reveal frequently-occurring incidents along with different MCH interventions. Fourth module would segment regions based on previously mentioned three categories and in fifth module, segment-wise key influential interventions will be identified. India has been considered as case study area in this research. Data of 601 districts of India has been used for inspecting effectiveness of those developed modules. This system has been developed by importing different statistical and data mining techniques on Web platform. Policy makers would be able to generate different scenarios from the system before drawing any inference, aided by its interactive capability.Keywords: maternal and child heathcare, decision support systems, data mining techniques, low and middle income countries
Procedia PDF Downloads 258401 Long Short-Term Memory Stream Cruise Control Method for Automated Drift Detection and Adaptation
Authors: Mohammad Abu-Shaira, Weishi Shi
Abstract:
Adaptive learning, a commonly employed solution to drift, involves updating predictive models online during their operation to react to concept drifts, thereby serving as a critical component and natural extension for online learning systems that learn incrementally from each example. This paper introduces LSTM-SCCM “Long Short-Term Memory Stream Cruise Control Method”, a drift adaptation-as-a-service framework for online learning. LSTM-SCCM automates drift adaptation through prompt detection, drift magnitude quantification, dynamic hyperparameter tuning, performing shortterm optimization and model recalibration for immediate adjustments, and, when necessary, conducting long-term model recalibration to ensure deeper enhancements in model performance. LSTM-SCCM is incorporated into a suite of cutting-edge online regression models, assessing their performance across various types of concept drift using diverse datasets with varying characteristics. The findings demonstrate that LSTM-SCCM represents a notable advancement in both model performance and efficacy in handling concept drift occurrences. LSTM-SCCM stands out as the sole framework adept at effectively tackling concept drifts within regression scenarios. Its proactive approach to drift adaptation distinguishes it from conventional reactive methods, which typically rely on retraining after significant degradation to model performance caused by drifts. Additionally, LSTM-SCCM employs an in-memory approach combined with the Self-Adjusting Memory (SAM) architecture to enhance real-time processing and adaptability. The framework incorporates variable thresholding techniques and does not assume any particular data distribution, making it an ideal choice for managing high-dimensional datasets and efficiently handling large-scale data. Our experiments, which include abrupt, incremental, and gradual drifts across both low- and high-dimensional datasets with varying noise levels, and applied to four state-of-the-art online regression models, demonstrate that LSTM-SCCM is versatile and effective, rendering it a valuable solution for online regression models to address concept drift.Keywords: automated drift detection and adaptation, concept drift, hyperparameters optimization, online and adaptive learning, regression
Procedia PDF Downloads 11400 Classification of Political Affiliations by Reduced Number of Features
Authors: Vesile Evrim, Aliyu Awwal
Abstract:
By the evolvement in technology, the way of expressing opinions switched the direction to the digital world. The domain of politics as one of the hottest topics of opinion mining research merged together with the behavior analysis for affiliation determination in text which constitutes the subject of this paper. This study aims to classify the text in news/blogs either as Republican or Democrat with the minimum number of features. As an initial set, 68 features which 64 are constituted by Linguistic Inquiry and Word Count (LIWC) features are tested against 14 benchmark classification algorithms. In the later experiments, the dimensions of the feature vector reduced based on the 7 feature selection algorithms. The results show that Decision Tree, Rule Induction and M5 Rule classifiers when used with SVM and IGR feature selection algorithms performed the best up to 82.5% accuracy on a given dataset. Further tests on a single feature and the linguistic based feature sets showed the similar results. The feature “function” as an aggregate feature of the linguistic category, is obtained as the most differentiating feature among the 68 features with 81% accuracy by itself in classifying articles either as Republican or Democrat.Keywords: feature selection, LIWC, machine learning, politics
Procedia PDF Downloads 382399 Hypersonic Propulsion Requirements for Sustained Hypersonic Flight for Air Transportation
Authors: James Rate, Apostolos Pesiridis
Abstract:
In this paper, the propulsion requirements required to achieve sustained hypersonic flight for commercial air transportation are evaluated. In addition, a design methodology is developed and used to determine the propulsive capabilities of both ramjet and scramjet engines. Twelve configurations are proposed for hypersonic flight using varying combinations of turbojet, turbofan, ramjet and scramjet engines. The optimal configuration was determined based on how well each of the configurations met the projected requirements for hypersonic commercial transport. The configurations were separated into four sub-configurations each comprising of three unique derivations. The first sub-configuration comprised four afterburning turbojets and either one or two ramjets idealised for Mach 5 cruise. The number of ramjets required was dependent on the thrust required to accelerate the vehicle from a speed where the turbojets cut out to Mach 5 cruise. The second comprised four afterburning turbojets and either one or two scramjets, similar to the first configuration. The third used four turbojets, one scramjet and one ramjet to aid acceleration from Mach 3 to Mach 5. The fourth configuration was the same as the third, but instead of turbojets, it implemented turbofan engines for the preliminary acceleration of the vehicle. From calculations which determined the fuel consumption at incremental Mach numbers this paper found that the ideal solution would require four turbojet engines and two Scramjet engines. The ideal mission profile was determined as being an 8000km sortie based on an averaging of popular long haul flights with strong business ties, which included Los Angeles to Tokyo, London to New York and Dubai to Beijing. This paper deemed that these routes would benefit from hypersonic transport links based on the previously mentioned factors. This paper has found that this configuration would be sufficient for the 8000km flight to be completed in approximately two and a half hours and would consume less fuel than Concord in doing so. However, this propulsion configuration still result in a greater fuel cost than a conventional passenger. In this regard, this investigation contributes towards the specification of the engine requirements throughout a mission profile for a hypersonic passenger vehicle. A number of assumptions have had to be made for this theoretical approach but the authors believe that this investigation lays the groundwork for appropriate framing of the propulsion requirements for sustained hypersonic flight for commercial air transportation. Despite this, it does serve as a crucial step in the development of the propulsion systems required for hypersonic commercial air transportation. This paper provides a methodology and a focus for the development of the propulsion systems that would be required for sustained hypersonic flight for commercial air transportation.Keywords: hypersonic, ramjet, propulsion, Scramjet, Turbojet, turbofan
Procedia PDF Downloads 320398 Modelling Fluoride Pollution of Groundwater Using Artificial Neural Network in the Western Parts of Jharkhand
Authors: Neeta Kumari, Gopal Pathak
Abstract:
Artificial neural network has been proved to be an efficient tool for non-parametric modeling of data in various applications where output is non-linearly associated with input. It is a preferred tool for many predictive data mining applications because of its power , flexibility, and ease of use. A standard feed forward networks (FFN) is used to predict the groundwater fluoride content. The ANN model is trained using back propagated algorithm, Tansig and Logsig activation function having varying number of neurons. The models are evaluated on the basis of statistical performance criteria like Root Mean Squarred Error (RMSE) and Regression coefficient (R2), bias (mean error), Coefficient of variation (CV), Nash-Sutcliffe efficiency (NSE), and the index of agreement (IOA). The results of the study indicate that Artificial neural network (ANN) can be used for groundwater fluoride prediction in the limited data situation in the hard rock region like western parts of Jharkhand with sufficiently good accuracy.Keywords: Artificial neural network (ANN), FFN (Feed-forward network), backpropagation algorithm, Levenberg-Marquardt algorithm, groundwater fluoride contamination
Procedia PDF Downloads 550397 User Modeling from the Perspective of Improvement in Search Results: A Survey of the State of the Art
Authors: Samira Karimi-Mansoub, Rahem Abri
Abstract:
Currently, users expect high quality and personalized information from search results. To satisfy user’s needs, personalized approaches to web search have been proposed. These approaches can provide the most appropriate answer for user’s needs by using user context and incorporating information about query provided by combining search technologies. To carry out personalized web search, there is a need to make different techniques on whole of user search process. There are the number of possible deployment of personalized approaches such as personalized web search, personalized recommendation, personalized summarization and filtering systems and etc. but the common feature of all approaches in various domains is that user modeling is utilized to provide personalized information from the Web. So the most important work in personalized approaches is user model mining. User modeling applications and technologies can be used in various domains depending on how the user collected information may be extracted. In addition to, the used techniques to create user model is also different in each of these applications. Since in the previous studies, there was not a complete survey in this field, our purpose is to present a survey on applications and techniques of user modeling from the viewpoint of improvement in search results by considering the existing literature and researches.Keywords: filtering systems, personalized web search, user modeling, user search behavior
Procedia PDF Downloads 279396 Simulation Study on Polymer Flooding with Thermal Degradation in Elevated-Temperature Reservoirs
Authors: Lin Zhao, Hanqiao Jiang, Junjian Li
Abstract:
Polymers injected into elevated-temperature reservoirs inevitably suffer from thermal degradation, resulting in severe viscosity loss and poor flooding performance. However, for polymer flooding in such reservoirs, present simulators fail to provide accurate results for lack of description on thermal degradation. In light of this, the objectives of this paper are to provide a simulation model for polymer flooding with thermal degradation and study the effect of thermal degradation on polymer flooding in elevated-temperature reservoirs. Firstly, a thermal degradation experiment was conducted to obtain the degradation law of polymer concentration and viscosity. Different types of polymers degraded in the Thermo tank with elevated temperatures. Afterward, based on the obtained law, a streamline-assistant model was proposed to simulate the degradation process under in-situ flow conditions. Model validation was performed with field data from a well group of an offshore oilfield. Finally, the effect of thermal degradation on polymer flooding was studied using the proposed model. Experimental results showed that the polymer concentration remained unchanged, while the viscosity degraded exponentially with time after degradation. The polymer viscosity was functionally dependent on the polymer degradation time (PDT), which represented the elapsed time started from the polymer particle injection. Tracing the real flow path of polymer particle was required. Therefore, the presented simulation model was streamline-assistant. Equation of PDT vs. time of flight (TOF) along streamline was built by the law of polymer particle transport. Based on the field polymer sample and dynamic data, the new model proved its accuracy. Study of degradation effect on polymer flooding indicated: (1) the viscosity loss increased with TOF exponentially in the main body of polymer-slug and remained constant in the slug front; (2) the responding time of polymer flooding was delayed, but the effective time was prolonged; (3) the breakthrough of subsequent water was eased; (4) the capacity of polymer adjusting injection profile was diminished; (5) the incremental recovery was reduced significantly. In general, the effect of thermal degradation on polymer flooding performance was rather negative. This paper provides a more comprehensive insight into polymer thermal degradation in both the physical process and field application. The proposed simulation model offers an effective means for simulating the polymer flooding process with thermal degradation. The negative effect of thermal degradation suggests that the polymer thermal stability should be given full consideration when designing polymer flooding project in elevated-temperature reservoirs.Keywords: polymer flooding, elevated-temperature reservoir, thermal degradation, numerical simulation
Procedia PDF Downloads 143395 Environmental Resilience in Sustainability Outcomes of Spatial-Economic Model Structure on the Topology of Construction Ecology
Authors: Moustafa Osman Mohammed
Abstract:
The resilient and sustainable of construction ecology is essential to world’s socio-economic development. Environmental resilience is crucial in relating construction ecology to topology of spatial-economic model. Sustainability of spatial-economic model gives attention to green business to comply with Earth’s System for naturally exchange patterns of ecosystems. The systems ecology has consistent and periodic cycles to preserve energy and materials flow in Earth’s System. When model structure is influencing communication of internal and external features in system networks, it postulated the valence of the first-level spatial outcomes (i.e., project compatibility success). These instrumentalities are dependent on second-level outcomes (i.e., participant security satisfaction). These outcomes of model are based on measuring database efficiency, from 2015 to 2025. The model topology has state-of-the-art in value-orientation impact and correspond complexity of sustainability issues (e.g., build a consistent database necessary to approach spatial structure; construct the spatial-economic model; develop a set of sustainability indicators associated with model; allow quantification of social, economic and environmental impact; use the value-orientation as a set of important sustainability policy measures), and demonstrate environmental resilience. The model is managing and developing schemes from perspective of multiple sources pollutants through the input–output criteria. These criteria are evaluated the external insertions effects to conduct Monte Carlo simulations and analysis for using matrices in a unique spatial structure. The balance “equilibrium patterns” such as collective biosphere features, has a composite index of the distributed feedback flows. These feedback flows have a dynamic structure with physical and chemical properties for gradual prolong of incremental patterns. While these structures argue from system ecology, static loads are not decisive from an artistic/architectural perspective. The popularity of system resilience, in the systems structure related to ecology has not been achieved without the generation of confusion and vagueness. However, this topic is relevant to forecast future scenarios where industrial regions will need to keep on dealing with the impact of relative environmental deviations. The model attempts to unify analytic and analogical structure of urban environments using database software to integrate sustainability outcomes where the process based on systems topology of construction ecology.Keywords: system ecology, construction ecology, industrial ecology, spatial-economic model, systems topology
Procedia PDF Downloads 19394 Benthic Foraminiferal Responses to Coastal Pollution for Some Selected Sites along Red Sea, Egypt
Authors: Ramadan M. El-Kahawy, M. A. El-Shafeiy, Mohamed Abd El-Wahab, S. A. Helal, Nabil Aboul-Ela
Abstract:
Due to the economic importance of Safaga Bay, Quseir harbor and Ras Gharib harbor , a multidisciplinary approach was adopted to invistigate 27 surfecial sediment samples from the three sites and 9 samples for each in order to use the benthic foraminifera as bio-indicators for characterization of the environmental variations. Grain size analyses indicate that the bottom facies in the inner part of quseir is muddy while the inner part of Ras Gharib and Safaga is silty sand and those close to the entrance of Safaga bay and Ras Gharib is sandy facies while quseir still also muddy facies. geochemical data show high concentration of heavy-metals mainly in Ras Gharib due to oil leakage from the hydrocarbon oil field and Safaga bay due to the phosphate mining while quseir is medium concentration due to anthropocentric effect.micropaelontological analyses indicate the boundaries of the highest concentration of heavy metals and those of low concentration as well.the dominant benthic foraminifera in these three sites are Ammonia beccarii, Amphistigina and sorites. the study highlights the worsening of environmental conditions and also show that the areas in need of a priority recovery.Keywords: benthic foraminifera, Ras Gharib, Safaga, Quseir, Red Sea, Egypt
Procedia PDF Downloads 350393 Sentiment Analysis of Ensemble-Based Classifiers for E-Mail Data
Authors: Muthukumarasamy Govindarajan
Abstract:
Detection of unwanted, unsolicited mails called spam from email is an interesting area of research. It is necessary to evaluate the performance of any new spam classifier using standard data sets. Recently, ensemble-based classifiers have gained popularity in this domain. In this research work, an efficient email filtering approach based on ensemble methods is addressed for developing an accurate and sensitive spam classifier. The proposed approach employs Naive Bayes (NB), Support Vector Machine (SVM) and Genetic Algorithm (GA) as base classifiers along with different ensemble methods. The experimental results show that the ensemble classifier was performing with accuracy greater than individual classifiers, and also hybrid model results are found to be better than the combined models for the e-mail dataset. The proposed ensemble-based classifiers turn out to be good in terms of classification accuracy, which is considered to be an important criterion for building a robust spam classifier.Keywords: accuracy, arcing, bagging, genetic algorithm, Naive Bayes, sentiment mining, support vector machine
Procedia PDF Downloads 142392 Product Features Extraction from Opinions According to Time
Authors: Kamal Amarouche, Houda Benbrahim, Ismail Kassou
Abstract:
Nowadays, e-commerce shopping websites have experienced noticeable growth. These websites have gained consumers’ trust. After purchasing a product, many consumers share comments where opinions are usually embedded about the given product. Research on the automatic management of opinions that gives suggestions to potential consumers and portrays an image of the product to manufactures has been growing recently. After launching the product in the market, the reviews generated around it do not usually contain helpful information or generic opinions about this product (e.g. telephone: great phone...); in the sense that the product is still in the launching phase in the market. Within time, the product becomes old. Therefore, consumers perceive the advantages/ disadvantages about each specific product feature. Therefore, they will generate comments that contain their sentiments about these features. In this paper, we present an unsupervised method to extract different product features hidden in the opinions which influence its purchase, and that combines Time Weighting (TW) which depends on the time opinions were expressed with Term Frequency-Inverse Document Frequency (TF-IDF). We conduct several experiments using two different datasets about cell phones and hotels. The results show the effectiveness of our automatic feature extraction, as well as its domain independent characteristic.Keywords: opinion mining, product feature extraction, sentiment analysis, SentiWordNet
Procedia PDF Downloads 410391 Artificial Reproduction System and Imbalanced Dataset: A Mendelian Classification
Authors: Anita Kushwaha
Abstract:
We propose a new evolutionary computational model called Artificial Reproduction System which is based on the complex process of meiotic reproduction occurring between male and female cells of the living organisms. Artificial Reproduction System is an attempt towards a new computational intelligence approach inspired by the theoretical reproduction mechanism, observed reproduction functions, principles and mechanisms. A reproductive organism is programmed by genes and can be viewed as an automaton, mapping and reducing so as to create copies of those genes in its off springs. In Artificial Reproduction System, the binding mechanism between male and female cells is studied, parameters are chosen and a network is constructed also a feedback system for self regularization is established. The model then applies Mendel’s law of inheritance, allele-allele associations and can be used to perform data analysis of imbalanced data, multivariate, multiclass and big data. In the experimental study Artificial Reproduction System is compared with other state of the art classifiers like SVM, Radial Basis Function, neural networks, K-Nearest Neighbor for some benchmark datasets and comparison results indicates a good performance.Keywords: bio-inspired computation, nature- inspired computation, natural computing, data mining
Procedia PDF Downloads 272390 Feature Weighting Comparison Based on Clustering Centers in the Detection of Diabetic Retinopathy
Authors: Kemal Polat
Abstract:
In this paper, three feature weighting methods have been used to improve the classification performance of diabetic retinopathy (DR). To classify the diabetic retinopathy, features extracted from the output of several retinal image processing algorithms, such as image-level, lesion-specific and anatomical components, have been used and fed them into the classifier algorithms. The dataset used in this study has been taken from University of California, Irvine (UCI) machine learning repository. Feature weighting methods including the fuzzy c-means clustering based feature weighting, subtractive clustering based feature weighting, and Gaussian mixture clustering based feature weighting, have been used and compered with each other in the classification of DR. After feature weighting, five different classifier algorithms comprising multi-layer perceptron (MLP), k- nearest neighbor (k-NN), decision tree, support vector machine (SVM), and Naïve Bayes have been used. The hybrid method based on combination of subtractive clustering based feature weighting and decision tree classifier has been obtained the classification accuracy of 100% in the screening of DR. These results have demonstrated that the proposed hybrid scheme is very promising in the medical data set classification.Keywords: machine learning, data weighting, classification, data mining
Procedia PDF Downloads 325389 An Evaluation of Edible Plants for Remediation of Contaminated Soil- Can Edible Plants Be Used to Remove Heavy Metals on Soil?
Authors: Celia Marilia Martins, Sonia I. V. Guilundo, Iris M. Victorino, Antonio O. Quilambo
Abstract:
In Mozambique rapid industrialization (mining, aluminium and cement activities) and urbanization processes has led to the incorporation of heavy metals on soil, thus degrading not only the quality of the environment, but also affecting plants, animals and human healthy. Several methods have been used to remediate contaminated soils, but most of them are costly and difficult to get optimum results. Currently, phytoremediation is an effective and affordable technological solution used to extract or remove inactive metals from contaminated soil. Phytoremediation is the use of plants to clean up a contamination from soils, sediments, and water. This technology is environmental friendly and potentially cost effective. The present investigation summarised the potential of edible vegetable to grow under the high level of heavy metals such as lead and zinc. The plants used in these studies include Tomatoes, lettuce and Soya beans. The studies have shown that edible plants can be grown under the high level of heavy metals on the soil. Further investigations are identifying mechanisms used by plants to ensure a safe and sustainable use for remediation of contaminated soils by heavy metals.Keywords: contaminated soil, edible plants, heavy metals, phytoremediation
Procedia PDF Downloads 376388 Domain Adaptive Dense Retrieval with Query Generation
Authors: Rui Yin, Haojie Wang, Xun Li
Abstract:
Recently, mainstream dense retrieval methods have obtained state-of-the-art results on some datasets and tasks. However, they require large amounts of training data, which is not available in most domains. The severe performance degradation of dense retrievers on new data domains has limited the use of dense retrieval methods to only a few domains with large training datasets. In this paper, we propose an unsupervised domain-adaptive approach based on query generation. First, a generative model is used to generate relevant queries for each passage in the target corpus, and then, the generated queries are used for mining negative passages. Finally, the query-passage pairs are labeled with a cross-encoder and used to train a domain-adapted dense retriever. We also explore contrastive learning as a method for training domain-adapted dense retrievers and show that it leads to strong performance in various retrieval settings. Experiments show that our approach is more robust than previous methods in target domains that require less unlabeled data.Keywords: dense retrieval, query generation, contrastive learning, unsupervised training
Procedia PDF Downloads 103387 Aquatic and Marshy Flora from Fresh Water Wetlands on Quartz Sands in Pinar Del Río, Cuba
Authors: Vidal Pérez Hernández, Enrique González Pendás
Abstract:
The most of the aquatic and marshy flora in Cuba, is located on quartzitic sands ecosystems and they are represented by a wide variety of freshwater wetlands, which are spread in the whole south and south-western plain of Pinar del Río. The survey carried out in these ecosystems offers an updated inventory of these species, showing up their biological type, habit, distribution, and the threat grade to which are subjected, taking into account categories granted by UICN. A remarkable decrease is evidenced, in the total of these species respect to this area; due to deposit processes and deforestation, which are taken place by the human activity and the climatic change. It is linked to others threats like, limitless use of their water reserves for irrigating groves, the cattle raising and intensive fishing. Added to it, its sand with 99% pure crystal quartz, are used for the mining. The combination of all factors has a negative influence on a flora that stores more than 250 species, most of them herbaceous and hydrophytes. In these particular ecosystems were found a 40% endemism from total flora, and more than 80%, are evaluated inside the most sensitive threat categories, and already some of them have been declared as extinct.Keywords: aquatic flora, marshy flora, quartzitic sands, wetlands
Procedia PDF Downloads 228386 How Virtualization, Decentralization, and Network-Building Change the Manufacturing Landscape: An Industry 4.0 Perspective
Authors: Malte Brettel, Niklas Friederichsen, Michael Keller, Marius Rosenberg
Abstract:
The German manufacturing industry has to withstand an increasing global competition on product quality and production costs. As labor costs are high, several industries have suffered severely under the relocation of production facilities towards aspiring countries, which have managed to close the productivity and quality gap substantially. Established manufacturing companies have recognized that customers are not willing to pay large price premiums for incremental quality improvements. As a consequence, many companies from the German manufacturing industry adjust their production focusing on customized products and fast time to market. Leveraging the advantages of novel production strategies such as Agile Manufacturing and Mass Customization, manufacturing companies transform into integrated networks, in which companies unite their core competencies. Hereby, virtualization of the process- and supply-chain ensures smooth inter-company operations providing real-time access to relevant product and production information for all participating entities. Boundaries of companies deteriorate, as autonomous systems exchange data, gained by embedded systems throughout the entire value chain. By including Cyber-Physical-Systems, advanced communication between machines is tantamount to their dialogue with humans. The increasing utilization of information and communication technology allows digital engineering of products and production processes alike. Modular simulation and modeling techniques allow decentralized units to flexibly alter products and thereby enable rapid product innovation. The present article describes the developments of Industry 4.0 within the literature and reviews the associated research streams. Hereby, we analyze eight scientific journals with regards to the following research fields: Individualized production, end-to-end engineering in a virtual process chain and production networks. We employ cluster analysis to assign sub-topics into the respective research field. To assess the practical implications, we conducted face-to-face interviews with managers from the industry as well as from the consulting business using a structured interview guideline. The results reveal reasons for the adaption and refusal of Industry 4.0 practices from a managerial point of view. Our findings contribute to the upcoming research stream of Industry 4.0 and support decision-makers to assess their need for transformation towards Industry 4.0 practices.Keywords: Industry 4.0., mass customization, production networks, virtual process-chain
Procedia PDF Downloads 277385 Quantifying User-Related, System-Related, and Context-Related Patterns of Smartphone Use
Authors: Andrew T. Hendrickson, Liven De Marez, Marijn Martens, Gytha Muller, Tudor Paisa, Koen Ponnet, Catherine Schweizer, Megan Van Meer, Mariek Vanden Abeele
Abstract:
Quantifying and understanding the myriad ways people use their phones and how that impacts their relationships, cognitive abilities, mental health, and well-being is increasingly important in our phone-centric society. However, most studies on the patterns of phone use have focused on theory-driven tests of specific usage hypotheses using self-report questionnaires or analyses of smaller datasets. In this work we present a series of analyses from a large corpus of over 3000 users that combine data-driven and theory-driven analyses to identify reliable smartphone usage patterns and clusters of similar users. Furthermore, we compare the stability of user clusters across user- and system-initiated sessions, as well as during the hypothesized ritualized behavior times directly before and after sleeping. Our results indicate support for some hypothesized usage patterns but present a more complete and nuanced view of how people use smartphones.Keywords: data mining, experience sampling, smartphone usage, health and well being
Procedia PDF Downloads 163384 Internet of Things, Edge and Cloud Computing in Rock Mechanical Investigation for Underground Surveys
Authors: Esmael Makarian, Ayub Elyasi, Fatemeh Saberi, Olusegun Stanley Tomomewo
Abstract:
Rock mechanical investigation is one of the most crucial activities in underground operations, especially in surveys related to hydrocarbon exploration and production, geothermal reservoirs, energy storage, mining, and geotechnics. There is a wide range of traditional methods for driving, collecting, and analyzing rock mechanics data. However, these approaches may not be suitable or work perfectly in some situations, such as fractured zones. Cutting-edge technologies have been provided to solve and optimize the mentioned issues. Internet of Things (IoT), Edge, and Cloud Computing technologies (ECt & CCt, respectively) are among the most widely used and new artificial intelligence methods employed for geomechanical studies. IoT devices act as sensors and cameras for real-time monitoring and mechanical-geological data collection of rocks, such as temperature, movement, pressure, or stress levels. Structural integrity, especially for cap rocks within hydrocarbon systems, and rock mass behavior assessment, to further activities such as enhanced oil recovery (EOR) and underground gas storage (UGS), or to improve safety risk management (SRM) and potential hazards identification (P.H.I), are other benefits from IoT technologies. EC techniques can process, aggregate, and analyze data immediately collected by IoT on a real-time scale, providing detailed insights into the behavior of rocks in various situations (e.g., stress, temperature, and pressure), establishing patterns quickly, and detecting trends. Therefore, this state-of-the-art and useful technology can adopt autonomous systems in rock mechanical surveys, such as drilling and production (in hydrocarbon wells) or excavation (in mining and geotechnics industries). Besides, ECt allows all rock-related operations to be controlled remotely and enables operators to apply changes or make adjustments. It must be mentioned that this feature is very important in environmental goals. More often than not, rock mechanical studies consist of different data, such as laboratory tests, field operations, and indirect information like seismic or well-logging data. CCt provides a useful platform for storing and managing a great deal of volume and different information, which can be very useful in fractured zones. Additionally, CCt supplies powerful tools for predicting, modeling, and simulating rock mechanical information, especially in fractured zones within vast areas. Also, it is a suitable source for sharing extensive information on rock mechanics, such as the direction and size of fractures in a large oil field or mine. The comprehensive review findings demonstrate that digital transformation through integrated IoT, Edge, and Cloud solutions is revolutionizing traditional rock mechanical investigation. These advanced technologies have empowered real-time monitoring, predictive analysis, and data-driven decision-making, culminating in noteworthy enhancements in safety, efficiency, and sustainability. Therefore, by employing IoT, CCt, and ECt, underground operations have experienced a significant boost, allowing for timely and informed actions using real-time data insights. The successful implementation of IoT, CCt, and ECt has led to optimized and safer operations, optimized processes, and environmentally conscious approaches in underground geological endeavors.Keywords: rock mechanical studies, internet of things, edge computing, cloud computing, underground surveys, geological operations
Procedia PDF Downloads 62383 Cost Sensitive Feature Selection in Decision-Theoretic Rough Set Models for Customer Churn Prediction: The Case of Telecommunication Sector Customers
Authors: Emel Kızılkaya Aydogan, Mihrimah Ozmen, Yılmaz Delice
Abstract:
In recent days, there is a change and the ongoing development of the telecommunications sector in the global market. In this sector, churn analysis techniques are commonly used for analysing why some customers terminate their service subscriptions prematurely. In addition, customer churn is utmost significant in this sector since it causes to important business loss. Many companies make various researches in order to prevent losses while increasing customer loyalty. Although a large quantity of accumulated data is available in this sector, their usefulness is limited by data quality and relevance. In this paper, a cost-sensitive feature selection framework is developed aiming to obtain the feature reducts to predict customer churn. The framework is a cost based optional pre-processing stage to remove redundant features for churn management. In addition, this cost-based feature selection algorithm is applied in a telecommunication company in Turkey and the results obtained with this algorithm.Keywords: churn prediction, data mining, decision-theoretic rough set, feature selection
Procedia PDF Downloads 446382 Morphological Processing of Punjabi Text for Sentiment Analysis of Farmer Suicides
Authors: Jaspreet Singh, Gurvinder Singh, Prabhsimran Singh, Rajinder Singh, Prithvipal Singh, Karanjeet Singh Kahlon, Ravinder Singh Sawhney
Abstract:
Morphological evaluation of Indian languages is one of the burgeoning fields in the area of Natural Language Processing (NLP). The evaluation of a language is an eminent task in the era of information retrieval and text mining. The extraction and classification of knowledge from text can be exploited for sentiment analysis and morphological evaluation. This study coalesce morphological evaluation and sentiment analysis for the task of classification of farmer suicide cases reported in Punjab state of India. The pre-processing of Punjabi text involves morphological evaluation and normalization of Punjabi word tokens followed by the training of proposed model using deep learning classification on Punjabi language text extracted from online Punjabi news reports. The class-wise accuracies of sentiment prediction for four negatively oriented classes of farmer suicide cases are 93.85%, 88.53%, 83.3%, and 95.45% respectively. The overall accuracy of sentiment classification obtained using proposed framework on 275 Punjabi text documents is found to be 90.29%.Keywords: deep neural network, farmer suicides, morphological processing, punjabi text, sentiment analysis
Procedia PDF Downloads 326381 A Method for the Extraction of the Character's Tendency from Korean Novels
Authors: Min-Ha Hong, Kee-Won Kim, Seung-Hoon Kim
Abstract:
The character in the story-based content, such as novels and movies, is one of the core elements to understand the story. In particular, the character’s tendency is an important factor to analyze the story-based content, because it has a significant influence on the storyline. If readers have the knowledge of the tendency of characters before reading a novel, it will be helpful to understand the structure of conflict, episode and relationship between characters in the novel. It may therefore help readers to select novel that the reader wants to read. In this paper, we propose a method of extracting the tendency of the characters from a novel written in Korean. In advance, we build the dictionary with pairs of the emotional words in Korean and English since the emotion words in the novel’s sentences express character’s feelings. We rate the degree of polarity (positive or negative) of words in our emotional words dictionary based on SenticNet. Then we extract characters and emotion words from sentences in a novel. Since the polarity of a word grows strong or weak due to sentence features such as quotations and modifiers, our proposed method consider them to calculate the polarity of characters. The information of the extracted character’s polarity can be used in the book search service or book recommendation service.Keywords: character tendency, data mining, emotion word, Korean novel
Procedia PDF Downloads 334380 Mineralogical Study of the Triassic Clay of Maaziz and the Miocene Marl of Akrach in Morocco: Analysis and Evaluating of the Two Geomaterials for the Construction of Ceramic Bricks
Authors: Sahar El Kasmi, Ayoub Aziz, Saadia Lharti, Mohammed El Janati, Boubker Boukili, Nacer El Motawakil, Mayom Chol Luka Awan
Abstract:
Two types of geomaterials (Red Triassic clay from the Maaziz region and Yellow Pliocene clay from the Akrach region) were used to create different mixtures for the fabrication of ceramic bricks. This study investigated the influence of the Pliocene clay on the overall composition and mechanical properties of the Triassic clay. The red Triassic clay, sourced from Maaziz, underwent various mechanical processes and treatments to facilitate its transformation into ceramic bricks for construction. The triassic clay was subjected to a drying chamber and a heating chamber at 100°C to remove moisture. Subsequently, the dried clay samples were processed using a Planetary Babs ll Mill to reduce particle size and improve homogeneity. The resulting clay material was sieved, and the fine particles below 100 mm were collected for further analysis. In parallel, the Miocene marl obtained from the Akrach region was fragmented into finer particles and subjected to similar drying, grinding, and sieving procedures as the triassic clay. The two clay samples are then amalgamated and homogenized in different proportions. Precise measurements were taken using a weighing balance, and mixtures of 90%, 80%, and 70% Triassic clay with 10%, 20%, and 30% yellow clay were prepared, respectively. To evaluate the impact of Pliocene marl on the composition, the prepared clay mixtures were spread evenly and treated with a water modifier to enhance plasticity. The clay was then molded using a brick-making machine, and the initial manipulation process was observed. Additional batches were prepared with incremental amounts of Pliocene marl to further investigate its effect on the fracture behavior of the clay, specifically their resistance. The molded clay bricks were subjected to compression tests to measure their strength and resistance to deformation. Additional tests, such as water absorption tests, were also conducted to assess the overall performance of the ceramic bricks fabricated from the different clay mixtures. The results were analyzed to determine the influence of the Pliocene marl on the strength and durability of the Triassic clay bricks. The results indicated that the incorporation of Pliocene clay reduced the fracture of the triassic clay, with a noticeable reduction observed at 10% addition. No fractures were observed when 20% and 30% of yellow clay are added. These findings suggested that yellow clay can enhance the mechanical properties and structural integrity of red clay-based products.Keywords: triassic clay, pliocene clay, mineralogical composition, geo-materials, ceramics, akach region, maaziz region, morocco.
Procedia PDF Downloads 88379 Methotrexate Associated Skin Cancer: A Signal Review of Pharmacovigilance Center
Authors: Abdulaziz Alakeel, Abdulrahman Alomair, Mohammed Fouda
Abstract:
Introduction: Methotrexate (MTX) is an antimetabolite used to treat multiple conditions, including neoplastic diseases, severe psoriasis, and rheumatoid arthritis. Skin cancer is the out-of-control growth of abnormal cells in the epidermis, the outermost skin layer, caused by unrepaired DNA damage that triggers mutations. These mutations lead the skin cells to multiply rapidly and form malignant tumors. The aim of this review is to evaluate the risk of skin cancer associated with the use of methotrexate and to suggest regulatory recommendations if required. Methodology: Signal Detection team at Saudi Food and Drug Authority (SFDA) performed a safety review using National Pharmacovigilance Center (NPC) database as well as the World Health Organization (WHO) VigiBase, alongside with literature screening to retrieve related information for assessing the causality between skin cancer and methotrexate. The search conducted in July 2020. Results: Four published articles support the association seen while searching in literature, a recent randomized control trial published in 2020 revealed a statistically significant increase in skin cancer among MTX users. Another study mentioned methotrexate increases the risk of non-melanoma skin cancer when used in combination with immunosuppressant and biologic agents. In addition, the incidence of melanoma for methotrexate users was 3-fold more than the general population in a cohort study of rheumatoid arthritis patients. The last article estimated the risk of cutaneous malignant melanoma (CMM) in a cohort study shows a statistically significant risk increase for CMM was observed in MTX exposed patients. The WHO database (VigiBase) searched for individual case safety reports (ICSRs) reported for “Skin Cancer” and 'Methotrexate' use, which yielded 121 ICSRs. The initial review revealed that 106 cases are insufficiently documented for proper medical assessment. However, the remaining fifteen cases have extensively evaluated by applying the WHO criteria of causality assessment. As a result, 30 percent of the cases showed that MTX could possibly cause skin cancer; five cases provide unlikely association and five un-assessable cases due to lack of information. The Saudi NPC database searched to retrieve any reported cases for the combined terms methotrexate/skin cancer; however, no local cases reported up to date. The data mining of the observed and the expected reporting rate for drug/adverse drug reaction pair is estimated using information component (IC), a tool developed by the WHO Uppsala Monitoring Centre to measure the reporting ratio. Positive IC reflects higher statistical association, while negative values translated as a less statistical association, considering the null value equal to zero. Results showed that a combination of 'Methotrexate' and 'Skin cancer' observed more than expected when compared to other medications in the WHO database (IC value is 1.2). Conclusion: The weighted cumulative pieces of evidence identified from global cases, data mining, and published literature are sufficient to support a causal association between the risk of skin cancer and methotrexate. Therefore, health care professionals should be aware of this possible risk and may consider monitoring any signs or symptoms of skin cancer in patients treated with methotrexate.Keywords: methotrexate, skin cancer, signal detection, pharmacovigilance
Procedia PDF Downloads 114