Search results for: predictive mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2056

Search results for: predictive mining

856 A Comparative Study between Different Techniques of Off-Page and On-Page Search Engine Optimization

Authors: Ahmed Ishtiaq, Maeeda Khalid, Umair Sajjad

Abstract:

In the fast-moving world, information is the key to success. If information is easily available, then it makes work easy. The Internet is the biggest collection and source of information nowadays, and with every single day, the data on internet increases, and it becomes difficult to find required data. Everyone wants to make his/her website at the top of search results. This can be possible when you have applied some techniques of SEO inside your application or outside your application, which are two types of SEO, onsite and offsite SEO. SEO is an abbreviation of Search Engine Optimization, and it is a set of techniques, methods to increase users of a website on World Wide Web or to rank up your website in search engine indexing. In this paper, we have compared different techniques of Onpage and Offpage SEO, and we have suggested many things that should be changed inside webpage, outside web page and mentioned some most powerful and search engine considerable elements and techniques in both types of SEO in order to gain high ranking on Search Engine.

Keywords: auto-suggestion, search engine optimization, SEO, query, web mining, web crawler

Procedia PDF Downloads 151
855 A Comparative Analysis of Machine Learning Techniques for PM10 Forecasting in Vilnius

Authors: Mina Adel Shokry Fahim, Jūratė Sužiedelytė Visockienė

Abstract:

With the growing concern over air pollution (AP), it is clear that this has gained more prominence than ever before. The level of consciousness has increased and a sense of knowledge now has to be forwarded as a duty by those enlightened enough to disseminate it to others. This realisation often comes after an understanding of how poor air quality indices (AQI) damage human health. The study focuses on assessing air pollution prediction models specifically for Lithuania, addressing a substantial need for empirical research within the region. Concentrating on Vilnius, it specifically examines particulate matter concentrations 10 micrometers or less in diameter (PM10). Utilizing Gaussian Process Regression (GPR) and Regression Tree Ensemble, and Regression Tree methodologies, predictive forecasting models are validated and tested using hourly data from January 2020 to December 2022. The study explores the classification of AP data into anthropogenic and natural sources, the impact of AP on human health, and its connection to cardiovascular diseases. The study revealed varying levels of accuracy among the models, with GPR achieving the highest accuracy, indicated by an RMSE of 4.14 in validation and 3.89 in testing.

Keywords: air pollution, anthropogenic and natural sources, machine learning, Gaussian process regression, tree ensemble, forecasting models, particulate matter

Procedia PDF Downloads 54
854 Comparisons of Surveying with Terrestrial Laser Scanner and Total Station for Volume Determination of Overburden and Coal Excavations in Large Open-Pit Mine

Authors: B. Keawaram, P. Dumrongchai

Abstract:

The volume of overburden and coal excavations in open-pit mine is generally determined by conventional survey such as total station. This study aimed to evaluate the accuracy of terrestrial laser scanner (TLS) used to measure overburden and coal excavations, and to compare TLS survey data sets with the data of the total station. Results revealed that, the reference points measured with the total station showed 0.2 mm precision for both horizontal and vertical coordinates. When using TLS on the same points, the standard deviations of 4.93 cm and 0.53 cm for horizontal and vertical coordinates, respectively, were achieved. For volume measurements covering the mining areas of 79,844 m2, TLS yielded the mean difference of about 1% and the surface error margin of 6 cm at the 95% confidence level when compared to the volume obtained by total station.

Keywords: mine, survey, terrestrial laser scanner, total station

Procedia PDF Downloads 388
853 Providing a Practical Model to Reduce Maintenance Costs: A Case Study in GeG Company

Authors: Iman Atighi, Jalal Soleimannejad, Reza Pourjafarabadi, Saeid Moradpour

Abstract:

In the past, we could increase profit by increasing product prices. But in the new decade, a competitive market does not let us to increase profit with increased prices. Therefore, the only way to increase profit will be to reduce costs. A significant percentage of production costs are the maintenance costs, and analysis of these costs could achieve more profit. Most maintenance strategies such as RCM (Reliability-Center-Maintenance), TPM (Total Productivity Maintenance), PM (Preventive Maintenance) and etc., are trying to reduce maintenance costs. In this paper, decreasing the maintenance costs of Concentration Plant of Golgohar Iron Ore Mining & Industrial Company (GeG) was examined by using of MTBF (Mean Time Between Failures) and MTTR (Mean Time To Repair) analyses. These analyses showed that instead of buying new machines and increasing costs in order to promote capacity, the improving of MTBF and MTTR indexes would solve capacity problems in the best way and decrease costs.

Keywords: GeG company, maintainability, maintenance costs, reliability-center-maintenance

Procedia PDF Downloads 222
852 Treatment of Acid Mine Drainage with Modified Fly Ash

Authors: Sukla Saha, Alok Sinha

Abstract:

Acid mine drainage (AMD) is the generation of acidic water from active as well as abandoned mines. AMD generates due to the oxidation of pyrites present in the rock in mining areas. Sulfur oxidizing bacteria such as Thiobacillus ferrooxidans acts as a catalyst in this oxidation process. The characteristics of AMD is extreme low pH (2-3) with elevated concentration of different heavy metals such as Fe, Al, Zn, Mn, Cu and Co and anions such sulfate and chloride. AMD contaminate the ground water as well as surface water which leads to the degradation of water quality. Moreover, it carries detrimental effect for aquatic organism and degrade the environment. In the present study, AMD is treated with fly ash, modified with alkaline agent (NaOH). This modified fly ash (MFA) was experimentally proven as a very effective neutralizing agent for the treatment of AMD. It was observed that pH of treated AMD raised to 9.22 from 1.51 with 100g/L of MFA dose. Approximately, 99% removal of Fe, Al, Mn, Cu and Co took place with the same MFA dose. The treated water comply with the effluent discharge standard of (IS: 2490-1981).

Keywords: acid mine drainage, heavy metals, modified fly ash, neutralization

Procedia PDF Downloads 153
851 Cost Analysis of Optimized Fast Network Mobility in IEEE 802.16e Networks

Authors: Seyyed Masoud Seyyedoshohadaei, Borhanuddin Mohd Ali

Abstract:

To support group mobility, the NEMO Basic Support Protocol has been standardized as an extension of Mobile IP that enables an entire network to change its point of attachment to the Internet. Using NEMO in IEEE 802.16e (WiMax) networks causes latency in handover procedure and affects seamless communication of real-time applications. To decrease handover latency and service disruption time, an integrated scheme named Optimized Fast NEMO (OFNEMO) was introduced by authors of this paper. In OFNEMO a pre-establish multi tunnels concept, cross function optimization and cross layer design are used. In this paper, an analytical model is developed to evaluate total cost consisting of signaling and packet delivery costs of the OFNEMO compared with RFC3963. Results show that OFNEMO increases probability of predictive mode compared with RFC3963 due to smaller handover latency. Even though OFNEMO needs extra signalling to pre-establish multi tunnel, it has less total cost thanks to its optimized algorithm. OFNEMO can minimize handover latency for supporting real time application in moving networks.

Keywords: fast mobile IPv6, handover latency, IEEE802.16e, network mobility

Procedia PDF Downloads 198
850 A Similarity Measure for Classification and Clustering in Image Based Medical and Text Based Banking Applications

Authors: K. P. Sandesh, M. H. Suman

Abstract:

Text processing plays an important role in information retrieval, data-mining, and web search. Measuring the similarity between the documents is an important operation in the text processing field. In this project, a new similarity measure is proposed. To compute the similarity between two documents with respect to a feature the proposed measure takes the following three cases into account: (1) The feature appears in both documents; (2) The feature appears in only one document and; (3) The feature appears in none of the documents. The proposed measure is extended to gauge the similarity between two sets of documents. The effectiveness of our measure is evaluated on several real-world data sets for text classification and clustering problems, especially in banking and health sectors. The results show that the performance obtained by the proposed measure is better than that achieved by the other measures.

Keywords: document classification, document clustering, entropy, accuracy, classifiers, clustering algorithms

Procedia PDF Downloads 519
849 Machine Learning Predictive Models for Hydroponic Systems: A Case Study Nutrient Film Technique and Deep Flow Technique

Authors: Kritiyaporn Kunsook

Abstract:

Machine learning algorithms (MLAs) such us artificial neural networks (ANNs), decision tree, support vector machines (SVMs), Naïve Bayes, and ensemble classifier by voting are powerful data driven methods that are relatively less widely used in the mapping of technique of system, and thus have not been comparatively evaluated together thoroughly in this field. The performances of a series of MLAs, ANNs, decision tree, SVMs, Naïve Bayes, and ensemble classifier by voting in technique of hydroponic systems prospectively modeling are compared based on the accuracy of each model. Classification of hydroponic systems only covers the test samples from vegetables grown with Nutrient film technique (NFT) and Deep flow technique (DFT). The feature, which are the characteristics of vegetables compose harvesting height width, temperature, require light and color. The results indicate that the classification performance of the ANNs is 98%, decision tree is 98%, SVMs is 97.33%, Naïve Bayes is 96.67%, and ensemble classifier by voting is 98.96% algorithm respectively.

Keywords: artificial neural networks, decision tree, support vector machines, naïve Bayes, ensemble classifier by voting

Procedia PDF Downloads 375
848 A Social Decision Support Mechanism for Group Purchasing

Authors: Lien-Fa Lin, Yung-Ming Li, Fu-Shun Hsieh

Abstract:

With the advancement of information technology and development of group commerce, people have obviously changed in their lifestyle. However, group commerce faces some challenging problems. The products or services provided by vendors do not satisfactorily reflect customers’ opinions, so that the sale and revenue of group commerce gradually become lower. On the other hand, the process for a formed customer group to reach group-purchasing consensus is time-consuming and the final decision is not the best choice for each group members. In this paper, we design a social decision support mechanism, by using group discussion message to recommend suitable options for group members and we consider social influence and personal preference to generate option ranking list. The proposed mechanism can enhance the group purchasing decision making efficiently and effectively and venders can provide group products or services according to the group option ranking list.

Keywords: social network, group decision, text mining, group commerce

Procedia PDF Downloads 485
847 Digital Wellbeing: A Multinational Study and Global Index

Authors: Fahad Al Beyahi, Justin Thomas, Md Mamunur Rashid

Abstract:

Various definitions of digital well-being have emerged in recent years, most of which center on the impacts -beneficial and detrimental- of digital technology on health and well-being (psychological, social, and financial). Other definitions go further, emphasizing the attainment of balance, viewing digital well-being as wholly subjective, the individual’s perception of optimal balance between the benefits and ills associated with online connectivity. Based on this broad conceptualization of digital well-being, we undertook a global survey measuring various dimensions of this emerging construct. The survey was administered across 35 nations and 7 world regions, with 1000 participants within each territory (N= 35000). Along with attitudinal, behavioral, and sociodemographic variables, the survey included measures of depression, anxiety, problematic social media use, gaming disorder, and other relevant metrics. Coupled with nation-level policy audits, these data were used to create a multinational (global) digital well-being index. Nations are ranked based on various dimensions of digital well-being, and predictive models are used to identify resilience and risk factors for problem technology use. In this paper, we will discuss key findings from the survey and the index. This work can inform public policy and shape our responses to the emerging implications of lives increasingly lived online and interconnected with digital technology.

Keywords: technology, health, behavioral addiction, digital wellbeing

Procedia PDF Downloads 81
846 Predicting the Diagnosis of Alzheimer’s Disease: Development and Validation of Machine Learning Models

Authors: Jay L. Fu

Abstract:

Patients with Alzheimer's disease progressively lose their memory and thinking skills and, eventually, the ability to carry out simple daily tasks. The disease is irreversible, but early detection and treatment can slow down the disease progression. In this research, publicly available MRI data and demographic data from 373 MRI imaging sessions were utilized to build models to predict dementia. Various machine learning models, including logistic regression, k-nearest neighbor, support vector machine, random forest, and neural network, were developed. Data were divided into training and testing sets, where training sets were used to build the predictive model, and testing sets were used to assess the accuracy of prediction. Key risk factors were identified, and various models were compared to come forward with the best prediction model. Among these models, the random forest model appeared to be the best model with an accuracy of 90.34%. MMSE, nWBV, and gender were the three most important contributing factors to the detection of Alzheimer’s. Among all the models used, the percent in which at least 4 of the 5 models shared the same diagnosis for a testing input was 90.42%. These machine learning models allow early detection of Alzheimer’s with good accuracy, which ultimately leads to early treatment of these patients.

Keywords: Alzheimer's disease, clinical diagnosis, magnetic resonance imaging, machine learning prediction

Procedia PDF Downloads 143
845 Preferred Left-Handed Conformation of Glycyls at Pathogenic Sites

Authors: Purva Mishra, Rajesh Potlia, Kuljeet Singh Sandhu

Abstract:

The role of glycyl residues in the protein structure has lingered within the research community for the last several decades. Glycyl residue is the only amino acid that is achiral due to the lack of a side chain and can, therefore, exhibit Ramachandran conformations that are disallowed for L-amino acids. The structural and functional significance of glycyl residues with L-disallowed conformation, however, remains obscure. Through statistical analysis of various datasets, we found that the glycyls with L-disallowed conformations are over-represented at disease-associated sites and tend to be evolutionarily conserved. The mutations of L-disallowed glycyls tend to destabilize the native conformation, reduce protein solubility, and promote inter-molecular aggregations. We uncovered a structural motif referred to as “β-crescent” formed around the L-disallowed glycyl, which prevents β-sheet aggregation by disrupting the alternating pattern of β-pleats. The L-disallowed conformation of glycyls also holds predictive power to infer the pathogenic missense variants. Altogether, our observations highlight that the L-disallowed conformation of glycyls is selected to facilitate native folding and prevent inter-molecular aggregations. The findings may also have implications for designing more stable proteins and prioritizing the genetic lesions implicated in diseases.

Keywords: Ramachandran plot, β-sheet, protein stability, protein aggregation

Procedia PDF Downloads 72
844 Incremental Learning of Independent Topic Analysis

Authors: Takahiro Nishigaki, Katsumi Nitta, Takashi Onoda

Abstract:

In this paper, we present a method of applying Independent Topic Analysis (ITA) to increasing the number of document data. The number of document data has been increasing since the spread of the Internet. ITA was presented as one method to analyze the document data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis (ICA). ICA is a technique in the signal processing; however, it is difficult to apply the ITA to increasing number of document data. Because ITA must use the all document data so temporal and spatial cost is very high. Therefore, we present Incremental ITA which extracts the independent topics from increasing number of document data. Incremental ITA is a method of updating the independent topics when the document data is added after extracted the independent topics from a just previous the data. In addition, Incremental ITA updates the independent topics when the document data is added. And we show the result applied Incremental ITA to benchmark datasets.

Keywords: text mining, topic extraction, independent, incremental, independent component analysis

Procedia PDF Downloads 309
843 The Predictive Value of Extensor Grip Test for the Effectiveness of Treatment for Tennis Elbow: A Randomized Controlled Trial

Authors: Mohammad Javad Zehtab, S. Alireza Mirghasemi, Ali Majlesara, Parvin Tajik, Babak Siavashi

Abstract:

Objective: There are different modalities proposed for tennis elbow treatment with few randomized trials comparing them. We designed a study to compare the effectiveness of five different modalities and determine the usefulness of recently proposed extensor grip test (EGT) in predicting the response to treatment. Methods: In a randomized controlled clinical trial 92 of 98 tennis elbow patients in Sina hospital of Tehran, Iran between 2006 and 2007 fulfill trial entry criteria, among these patients 56 (60.9%) had positive EGT result. Stratified on EGT result, patients allocated randomly to 5 treatment groups: Brace (B) group, physiotherapy (P), brace + physiotherapy (BP), injection (I) and injection + physiotherapy (IP). Results: Patients who had positive result of EGT had better response to treatments: less SOC (p = 0.06), less PFFQ and patients’ satisfaction scores (p < 0.001). Among the treatment IP was the most successful, then BP, P and B, respectively; injection was the worst treatment modality. Response to treatment was comparable in all groups between EGT positive and negative patients except bracing; in which positive EGT was correlated with a dramatic response to treatment. Conclusion: In all patients IP and then BP is recommended but in EGT negatives, bracing seems to be of no use. Injection alone is not recommended in either group.

Keywords: tennis elbow, extensor grip test, physiotherapy, tennis elbow treatment

Procedia PDF Downloads 285
842 Computational Chemical-Composition of Carbohydrates in the Context of Healthcare Informatics

Authors: S. Chandrasekaran, S. Nandita, M. Shivathmika, Srikrishnan Shivakumar

Abstract:

The objective of the research work is to analyze the computational chemical-composition of carbohydrates in the context of healthcare informatics. The computation involves the representation of complex chemical molecular structure of carbohydrate using graph theory and in a deployable Chemical Markup Language (CML). The parallel molecular structure of the chemical molecules with or without other adulterants for the sake of business profit can be analyzed in terms of robustness and derivatization measures. The rural healthcare program should create awareness in malnutrition to reduce ill-effect of decomposition and help the consumers to know the level of such energy storage mixtures in a quantitative way. The earlier works were based on the empirical and wet data which can vary from time to time but cannot be made to reuse the results of mining. The work is carried out on the quantitative computational chemistry on carbohydrates to provide a safe and secure right to food act and its regulations.

Keywords: carbohydrates, chemical-composition, chemical markup, robustness, food safety

Procedia PDF Downloads 374
841 Drivers of Energy Saving Behaviour: The Relative Influence of Normative, Habitual, Intentional, and Situational Processes

Authors: Karlijn Van Den Broek, Ian Walker, Christian Klöckner

Abstract:

Campaigns aiming to induce energy-saving behaviour among householders use a wide range of approaches that address many different drivers thought to underpin this behaviour. However, little research has compared the relative importance of the different factors that influence energy behaviour, meaning campaigns are not informed about where best to focus resources. Therefore, this study applies the Comprehensive Action Determination Model (CADM) to compare the role of normative, intentional, habitual, and situational processes on energy-saving behaviour. An online survey on a sample of households (N = 247) measured the CADM variables and the data was analysed using structural equation modelling. Results showed that situational and habitual processes were best able to account for energy saving behaviour while normative and intentional processes had little predictive power. These findings suggest that policymakers should move away from motivating householders to save energy and should instead focus their efforts on changing energy habits and creating environments that facilitate energy saving behaviour. These findings add to the wider development in social and environmental psychology that emphasizes the importance of extra-personal variables such as the physical environment in shaping behaviour.

Keywords: energy consumption, behavioural modelling, environmental psychology theory, habits, values

Procedia PDF Downloads 258
840 Development of PCI Prediction Models for Distress Evaluation of Asphalt Pavements

Authors: Hamid Noori

Abstract:

A scientific approach is essential for evaluating pavement surface conditions at the network level. The Pavement Condition Index (PCI) is widely used to assess surface conditions and determine appropriate treatments. This study examines three national highways using a network survey vehicle to collect distress data. The first two corridors were used for evaluation and comparison, while the third corridor validated the predicted PCI values. Multiple linear regression (MLR) initially modeled the relationship between PCI and distress variables but showed poor predictive accuracy. Therefore, K-nearest neighbors (KNN) and artificial neural network (ANN) models were developed, providing better results. A methodology for prioritizing pavement sections was introduced, and the pavement sections were based on PCI, IRI, and rut values through Combined Index Rankings (CIR). In addition, a methodology has been proposed for the selection of appropriate treatment of the ranked candidate pavement section. The proposed treatment selection process considers PCI, IRI, rutting, and FWD test results, aligning with a customized PCI rating scale. A Decision Tree was developed to recommend suitable treatments based on these criteria.

Keywords: pavement distresses, pavement condition index, multiple linear regression, artificial neural network, k-nearest neighbors, combined index ranking

Procedia PDF Downloads 0
839 A Heart Arrhythmia Prediction Using Machine Learning’s Classification Approach and the Concept of Data Mining

Authors: Roshani S. Golhar, Neerajkumar S. Sathawane, Snehal Dongre

Abstract:

Background and objectives: As the, cardiovascular illnesses increasing and becoming cause of mortality worldwide, killing around lot of people each year. Arrhythmia is a type of cardiac illness characterized by a change in the linearity of the heartbeat. The goal of this study is to develop novel deep learning algorithms for successfully interpreting arrhythmia using a single second segment. Because the ECG signal indicates unique electrical heart activity across time, considerable changes between time intervals are detected. Such variances, as well as the limited number of learning data available for each arrhythmia, make standard learning methods difficult, and so impede its exaggeration. Conclusions: The proposed method was able to outperform several state-of-the-art methods. Also proposed technique is an effective and convenient approach to deep learning for heartbeat interpretation, that could be probably used in real-time healthcare monitoring systems

Keywords: electrocardiogram, ECG classification, neural networks, convolutional neural networks, portable document format

Procedia PDF Downloads 72
838 Literature Review on Text Comparison Techniques: Analysis of Text Extraction, Main Comparison and Visual Representation Tools

Authors: Andriana Mkrtchyan, Vahe Khlghatyan

Abstract:

The choice of a profession is one of the most important decisions people make throughout their life. With the development of modern science, technologies, and all the spheres existing in the modern world, more and more professions are being arisen that complicate even more the process of choosing. Hence, there is a need for a guiding platform to help people to choose a profession and the right career path based on their interests, skills, and personality. This review aims at analyzing existing methods of comparing PDF format documents and suggests that a 3-stage approach is implemented for the comparison, that is – 1. text extraction from PDF format documents, 2. comparison of the extracted text via NLP algorithms, 3. comparison representation using special shape and color psychology methodology.

Keywords: color psychology, data acquisition/extraction, data augmentation, disambiguation, natural language processing, outlier detection, semantic similarity, text-mining, user evaluation, visual search

Procedia PDF Downloads 79
837 How Unicode Glyphs Revolutionized the Way We Communicate

Authors: Levi Corallo

Abstract:

Typed language made by humans on computers and cell phones has made a significant distinction from previous modes of written language exchanges. While acronyms remain one of the most predominant markings of typed language, another and perhaps more recent revolution in the way humans communicate has been with the use of symbols or glyphs, primarily Emojis—globally introduced on the iPhone keyboard by Apple in 2008. This paper seeks to analyze the use of symbols in typed communication from both a linguistic and machine learning perspective. The Unicode system will be explored and methods of encoding will be juxtaposed with the current machine and human perception. Topics in how typed symbol usage exists in conversation will be explored as well as topics across current research methods dealing with Emojis like sentiment analysis, predictive text models, and so on. This study proposes that sequential analysis is a significant feature for analyzing unicode characters in a corpus with machine learning. Current models that are trying to learn or translate the meaning of Emojis should be starting to learn using bi- and tri-grams of Emoji, as well as observing the relationship between combinations of different Emoji in tandem. The sociolinguistics of an entire new vernacular of language referred to here as ‘typed language’ will also be delineated across my analysis with unicode glyphs from both a semantic and technical perspective.

Keywords: unicode, text symbols, emojis, glyphs, communication

Procedia PDF Downloads 194
836 A Research and Application of Feature Selection Based on IWO and Tabu Search

Authors: Laicheng Cao, Xiangqian Su, Youxiao Wu

Abstract:

Feature selection is one of the important problems in network security, pattern recognition, data mining and other fields. In order to remove redundant features, effectively improve the detection speed of intrusion detection system, proposes a new feature selection method, which is based on the invasive weed optimization (IWO) algorithm and tabu search algorithm(TS). Use IWO as a global search, tabu search algorithm for local search, to improve the results of IWO algorithm. The experimental results show that the feature selection method can effectively remove the redundant features of network data information in feature selection, reduction time, and to guarantee accurate detection rate, effectively improve the speed of detection system.

Keywords: intrusion detection, feature selection, iwo, tabu search

Procedia PDF Downloads 531
835 Predictive Analytics of Student Performance Determinants

Authors: Mahtab Davari, Charles Edward Okon, Somayeh Aghanavesi

Abstract:

Every institute of learning is usually interested in the performance of enrolled students. The level of these performances determines the approach an institute of study may adopt in rendering academic services. The focus of this paper is to evaluate students' academic performance in given courses of study using machine learning methods. This study evaluated various supervised machine learning classification algorithms such as Logistic Regression (LR), Support Vector Machine, Random Forest, Decision Tree, K-Nearest Neighbors, Linear Discriminant Analysis, and Quadratic Discriminant Analysis, using selected features to predict study performance. The accuracy, precision, recall, and F1 score obtained from a 5-Fold Cross-Validation were used to determine the best classification algorithm to predict students’ performances. SVM (using a linear kernel), LDA, and LR were identified as the best-performing machine learning methods. Also, using the LR model, this study identified students' educational habits such as reading and paying attention in class as strong determinants for a student to have an above-average performance. Other important features include the academic history of the student and work. Demographic factors such as age, gender, high school graduation, etc., had no significant effect on a student's performance.

Keywords: student performance, supervised machine learning, classification, cross-validation, prediction

Procedia PDF Downloads 128
834 Performance Evaluation of an Ontology-Based Arabic Sentiment Analysis

Authors: Salima Behdenna, Fatiha Barigou, Ghalem Belalem

Abstract:

Due to the quick increase in the volume of Arabic opinions posted on various social media, Arabic sentiment analysis has become one of the most important areas of research. Compared to English, there is very little works on Arabic sentiment analysis, in particular aspect-based sentiment analysis (ABSA). In ABSA, aspect extraction is the most important task. In this paper, we propose a semantic aspect-based sentiment analysis approach for standard Arabic reviews to extract explicit aspect terms and identify the polarity of the extracted aspects. The proposed approach was evaluated using HAAD datasets. Experiments showed that the proposed approach achieved a good level of performance compared with baseline results. The F-measure was improved by 19% for the aspect term extraction tasks and 55% aspect term polarity task.

Keywords: sentiment analysis, opinion mining, Arabic, aspect level, opinion, polarity

Procedia PDF Downloads 163
833 Topic Sentiments toward the COVID-19 Vaccine on Twitter

Authors: Melissa Vang, Raheyma Khan, Haihua Chen

Abstract:

The coronavirus disease 2019 (COVID‐19) pandemic has changed people's lives from all over the world. More people have turned to Twitter to engage online and discuss the COVID-19 vaccine. This study aims to present a text mining approach to identify people's attitudes towards the COVID-19 vaccine on Twitter. To achieve this purpose, we collected 54,268 COVID-19 vaccine tweets from September 01, 2020, to November 01, 2020, then the BERT model is used for the sentiment and topic analysis. The results show that people had more negative than positive attitudes about the vaccine, and countries with an increasing number of confirmed cases had a higher percentage of negative attitudes. Additionally, the topics discussed in positive and negative tweets are different. The tweet datasets can be helpful to information professionals to inform the public about vaccine-related informational resources. Our findings may have implications for understanding people's cognitions and feelings about the vaccine.

Keywords: BERT, COVID-19 vaccine, sentiment analysis, topic modeling

Procedia PDF Downloads 152
832 Predicting Relative Performance of Sector Exchange Traded Funds Using Machine Learning

Authors: Jun Wang, Ge Zhang

Abstract:

Machine learning has been used in many areas today. It thrives at reviewing large volumes of data and identifying patterns and trends that might not be apparent to a human. Given the huge potential benefit and the amount of data available in the financial market, it is not surprising to see machine learning applied to various financial products. While future prices of financial securities are extremely difficult to forecast, we study them from a different angle. Instead of trying to forecast future prices, we apply machine learning algorithms to predict the direction of future price movement, in particular, whether a sector Exchange Traded Fund (ETF) would outperform or underperform the market in the next week or in the next month. We apply several machine learning algorithms for this prediction. The algorithms are Linear Discriminant Analysis (LDA), k-Nearest Neighbors (KNN), Decision Tree (DT), Gaussian Naive Bayes (GNB), and Neural Networks (NN). We show that these machine learning algorithms, most notably GNB and NN, have some predictive power in forecasting out-performance and under-performance out of sample. We also try to explore whether it is possible to utilize the predictions from these algorithms to outperform the buy-and-hold strategy of the S&P 500 index. The trading strategy to explore out-performance predictions does not perform very well, but the trading strategy to explore under-performance predictions can earn higher returns than simply holding the S&P 500 index out of sample.

Keywords: machine learning, ETF prediction, dynamic trading, asset allocation

Procedia PDF Downloads 100
831 Infrastructure Project Management and Implementation: A Case Study Of the Mokolo-Crocodile Water Augmentation Project in South Africa

Authors: Elkington Sibusiso Mnguni

Abstract:

The Mokolo-Crocodile Water Augmentation Project (MCWAP) is located in the Limpopo Province in the northern-western part of South Africa. Its purpose is to increase water supply by 30 million cubic meters per year to meet current and future demand for users, including power stations, mining houses, and the local municipality in the Lephalale area. This paper documents the planning and implementation aspects of the MCWAP infrastructure project. The study will add to the body of knowledge with respect to bulk water infrastructure development in water-scarce regions. The method used to gather and collate relevant data and information was the desktop study. The key finding was that the project was successfully completed in 2015 using conventional project management and construction methods. The project is currently being operated and maintained by the National Department of Water and Sanitation.

Keywords: construction, contract management, infrastructure project, project management

Procedia PDF Downloads 303
830 Acid Mine Drainage Remediation Using Silane and Phosphate Coatings

Authors: M. Chiliza, H. P. Mbukwane, P Masita, H. Rutto

Abstract:

Acid mine drainage (AMD) one of the main pollutants of water in many countries that have mining activities. AMD results from the oxidation of pyrite and other metal sulfides. When these metals gets exposed to moisture and oxygen, leaching takes place releasing sulphate and Iron. Acid drainage is often noted by 'yellow boy,' an orange-yellow substance that occurs when the pH of acidic mine-influenced water raises above pH 3, so that the previously dissolved iron precipitates out. The possibility of using environmentally friendly silane and phosphate based coatings on pyrite to remediate acid mine drainage and prevention at source was investigated. The results showed that both coatings reduced chemical oxidation of pyrite based on Fe and sulphate release. Furthermore, it was found that silane based coating performs better when coating synthesis take place in a basic hydrolysis than in an acidic state.

Keywords: acid mine drainage, pyrite, silane, phosphate

Procedia PDF Downloads 342
829 Groundwater Flow Assessment Based on Numerical Simulation at Omdurman Area, Khartoum State, Sudan

Authors: Adil Balla Elkrail

Abstract:

Visual MODFLOW computer codes were selected to simulate head distribution, calculate the groundwater budgets of the area, and evaluate the effect of external stresses on the groundwater head and to demonstrate how the groundwater model can be used as a comparative technique in order to optimize utilization of the groundwater resource. A conceptual model of the study area, aquifer parameters, boundary, and initial conditions were used to simulate the flow model. The trial-and-error technique was used to calibrate the model. The most important criteria used to check the calibrated model were Root Mean Square error (RMS), Mean Absolute error (AM), Normalized Root Mean Square error (NRMS) and mass balance. The maps of the simulated heads elaborated acceptable model calibration compared to observed heads map. A time length of eight years and the observed heads of the year 2004 were used for model prediction. The predictive simulation showed that the continuation of pumping will cause relatively high changes in head distribution and components of groundwater budget whereas, the low deficit computed (7122 m3/d) between inflows and outflows cannot create a significant drawdown of the potentiometric level. Hence, the area under consideration may represent a high permeability and productive zone and strongly recommended for further groundwater development.

Keywords: aquifers, model simulation, groundwater, calibrations, trail-and- error, prediction

Procedia PDF Downloads 244
828 Scattered Places in Stories Singularity and Pattern in Geographic Information

Authors: I. Pina, M. Painho

Abstract:

Increased knowledge about the nature of place and the conditions under which space becomes place is a key factor for better urban planning and place-making. Although there is a broad consensus on the relevance of this knowledge, difficulties remain in relating the theoretical framework about place and urban management. Issues related to representation of places are among the greatest obstacles to overcome this gap. With this critical discussion, based on literature review, we intended to explore, in a common framework for geographical analysis, the potential of stories to spell out place meanings, bringing together qualitative text analysis and text mining in order to capture and represent the singularity contained in each person's life history, and the patterns of social processes that shape places. The development of this reasoning is based on the extensive geographical thought about place, and in the theoretical advances in the field of Geographic Information Science (GISc).

Keywords: discourse analysis, geographic information science place, place-making, stories

Procedia PDF Downloads 199
827 A Relationship Extraction Method from Literary Fiction Considering Korean Linguistic Features

Authors: Hee-Jeong Ahn, Kee-Won Kim, Seung-Hoon Kim

Abstract:

The knowledge of the relationship between characters can help readers to understand the overall story or plot of the literary fiction. In this paper, we present a method for extracting the specific relationship between characters from a Korean literary fiction. Generally, methods for extracting relationships between characters in text are statistical or computational methods based on the sentence distance between characters without considering Korean linguistic features. Furthermore, it is difficult to extract the relationship with direction from text, such as one-sided love, because they consider only the weight of relationship, without considering the direction of the relationship. Therefore, in order to identify specific relationships between characters, we propose a statistical method considering linguistic features, such as syntactic patterns and speech verbs in Korean. The result of our method is represented by a weighted directed graph of the relationship between the characters. Furthermore, we expect that proposed method could be applied to the relationship analysis between characters of other content like movie or TV drama.

Keywords: data mining, Korean linguistic feature, literary fiction, relationship extraction

Procedia PDF Downloads 383