Search results for: predictive mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2041

Search results for: predictive mining

1411 Investigating Data Normalization Techniques in Swarm Intelligence Forecasting for Energy Commodity Spot Price

Authors: Yuhanis Yusof, Zuriani Mustaffa, Siti Sakira Kamaruddin

Abstract:

Data mining is a fundamental technique in identifying patterns from large data sets. The extracted facts and patterns contribute in various domains such as marketing, forecasting, and medical. Prior to that, data are consolidated so that the resulting mining process may be more efficient. This study investigates the effect of different data normalization techniques, which are Min-max, Z-score, and decimal scaling, on Swarm-based forecasting models. Recent swarm intelligence algorithms employed includes the Grey Wolf Optimizer (GWO) and Artificial Bee Colony (ABC). Forecasting models are later developed to predict the daily spot price of crude oil and gasoline. Results showed that GWO works better with Z-score normalization technique while ABC produces better accuracy with the Min-Max. Nevertheless, the GWO is more superior that ABC as its model generates the highest accuracy for both crude oil and gasoline price. Such a result indicates that GWO is a promising competitor in the family of swarm intelligence algorithms.

Keywords: artificial bee colony, data normalization, forecasting, Grey Wolf optimizer

Procedia PDF Downloads 475
1410 Predictive Machine Learning Model for Assessing the Impact of Untreated Teeth Grinding on Gingival Recession and Jaw Pain

Authors: Joseph Salim

Abstract:

This paper proposes the development of a supervised machine learning system to predict the consequences of untreated bruxism (teeth grinding) on gingival (gum) recession and jaw pain (most often bilateral jaw pain with possible headaches and limited ability to open the mouth). As a general dentist in a multi-specialty practice, the author has encountered many patients suffering from these issues due to uncontrolled bruxism (teeth grinding) at night. The most effective treatment for managing this problem involves wearing a nightguard during sleep and receiving therapeutic Botox injections to relax the muscles (the masseter muscle) responsible for grinding. However, some patients choose to postpone these treatments, leading to potentially irreversible and costlier consequences in the future. The proposed machine learning model aims to track patients who forgo the recommended treatments and assess the percentage of individuals who will experience worsening jaw pain, gingival (gum) recession, or both within a 3-to-5-year timeframe. By accurately predicting these outcomes, the model seeks to motivate patients to address the root cause proactively, ultimately saving time and pain while improving quality of life and avoiding much costlier treatments such as full-mouth rehabilitation to help recover the loss of vertical dimension of occlusion due to shortened clinical crowns because of bruxism, gingival grafts, etc.

Keywords: artificial intelligence, machine learning, predictive insights, bruxism, teeth grinding, therapeutic botox, nightguard, gingival recession, gum recession, jaw pain

Procedia PDF Downloads 92
1409 A Near-Optimal Domain Independent Approach for Detecting Approximate Duplicates

Authors: Abdelaziz Fellah, Allaoua Maamir

Abstract:

We propose a domain-independent merging-cluster filter approach complemented with a set of algorithms for identifying approximate duplicate entities efficiently and accurately within a single and across multiple data sources. The near-optimal merging-cluster filter (MCF) approach is based on the Monge-Elkan well-tuned algorithm and extended with an affine variant of the Smith-Waterman similarity measure. Then we present constant, variable, and function threshold algorithms that work conceptually in a divide-merge filtering fashion for detecting near duplicates as hierarchical clusters along with their corresponding representatives. The algorithms take recursive refinement approaches in the spirit of filtering, merging, and updating, cluster representatives to detect approximate duplicates at each level of the cluster tree. Experiments show a high effectiveness and accuracy of the MCF approach in detecting approximate duplicates by outperforming the seminal Monge-Elkan’s algorithm on several real-world benchmarks and generated datasets.

Keywords: data mining, data cleaning, approximate duplicates, near-duplicates detection, data mining applications and discovery

Procedia PDF Downloads 385
1408 Delivery Service and Online-and-Offline Purchasing for Collaborative Recommendations on Retail Cross-Channels

Authors: S. H. Liao, J. M. Huang

Abstract:

The delivery service business model is the final link in logistics for both online-and-offline businesses. The online-and-offline business model focuses on the entire customer purchasing process online and offline, placing greater emphasis on the importance of data to optimize overall retail operations. For the retail industry, it is an important task of information and management to strengthen the collection and investigation of consumers' online and offline purchasing data to better understand customers and then recommend products. This study implements two-stage data mining analytics for clustering and association rules analysis to investigate Taiwanese consumers' (n=2,209) preferences for delivery service. This process clarifies online-and-offline purchasing behaviors and preferences to find knowledge profiles/patterns/rules for cross-channel collaborative recommendations. Finally, theoretical and practical implications for methodology and enterprise are presented.

Keywords: delivery service, online-and-offline purchasing, retail cross-channel, collaborative recommendations, data mining analytics

Procedia PDF Downloads 30
1407 Fuzzy Expert Approach for Risk Mitigation on Functional Urban Areas Affected by Anthropogenic Ground Movements

Authors: Agnieszka A. Malinowska, R. Hejmanowski

Abstract:

A number of European cities are strongly affected by ground movements caused by anthropogenic activities or post-anthropogenic metamorphosis. Those are mainly water pumping, current mining operation, the collapse of post-mining underground voids or mining-induced earthquakes. These activities lead to large and small-scale ground displacements and a ground ruptures. The ground movements occurring in urban areas could considerably affect stability and safety of structures and infrastructures. The complexity of the ground deformation phenomenon in relation to the structures and infrastructures vulnerability leads to considerable constraints in assessing the threat of those objects. However, the increase of access to the free software and satellite data could pave the way for developing new methods and strategies for environmental risk mitigation and management. Open source geographical information systems (OS GIS), may support data integration, management, and risk analysis. Lately, developed methods based on fuzzy logic and experts methods for buildings and infrastructure damage risk assessment could be integrated into OS GIS. Those methods were verified base on back analysis proving their accuracy. Moreover, those methods could be supported by ground displacement observation. Based on freely available data from European Space Agency and free software, ground deformation could be estimated. The main innovation presented in the paper is the application of open source software (OS GIS) for integration developed models and assessment of the threat of urban areas. Those approaches will be reinforced by analysis of ground movement based on free satellite data. Those data would support the verification of ground movement prediction models. Moreover, satellite data will enable our mapping of ground deformation in urbanized areas. Developed models and methods have been implemented in one of the urban areas hazarded by underground mining activity. Vulnerability maps supported by satellite ground movement observation would mitigate the hazards of land displacements in urban areas close to mines.

Keywords: fuzzy logic, open source geographic information science (OS GIS), risk assessment on urbanized areas, satellite interferometry (InSAR)

Procedia PDF Downloads 159
1406 Semantic Based Analysis in Complaint Management System with Analytics

Authors: Francis Alterado, Jennifer Enriquez

Abstract:

Semantic Based Analysis in Complaint Management System with Analytics is an enhanced tool of providing complaints by the clients as well as a mechanism for Palawan Polytechnic College to gather, process, and monitor status of these complaints. The study has a mobile application that serves as a remote facility of communication between the students and the school management on the issues encountered by the student and the solution of every complaint received. In processing the complaints, text mining and clustering algorithms were utilized. Every module of the systems was tested and based on the results; these are 100% free from error before integration was done. A system testing was also done by checking the expected functionality of the system which was 100% functional. The system was tested by 10 students by forwarding complaints to 10 departments. Based on results, the students were able to submit complaints, the system was able to process accordingly by identifying to which department the complaints are intended, and the concerned department was able to give feedback on the complaint received to the student. With this, the system gained 4.7 rating which means Excellent.

Keywords: technology adoption, emerging technology, issues challenges, algorithm, text mining, mobile technology

Procedia PDF Downloads 198
1405 Big Data in Telecom Industry: Effective Predictive Techniques on Call Detail Records

Authors: Sara ElElimy, Samir Moustafa

Abstract:

Mobile network operators start to face many challenges in the digital era, especially with high demands from customers. Since mobile network operators are considered a source of big data, traditional techniques are not effective with new era of big data, Internet of things (IoT) and 5G; as a result, handling effectively different big datasets becomes a vital task for operators with the continuous growth of data and moving from long term evolution (LTE) to 5G. So, there is an urgent need for effective Big data analytics to predict future demands, traffic, and network performance to full fill the requirements of the fifth generation of mobile network technology. In this paper, we introduce data science techniques using machine learning and deep learning algorithms: the autoregressive integrated moving average (ARIMA), Bayesian-based curve fitting, and recurrent neural network (RNN) are employed for a data-driven application to mobile network operators. The main framework included in models are identification parameters of each model, estimation, prediction, and final data-driven application of this prediction from business and network performance applications. These models are applied to Telecom Italia Big Data challenge call detail records (CDRs) datasets. The performance of these models is found out using a specific well-known evaluation criteria shows that ARIMA (machine learning-based model) is more accurate as a predictive model in such a dataset than the RNN (deep learning model).

Keywords: big data analytics, machine learning, CDRs, 5G

Procedia PDF Downloads 138
1404 Influence of Dynamic Loads in the Structural Integrity of Underground Rooms

Authors: M. Inmaculada Alvarez-Fernández, Celestino González-Nicieza, M. Belén Prendes-Gero, Fernando López-Gayarre

Abstract:

Among many factors affecting the stability of mining excavations, rock-bursts and tremors play a special role. These dynamic loads occur practically always and have different sources of generation. The most important of them is the commonly used mining technique, which disintegrates a certain area of the rock mass not only in the area of the planned mining, but also creates waves that significantly exceed this area affecting the structural elements. In this work it is analysed the consequences of dynamic loads over the structural elements in an underground room and pillar mine to avoid roof instabilities. With this end, dynamic loads were evaluated through in situ and laboratory tests and simulated with numerical modelling. Initially, the geotechnical characterization of all materials was carried out by mean of large-scale tests. Then, drill holes were done on the roof of the mine and were monitored to determine possible discontinuities in it. Three seismic stations and a triaxial accelerometer were employed to measure the vibrations from blasting tests, establish the dynamic behaviour of roof and pillars and develop the transmission laws. At last, computer simulations by FLAC3D software were done to check the effect of vibrations on the stability of the roofs. The study shows that in-situ tests have a greater reliability than laboratory samples because of eliminating the effect of heterogeneities, that the pillars work decreasing the amplitude of the vibration around them, and that the tensile strength of a beam and depending on its span is overcome with waves in phase and delayed. The obtained transmission law allows designing a blasting which guarantees safety and prevents the risk of future failures.

Keywords: dynamic modelling, long term instability risks, room and pillar, seismic collapse

Procedia PDF Downloads 138
1403 Enhancing the Pricing Expertise of an Online Distribution Channel

Authors: Luis N. Pereira, Marco P. Carrasco

Abstract:

Dynamic pricing is a revenue management strategy in which hotel suppliers define, over time, flexible and different prices for their services for different potential customers, considering the profile of e-consumers and the demand and market supply. This means that the fundamentals of dynamic pricing are based on economic theory (price elasticity of demand) and market segmentation. This study aims to define a dynamic pricing strategy and a contextualized offer to the e-consumers profile in order to improve the number of reservations of an online distribution channel. Segmentation methods (hierarchical and non-hierarchical) were used to identify and validate an optimal number of market segments. A profile of the market segments was studied, considering the characteristics of the e-consumers and the probability of reservation a room. In addition, the price elasticity of demand was estimated for each segment using econometric models. Finally, predictive models were used to define rules for classifying new e-consumers into pre-defined segments. The empirical study illustrates how it is possible to improve the intelligence of an online distribution channel system through an optimal dynamic pricing strategy and a contextualized offer to the profile of each new e-consumer. A database of 11 million e-consumers of an online distribution channel was used in this study. The results suggest that an appropriate policy of market segmentation in using of online reservation systems is benefit for the service suppliers because it brings high probability of reservation and generates more profit than fixed pricing.

Keywords: dynamic pricing, e-consumers segmentation, online reservation systems, predictive analytics

Procedia PDF Downloads 234
1402 Comparison of Different Methods of Microorganism's Identification from a Copper Mining in Pará, Brazil

Authors: Louise H. Gracioso, Marcela P.G. Baltazar, Ingrid R. Avanzi, Bruno Karolski, Luciana J. Gimenes, Claudio O. Nascimento, Elen A. Perpetuo

Abstract:

Introduction: Higher copper concentrations promote a selection pressure on organisms such as plants, fungi and bacteria, which allows surviving only the resistant organisms to the contaminated site. This selective pressure keeps only the organisms most resistant to a specific condition and subsequently increases their bioremediation potential. Despite the bacteria importance for biosphere maintenance, it is estimated that only a small fraction living microbial species has been described and characterized. Due to the molecular biology development, tools based on analysis 16S ribosomal RNA or another specific gene are making a new scenario for the characterization studies and identification of microorganisms in the environment. News identification of microorganisms methods have also emerged like Biotyper (MALDI / TOF), this method mass spectrometry is subject to the recognition of spectroscopic patterns of conserved and features proteins for different microbial species. In view of this, this study aimed to isolate bacteria resistant to copper present in a Copper Processing Area (Sossego Mine, Canaan, PA) and identifies them in two different methods: Recent (spectrometry mass) and conventional. This work aimed to use them for a future bioremediation of this Mining. Material and Methods: Samples were collected at fifteen different sites of five periods of times. Microorganisms were isolated from mining wastes by culture enrichment technique; this procedure was repeated 4 times. The isolates were inoculated into MJS medium containing different concentrations of chloride copper (1mM, 2.5mM, 5mM, 7.5mM and 10 mM) and incubated in plates for 72 h at 28 ºC. These isolates were subjected to mass spectrometry identification methods (Biotyper – MALDI/TOF) and 16S gene sequencing. Results: A total of 105 strains were isolated in this area, bacterial identification by mass spectrometry method (MALDI/TOF) achieved 74% agreement with the conventional identification method (16S), 31% have been unsuccessful in MALDI-TOF and 2% did not obtain identification sequence the 16S. These results show that Biotyper can be a very useful tool in the identification of bacteria isolated from environmental samples, since it has a better value for money (cheap and simple sample preparation and MALDI plates are reusable). Furthermore, this technique is more rentable because it saves time and has a high performance (the mass spectra are compared to the database and it takes less than 2 minutes per sample).

Keywords: copper mining area, bioremediation, microorganisms, identification, MALDI/TOF, RNA 16S

Procedia PDF Downloads 374
1401 Developing Sustainable Tourism Practices in Communities Adjacent to Mines: An Exploratory Study in South Africa

Authors: Felicite Ann Fairer-Wessels

Abstract:

There has always been a disparity between mining and tourism mainly due to the socio-economic and environmental impacts of mines on both the adjacent resident communities and the areas taken up by the mining operation. Although heritage mining tourism has been actively and successfully pursued and developed in the UK, largely Wales, and Scandinavian countries, the debate whether active mining and tourism can have a mutually beneficial relationship remains imminent. This pilot study explores the relationship between the ‘to be developed’ future Nokeng Mine and its adjacent community, the rural community of Moloto, will be investigated in terms of whether sustainable tourism and livelihood activities can potentially be developed with the support of the mine. Concepts such as social entrepreneur, corporate social responsibility, sustainable development and triple bottom line are discussed. Within the South African context as a mineral rich developing country, the government has a statutory obligation to empower disenfranchised communities through social and labour plans and policies. All South African mines must preside over a Social and Labour Plan according to the Mineral and Petroleum Resources Development Act, No 28 of 2002. The ‘social’ component refers to the ‘social upliftment’ of communities within or adjacent to any mine; whereas the ‘labour’ component refers to the mine workers sourced from the specific community. A qualitative methodology is followed using the case study as research instrument for the Nokeng Mine and Moloto community with interviews and focus group discussions. The target population comprised of the Moloto Tribal Council members (8 in-depth interviews), the Moloto community members (17: focus groups); and the Nokeng Mine representatives (4 in-depth interviews). In this pilot study two disparate ‘worlds’ are potentially linked: on the one hand, the mine as social entrepreneur that is searching for feasible and sustainable ideas; and on the other hand, the community adjacent to the mine, with potentially sustainable tourism entrepreneurs that can tap into the resources of the mine should their ideas be feasible to build their businesses. Being an exploratory study the findings are limited but indicate that the possible success of tourism and sustainable livelihood activities lies in the fact that both the Mine and Community are keen to work together – the mine in terms of obtaining labour and profit; and the community in terms of improved and sustainable social and economic conditions; with both parties realizing the importance to mitigate negative environmental impacts. In conclusion, a relationship of trust is imperative between a mine and a community before a long term liaison is possible. However whether tourism is a viable solution for the community to engage in is debatable. The community could initially rather pursue the sustainable livelihoods approach and focus on life-supporting activities such as building, gardening, etc. that once established could feed into possible sustainable tourism activities.

Keywords: community development, mining tourism, sustainability, South Africa

Procedia PDF Downloads 301
1400 Predictive Analysis of Chest X-rays Using NLP and Large Language Models with the Indiana University Dataset and Random Forest Classifier

Authors: Azita Ramezani, Ghazal Mashhadiagha, Bahareh Sanabakhsh

Abstract:

This study researches the combination of Random. Forest classifiers with large language models (LLMs) and natural language processing (NLP) to improve diagnostic accuracy in chest X-ray analysis using the Indiana University dataset. Utilizing advanced NLP techniques, the research preprocesses textual data from radiological reports to extract key features, which are then merged with image-derived data. This improved dataset is analyzed with Random Forest classifiers to predict specific clinical results, focusing on the identification of health issues and the estimation of case urgency. The findings reveal that the combination of NLP, LLMs, and machine learning not only increases diagnostic precision but also reliability, especially in quickly identifying critical conditions. Achieving an accuracy of 99.35%, the model shows significant advancements over conventional diagnostic techniques. The results emphasize the large potential of machine learning in medical imaging, suggesting that these technologies could greatly enhance clinician judgment and patient outcomes by offering quicker and more precise diagnostic approximations.

Keywords: natural language processing (NLP), large language models (LLMs), random forest classifier, chest x-ray analysis, medical imaging, diagnostic accuracy, indiana university dataset, machine learning in healthcare, predictive modeling, clinical decision support systems

Procedia PDF Downloads 42
1399 Diagnostic Accuracy of the Tuberculin Skin Test for Tuberculosis Diagnosis: Interest of Using ROC Curve and Fagan’s Nomogram

Authors: Nouira Mariem, Ben Rayana Hazem, Ennigrou Samir

Abstract:

Background and aim: During the past decade, the frequency of extrapulmonary forms of tuberculosis has increased. These forms are under-diagnosed using conventional tests. The aim of this study was to evaluate the performance of the Tuberculin Skin Test (TST) for the diagnosis of tuberculosis, using the ROC curve and Fagan’s Nomogram methodology. Methods: This was a case-control, multicenter study in 11 anti-tuberculosis centers in Tunisia, during the period from June to November2014. The cases were adults aged between 18 and 55 years with confirmed tuberculosis. Controls were free from tuberculosis. A data collection sheet was filled out and a TST was performed for each participant. Diagnostic accuracy measures of TST were estimated using ROC curve and Area Under Curve to estimate sensitivity and specificity of a determined cut-off point. Fagan’s nomogram was used to estimate its predictive values. Results: Overall, 1053 patients were enrolled, composed of 339 cases (sex-ratio (M/F)=0.87) and 714 controls (sex-ratio (M/F)=0.99). The mean age was 38.3±11.8 years for cases and 33.6±11 years for controls. The mean diameter of the TST induration was significantly higher among cases than controls (13.7mm vs.6.2mm;p=10-6). Area Under Curve was 0.789 [95% CI: 0.758-0.819; p=0.01], corresponding to a moderate discriminating power for this test. The most discriminative cut-off value of the TST, which were associated with the best sensitivity (73.7%) and specificity (76.6%) couple was about 11 mm with a Youden index of 0.503. Positive and Negative predictive values were 3.11% and 99.52%, respectively. Conclusion: In view of these results, we can conclude that the TST can be used for tuberculosis diagnosis with a good sensitivity and specificity. However, the skin induration measurement and its interpretation is operator dependent and remains difficult and subjective. The combination of the TST with another test such as the Quantiferon test would be a good alternative.

Keywords: tuberculosis, tuberculin skin test, ROC curve, cut-off

Procedia PDF Downloads 66
1398 Assessment of Chromium Concentration and Human Health Risk in the Steelpoort River Sub-Catchment of the Olifants River Basin, South Africa

Authors: Abraham Addo-Bediako

Abstract:

Many freshwater ecosystems are facing immense pressure from anthropogenic activities, such as agricultural, industrial and mining. Trace metal pollution in freshwater ecosystems has become an issue of public health concern due to its toxicity and persistence in the environment. Trace elements pose a serious risk not only to the environment and aquatic biota but also humans. Chromium is one of such trace elements and its pollution in surface waters and groundwaters represents a serious environmental problem. In South Africa, agriculture, mining, industrial and domestic wastes are the main contributors to chromium discharge in rivers. The common forms of chromium are chromium (III) and chromium (VI). The latter is the most toxic because it can cause damage to human health. The aim of the study was to assess the contamination of chromium in the water and sediments of two rivers in the Steelpoort River sub-catchment of the Olifants River Basin, South Africa and human health risk. The concentration of Cr was analyzed using inductively coupled plasma–optical emission spectrometry (ICP-OES). The concentration of the metal was found to exceed the threshold limit, mainly in areas of high human activities. The hazard quotient through ingestion exposure did not exceed the threshold limit of 1 for adults and children and cancer risk for adults and children computed did not exceed the threshold limit of 10-4. Thus, there is no potential health risk from chromium through ingestion of drinking water for now. However, with increasing human activities, especially mining, the concentration could increase and become harmful to humans who depend on rivers for drinking water. It is recommended that proper management strategies should be taken to minimize the impact of chromium on the rivers and water from the rivers should properly be treated before domestic use.

Keywords: land use, health risk, metal pollution, water quality

Procedia PDF Downloads 86
1397 Three-Stage Mining Metals Supply Chain Coordination and Product Quality Improvement with Revenue Sharing Contract

Authors: Hamed Homaei, Iraj Mahdavi, Ali Tajdin

Abstract:

One of the main concerns of miners is to increase the quality level of their products because the mining metals price depends on their quality level; however, increasing the quality level of these products has different costs at different levels of the supply chain. These costs usually increase after extractor level. This paper studies the coordination issue of a decentralized three-level supply chain with one supplier (extractor), one mineral processor and one manufacturer in which the increasing product quality level cost at the processor level is higher than the supplier and at the level of the manufacturer is more than the processor. We identify the optimal product quality level for each supply chain member by designing a revenue sharing contract. Finally, numerical examples show that the designed contract not only increases the final product quality level but also provides a win-win condition for all supply chain members and increases the whole supply chain profit.

Keywords: three-stage supply chain, product quality improvement, channel coordination, revenue sharing

Procedia PDF Downloads 182
1396 Application Methodology for the Generation of 3D Thermal Models Using UAV Photogrammety and Dual Sensors for Mining/Industrial Facilities Inspection

Authors: Javier Sedano-Cibrián, Julio Manuel de Luis-Ruiz, Rubén Pérez-Álvarez, Raúl Pereda-García, Beatriz Malagón-Picón

Abstract:

Structural inspection activities are necessary to ensure the correct functioning of infrastructures. Unmanned Aerial Vehicle (UAV) techniques have become more popular than traditional techniques. Specifically, UAV Photogrammetry allows time and cost savings. The development of this technology has permitted the use of low-cost thermal sensors in UAVs. The representation of 3D thermal models with this type of equipment is in continuous evolution. The direct processing of thermal images usually leads to errors and inaccurate results. A methodology is proposed for the generation of 3D thermal models using dual sensors, which involves the application of visible Red-Blue-Green (RGB) and thermal images in parallel. Hence, the RGB images are used as the basis for the generation of the model geometry, and the thermal images are the source of the surface temperature information that is projected onto the model. Mining/industrial facilities representations that are obtained can be used for inspection activities.

Keywords: aerial thermography, data processing, drone, low-cost, point cloud

Procedia PDF Downloads 142
1395 Automatic Lexicon Generation for Domain Specific Dataset for Mining Public Opinion on China Pakistan Economic Corridor

Authors: Tayyaba Azim, Bibi Amina

Abstract:

The increase in the popularity of opinion mining with the rapid growth in the availability of social networks has attracted a lot of opportunities for research in the various domains of Sentiment Analysis and Natural Language Processing (NLP) using Artificial Intelligence approaches. The latest trend allows the public to actively use the internet for analyzing an individual’s opinion and explore the effectiveness of published facts. The main theme of this research is to account the public opinion on the most crucial and extensively discussed development projects, China Pakistan Economic Corridor (CPEC), considered as a game changer due to its promise of bringing economic prosperity to the region. So far, to the best of our knowledge, the theme of CPEC has not been analyzed for sentiment determination through the ML approach. This research aims to demonstrate the use of ML approaches to spontaneously analyze the public sentiment on Twitter tweets particularly about CPEC. Support Vector Machine SVM is used for classification task classifying tweets into positive, negative and neutral classes. Word2vec and TF-IDF features are used with the SVM model, a comparison of the trained model on manually labelled tweets and automatically generated lexicon is performed. The contributions of this work are: Development of a sentiment analysis system for public tweets on CPEC subject, construction of an automatic generation of the lexicon of public tweets on CPEC, different themes are identified among tweets and sentiments are assigned to each theme. It is worth noting that the applications of web mining that empower e-democracy by improving political transparency and public participation in decision making via social media have not been explored and practised in Pakistan region on CPEC yet.

Keywords: machine learning, natural language processing, sentiment analysis, support vector machine, Word2vec

Procedia PDF Downloads 148
1394 Patterns in Fish Diversity and Abundance of an Abandoned Gold Mine Reservoirs

Authors: O. E. Obayemi, M. A. Ayoade, O. O. Komolafe

Abstract:

Fish survey was carried out for an annual cycle covering both rainy and dry seasons using cast nets, gill nets and traps at two different reservoirs. The objective was to examined the fish assemblages of the reservoirs and provide more additional information on the reservoir. The fish species in the reservoirs comprised of twelve species of six families. The results of the study also showed that five species of fish were caught in reservoir five while ten fish species were captured in reservoir six. Species such as Malapterurus electricus, Ctenopoma kingsleyae, Mormyrus rume, Parachanna obscura, Sarotherodon galilaeus, Tilapia mariae, C. guntheri, Clarias macromystax, Coptodon zilii and Clarias gariepinus were caught during the sampling period. There was a significant difference (p=0.014, t = 1.711) in the abundance of fish species in the two reservoirs. Seasonally, reservoirs five (p=0.221, t = 1.859) and six (p=0.453, t = 1.734) showed there was no significant difference in their fish populations. Also, despite being impacted with gold mining the diversity indices were high when compared to less disturbed waterbodies. The study concluded that the environments recorded low abundant fish species which suggests the influence of mining on the abundance and diversity of fish species.

Keywords: Igun, fish, Shannon-Wiener Index, Simpson index, Pielou index

Procedia PDF Downloads 105
1393 Integrating Machine Learning and Rule-Based Decision Models for Enhanced B2B Sales Forecasting and Customer Prioritization

Authors: Wenqi Liu, Reginald Bailey

Abstract:

This study explores an advanced approach to enhancing B2B sales forecasting by integrating machine learning models with a rule-based decision framework. The methodology begins with the development of a machine learning classification model to predict conversion likelihood, aiming to improve accuracy over traditional methods like logistic regression. The classification model's effectiveness is measured using metrics such as accuracy, precision, recall, and F1 score, alongside a feature importance analysis to identify key predictors. Following this, a machine learning regression model is used to forecast sales value, with the objective of reducing mean absolute error (MAE) compared to linear regression techniques. The regression model's performance is assessed using MAE, root mean square error (RMSE), and R-squared metrics, emphasizing feature contribution to the prediction. To bridge the gap between predictive analytics and decision-making, a rule-based decision model is introduced that prioritizes customers based on predefined thresholds for conversion probability and predicted sales value. This approach significantly enhances customer prioritization and improves overall sales performance by increasing conversion rates and optimizing revenue generation. The findings suggest that this combined framework offers a practical, data-driven solution for sales teams, facilitating more strategic decision-making in B2B environments.

Keywords: sales forecasting, machine learning, rule-based decision model, customer prioritization, predictive analytics

Procedia PDF Downloads 14
1392 The Structure and Function Investigation and Analysis of the Automatic Spin Regulator (ASR) in the Powertrain System of Construction and Mining Machines with the Focus on Dump Trucks

Authors: Amir Mirzaei

Abstract:

The powertrain system is one of the most basic and essential components in a machine. The occurrence of motion is practically impossible without the presence of this system. When power is generated by the engine, it is transmitted by the powertrain system to the wheels, which are the last parts of the system. Powertrain system has different components according to the type of use and design. When the force generated by the engine reaches to the wheels, the amount of frictional force between the tire and the ground determines the amount of traction and non-slip or the amount of slip. At various levels, such as icy, muddy, and snow-covered ground, the amount of friction coefficient between the tire and the ground decreases dramatically and considerably, which in turn increases the amount of force loss and the vehicle traction decreases drastically. This condition is caused by the phenomenon of slipping, which, in addition to the waste of energy produced, causes the premature wear of driving tires. It also causes the temperature of the transmission oil to rise too much, as a result, causes a reduction in the quality and become dirty to oil and also reduces the useful life of the clutches disk and plates inside the transmission. this issue is much more important in road construction and mining machinery than passenger vehicles and is always one of the most important and significant issues in the design discussion, in order to overcome. One of these methods is the automatic spin regulator system which is abbreviated as ASR. The importance of this method and its structure and function have solved one of the biggest challenges of the powertrain system in the field of construction and mining machinery. That this research is examined.

Keywords: automatic spin regulator, ASR, methods of reducing slipping, methods of preventing the reduction of the useful life of clutches disk and plate, methods of preventing the premature dirtiness of transmission oil, method of preventing the reduction of the useful life of tires

Procedia PDF Downloads 78
1391 Geochemical Baseline and Origin of Trace Elements in Soils and Sediments around Selibe-Phikwe Cu-Ni Mining Town, Botswana

Authors: Fiona S. Motswaiso, Kengo Nakamura, Takeshi Komai

Abstract:

Heavy metals may occur naturally in rocks and soils, but elevated quantities of them are being gradually released into the environment by anthropogenic activities such as mining. In order to address issues of heavy metal water and soil pollution, a distinction needs to be made between natural and anthropogenic anomalies. The current study aims at characterizing the spatial distribution of trace elements and evaluate site-specific geochemical background concentrations of trace elements in the mine soils examined, and also to discriminate between lithogenic and anthropogenic sources of enrichment around a copper-nickel mining town in Selibe-Phikwe, Botswana. A total of 20 Soil samples, 11 river sediment, and 9 river water samples were collected from an area of 625m² within the precincts of the mine and the smelter. The concentrations of metals (Cu, Ni, Pb, Zn, Cr, Ni, Mn, As, Pb, and Co) were determined by using an ICP-MS after digestion with aqua regia. Major elements were also determined using ED-XRF. Water pH and EC were measured on site and recorded while soil pH and EC were also determined in the laboratory after performing water elution tests. The highest Cu and Ni concentrations in soil are 593mg/kg and 453mg/kg respectively, which is 3 times higher than the crustal composition values and 2 times higher than the South African minimum allowable levels of heavy metals in soils. The level of copper contamination was higher than that of nickel and other contaminants. Water pH levels ranged from basic (9) to very acidic (3) in areas closer to the mine/smelter. There is high variation in heavy metal concentration, eg. Cu suggesting that some sites depict regional natural background concentrations while other depict anthropogenic sources.

Keywords: contamination, geochemical baseline, heavy metals, soils

Procedia PDF Downloads 158
1390 Predictive Analytics of Bike Sharing Rider Parameters

Authors: Bongs Lainjo

Abstract:

The evolution and escalation of bike-sharing programs (BSP) continue unabated. Since the sixties, many countries have introduced different models and strategies of BSP. These include variations ranging from dockless models to electronic real-time monitoring systems. Reasons for using this BSP include recreation, errands, work, etc. And there is all indication that complex, and more innovative rider-friendly systems are yet to be introduced. The objective of this paper is to analyze current variables established by different operators and streamline them identifying the most compelling ones using analytics. Given the contents of available databases, there is a lack of uniformity and common standard on what is required and what is not. Two factors appear to be common: user type (registered and unregistered, and duration of each trip). This article uses historical data provided by one operator based in the greater Washington, District of Columbia, USA area. Several variables including categorical and continuous data types were screened. Eight out of 18 were considered acceptable and significantly contribute to determining a useful and reliable predictive model. Bike-sharing systems have become popular in recent years all around the world. Although this trend has resulted in many studies on public cycling systems, there have been few previous studies on the factors influencing public bicycle travel behavior. A bike-sharing system is a computer-controlled system in which individuals can borrow bikes for a fee or free for a limited period. This study has identified unprecedented useful, and pragmatic parameters required in improving BSP ridership dynamics.

Keywords: sharing program, historical data, parameters, ridership dynamics, trip duration

Procedia PDF Downloads 138
1389 The Analysis of Emergency Shutdown Valves Torque Data in Terms of Its Use as a Health Indicator for System Prognostics

Authors: Ewa M. Laskowska, Jorn Vatn

Abstract:

Industry 4.0 focuses on digital optimization of industrial processes. The idea is to use extracted data in order to build a decision support model enabling use of those data for real time decision making. In terms of predictive maintenance, the desired decision support tool would be a model enabling prognostics of system's health based on the current condition of considered equipment. Within area of system prognostics and health management, a commonly used health indicator is Remaining Useful Lifetime (RUL) of a system. Because the RUL is a random variable, it has to be estimated based on available health indicators. Health indicators can be of different types and come from different sources. They can be process variables, equipment performance variables, data related to number of experienced failures, etc. The aim of this study is the analysis of performance variables of emergency shutdown valves (ESV) used in oil and gas industry. ESV is inspected periodically, and at each inspection torque and time of valve operation are registered. The data will be analyzed by means of machine learning or statistical analysis. The purpose is to investigate whether the available data could be used as a health indicator for a prognostic purpose. The second objective is to examine what is the most efficient way to incorporate the data into predictive model. The idea is to check whether the data can be applied in form of explanatory variables in Markov process or whether other stochastic processes would be a more convenient to build an RUL model based on the information coming from registered data.

Keywords: emergency shutdown valves, health indicator, prognostics, remaining useful lifetime, RUL

Procedia PDF Downloads 90
1388 Short Text Classification Using Part of Speech Feature to Analyze Students' Feedback of Assessment Components

Authors: Zainab Mutlaq Ibrahim, Mohamed Bader-El-Den, Mihaela Cocea

Abstract:

Students' textual feedback can hold unique patterns and useful information about learning process, it can hold information about advantages and disadvantages of teaching methods, assessment components, facilities, and other aspects of teaching. The results of analysing such a feedback can form a key point for institutions’ decision makers to advance and update their systems accordingly. This paper proposes a data mining framework for analysing end of unit general textual feedback using part of speech feature (PoS) with four machine learning algorithms: support vector machines, decision tree, random forest, and naive bays. The proposed framework has two tasks: first, to use the above algorithms to build an optimal model that automatically classifies the whole data set into two subsets, one subset is tailored to assessment practices (assessment related), and the other one is the non-assessment related data. Second task to use the same algorithms to build an optimal model for whole data set, and the new data subsets to automatically detect their sentiment. The significance of this paper is to compare the performance of the above four algorithms using part of speech feature to the performance of the same algorithms using n-grams feature. The paper follows Knowledge Discovery and Data Mining (KDDM) framework to construct the classification and sentiment analysis models, which is understanding the assessment domain, cleaning and pre-processing the data set, selecting and running the data mining algorithm, interpreting mined patterns, and consolidating the discovered knowledge. The results of this paper experiments show that both models which used both features performed very well regarding first task. But regarding the second task, models that used part of speech feature has underperformed in comparison with models that used unigrams and bigrams.

Keywords: assessment, part of speech, sentiment analysis, student feedback

Procedia PDF Downloads 142
1387 Risk Assessment of Trace Metals in the Soil Surface of an Abandoned Mine, El-Abed Northwestern Algeria

Authors: Farida Mellah, Abdelhak Boutaleb, Bachir Henni, Dalila Berdous, Abdelhamid Mellah

Abstract:

Context/Purpose: One of the largest mining operations for lead and zinc deposits in northwestern Algeria in more than thirty years, El Abed is now the abandoned mine that has been inactive since 2004, leaving large amounts of accumulated mining waste under the influence of Wind, erosion, rain, and near agricultural lands. Materials & Methods: This study aims to verify the concentrations and sources of heavy metals for surface samples containing randomly taken soil. Chemical analyses were performed using iCAP 7000 Series ICP-optical emission spectrometer, using a set of environmental quality indicators by calculating the enrichment factor using iron and aluminum references, geographic accumulation index and geographic information system (GIS). On the basis of the spatial distribution. Results: The results indicated that the average metal concentration was: (As = 30,82),(Pb = 1219,27), (Zn = 2855,94), (Cu = 5,3), mg/Kg,based on these results, all metals except Cu passed by GBV in the Earth's crust. Environmental quality indicators were calculated based on the concentrations of trace metals such as lead, arsenic, zinc, copper, iron and aluminum. Interpretation: This study investigated the concentrations and sources of trace metals, and by using quality indicators and statistical methods, lead, zinc, and arsenic were determined from human sources, while copper was a natural source. And based on the spatial analysis on the basis of GIS, many hot spots were identified in the El-Abed region. Conclusion: These results could help in the development of future treatment strategies aimed primarily at eliminating materials from mining waste.

Keywords: soil contamination, trace metals, geochemical indices, El Abed mine, Algeria

Procedia PDF Downloads 69
1386 Developing HRCT Criterion to Predict the Risk of Pulmonary Tuberculosis

Authors: Vandna Raghuvanshi, Vikrant Thakur, Anupam Jhobta

Abstract:

Objective: To design HRCT criterion to forecast the threat of pulmonary tuberculosis. Material and methods: This was a prospective study of 69 patients with clinical suspicion of pulmonary tuberculosis. We studied their medical characteristics, numerous separate HRCT-results, and a combination of HRCT findings to foresee the danger for PTB by utilizing univariate and multivariate investigation. Temporary HRCT diagnostic criteria were planned in view of these outcomes to find out the risk of PTB and tested these criteria on our patients. Results: The results of HRCT chest were analyzed, and Rank was given from 1 to 4 according to the HRCT chest findings. Sensitivity, specificity, positive predictive value, and negative predictive value were calculated. Rank 1: Highly suspected PTB. Rank 2: Probable PTB Rank 3: Nonspecific or difficult to differentiate from other diseases Rank 4: Other suspected diseases • Rank 1 (Highly suspected TB) was present in 22 (31.9%) patients, all of them finally diagnosed to have pulmonary tuberculosis. The sensitivity, specificity, and negative likelihood ratio for RANK 1 on HRCT chest was 53.6%, 100%, and 0.43, respectively. • Rank 2 (Probable TB) was present in 13 patients, out of which 12 were tubercular, and 1 was non-tubercular. • The sensitivity, specificity, positive likelihood ratio, and negative likelihood ratio of the combination of Rank 1 and Rank 2 was 82.9%, 96.4%, 23.22, and 0.18, respectively. • Rank 3 (Non-specific TB) was present in 25 patients, and out of these, 7 were tubercular, and 18 were non-tubercular. • When all these 3 ranks were considered together, the sensitivity approached 100% however, the specificity reduced to 35.7%. The positive likelihood ratio and negative likelihood ratio were 1.56 and 0, respectively. • Rank 4 (Other specific findings) was given to 9 patients, and all of these were non-tubercular. Conclusion: HRCT is useful in selecting individuals with greater chances of pulmonary tuberculosis.

Keywords: pulmonary, tuberculosis, multivariate, HRCT

Procedia PDF Downloads 169
1385 Spirometric Reference Values in 236,606 Healthy, Non-Smoking Chinese Aged 4–90 Years

Authors: Jiashu Shen

Abstract:

Objectives: Spirometry is a basic reference for health evaluation which is widely used in clinical. Previous reference of spirometry is not applicable because of drastic changes of social and natural circumstance in China. A new reference values for the spirometry of the Chinese population is extremely needed. Method: Spirometric reference value was established using the statistical modeling method Generalized Additive Models for Location, Scale and Shape for forced expiratory volume in 1 s (FEV1), forced vital capacity (FVC), FEV1/FVC, and maximal mid-expiratory flow (MMEF). Results: Data from 236,606 healthy non-smokers aged 4–90 years was collected from the MJ Health Check database. Spirometry equations for FEV1, FVC, MMEF, and FEV1/FVC were established, including the predicted values and lower limits of normal (LLNs) by sex. The predictive equations that were developed for the spirometric results elaborated the relationship between spirometry and age, and they eliminated the effects of height as a variable. Most previous predictive equations for Chinese spirometry were significantly overestimated (to be exact, with mean differences of 22.21% in FEV1 and 31.39% in FVC for males, along with differences of 26.93% in FEV1 and 35.76% in FVC for females) or underestimated (with mean differences of -5.81% in MMEF and -14.56% in FEV1/FVC for males, along with a difference of -14.54% in FEV1/FVC for females) the results of lung function measurements as found in this study. Through cross-validation, our equations were established as having good fit, and the means of the measured value and the estimated value were compared, with good results. Conclusions: Our study updates the spirometric reference equations for Chinese people of all ages and provides comprehensive values for both physical examination and clinical diagnosis.

Keywords: Chinese, GAMLSS model, reference values, spirometry

Procedia PDF Downloads 133
1384 A Comprehensive Review of Artificial Intelligence Applications in Sustainable Building

Authors: Yazan Al-Kofahi, Jamal Alqawasmi.

Abstract:

In this study, a comprehensive literature review (SLR) was conducted, with the main goal of assessing the existing literature about how artificial intelligence (AI), machine learning (ML), deep learning (DL) models are used in sustainable architecture applications and issues including thermal comfort satisfaction, energy efficiency, cost prediction and many others issues. For this reason, the search strategy was initiated by using different databases, including Scopus, Springer and Google Scholar. The inclusion criteria were used by two research strings related to DL, ML and sustainable architecture. Moreover, the timeframe for the inclusion of the papers was open, even though most of the papers were conducted in the previous four years. As a paper filtration strategy, conferences and books were excluded from database search results. Using these inclusion and exclusion criteria, the search was conducted, and a sample of 59 papers was selected as the final included papers in the analysis. The data extraction phase was basically to extract the needed data from these papers, which were analyzed and correlated. The results of this SLR showed that there are many applications of ML and DL in Sustainable buildings, and that this topic is currently trendy. It was found that most of the papers focused their discussions on addressing Environmental Sustainability issues and factors using machine learning predictive models, with a particular emphasis on the use of Decision Tree algorithms. Moreover, it was found that the Random Forest repressor demonstrates strong performance across all feature selection groups in terms of cost prediction of the building as a machine-learning predictive model.

Keywords: machine learning, deep learning, artificial intelligence, sustainable building

Procedia PDF Downloads 65
1383 Application of Knowledge Discovery in Database Techniques in Cost Overruns of Construction Projects

Authors: Mai Ghazal, Ahmed Hammad

Abstract:

Cost overruns in construction projects are considered as worldwide challenges since the cost performance is one of the main measures of success along with schedule performance. To overcome this problem, studies were conducted to investigate the cost overruns' factors, also projects' historical data were analyzed to extract new and useful knowledge from it. This research is studying and analyzing the effect of some factors causing cost overruns using the historical data from completed construction projects. Then, using these factors to estimate the probability of cost overrun occurrence and predict its percentage for future projects. First, an intensive literature review was done to study all the factors that cause cost overrun in construction projects, then another review was done for previous researcher papers about mining process in dealing with cost overruns. Second, a proposed data warehouse was structured which can be used by organizations to store their future data in a well-organized way so it can be easily analyzed later. Third twelve quantitative factors which their data are frequently available at construction projects were selected to be the analyzed factors and suggested predictors for the proposed model.

Keywords: construction management, construction projects, cost overrun, cost performance, data mining, data warehousing, knowledge discovery, knowledge management

Procedia PDF Downloads 367
1382 Presenting a Model for Predicting the State of Being Accident-Prone of Passages According to Neural Network and Spatial Data Analysis

Authors: Hamd Rezaeifar, Hamid Reza Sahriari

Abstract:

Accidents are considered to be one of the challenges of modern life. Due to the fact that the victims of this problem and also internal transportations are getting increased day by day in Iran, studying effective factors of accidents and identifying suitable models and parameters about this issue are absolutely essential. The main purpose of this research has been studying the factors and spatial data affecting accidents of Mashhad during 2007- 2008. In this paper it has been attempted to – through matching spatial layers on each other and finally by elaborating them with the place of accident – at the first step by adding landmarks of the accident and through adding especial fields regarding the existence or non-existence of effective phenomenon on accident, existing information banks of the accidents be completed and in the next step by means of data mining tools and analyzing by neural network, the relationship between these data be evaluated and a logical model be designed for predicting accident-prone spots with minimum error. The model of this article has a very accurate prediction in low-accident spots; yet it has more errors in accident-prone regions due to lack of primary data.

Keywords: accident, data mining, neural network, GIS

Procedia PDF Downloads 45